-
Notifications
You must be signed in to change notification settings - Fork 8
Repair diskless resource on sub-zero volume allocated size #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.2.12-8.3
Are you sure you want to change the base?
Repair diskless resource on sub-zero volume allocated size #88
Conversation
drivers/linstorvolumemanager.py
Outdated
| self._linstor.resource_auto_place( | ||
| rsc_name=resource.name, | ||
| place_count=1, | ||
| diskless_type=linstor.consts.FLAG_DRBD_DISKLESS | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must be recreated on the correct node, otherwise you change the behavior on the caller side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on what I read on the linstor docs, I'd say something like this would be appropriate. Is it ?
self._linstor.resource_create(
rscs=[
linstor.linstorapi.ResourceData(
node_name=resource.node_name,
rsc_name=resource.name,
storage_pool=volume.storage_pool_name,
diskless=linstor.consts.FLAG_DISKLESS in resource.flags,
drbd_diskless=linstor.consts.FLAG_DRBD_DISKLESS in resource.flags,
nvme_initiator=linstor.consts.FLAG_NVME_INITIATOR in resource.flags,
ebs_initiator=linstor.consts.FLAG_EBS_INITIATOR in resource.flags
)
]
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion added in 6b32030
| place_count=1, | ||
| diskless_type=linstor.consts.FLAG_DRBD_DISKLESS | ||
| ) | ||
| self._mark_resource_cache_as_dirty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure there is no issue for caller with for resource in self._get_resource_cache().resources code? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here in current usage I break the parent loop with a return, however for now it is up to the caller to handle the fact that this function implicitly modifies shared state. I see two options:
- Document this clearly in a docstring, specifying that it mutates the resource cache
- Move cache invalidation responsibility to the caller
I'd prefer the first option since as I know there is little to no point to not invalidate the cache afterwards ? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been changed to a scan happening before the loop in 9a9dec1. So this is not an issue anymore.
drivers/linstorvolumemanager.py
Outdated
| return util.retry( | ||
| lambda: self.allocated_volume_size, | ||
| maxretry=1, | ||
| period=1 | ||
| ) | ||
| except LinstorVolumeManagerError as e: | ||
| raise e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this code. 😅 Maybe we should talk about it in a call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See answers below, the util.retry was unnecessary and removed in bae7edb, meanwhile the except->raise is there to prevent nested exceptions of the same type.
drivers/linstorvolumemanager.py
Outdated
| except LinstorVolumeManagerError as e: | ||
| raise e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the function will call itself if a LinstorVolumeManagerError is raised, that exception can be caught again below, unnecessarily nesting exceptions of the same type. This might be a bit of overkill though, so I can remove it if you see no use in it.
drivers/linstorvolumemanager.py
Outdated
| return util.retry( | ||
| lambda: self._get_volumes_info(volume_name), | ||
| maxretry=1, | ||
| period=1 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a retry, and with a maxretry of 1 and a return here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might have mixed up stuff because I'm not sure of why I did this in the first place. I have fixed this in bae7edb. However this brings the eventuality of a loop if the resource is gets broken in the meantime.
|
|
Following issues that @Wescoeur noted we decided to have this PR on hold. We don't really know what is the cause of this issue and it seems relatively rare. Not being sure of what happens when trying to delete a resource open in read-only and how to know if it is open makes room for further errors. Thus, manual repair remains preferred for the moment. This is something that will be easily handled in SMAPIv3. |
|
Then I suggest converting this PR back to draft. |
e7801da to
ca5b52d
Compare
66c7c8e to
aaaab27
Compare
9a9dec1 to
4ba4f62
Compare
4ba4f62 to
5e85ea6
Compare
Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>
5e85ea6 to
a53dd00
Compare
No description provided.