Repair diskless resource on sub-zero volume allocated size #88

Millefeuille42 · 2025-06-30T10:57:35Z

No description provided.

Wescoeur · 2025-07-02T14:53:10Z

drivers/linstorvolumemanager.py

+        self._linstor.resource_auto_place(
+            rsc_name=resource.name,
+            place_count=1,
+            diskless_type=linstor.consts.FLAG_DRBD_DISKLESS
+        )


Must be recreated on the correct node, otherwise you change the behavior on the caller side.

Based on what I read on the linstor docs, I'd say something like this would be appropriate. Is it ?

self._linstor.resource_create( rscs=[ linstor.linstorapi.ResourceData( node_name=resource.node_name, rsc_name=resource.name, storage_pool=volume.storage_pool_name, diskless=linstor.consts.FLAG_DISKLESS in resource.flags, drbd_diskless=linstor.consts.FLAG_DRBD_DISKLESS in resource.flags, nvme_initiator=linstor.consts.FLAG_NVME_INITIATOR in resource.flags, ebs_initiator=linstor.consts.FLAG_EBS_INITIATOR in resource.flags ) ] )

Suggestion added in 6b32030

Wescoeur · 2025-07-02T15:00:33Z

drivers/linstorvolumemanager.py

+            place_count=1,
+            diskless_type=linstor.consts.FLAG_DRBD_DISKLESS
+        )
+        self._mark_resource_cache_as_dirty()


Are you sure there is no issue for caller with for resource in self._get_resource_cache().resources code? 😄

Here in current usage I break the parent loop with a return, however for now it is up to the caller to handle the fact that this function implicitly modifies shared state. I see two options:

Document this clearly in a docstring, specifying that it mutates the resource cache

Move cache invalidation responsibility to the caller

I'd prefer the first option since as I know there is little to no point to not invalidate the cache afterwards ? 🤔

This has been changed to a scan happening before the loop in 9a9dec1. So this is not an issue anymore.

Wescoeur · 2025-07-02T15:02:01Z

drivers/linstorvolumemanager.py

+                        return util.retry(
+                            lambda: self.allocated_volume_size,
+                            maxretry=1,
+                            period=1
+                        )
+                    except LinstorVolumeManagerError as e:
+                        raise e


I don't understand this code. 😅 Maybe we should talk about it in a call.

See answers below, the util.retry was unnecessary and removed in bae7edb, meanwhile the except->raise is there to prevent nested exceptions of the same type.

Wescoeur · 2025-07-02T15:02:25Z

drivers/linstorvolumemanager.py

+                        except LinstorVolumeManagerError as e:
+                            raise e


Since the function will call itself if a LinstorVolumeManagerError is raised, that exception can be caught again below, unnecessarily nesting exceptions of the same type. This might be a bit of overkill though, so I can remove it if you see no use in it.

Wescoeur · 2025-07-02T15:03:12Z

drivers/linstorvolumemanager.py

+                            return util.retry(
+                                lambda: self._get_volumes_info(volume_name),
+                                maxretry=1,
+                                period=1
+                            )


Why a retry, and with a maxretry of 1 and a return here?

Might have mixed up stuff because I'm not sure of why I did this in the first place. I have fixed this in bae7edb. However this brings the eventuality of a loop if the resource is gets broken in the meantime.

Wescoeur · 2025-07-08T09:26:10Z

Can we delete a diskless that is open in read-only mode? Does DRBD prevent deletion if it is open?
Can we detect if a diskless is open (without using /sys/kernel/debug/)?

Millefeuille42 · 2025-07-17T09:31:24Z

Following issues that @Wescoeur noted we decided to have this PR on hold. We don't really know what is the cause of this issue and it seems relatively rare.

Not being sure of what happens when trying to delete a resource open in read-only and how to know if it is open makes room for further errors. Thus, manual repair remains preferred for the moment.

This is something that will be easily handled in SMAPIv3.

stormi · 2025-07-17T10:09:20Z

Then I suggest converting this PR back to draft.

Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>

Millefeuille42 requested review from Nambrok and Wescoeur June 30, 2025 10:57

Millefeuille42 self-assigned this Jun 30, 2025

Wescoeur requested changes Jul 2, 2025

View reviewed changes

Millefeuille42 marked this pull request as draft July 17, 2025 10:18

Wescoeur force-pushed the 3.2.12-8.3 branch 3 times, most recently from e7801da to ca5b52d Compare August 28, 2025 15:32

Wescoeur force-pushed the 3.2.12-8.3 branch 2 times, most recently from 66c7c8e to aaaab27 Compare December 1, 2025 23:24

Millefeuille42 force-pushed the mlr-1037-robustify-linstor-diskless-path-recreation branch from 9a9dec1 to 4ba4f62 Compare December 8, 2025 10:04

Wescoeur force-pushed the 3.2.12-8.3 branch from 2cbab06 to d7cd0b1 Compare December 9, 2025 18:04

Millefeuille42 force-pushed the mlr-1037-robustify-linstor-diskless-path-recreation branch from 4ba4f62 to 5e85ea6 Compare December 10, 2025 09:55

fix: repair diskless resource on sub-zero allocated size

a53dd00

Signed-off-by: Mathieu Labourier <mathieu.labourier@vates.tech>

Millefeuille42 force-pushed the mlr-1037-robustify-linstor-diskless-path-recreation branch from 5e85ea6 to a53dd00 Compare December 10, 2025 10:00

Repair diskless resource on sub-zero volume allocated size #88

Are you sure you want to change the base?

Repair diskless resource on sub-zero volume allocated size #88

Conversation

Millefeuille42 commented Jun 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Millefeuille42 Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wescoeur commented Jul 8, 2025

Uh oh!

Millefeuille42 commented Jul 17, 2025

Uh oh!

stormi commented Jul 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Millefeuille42 Jul 17, 2025 •

edited

Loading