Fix NPE in NASBackupProvider when no running KVM host is available#12805
Fix NPE in NASBackupProvider when no running KVM host is available#12805sureshanaparti merged 3 commits intoapache:4.22from
Conversation
ResourceManager.findOneRandomRunningHostByHypervisor() can return null when no KVM host in the zone has status=Up (e.g. during management server startup, brief agent disconnections, or host state transitions). NASBackupProvider.syncBackupStorageStats() and deleteBackup() call host.getId() without a null check, causing a NullPointerException that crashes the entire BackupSyncTask background job every sync interval. This adds null checks in both methods: - syncBackupStorageStats: log a warning and return early - deleteBackup: throw CloudRuntimeException with a descriptive message
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12805 +/- ##
============================================
- Coverage 17.61% 17.60% -0.01%
- Complexity 15662 15674 +12
============================================
Files 5917 5917
Lines 531415 531606 +191
Branches 64973 64996 +23
============================================
+ Hits 93588 93601 +13
- Misses 427271 427446 +175
- Partials 10556 10559 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 17121 |
|
@jmsperu , probably due to the rebase: seems an import is missing. |
|
Good catch @DaanHoogland, the CollectionUtils import was dropped during the rebase. Fixed now — should build cleanly. |
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17198 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
There was a problem hiding this comment.
Pull request overview
Fixes a crash in the NAS backup provider’s background sync/deletion flows when no running KVM host is available in a zone (i.e., ResourceManager.findOneRandomRunningHostByHypervisor(...) returns null), preventing BackupSyncTask from dying due to NullPointerException.
Changes:
- Add a
null-host check indeleteBackup()and fail with a descriptiveCloudRuntimeException. - Add early returns in
syncBackupStorageStats()when there are no repositories and when no running KVM host can be found (with a warning log).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
plugins/backup/nas/src/main/java/org/apache/cloudstack/backup/NASBackupProvider.java
Show resolved
Hide resolved
| if (host == null) { | ||
| throw new CloudRuntimeException(String.format("Unable to find a running KVM host in zone %d to delete backup %s", backup.getZoneId(), backup.getUuid())); | ||
| } |
| final Host host = resourceManager.findOneRandomRunningHostByHypervisor(Hypervisor.HypervisorType.KVM, zoneId); | ||
| if (host == null) { | ||
| logger.warn("Unable to find a running KVM host in zone {} to sync backup storage stats", zoneId); | ||
| return; | ||
| } |
|
[SF] Trillian test result (tid-15701)
|
|
Thanks @DaanHoogland for driving the packaging and smoke tests. All builds green and the 8 test failures are pre-existing (Kubernetes, userdata, private gw ACL) — unrelated to this change. Is there anything else needed to get this into 4.22.1? |
|
@DaanHoogland @abh1sar Gentle ping — builds and tests are green. Is this ready to merge into 4.22? |
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
@jmsperu Please address the comment about import ordering. We'll get this merged after that. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17264 |
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17267 |
|
Awesome work, congrats on your first merged pull request! |
Rebased onto 4.22 as requested (previously #12680).
ResourceManager.findOneRandomRunningHostByHypervisor()can return null when no KVM host in the zone has status=Up (e.g. during management server startup, brief agent disconnections, or host state transitions).NASBackupProvider.syncBackupStorageStats()anddeleteBackup()callhost.getId()without a null check, causing a NullPointerException that crashes the entireBackupSyncTaskbackground job every sync interval.This adds null checks in both methods:
CloudRuntimeExceptionwith a descriptive message