Skip to content

Use ACPI event to reboot VM on KVM, and Use 'forced' reboot option to stop and start the VM(s) #4681

Merged
yadvr merged 3 commits intoapache:masterfrom
shapeblue:cs-kvm-libvirt-acpi-reboot-and-forced-reboot
Mar 6, 2021
Merged

Use ACPI event to reboot VM on KVM, and Use 'forced' reboot option to stop and start the VM(s) #4681
yadvr merged 3 commits intoapache:masterfrom
shapeblue:cs-kvm-libvirt-acpi-reboot-and-forced-reboot

Conversation

@sureshanaparti
Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti commented Feb 11, 2021

Description

Use ACPI event to reboot VM on KVM, and Use 'forced' reboot option to stop and start the VM(s).

This PR addresses the following:

  • Replaces the legacy stop/start logic to reboot VM on KVM, with libvirt's native reboot operation that will use ACPI event based rebooting.
  • Introduces new 'forced' parameter in the rebootVirtualMachine API, which facilitates to stop and start the user VM.
  • Introduces new 'forced' parameter in rebootSystemVm API, to stop and start the system VM.
  • Introduces new 'forced' parameter in rebootRouter API, to force stop and start the Router.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

Reboot Dialog with Force option in the UI:

RebootDialogWithForceOption_UI

How Has This Been Tested?

  • Deployed a VM (on KVM / VMware 6.7.0 / XenServer 7.0), and Performed normal Reboot.
  • Deployed a VM (on KVM / VMware 6.7.0 / XenServer 7.0), and Performed force Reboot.
  • Marvin Tests for User VM, System VM and Router force reboot.

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@yadvr yadvr added this to the 4.15.1.0 milestone Feb 11, 2021
show: (record) => { return ['Running'].includes(record.state) },
args: (record, store) => {
var fields = []
fields.push('forced')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshanaparti what about systemvms, VR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshanaparti what about systemvms, VR?

The reboot for systemVMs and VR are performed using rebootSystemVm and rebootRouter APIs respectively, should add support for these as well ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 'forced' reboot option for System VM and Router, as well. Existing reboot router operation already stops and starts the router, so the new 'forced' flag, would force stop and then start the router.

}
UserVm userVm = rebootVirtualMachine(CallContext.current().getCallingUserId(), vmId, enterSetup == null ? false : cmd.getBootIntoSetup());

UserVm userVm = rebootVirtualMachine(CallContext.current().getCallingUserId(), vmId, enterSetup == null ? false : cmd.getBootIntoSetup(), cmd.isForced());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshanaparti does forced parameter cause it to perform a stop/start at orchestration level?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes @rhtyd


if (forced) {
Host vmOnHost = _hostDao.findById(vm.getHostId());
if (vmOnHost == null || vmOnHost.getResourceState() != ResourceState.Enabled || vmOnHost.getStatus() != Status.Up ) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshanaparti is this a new behaviour, did it use to throw exception/fail when host is not enabled or up without forced=true?

Copy link
Copy Markdown
Contributor Author

@sureshanaparti sureshanaparti Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd Start VM can fail (or) either pick another host when the VM's host is not Enabled / not Up. As the force reboot stops and starts the VM, there are chances that start VM would fail in that case, and so it's safe to not allow when force reboot when host is not Enabled / not Up.

Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, testing required, otherwise LGTM.

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Feb 11, 2021

@sureshanaparti is rebooting VM with forced=true|false covered in VM lifecycle smoketests?

@yadvr yadvr modified the milestones: 4.15.1.0, 4.16.0.0 Feb 11, 2021
@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2675

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@sureshanaparti is rebooting VM with forced=true|false covered in VM lifecycle smoketests?

no, existing test(s) should cover forced=false, will add test(s) for forced=true.

@sureshanaparti sureshanaparti marked this pull request as draft February 12, 2021 05:10
@sureshanaparti sureshanaparti changed the title Use ACPI event to reboot VM on KVM, and Use 'forced' reboot option to stop and start the VM Use ACPI event to reboot VM on KVM, and Use 'forced' reboot option to stop and start the VM(s) Feb 12, 2021
@sureshanaparti sureshanaparti force-pushed the cs-kvm-libvirt-acpi-reboot-and-forced-reboot branch from 68f6906 to 5c2c8c9 Compare February 12, 2021 05:33
@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2679

@sureshanaparti sureshanaparti self-assigned this Feb 12, 2021
…ent, and Added 'forced' reboot option to stop and start the VM (using rebootVirtualMachine API)
- New parameter 'forced' in rebootSystemVm API, to stop and then start System VM
- New parameter 'forced' in rebootRouter API, to force stop and then start Router
@sureshanaparti sureshanaparti force-pushed the cs-kvm-libvirt-acpi-reboot-and-forced-reboot branch from 5c2c8c9 to c66d297 Compare February 17, 2021 06:11
@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@sureshanaparti is rebooting VM with forced=true|false covered in VM lifecycle smoketests?

no, existing test(s) should cover forced=false, will add test(s) for forced=true.

@rhtyd Added tests to cover reboot with forced=true, for User VM, System VM and Router.

@sureshanaparti sureshanaparti marked this pull request as ready for review February 17, 2021 06:17
@blueorangutan
Copy link
Copy Markdown

Packaging result: ✖centos7 ✖centos8 ✖debian. JID-2719

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2726

@sureshanaparti
Copy link
Copy Markdown
Contributor Author

@blueorangutan test matrix

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Trillian-Jenkins matrix job (centos7 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3566)
Environment: vmware-65u2 (x2), Advanced Networking with Mgmt server 7
Total time taken: 37807 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4681-t3566-vmware-65u2.zip
Intermittent failure detected: /marvin/tests/smoke/test_kubernetes_clusters.py
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Smoke tests completed. 85 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_03_deploy_and_upgrade_kubernetes_cluster Failure 779.21 test_kubernetes_clusters.py

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Feb 24, 2021

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2821

@yadvr yadvr requested a review from borisstoyanov February 25, 2021 16:55
@yadvr
Copy link
Copy Markdown
Member

yadvr commented Feb 25, 2021

@borisstoyanov are you lgtm on this?
@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3619)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 32387 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4681-t3619-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_vm_life_cycle.py
Smoke tests completed. 85 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_migrate_VM_and_root_volume Error 59.21 test_vm_life_cycle.py
test_02_migrate_VM_with_two_data_disks Error 51.15 test_vm_life_cycle.py

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Mar 6, 2021

Bobby has confirmed manual testing. LGTM, let's merge.

@yadvr yadvr merged commit 81dfcbb into apache:master Mar 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants