Skip to content

Fix snapshots garbage collection#4188

Merged
nvazquez merged 2 commits intoapache:4.13from
shapeblue:fixgcsnaps
Jul 18, 2020
Merged

Fix snapshots garbage collection#4188
nvazquez merged 2 commits intoapache:4.13from
shapeblue:fixgcsnaps

Conversation

@nvazquez
Copy link
Copy Markdown
Contributor

@nvazquez nvazquez commented Jun 29, 2020

Description

Cleanup orphan entries for primary storage on table snapshot_store_ref

Fixes: #4018 - Problem 1

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

How Has This Been Tested?

@nvazquez
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔debian. JID-1493

@nvazquez nvazquez changed the title WIP: Fix snapshots garbage collection Fix snapshots garbage collection Jul 4, 2020
@nvazquez
Copy link
Copy Markdown
Contributor Author

nvazquez commented Jul 4, 2020

@blueorangutan test matrix

@blueorangutan
Copy link
Copy Markdown

@nvazquez a Trillian-Jenkins matrix job (centos7 mgmt + xs71, centos7 mgmt + vmware67, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@nvazquez nvazquez added this to the 4.15.0.0 milestone Jul 4, 2020
@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1988)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 44639 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4188-t1988-vmware-67u3.zip
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 78 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_05_rvpc_multi_tiers Failure 220.55 test_vpc_redundant.py
test_05_rvpc_multi_tiers Error 220.57 test_vpc_redundant.py
ContextSuite context=TestVPCRedundancy>:teardown Error 220.58 test_vpc_redundant.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1987)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 50139 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4188-t1987-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_kubernetes_clusters.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 82 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_04_deploy_and_upgrade_kubernetes_cluster Failure 873.79 test_kubernetes_clusters.py

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1986)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 50672 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4188-t1986-xenserver-71.zip
Intermittent failure detected: /marvin/tests/smoke/test_scale_vm.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 82 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_scale_vm Failure 14.40 test_scale_vm.py

Copy link
Copy Markdown
Contributor

@borisstoyanov borisstoyanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. manually checked after GC passed there's no records in the table.

@nvazquez nvazquez changed the base branch from master to 4.14 July 9, 2020 20:33
@nvazquez nvazquez changed the base branch from 4.14 to master July 9, 2020 20:34
@nvazquez nvazquez marked this pull request as ready for review July 9, 2020 20:35
@nvazquez
Copy link
Copy Markdown
Contributor Author

nvazquez commented Jul 9, 2020

Thanks @borisstoyanov, as its a bug completing the fix on #3969 do you agree targeting this fix to branch 4.13 and forward merge to 4.14 and master? cc @rhtyd @DaanHoogland

@nvazquez nvazquez changed the base branch from master to 4.13 July 14, 2020 12:50
@nvazquez
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔debian. JID-1578

@nvazquez
Copy link
Copy Markdown
Contributor Author

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@apache apache deleted a comment from blueorangutan Jul 15, 2020
@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jul 15, 2020

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2128)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 61821 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4188-t2128-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermittent failure detected: /marvin/tests/smoke/test_public_ip_range.py
Intermittent failure detected: /marvin/tests/smoke/test_reset_vm_on_reboot.py
Intermittent failure detected: /marvin/tests/smoke/test_resource_accounting.py
Intermittent failure detected: /marvin/tests/smoke/test_router_dhcphosts.py
Intermittent failure detected: /marvin/tests/smoke/test_router_dns.py
Intermittent failure detected: /marvin/tests/smoke/test_router_dnsservice.py
Intermittent failure detected: /marvin/tests/smoke/test_routers_iptables_default_policy.py
Intermittent failure detected: /marvin/tests/smoke/test_routers_network_ops.py
Intermittent failure detected: /marvin/tests/smoke/test_routers.py
Intermittent failure detected: /marvin/tests/smoke/test_secondary_storage.py
Intermittent failure detected: /marvin/tests/smoke/test_service_offerings.py
Intermittent failure detected: /marvin/tests/smoke/test_snapshots.py
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Intermittent failure detected: /marvin/tests/smoke/test_templates.py
Intermittent failure detected: /marvin/tests/smoke/test_usage.py
Intermittent failure detected: /marvin/tests/smoke/test_vm_life_cycle.py
Intermittent failure detected: /marvin/tests/smoke/test_vm_snapshots.py
Intermittent failure detected: /marvin/tests/smoke/test_volumes.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_router_nics.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Intermittent failure detected: /marvin/tests/smoke/test_hostha_kvm.py
Smoke tests completed. 55 look OK, 22 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_vpc_privategw_static_routes Failure 202.38 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 208.31 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 265.79 test_privategw_acl.py
ContextSuite context=TestResetVmOnReboot>:setup Error 0.00 test_reset_vm_on_reboot.py
ContextSuite context=TestRAMCPUResourceAccounting>:setup Error 0.00 test_resource_accounting.py
ContextSuite context=TestRouterDHCPHosts>:setup Error 0.00 test_router_dhcphosts.py
ContextSuite context=TestRouterDHCPOpts>:setup Error 0.00 test_router_dhcphosts.py
ContextSuite context=TestRouterDns>:setup Error 0.00 test_router_dns.py
test_01_sys_vm_start Failure 0.07 test_secondary_storage.py
ContextSuite context=TestRouterDnsService>:setup Error 0.00 test_router_dnsservice.py
ContextSuite context=TestRouterIpTablesPolicies>:setup Error 0.00 test_routers_iptables_default_policy.py
ContextSuite context=TestVPCIpTablesPolicies>:setup Error 0.00 test_routers_iptables_default_policy.py
ContextSuite context=TestIsolatedNetworks>:setup Error 0.00 test_routers_network_ops.py
ContextSuite context=TestRedundantIsolateNetworks>:setup Error 0.00 test_routers_network_ops.py
ContextSuite context=TestRouterServices>:setup Error 0.00 test_routers.py
ContextSuite context=TestCpuCapServiceOfferings>:setup Error 0.00 test_service_offerings.py
ContextSuite context=TestServiceOfferings>:setup Error 0.15 test_service_offerings.py
ContextSuite context=TestSnapshotRootDisk>:setup Error 0.00 test_snapshots.py
test_01_list_sec_storage_vm Failure 0.03 test_ssvm.py
test_02_list_cpvm_vm Failure 0.03 test_ssvm.py
test_03_ssvm_internals Failure 0.03 test_ssvm.py
test_04_cpvm_internals Failure 0.02 test_ssvm.py
test_05_stop_ssvm Failure 0.03 test_ssvm.py
test_06_stop_cpvm Failure 0.03 test_ssvm.py
test_07_reboot_ssvm Failure 0.03 test_ssvm.py
test_08_reboot_cpvm Failure 0.03 test_ssvm.py
test_09_destroy_ssvm Failure 0.03 test_ssvm.py
test_10_destroy_cpvm Failure 0.03 test_ssvm.py
test_02_create_template_with_checksum_sha1 Error 65.34 test_templates.py
test_03_create_template_with_checksum_sha256 Error 65.37 test_templates.py
test_04_create_template_with_checksum_md5 Error 65.39 test_templates.py
test_05_create_template_with_no_checksum Error 65.39 test_templates.py
test_02_deploy_vm_from_direct_download_template Error 1.19 test_templates.py
test_03_deploy_vm_wrong_checksum Error 1.25 test_templates.py
ContextSuite context=TestTemplates>:setup Error 16.65 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestLBRuleUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestNatRuleUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestPublicIPUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestSnapshotUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVmUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVolumeUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVpnUsage>:setup Error 0.00 test_usage.py
ContextSuite context=Test01DeployVM>:setup Error 0.00 test_vm_life_cycle.py
ContextSuite context=Test02VMLifeCycle>:setup Error 0.00 test_vm_life_cycle.py
test_14_secure_to_secure_vm_migration Error 11.28 test_vm_life_cycle.py
test_15_secured_to_nonsecured_vm_migration Error 74.11 test_vm_life_cycle.py
test_16_nonsecured_to_secured_vm_migration Error 1.17 test_vm_life_cycle.py
ContextSuite context=TestVmSnapshot>:setup Error 1.57 test_vm_snapshots.py
ContextSuite context=TestCreateVolume>:setup Error 0.00 test_volumes.py
ContextSuite context=TestVolumes>:setup Error 0.00 test_volumes.py
ContextSuite context=TestVPCRedundancy>:setup Error 0.00 test_vpc_redundant.py
ContextSuite context=TestVPCNics>:setup Error 0.00 test_vpc_router_nics.py
ContextSuite context=TestRVPCSite2SiteVpn>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVPCSite2SiteVPNMultipleOptions>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVpcRemoteAccessVpn>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVpcSite2SiteVpn>:setup Error 0.00 test_vpc_vpn.py
test_disable_oobm_ha_state_ineligible Error 1511.59 test_hostha_kvm.py

@nvazquez
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔debian. JID-1585

@nvazquez
Copy link
Copy Markdown
Contributor Author

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2136)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 30014 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4188-t2136-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Smoke tests completed. 76 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_vpc_privategw_static_routes Failure 182.99 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 183.11 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 250.29 test_privategw_acl.py

@nvazquez
Copy link
Copy Markdown
Contributor Author

nvazquez commented Jul 17, 2020

@rhtyd can you please review this one?

Copy link
Copy Markdown
Member

@GabrielBrascher GabrielBrascher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on code review. I did not test it, though.

@nvazquez
Copy link
Copy Markdown
Contributor Author

Thanks @GabrielBrascher, @borisstoyanov tested and approved as well. Merging it

@nvazquez nvazquez merged commit f843c53 into apache:4.13 Jul 18, 2020
@nvazquez nvazquez deleted the fixgcsnaps branch July 18, 2020 17:13
shwstppr pushed a commit to shapeblue/cloudstack that referenced this pull request Jul 20, 2020
* Cleanup orphan entries from snapshot store ref for primary storage

* Add debug message
Pearl1594 pushed a commit to shapeblue/cloudstack that referenced this pull request Jul 27, 2020
* Cleanup orphan entries from snapshot store ref for primary storage

* Add debug message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snapshots GC from DB and XenServer Primary Storage garbage

8 participants