Skip to content

Add cpu speed detection methods#9762

Merged
Pearl1594 merged 1 commit intoapache:4.20from
CLDIN:cpu-speed-detection
Feb 19, 2025
Merged

Add cpu speed detection methods#9762
Pearl1594 merged 1 commit intoapache:4.20from
CLDIN:cpu-speed-detection

Conversation

@BartJM
Copy link
Copy Markdown
Contributor

@BartJM BartJM commented Oct 3, 2024

Description

This PR ads two additional methods to detect cpu speed on kvm hosts. This will improve the speed detection on AMD Epyc cpu's. For cpu's where the Ghz is in the model name no change will occur. For other cpu's the detected cpu speed ca change to the max Mhz of the cpu.

  1. A match on the CPU max MHz value from lscpu
  2. An additional sysfs file scaling_max_freq

Fixes: #6914

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

How Has This Been Tested?

Tested on a kvm host with an AMD EPYC 7601 cpu.

  • With normal agent start the cpu speed is detected as the expected 2200Mhz.
  • Removed the cpu max Mhz line from the lscpu output and restarted agent. The detected speed is still the expected 2200Mhz.

On an kvm centos8 vm without the lscpu matches and neither file the agent still falls back on host capabilities.

Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 3, 2024

Codecov Report

Attention: Patch coverage is 68.18182% with 7 lines in your changes missing coverage. Please review.

Project coverage is 15.78%. Comparing base (019f2c6) to head (78a981f).
Report is 206 commits behind head on 4.20.

Files with missing lines Patch % Lines
...org/apache/cloudstack/utils/linux/KVMHostInfo.java 68.18% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##               4.20    #9762    +/-   ##
==========================================
  Coverage     15.78%   15.78%            
- Complexity    12564    12565     +1     
==========================================
  Files          5627     5627            
  Lines        492250   492261    +11     
  Branches      61405    62190   +785     
==========================================
+ Hits          77710    77718     +8     
- Misses       406066   406070     +4     
+ Partials       8474     8473     -1     
Flag Coverage Δ
uitests 4.04% <ø> (ø)
unittests 16.60% <68.18%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11266

Copy link
Copy Markdown
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

Copy link
Copy Markdown
Contributor

@JoaoJandre JoaoJandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, did not test it

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-11619)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 61373 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t11619-kvm-ol8.zip
Smoke tests completed. 140 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_secure_vm_migration Error 134.25 test_vm_life_cycle.py
test_01_secure_vm_migration Error 134.25 test_vm_life_cycle.py

Copy link
Copy Markdown
Contributor

@BryanMLima BryanMLima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaanHoogland @BartJM looks like some unwanted commits were added to this PR, probably due to the force push in the main. @DaanHoogland, could you take a look on this?

@DaanHoogland
Copy link
Copy Markdown
Contributor

@DaanHoogland @BartJM looks like some unwanted commits were added to this PR, probably due to the force push in the main. @DaanHoogland, could you take a look on this?

@BartJM , you want to execute

git rebase --onto main c087de4adfe0db02802ec4fe0929a5b3d6dfba2a 0ceff7f5b4cfdd2b2f26591d933eb112d8cf2329

and force push (git push --force) your branch.
or alternatively start a new branch and git cherry-pick 0ceff7f5b4cfdd2b2f26591d933eb112d8cf2329 on that new branch. Then rename it to replace the branch in this PR or start a new PR.

Added additional match for lscpu
Added additional file to check
@BartJM BartJM force-pushed the cpu-speed-detection branch from 0ceff7f to 78a981f Compare October 28, 2024 08:59
Copy link
Copy Markdown
Contributor

@BryanMLima BryanMLima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLGTM, I did not manually test it.

Copy link
Copy Markdown
Contributor

@wido wido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to merge

@weizhouapache
Copy link
Copy Markdown
Member

@blueorangutan test ubuntu24 kvm-ubuntu24

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-11801)
Environment: kvm-ubuntu22 (x2), Advanced Networking with Mgmt server u22
Total time taken: 57205 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t11801-kvm-ubuntu22.zip
Smoke tests completed. 139 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestClusterDRS>:setup Error 0.00 test_cluster_drs.py
test_hostha_enable_ha_when_host_disabled Error 3.01 test_hostha_kvm.py
test_hostha_enable_ha_when_host_in_maintenance Error 302.14 test_hostha_kvm.py

@DaanHoogland
Copy link
Copy Markdown
Contributor

@kiranchavala can you check thsi and see if this fixes #9819

@kiranchavala
Copy link
Copy Markdown
Member

@kiranchavala can you check thsi and see if this fixes #9819

Sure @DaanHoogland I will take a look

@Pearl1594
Copy link
Copy Markdown
Contributor

@BartJM could you rebase this to 4.20. So that we could have this in the 4.20.1 release. Thanks.

@Pearl1594 Pearl1594 changed the base branch from main to 4.20 February 14, 2025 14:24
@Pearl1594
Copy link
Copy Markdown
Contributor

@blueorangutan package

@Pearl1594
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12469

@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@apache apache deleted a comment from blueorangutan Feb 15, 2025
@Pearl1594
Copy link
Copy Markdown
Contributor

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-12468)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 55829 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t12468-kvm-ol8.zip
Smoke tests completed. 139 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.33 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.47 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.48 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 391.44 test_purge_expunged_vms.py

@DaanHoogland
Copy link
Copy Markdown
Contributor

[SF] Trillian test result (tid-12468) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 55829 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t12468-kvm-ol8.zip Smoke tests completed. 139 look OK, 2 have errors, 0 did not run Only failed and skipped tests results shown below:
Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.33 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.47 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.48 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 391.44 test_purge_expunged_vms.py

@Pearl1594 , cc @kiranchavala , these errors seem consistent on 4.20, lately. Can we merge this?

@JoaoJandre
Copy link
Copy Markdown
Contributor

@Pearl1594 , cc @kiranchavala , these errors seem consistent on 4.20, lately. Can we merge this?

@DaanHoogland I have seen these exact failures in other PRs. I think we are safe to merge here.

@Pearl1594 Pearl1594 merged commit ee32f4c into apache:4.20 Feb 19, 2025
@Pearl1594 Pearl1594 moved this to Done in ACS 4.20.1 Mar 17, 2025
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Jun 19, 2025
Added additional match for lscpu
Added additional file to check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Hosts marked unavailable due to different CPU frequencies using host capabilities

10 participants