Skip to content

When the out_chg option is enabled in INPUT for the GPU version of ABACUS, the final step of the SCF calculation takes considerably longer than when the option is disabled. #6931

@LiYuqiii

Description

@LiYuqiii

Describe the bug

During testing of GPU parallel efficiency, I observed that under large-scale tasks, enabling out_chg significantly increased the scf runtime compared to disabling it. The specific timing data is shown in the figure below.

Image

Upon examining the screen output, it was observed that the final step of the SCF calculation took an unusually long time. After enabling the out_chg option, the SCF time on 12,800 random orbital samples using 64 GPUs yielded the results shown below.

================================================================
SELF-CONSISTENT:

DONE(17.2895 SEC) : INIT SCF
ITER ETOT/eV EDIFF/eV DRHO TIME/s
CG1 -2.15838992e+05 0.00000000e+00 1.6106e+02 10.73
CG2 -2.15787655e+05 5.13377187e+01 1.2265e+01 10.25
CG3 -2.15794070e+05 -6.41501186e+00 6.2074e-03 10.28
CG4 -2.15794082e+05 -1.26070361e-02 3.4884e-05 10.31
CG5 -2.15794082e+05 -5.26812500e-05 1.6591e-06 10.25
CG6 -2.15794082e+05 2.61236538e-06 2.1107e-09 174.86

After disabling the out_chg option, the SCF time on 12,800 random orbital samples using 64 GPUs yielded the results shown below.

================================================================
SELF-CONSISTENT:

DONE(17.3907 SEC) : INIT SCF
ITER ETOT/eV EDIFF/eV DRHO TIME/s
CG1 -2.15837842e+05 0.00000000e+00 1.6108e+02 10.68
CG2 -2.15786452e+05 5.13891162e+01 1.2267e+01 10.16
CG3 -2.15792861e+05 -6.40888612e+00 6.2055e-03 10.17
CG4 -2.15792874e+05 -1.26498832e-02 3.4873e-05 10.22
CG5 -2.15792874e+05 -4.03126567e-05 1.6631e-06 10.22
CG6 -2.15792874e+05 -7.28789952e-06 2.1194e-09 10.07

Expected behavior

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

  • Verify the issue is not a duplicate.
  • Describe the bug.
  • Steps to reproduce.
  • Expected behavior.
  • Error message.
  • Environment details.
  • Additional context.
  • Assign a priority level (low, medium, high, urgent).
  • Assign the issue to a team member.
  • Label the issue with relevant tags.
  • Identify possible related issues.
  • Create a unit test or automated test to reproduce the bug (if applicable).
  • Fix the bug.
  • Test the fix.
  • Update documentation (if necessary).
  • Close the issue and inform the reporter (if applicable).

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugsBugs that only solvable with sufficient knowledge of DFTGPU & DCU & HPCGPU and DCU and HPC related any issues

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions