StartDate: 2021-12-16 19:38:40+00:00 CpuId: 64x Intel Xeon W 2000 / D-2100 (Skylake / Cascade Lake) {Skylake}, 14nm CommitSHA: 067e01a6732c64a193f73fcdfac5658fc19303bc CommitTime: 2021-12-16 19:42:19 +0100 CommitAuthor: Matthias Krack CommitSubject: Switch to new CRAY-XC50 arch file Trying to pull image cp2k-toolchain-mpich... success :-) Trying to pull image cp2k-perf-openmp... image not found. #################### Building Image cp2k-perf-openmp #################### Dockerfile: /tools/docker/Dockerfile.test_performance Build-Args: TOOLCHAIN=gcr.io/cp2k-org-project/img_cp2k-toolchain-mpich-arch-b51:gittree-3be1e94-buildargs-68b329d Sending build context to Docker daemon 77.31kB Step 1/9 : ARG TOOLCHAIN=cp2k/toolchain:latest Step 2/9 : FROM ${TOOLCHAIN} ---> bca32ca981f8 Step 3/9 : WORKDIR /workspace ---> Running in c9a32ca1e031 Removing intermediate container c9a32ca1e031 ---> 54d6f5364841 Step 4/9 : COPY ./scripts/install_basics.sh . ---> 73da4a9bcfb6 Step 5/9 : RUN ./install_basics.sh ---> Running in 215b208e103b Installing Ubuntu packages... debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libpopt0:amd64. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 15383 files and directories currently installed.) Preparing to unpack .../libpopt0_1.16-14_amd64.deb ... Unpacking libpopt0:amd64 (1.16-14) ... Selecting previously unselected package rsync. Preparing to unpack .../rsync_3.1.3-8ubuntu0.1_amd64.deb ... Unpacking rsync (3.1.3-8ubuntu0.1) ... Preparing to unpack .../wget_1.20.3-1ubuntu2_amd64.deb ... Unpacking wget (1.20.3-1ubuntu2) over (1.20.3-1ubuntu1) ... Setting up wget (1.20.3-1ubuntu2) ... Setting up libpopt0:amd64 (1.16-14) ... Setting up rsync (3.1.3-8ubuntu0.1) ... invoke-rc.d: could not determine current runlevel invoke-rc.d: policy-rc.d denied execution of start. Processing triggers for libc-bin (2.31-0ubuntu9.2) ... done. Cloning cp2k repository... done. Removing intermediate container 215b208e103b ---> 2365187a43fc Step 6/9 : COPY ./scripts/install_performance.sh . ---> 0a6170791220 Step 7/9 : RUN ./install_performance.sh "local" ---> Running in 3e9a43a0970d './local.pdbg' -> '/opt/cp2k-toolchain/install/arch/local.pdbg' './local.psmp' -> '/opt/cp2k-toolchain/install/arch/local.psmp' './local.sdbg' -> '/opt/cp2k-toolchain/install/arch/local.sdbg' './local.ssmp' -> '/opt/cp2k-toolchain/install/arch/local.ssmp' './local_coverage.pdbg' -> '/opt/cp2k-toolchain/install/arch/local_coverage.pdbg' './local_static.psmp' -> '/opt/cp2k-toolchain/install/arch/local_static.psmp' './local_static.ssmp' -> '/opt/cp2k-toolchain/install/arch/local_static.ssmp' './local_warn.psmp' -> '/opt/cp2k-toolchain/install/arch/local_warn.psmp' Warming cache by trying to compile cp2k... done. Removing intermediate container 3e9a43a0970d ---> 397640b1fe26 Step 8/9 : COPY ./scripts/ci_entrypoint.sh ./scripts/test_performance.sh ./scripts/plot_performance.py ./ ---> d7736f1340fa Step 9/9 : CMD ["./ci_entrypoint.sh", "./test_performance.sh", "local"] ---> Running in d8242108db1e Removing intermediate container d8242108db1e ---> 8277911857d2 Successfully built 8277911857d2 Successfully tagged gcr.io/cp2k-org-project/img_cp2k-perf-openmp-arch-b51:gittree-a7c81b9-buildargs-407bbea Pushing image cp2k-perf-openmp... done. #################### Running Image cp2k-perf-openmp #################### ========== Fetching Git Commit ========== CommitSHA: 067e01a6732c64a193f73fcdfac5658fc19303bc CommitTime: 2021-12-16 19:42:19 +0100 CommitAuthor: Matthias Krack CommitSubject: Switch to new CRAY-XC50 arch file ========== Running Test ========== ========== Compiling CP2K ========== Compiling cp2k... done. ========== Running Performance Test ========== Running H2O-64.inp with 1 threads and 32 ranks... done. Running H2O-64.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/H2O-64_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.040 0.040 174.039 174.039 qs_mol_dyn_low 1 2.0 0.004 0.004 173.197 173.197 qs_forces 11 3.9 0.002 0.002 173.136 173.136 qs_energies 11 4.9 0.001 0.001 161.230 161.230 scf_env_do_scf 11 5.9 0.001 0.001 132.411 132.411 velocity_verlet 10 3.0 0.002 0.002 120.449 120.449 scf_env_do_scf_inner_loop 108 6.5 0.010 0.010 90.346 90.346 init_scf_loop 11 6.9 0.000 0.000 41.868 41.868 prepare_preconditioner 11 7.9 0.000 0.000 37.547 37.547 make_preconditioner 11 8.9 0.000 0.000 37.547 37.547 make_full_inverse_cholesky 11 9.9 0.000 0.000 35.533 35.533 rebuild_ks_matrix 119 8.3 0.001 0.001 35.415 35.415 qs_ks_build_kohn_sham_matrix 119 9.3 0.020 0.020 35.414 35.414 qs_scf_new_mos 108 7.5 0.001 0.001 33.073 33.073 qs_scf_loop_do_ot 108 8.5 0.001 0.001 33.072 33.072 qs_ks_update_qs_env 119 7.6 0.001 0.001 33.031 33.031 qs_rho_update_rho 119 7.7 0.001 0.001 30.901 30.901 calculate_rho_elec 119 8.7 1.588 1.588 30.900 30.900 ot_scf_mini 108 9.5 0.004 0.004 30.886 30.886 dbcsr_multiply_generic 2286 12.5 0.220 0.220 28.060 28.060 grid_collocate_task_list 119 9.7 24.123 24.123 24.123 24.123 sum_up_and_integrate 119 10.3 0.400 0.400 22.194 22.194 integrate_v_rspace 119 11.3 0.619 0.619 21.794 21.794 cp_fm_cholesky_invert 11 10.9 21.276 21.276 21.276 21.276 ot_mini 108 10.5 0.001 0.001 18.466 18.466 grid_integrate_task_list 119 12.3 18.109 18.109 18.109 18.109 make_m2s 4572 13.5 0.068 0.068 15.379 15.379 init_scf_run 11 5.9 0.001 0.001 13.527 13.527 scf_env_initial_rho_setup 11 6.9 0.001 0.001 13.526 13.526 wfi_extrapolate 11 7.9 0.001 0.001 12.725 12.725 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 11.573 11.573 cp_gemm 81 9.0 0.001 0.001 10.240 10.240 cp_gemm_cosma 81 10.0 10.239 10.239 10.239 10.239 qs_ot_get_derivative 108 11.5 0.002 0.002 10.171 10.171 pw_transfer 1439 11.6 0.101 0.101 9.071 9.071 fft_wrap_pw1pw2 1201 12.6 0.011 0.011 8.684 8.684 make_images 4572 14.5 3.094 3.094 8.408 8.408 ot_diis_step 108 11.5 0.006 0.006 8.290 8.290 cp_fm_cholesky_decompose 22 10.9 7.940 7.940 7.940 7.940 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 7.661 7.661 qs_ot_get_p 119 10.4 0.001 0.001 7.659 7.659 dbcsr_copy 2102 12.0 0.366 0.366 7.463 7.463 fft_wrap_pw1pw2_140 487 13.2 0.699 0.699 7.224 7.224 dbcsr_make_dense_low 5837 15.5 0.139 0.139 7.027 7.027 dbcsr_copy_into_existing 22 7.9 6.986 6.986 6.987 6.987 make_dense_data 5837 16.5 5.671 5.671 6.864 6.864 multiply_cannon 2286 13.5 1.193 1.193 6.841 6.841 apply_preconditioner_dbcsr 119 12.6 0.000 0.000 6.759 6.759 apply_single 119 13.6 0.001 0.001 6.758 6.758 dbcsr_complete_redistribute 329 12.2 3.079 3.079 6.558 6.558 dbcsr_make_images_dense 3978 14.8 0.028 0.028 6.250 6.250 qs_env_update_s_mstruct 11 6.9 0.000 0.000 6.076 6.076 qs_ot_p2m_diag 50 11.0 0.225 0.225 5.622 5.622 qs_create_task_list 11 7.9 0.000 0.000 5.512 5.512 generate_qs_task_list 11 8.9 3.733 3.733 5.512 5.512 copy_dbcsr_to_fm 153 11.3 0.004 0.004 5.382 5.382 density_rs2pw 119 9.7 0.007 0.007 5.189 5.189 cp_dbcsr_syevd 50 12.0 0.005 0.005 5.000 5.000 build_core_hamiltonian_matrix 11 6.9 0.001 0.001 4.889 4.889 multiply_cannon_loop 2286 14.5 0.197 0.197 4.849 4.849 cp_fm_diag_elpa 50 13.0 0.000 0.000 4.824 4.824 cp_fm_diag_elpa_base 50 14.0 4.766 4.766 4.823 4.823 pw_poisson_solve 119 10.3 1.980 1.980 4.751 4.751 multiply_cannon_multrec 2286 15.5 4.569 4.569 4.651 4.651 transfer_dbcsr_to_fm 11 10.9 0.000 0.000 4.378 4.378 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 4.242 4.242 fft3d_s 1202 14.6 4.216 4.216 4.222 4.222 dbcsr_finalize 5186 13.8 0.387 0.387 3.759 3.759 qs_energies_compute_matrix_w 11 5.9 0.000 0.000 3.541 3.541 calculate_w_matrix_ot 11 6.9 0.009 0.009 3.541 3.541 ------------------------------------------------------------------------------- From /workspace/artifacts/H2O-64_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.009 0.015 75.054 75.055 qs_mol_dyn_low 1 2.0 0.005 0.005 74.926 74.931 qs_forces 11 3.9 0.002 0.002 74.834 74.835 qs_energies 11 4.9 0.001 0.002 69.777 69.779 scf_env_do_scf 11 5.9 0.001 0.001 63.470 63.474 scf_env_do_scf_inner_loop 108 6.5 0.003 0.011 58.708 58.710 velocity_verlet 10 3.0 0.002 0.002 45.035 45.037 rebuild_ks_matrix 119 8.3 0.001 0.001 28.777 28.836 qs_ks_build_kohn_sham_matrix 119 9.3 0.022 0.025 28.776 28.835 qs_ks_update_qs_env 119 7.6 0.001 0.001 25.602 25.651 qs_rho_update_rho 119 7.7 0.001 0.001 22.932 22.956 calculate_rho_elec 119 8.7 0.048 0.051 22.931 22.955 sum_up_and_integrate 119 10.3 0.049 0.052 22.442 22.475 integrate_v_rspace 119 11.3 0.005 0.005 22.393 22.425 dbcsr_multiply_generic 2286 12.5 0.136 0.138 18.007 18.139 grid_collocate_task_list 119 9.7 15.897 16.511 15.897 16.511 grid_integrate_task_list 119 12.3 15.792 16.118 15.792 16.118 qs_scf_new_mos 108 7.5 0.001 0.001 14.822 14.862 qs_scf_loop_do_ot 108 8.5 0.001 0.001 14.822 14.861 ot_scf_mini 108 9.5 0.003 0.004 13.880 13.911 multiply_cannon 2286 13.5 0.224 0.227 11.749 12.092 multiply_cannon_loop 2286 14.5 0.228 0.235 10.546 10.896 mp_waitall_1 169478 16.3 8.837 9.233 8.837 9.233 ot_mini 108 10.5 0.001 0.001 8.305 8.336 rs_pw_transfer 974 11.9 0.017 0.018 7.254 8.145 density_rs2pw 119 9.7 0.009 0.011 6.405 7.297 pw_transfer 1439 11.6 0.136 0.144 6.134 6.188 multiply_cannon_metrocomm3 18288 15.5 0.084 0.086 5.647 5.981 fft_wrap_pw1pw2 1201 12.6 0.015 0.016 5.844 5.900 potential_pw2rs 119 12.3 0.010 0.011 5.308 5.320 fft_wrap_pw1pw2_140 487 13.2 0.579 0.605 5.031 5.181 init_scf_loop 11 6.9 0.000 0.001 4.740 4.742 fft3d_ps 1201 14.6 2.353 2.501 4.357 4.430 init_scf_run 11 5.9 0.000 0.002 4.333 4.333 scf_env_initial_rho_setup 11 6.9 0.000 0.001 4.332 4.333 qs_ot_get_derivative 108 11.5 0.001 0.002 4.203 4.232 make_m2s 4572 13.5 0.077 0.079 4.082 4.155 ot_diis_step 108 11.5 0.005 0.006 4.061 4.063 apply_preconditioner_dbcsr 119 12.6 0.000 0.000 3.926 3.981 apply_single 119 13.6 0.001 0.001 3.925 3.981 wfi_extrapolate 11 7.9 0.001 0.001 3.954 3.954 multiply_cannon_multrec 18288 15.5 3.668 3.776 3.686 3.794 mp_waitany 9880 13.7 2.552 3.462 2.552 3.462 make_images 4572 14.5 0.190 0.194 3.364 3.430 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 3.405 3.416 rs_pw_transfer_RS2PW_140 130 11.5 0.577 0.608 2.363 3.264 rs_pw_transfer_PW2RS_140 130 13.9 1.242 1.301 2.614 2.649 mp_alltoall_d11v 2130 13.8 1.592 1.998 1.592 1.998 qs_ot_get_p 119 10.4 0.001 0.001 1.938 1.980 prepare_preconditioner 11 7.9 0.000 0.000 1.722 1.734 make_preconditioner 11 8.9 0.000 0.000 1.722 1.734 rs_gather_matrices 119 12.3 0.137 0.146 1.238 1.682 make_images_data 4572 15.5 0.063 0.069 1.473 1.599 make_full_inverse_cholesky 11 9.9 0.000 0.000 1.543 1.578 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 1.475 1.578 ------------------------------------------------------------------------------- Plot: name="H2O-64_timings_32omp", title="Timings of H2O-64 with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="H2O-64_timings_32omp", name="rest", label="rest", y=87.78299999999999, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=24.123, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="cp_fm_cholesky_invert", label="cp_fm_cholesky_invert", y=21.276, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=18.109, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=10.239, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=7.94, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=4.569, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="mp_waitany", label="mp_waitany", y=0.0, yerr=0.0 Plot: name="H2O-64_timings_32mpi", title="Timings of H2O-64 with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="H2O-64_timings_32mpi", name="rest", label="rest", y=28.308, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=15.897, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="cp_fm_cholesky_invert", label="cp_fm_cholesky_invert", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=15.792, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=3.668, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=8.837, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="mp_waitany", label="mp_waitany", y=2.552, yerr=0.0 Running H2O-64_nonortho.inp with 1 threads and 32 ranks... done. Running H2O-64_nonortho.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/H2O-64_nonortho_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.042 0.042 215.292 215.292 qs_mol_dyn_low 1 2.0 0.004 0.004 214.432 214.432 qs_forces 11 3.9 0.002 0.002 214.373 214.373 qs_energies 11 4.9 0.001 0.001 200.052 200.052 scf_env_do_scf 11 5.9 0.001 0.001 167.679 167.679 velocity_verlet 10 3.0 0.002 0.002 145.897 145.897 scf_env_do_scf_inner_loop 96 6.5 0.009 0.009 124.185 124.185 rebuild_ks_matrix 107 8.3 0.001 0.001 62.310 62.310 qs_ks_build_kohn_sham_matrix 107 9.3 0.018 0.018 62.309 62.309 qs_ks_update_qs_env 107 7.6 0.001 0.001 56.011 56.011 qs_rho_update_rho 107 7.7 0.001 0.001 55.877 55.877 calculate_rho_elec 107 8.7 1.411 1.411 55.876 55.876 sum_up_and_integrate 107 10.3 0.355 0.355 51.070 51.070 integrate_v_rspace 107 11.3 0.539 0.539 50.716 50.716 grid_collocate_task_list 107 9.7 50.140 50.140 50.140 50.140 grid_integrate_task_list 107 12.3 47.756 47.756 47.756 47.756 init_scf_loop 11 6.9 0.000 0.000 43.292 43.292 prepare_preconditioner 11 7.9 0.000 0.000 35.800 35.800 make_preconditioner 11 8.9 0.000 0.000 35.800 35.800 make_full_inverse_cholesky 11 9.9 0.000 0.000 33.780 33.780 qs_scf_new_mos 96 7.5 0.001 0.001 24.995 24.995 qs_scf_loop_do_ot 96 8.5 0.001 0.001 24.994 24.994 ot_scf_mini 96 9.5 0.003 0.003 23.237 23.237 dbcsr_multiply_generic 1966 12.4 0.177 0.177 21.541 21.541 cp_fm_cholesky_invert 11 10.9 19.893 19.893 19.893 19.893 init_scf_run 11 5.9 0.001 0.001 16.374 16.374 scf_env_initial_rho_setup 11 6.9 0.001 0.001 16.373 16.373 wfi_extrapolate 11 7.9 0.001 0.001 15.331 15.331 ot_mini 96 10.5 0.001 0.001 13.898 13.898 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 12.173 12.173 make_m2s 3932 13.4 0.058 0.058 11.698 11.698 cp_gemm 81 9.0 0.000 0.000 10.187 10.187 cp_gemm_cosma 81 10.0 10.187 10.187 10.187 10.187 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 7.951 7.951 cp_fm_cholesky_decompose 22 10.9 7.670 7.670 7.670 7.670 qs_ot_get_derivative 96 11.5 0.001 0.001 7.645 7.645 pw_transfer 1295 11.6 0.090 0.090 7.449 7.449 fft_wrap_pw1pw2 1081 12.6 0.010 0.010 7.140 7.140 qs_env_update_s_mstruct 11 6.9 0.000 0.000 7.124 7.124 qs_create_task_list 11 7.9 0.000 0.000 6.561 6.561 generate_qs_task_list 11 8.9 4.791 4.791 6.561 6.561 dbcsr_complete_redistribute 317 12.2 3.003 3.003 6.560 6.560 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 6.367 6.367 make_images 3932 14.4 2.344 2.344 6.354 6.354 ot_diis_step 96 11.5 0.005 0.005 6.248 6.248 fft_wrap_pw1pw2_140 439 13.2 0.692 0.692 6.080 6.080 qs_ot_get_p 107 10.4 0.001 0.001 5.843 5.843 dbcsr_copy 1855 11.9 0.299 0.299 5.600 5.600 multiply_cannon 1966 13.4 0.933 0.933 5.457 5.457 dbcsr_make_dense_low 4961 15.5 0.102 0.102 5.448 5.448 make_dense_data 4961 16.5 4.658 4.658 5.326 5.326 apply_preconditioner_dbcsr 107 12.6 0.000 0.000 5.311 5.311 apply_single 107 13.6 0.000 0.000 5.311 5.311 copy_dbcsr_to_fm 147 11.2 0.004 0.004 5.277 5.277 dbcsr_copy_into_existing 22 7.9 5.224 5.224 5.225 5.225 dbcsr_make_images_dense 3386 14.7 0.023 0.023 4.833 4.833 build_core_hamiltonian_matrix 11 6.9 0.001 0.001 4.444 4.444 qs_ot_p2m_diag 44 11.0 0.196 0.196 4.376 4.376 pw_poisson_solve 107 10.3 1.890 1.890 4.326 4.326 density_rs2pw 107 9.7 0.006 0.006 4.325 4.325 ------------------------------------------------------------------------------- From /workspace/artifacts/H2O-64_nonortho_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.007 0.008 129.650 129.650 qs_mol_dyn_low 1 2.0 0.005 0.006 129.533 129.539 qs_forces 11 3.9 0.002 0.002 129.476 129.476 qs_energies 11 4.9 0.001 0.002 120.560 120.562 scf_env_do_scf 11 5.9 0.001 0.001 111.145 111.146 scf_env_do_scf_inner_loop 96 6.5 0.003 0.009 103.144 103.145 velocity_verlet 10 3.0 0.002 0.002 78.073 78.074 rebuild_ks_matrix 107 8.3 0.001 0.001 59.189 59.238 qs_ks_build_kohn_sham_matrix 107 9.3 0.021 0.022 59.188 59.237 sum_up_and_integrate 107 10.3 0.042 0.045 53.403 53.453 integrate_v_rspace 107 11.3 0.004 0.005 53.360 53.409 qs_ks_update_qs_env 107 7.6 0.001 0.002 52.115 52.159 qs_rho_update_rho 107 7.7 0.001 0.001 49.815 49.828 calculate_rho_elec 107 8.7 0.043 0.046 49.814 49.827 grid_integrate_task_list 107 12.3 46.230 48.073 46.230 48.073 grid_collocate_task_list 107 9.7 42.660 44.239 42.660 44.239 dbcsr_multiply_generic 1966 12.4 0.118 0.122 15.625 15.843 qs_scf_new_mos 96 7.5 0.001 0.001 12.487 12.539 qs_scf_loop_do_ot 96 8.5 0.001 0.001 12.486 12.538 ot_scf_mini 96 9.5 0.003 0.003 11.690 11.745 multiply_cannon 1966 13.4 0.193 0.199 10.376 10.522 multiply_cannon_loop 1966 14.4 0.199 0.216 9.396 9.680 rs_pw_transfer 878 11.9 0.015 0.016 7.596 8.734 mp_waitall_1 146670 16.2 7.786 8.096 7.786 8.096 init_scf_loop 11 6.9 0.000 0.001 7.983 7.984 density_rs2pw 107 9.7 0.008 0.009 6.609 7.791 init_scf_run 11 5.9 0.000 0.002 7.432 7.432 scf_env_initial_rho_setup 11 6.9 0.000 0.001 7.431 7.432 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 7.282 7.292 ot_mini 96 10.5 0.001 0.001 6.957 7.013 wfi_extrapolate 11 7.9 0.001 0.001 6.832 6.832 pw_transfer 1295 11.6 0.123 0.131 5.441 5.505 multiply_cannon_metrocomm3 15728 15.4 0.072 0.077 5.014 5.317 fft_wrap_pw1pw2 1081 12.6 0.014 0.015 5.177 5.240 potential_pw2rs 107 12.3 0.009 0.010 4.839 4.848 fft_wrap_pw1pw2_140 439 13.2 0.525 0.546 4.500 4.684 mp_waitany 8968 13.7 3.438 4.626 3.438 4.626 rs_pw_transfer_RS2PW_140 118 11.5 0.446 0.471 3.144 4.278 fft3d_ps 1081 14.6 2.110 2.254 3.822 3.895 mp_alltoall_d11v 1998 13.7 2.540 3.806 2.540 3.806 make_m2s 3932 13.4 0.066 0.068 3.549 3.597 ot_diis_step 96 11.5 0.005 0.005 3.535 3.536 apply_preconditioner_dbcsr 107 12.6 0.000 0.000 3.484 3.535 apply_single 107 13.6 0.001 0.001 3.484 3.535 rs_gather_matrices 107 12.3 0.141 0.154 2.237 3.479 qs_ot_get_derivative 96 11.5 0.001 0.002 3.388 3.442 multiply_cannon_multrec 15728 15.4 3.304 3.393 3.320 3.410 make_images 3932 14.4 0.166 0.173 2.917 2.969 ------------------------------------------------------------------------------- Plot: name="H2O-64_nonortho_timings_32omp", title="Timings of H2O-64_nonortho with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="rest", label="rest", y=79.64600000000002, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=50.14, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=47.756, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="cp_fm_cholesky_invert", label="cp_fm_cholesky_invert", y=19.893, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=10.187, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=7.67, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="mp_waitany", label="mp_waitany", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 Plot: name="H2O-64_nonortho_timings_32mpi", title="Timings of H2O-64_nonortho with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="rest", label="rest", y=26.232000000000014, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=42.66, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=46.23, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="cp_fm_cholesky_invert", label="cp_fm_cholesky_invert", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=3.304, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="mp_waitany", label="mp_waitany", y=3.438, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=7.786, yerr=0.0 Running H2O-hyb.inp with 1 threads and 32 ranks... done. Running H2O-hyb.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/H2O-hyb_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.396 0.396 253.277 253.277 qs_energies 1 2.0 0.000 0.000 252.019 252.019 scf_env_do_scf 1 3.0 0.000 0.000 249.580 249.580 qs_ks_update_qs_env 8 5.0 0.000 0.000 232.539 232.539 rebuild_ks_matrix 7 6.0 0.000 0.000 232.435 232.435 qs_ks_build_kohn_sham_matrix 7 7.0 0.002 0.002 232.435 232.435 hfx_ks_matrix 7 8.0 0.000 0.000 170.264 170.264 integrate_four_center 7 9.0 2.148 2.148 170.232 170.232 integrate_four_center_main 7 10.0 0.786 0.786 158.831 158.831 integrate_four_center_bin 453 11.0 158.046 158.046 158.046 158.046 scf_env_do_scf_inner_loop 7 4.0 0.001 0.001 141.419 141.419 init_scf_loop 1 4.0 0.000 0.000 108.147 108.147 cp_gemm 129 10.3 0.001 0.001 46.993 46.993 cp_gemm_cosma 129 11.3 46.992 46.992 46.992 46.992 admm_mo_calc_rho_aux 7 8.0 0.000 0.000 30.164 30.164 admm_fit_mo_coeffs 7 9.0 0.000 0.000 28.426 28.426 admm_mo_merge_derivs 7 8.0 0.000 0.000 23.219 23.219 merge_mo_derivs_diag 7 9.0 0.023 0.023 23.219 23.219 purify_mo_diag 7 10.0 0.001 0.001 15.484 15.484 prepare_preconditioner 1 5.0 0.000 0.000 13.256 13.256 make_preconditioner 1 6.0 0.000 0.000 13.256 13.256 fit_mo_coeffs 7 10.0 0.000 0.000 12.942 12.942 integrate_four_center_load 7 10.0 0.000 0.000 8.854 8.854 hfx_load_balance 1 11.0 0.002 0.002 8.853 8.853 arnoldi_normal_ev 11 9.3 0.002 0.002 7.794 7.794 estimate_cond_num 1 7.0 0.000 0.000 7.722 7.722 build_subspace 28 9.5 0.015 0.015 7.650 7.650 qs_vxc_create 14 8.0 0.000 0.000 5.298 5.298 xc_vxc_pw_create 14 9.0 0.935 0.935 5.297 5.297 ------------------------------------------------------------------------------- From /workspace/artifacts/H2O-hyb_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.251 0.261 191.734 191.735 qs_energies 1 2.0 0.000 0.001 191.326 191.327 scf_env_do_scf 1 3.0 0.000 0.000 190.737 190.737 qs_ks_update_qs_env 8 5.0 0.000 0.000 186.903 186.904 rebuild_ks_matrix 7 6.0 0.000 0.000 186.882 186.884 qs_ks_build_kohn_sham_matrix 7 7.0 0.002 0.003 186.882 186.883 hfx_ks_matrix 7 8.0 0.001 0.001 174.642 174.647 integrate_four_center 7 9.0 0.103 0.417 174.624 174.625 integrate_four_center_main 7 10.0 0.005 0.006 159.719 163.301 integrate_four_center_bin 448 11.0 159.714 163.296 159.714 163.296 scf_env_do_scf_inner_loop 7 4.0 0.000 0.001 112.143 112.143 init_scf_loop 1 4.0 0.000 0.000 78.592 78.593 integrate_four_center_load 7 10.0 0.000 0.000 9.249 9.253 hfx_load_balance 1 11.0 0.001 0.002 9.249 9.253 mp_sync 70 11.3 4.764 7.187 4.764 7.187 hfx_load_balance_bin 1 12.0 4.529 4.667 4.529 4.667 hfx_load_balance_count 1 12.0 4.440 4.575 4.440 4.575 cp_gemm 129 10.3 0.000 0.001 4.155 4.164 cp_gemm_cosma 129 11.3 4.155 4.163 4.155 4.163 qs_vxc_create 14 8.0 0.001 0.001 4.024 4.024 xc_vxc_pw_create 14 9.0 0.022 0.025 4.023 4.024 ------------------------------------------------------------------------------- Plot: name="H2O-hyb_timings_32omp", title="Timings of H2O-hyb with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="H2O-hyb_timings_32omp", name="rest", label="rest", y=44.370000000000005, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="integrate_four_center_bin", label="integrate_four_center_bin", y=158.046, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=46.992, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="integrate_four_center", label="integrate_four_center", y=2.148, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="xc_vxc_pw_create", label="xc_vxc_pw_create", y=0.935, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="integrate_four_center_main", label="integrate_four_center_main", y=0.786, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="mp_sync", label="mp_sync", y=0.0, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="hfx_load_balance_count", label="hfx_load_balance_count", y=0.0, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="hfx_load_balance_bin", label="hfx_load_balance_bin", y=0.0, yerr=0.0 Plot: name="H2O-hyb_timings_32mpi", title="Timings of H2O-hyb with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="H2O-hyb_timings_32mpi", name="rest", label="rest", y=14.00200000000001, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="integrate_four_center_bin", label="integrate_four_center_bin", y=159.714, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=4.155, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="integrate_four_center", label="integrate_four_center", y=0.103, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="xc_vxc_pw_create", label="xc_vxc_pw_create", y=0.022, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="integrate_four_center_main", label="integrate_four_center_main", y=0.005, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="mp_sync", label="mp_sync", y=4.764, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="hfx_load_balance_count", label="hfx_load_balance_count", y=4.44, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="hfx_load_balance_bin", label="hfx_load_balance_bin", y=4.529, yerr=0.0 Running GW_PBE_4benzene.inp with 1 threads and 32 ranks... done. Running GW_PBE_4benzene.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/GW_PBE_4benzene_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.016 0.016 354.648 354.648 qs_energies 1 2.0 0.000 0.000 354.146 354.146 mp2_main 1 3.0 0.000 0.000 347.614 347.614 mp2_gpw_main 1 4.0 0.000 0.000 347.101 347.101 rpa_ri_compute_en 1 5.0 0.000 0.000 333.339 333.339 rpa_num_int 1 6.0 0.001 0.001 333.314 333.314 compute_mat_P_omega 1 7.0 0.002 0.002 208.807 208.807 compute_mat_P_omega_contract 10 8.0 12.104 12.104 207.630 207.630 dbcsr_t_total 2336 9.6 0.016 0.016 198.114 198.114 dbcsr_t_contract 787 11.0 46.300 46.300 126.116 126.116 cp_gemm 105 8.4 0.000 0.000 93.665 93.665 cp_gemm_cosma 105 9.4 93.664 93.664 93.664 93.664 compute_mat_P_omega_calc_M_occ 250 9.0 12.109 12.109 79.866 79.866 dbcsr_tas_total 1149 12.2 0.049 0.049 73.646 73.646 dbcsr_tas_multiply 807 12.1 0.003 0.003 72.244 72.244 GW_matrix_operations 10 7.0 0.006 0.006 70.579 70.579 dbcsr_t_copy 1103 10.7 19.756 19.756 70.573 70.573 dbcsr_multiply_generic 837 15.8 0.127 0.127 58.713 58.713 dbcsr_tas_dbcsr 807 14.1 0.003 0.003 58.269 58.269 compute_mat_P_omega_calc_M_vir 250 9.0 0.001 0.001 52.596 52.596 dbcsr_tas_mm_1N 524 15.1 0.002 0.002 46.185 46.185 multiply_cannon 837 16.8 18.432 18.432 45.371 45.371 rpa_num_int_RPA_matrix_operati 10 7.0 0.000 0.000 32.862 32.862 contract_P_omega_with_mat_L 10 8.0 0.000 0.000 31.027 31.027 dbcsr_tas_reserve_blocks_index 3261 13.7 7.187 7.187 26.844 26.844 dbcsr_tas_copy 574 11.4 16.745 16.745 24.336 24.336 multiply_cannon_loop 837 17.8 0.171 0.171 24.288 24.288 multiply_cannon_multrec 837 18.8 22.641 22.641 23.184 23.184 dbcsr_t_reserve_blocks_index 2280 12.5 1.266 1.266 20.578 20.578 dbcsr_t_reserve_blocks_index_a 2222 11.6 0.011 0.011 20.287 20.287 dbcsr_reserve_blocks 3717 14.7 18.968 18.968 19.353 19.353 compute_mat_P_omega_copy_M_occ 250 9.0 0.002 0.002 19.279 19.279 compute_QP_energies 1 7.0 0.000 0.000 18.664 18.664 compute_self_energy_cubic_gw 1 8.0 0.093 0.093 18.663 18.663 compute_mat_P_omega_copy_M_vir 250 9.0 0.002 0.002 14.088 14.088 mp2_ri_gpw_compute_in 1 5.0 0.001 0.001 13.746 13.746 dbcsr_t_copy_nocomm 251 12.0 10.970 10.970 13.322 13.322 compute_mat_P_omega_calc_P_t 250 9.0 0.001 0.001 12.024 12.024 make_m2s 1674 16.8 0.104 0.104 10.852 10.852 make_images 1674 17.8 5.071 5.071 10.275 10.275 dbcsr_tas_mm_2 251 15.0 0.001 0.001 10.265 10.265 cp_fm_cholesky_invert 10 8.0 9.654 9.654 9.654 9.654 dbcsr_finalize 9888 13.6 1.509 1.509 8.221 8.221 contract_cubic_gw 21 9.0 0.000 0.000 7.783 7.783 mp2_ri_gpw_compute_in_copy_3c 6 6.0 0.665 0.665 7.122 7.122 ------------------------------------------------------------------------------- From /workspace/artifacts/GW_PBE_4benzene_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.006 0.009 57.062 57.063 qs_energies 1 2.0 0.001 0.001 56.942 56.948 mp2_main 1 3.0 0.001 0.001 55.492 55.498 mp2_gpw_main 1 4.0 0.000 0.001 55.435 55.441 rpa_ri_compute_en 1 5.0 0.000 0.000 53.464 53.471 rpa_num_int 1 6.0 0.000 0.001 53.456 53.463 dbcsr_t_total 2336 9.6 0.015 0.016 42.144 42.145 compute_mat_P_omega 1 7.0 0.001 0.002 41.053 41.063 compute_mat_P_omega_contract 10 8.0 0.768 0.790 40.813 40.818 dbcsr_t_contract 787 11.0 1.872 2.013 31.021 31.024 dbcsr_tas_total 1149 12.2 0.063 0.068 27.261 27.262 dbcsr_tas_multiply 807 12.1 0.003 0.003 27.116 27.118 dbcsr_tas_dbcsr 807 14.1 0.003 0.004 19.745 19.746 dbcsr_multiply_generic 837 15.8 0.071 0.076 16.515 17.472 compute_mat_P_omega_calc_M_occ 250 9.0 0.757 0.781 13.771 13.771 multiply_cannon 837 16.8 0.135 0.151 9.743 10.079 compute_mat_P_omega_calc_P_t 250 9.0 0.001 0.001 9.986 9.987 dbcsr_t_copy 1111 10.7 4.276 4.497 9.544 9.884 dbcsr_tas_mm_1N 524 15.1 0.003 0.003 8.765 9.586 multiply_cannon_loop 837 17.8 0.043 0.045 8.868 9.207 compute_mat_P_omega_calc_M_vir 250 9.0 0.001 0.001 8.700 8.700 cp_gemm 105 8.4 0.000 0.000 7.618 7.632 cp_gemm_cosma 105 9.4 7.617 7.631 7.617 7.631 dbcsr_tas_mm_2 251 15.0 0.002 0.002 7.534 7.534 multiply_cannon_multrec 1386 17.8 6.916 7.244 7.172 7.495 mp_sync 8696 11.6 6.350 7.409 6.350 7.409 make_m2s 1674 16.8 0.044 0.047 5.829 6.495 make_images 1674 17.8 0.243 0.257 5.747 6.411 GW_matrix_operations 10 7.0 0.001 0.002 5.026 5.033 compute_QP_energies 1 7.0 0.000 0.001 4.214 4.214 compute_self_energy_cubic_gw 1 8.0 0.005 0.006 4.210 4.214 dbcsr_t_communicate_buffer 1098 11.7 0.093 0.099 3.473 3.636 mp_waitall_2 3776 14.7 3.262 3.520 3.262 3.520 contract_cubic_gw 21 9.0 0.000 0.000 3.200 3.200 make_images_data 1674 18.8 0.037 0.039 3.032 3.159 dbcsr_t_reserve_blocks_index_a 2791 11.4 0.017 0.019 2.719 3.065 dbcsr_t_reserve_blocks_index 2849 12.4 0.105 0.110 2.716 3.063 hybrid_alltoall_any 1724 19.5 2.355 2.695 2.920 3.055 dbcsr_tas_reserve_blocks_index 3300 13.8 0.267 0.289 2.673 3.006 make_images_pack 1674 18.8 2.270 2.895 2.285 2.907 rpa_num_int_RPA_matrix_operati 10 7.0 0.000 0.000 2.779 2.787 dbcsr_reserve_blocks 3785 14.7 2.398 2.718 2.438 2.759 contract_P_omega_with_mat_L 10 8.0 0.000 0.000 2.663 2.671 convert_to_new_pgrid 2421 14.1 0.017 0.019 1.894 2.012 dbcsr_copy 3323 15.8 1.829 1.954 1.858 1.983 mp2_ri_gpw_compute_in 1 5.0 0.001 0.001 1.968 1.969 mp_waitall_1 26582 19.0 1.524 1.926 1.524 1.926 compute_mat_P_omega_copy_M_vir 250 9.0 0.002 0.002 1.716 1.722 dbcsr_add_anytype 909 13.7 1.022 1.077 1.585 1.654 compute_mat_P_omega_copy_M_occ 250 9.0 0.001 0.002 1.521 1.525 dbcsr_tas_replicate 396 14.1 0.797 0.885 1.356 1.423 scf_env_do_scf 1 3.0 0.000 0.000 1.396 1.396 scf_env_do_scf_inner_loop 17 4.0 0.001 0.001 1.396 1.396 mp_max_i 2058 9.6 1.005 1.267 1.005 1.267 ------------------------------------------------------------------------------- Plot: name="GW_PBE_4benzene_timings_32omp", title="Timings of GW_PBE_4benzene with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="rest", label="rest", y=153.31900000000005, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=93.664, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="dbcsr_t_contract", label="dbcsr_t_contract", y=46.3, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=22.641, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="dbcsr_t_copy", label="dbcsr_t_copy", y=19.756, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="dbcsr_reserve_blocks", label="dbcsr_reserve_blocks", y=18.968, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="mp_sync", label="mp_sync", y=0.0, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="mp_waitall_2", label="mp_waitall_2", y=0.0, yerr=0.0 Plot: name="GW_PBE_4benzene_timings_32mpi", title="Timings of GW_PBE_4benzene with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="rest", label="rest", y=24.370999999999995, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=7.617, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="dbcsr_t_contract", label="dbcsr_t_contract", y=1.872, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=6.916, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="dbcsr_t_copy", label="dbcsr_t_copy", y=4.276, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="dbcsr_reserve_blocks", label="dbcsr_reserve_blocks", y=2.398, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="mp_sync", label="mp_sync", y=6.35, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="mp_waitall_2", label="mp_waitall_2", y=3.262, yerr=0.0 Running RI-HFX_H2O-32.inp with 1 threads and 32 ranks... done. Running RI-HFX_H2O-32.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/RI-HFX_H2O-32_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.021 0.021 798.526 798.526 qs_forces 1 2.0 0.000 0.000 797.760 797.760 rebuild_ks_matrix 7 6.6 0.000 0.000 788.486 788.486 qs_ks_build_kohn_sham_matrix 7 7.6 0.001 0.001 788.486 788.486 hfx_ks_matrix 7 8.6 0.000 0.000 785.517 785.517 dbcsr_t_total 1342 10.2 0.010 0.010 669.713 669.713 qs_ks_update_qs_env_forces 1 3.0 0.000 0.000 478.969 478.969 hfx_ri_update_forces 1 7.0 0.024 0.024 434.031 434.031 dbcsr_t_contract 445 11.1 163.939 163.939 397.502 397.502 hfx_ri_update_ks 7 9.6 0.000 0.000 351.481 351.481 hfx_ri_update_ks_Pmat 7 10.6 60.167 60.167 351.477 351.477 qs_energies 1 3.0 0.000 0.000 318.734 318.734 scf_env_do_scf 1 4.0 0.000 0.000 318.323 318.323 qs_ks_update_qs_env 8 6.0 0.000 0.000 309.523 309.523 dbcsr_t_copy 441 11.6 91.208 91.208 261.046 261.046 scf_env_do_scf_inner_loop 6 5.0 0.001 0.001 185.244 185.244 dbcsr_tas_total 805 12.2 0.062 0.062 175.590 175.590 dbcsr_tas_reserve_blocks_index 2010 14.3 17.785 17.785 168.893 168.893 dbcsr_tas_multiply 456 12.2 0.002 0.002 164.440 164.440 dbcsr_reserve_blocks 2452 14.9 149.024 149.024 149.853 149.853 init_scf_loop 2 5.0 0.000 0.000 133.075 133.075 dbcsr_t_reserve_blocks_index 1344 13.2 3.170 3.170 132.940 132.940 dbcsr_t_reserve_blocks_index_a 1323 12.2 0.012 0.012 131.677 131.677 dbcsr_multiply_generic 611 14.8 0.104 0.104 128.527 128.527 dbcsr_tas_dbcsr 456 14.2 0.002 0.002 127.790 127.790 hfx_ri_update_ks_Pmat_KS 63 11.6 0.001 0.001 98.600 98.600 multiply_cannon 611 15.8 5.159 5.159 87.921 87.921 multiply_cannon_loop 611 16.8 0.154 0.154 79.236 79.236 multiply_cannon_multrec 611 17.8 77.248 77.248 77.369 77.369 hfx_ri_forces_Pmat_2c_inv_2 9 8.0 0.000 0.000 69.463 69.463 precalc_derivatives 1 8.0 0.009 0.009 69.162 69.162 hfx_ri_forces_Pmat_metric 9 8.0 0.001 0.001 68.408 68.408 hfx_ri_forces_Pmat_3c_RI 9 8.0 0.001 0.001 67.226 67.226 dbcsr_tas_copy 290 12.3 27.919 27.919 66.848 66.848 hfx_ri_forces_Pmat_3c_AO 9 8.0 0.000 0.000 56.812 56.812 dbcsr_tas_mm_3N 94 14.7 0.000 0.000 55.694 55.694 hfx_ri_update_ks_Pmat_Px3C 63 11.6 0.000 0.000 52.235 52.235 dbcsr_tas_mm_2 283 14.9 0.002 0.002 50.279 50.279 hfx_ri_update_ks_Pmat_copy_2 63 11.6 0.000 0.000 46.383 46.383 hfx_ri_forces_Pmat_PQ_der 9 8.0 1.339 1.339 38.133 38.133 build_3c_derivatives 2 9.0 1.563 1.563 36.688 36.688 hfx_ri_pre_scf_Pmat 1 12.0 0.000 0.000 35.774 35.774 dbcsr_data_release 96260 17.1 32.661 32.661 32.661 32.661 make_m2s 1222 15.8 0.115 0.115 31.172 31.172 hfx_ri_forces_Pmat_2c_inv_1 1 8.0 4.405 4.405 30.907 30.907 make_images 1222 16.8 11.971 11.971 30.319 30.319 dbcsr_t_split_blocks_generic 138 11.8 15.659 15.659 27.714 27.714 dbcsr_t_split_copyback 69 11.8 16.361 16.361 25.338 25.338 dbcsr_destroy 11406 14.4 0.059 0.059 24.232 24.232 hfx_ri_forces_Pmat_Pmat_2 9 8.0 0.000 0.000 23.187 23.187 dbcsr_tas_mm_3T 77 17.1 0.000 0.000 21.708 21.708 dbcsr_t_communicate_buffer 151 13.0 20.107 20.107 20.107 20.107 ------------------------------------------------------------------------------- From /workspace/artifacts/RI-HFX_H2O-32_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.010 0.013 119.835 119.837 qs_forces 1 2.0 0.000 0.000 119.628 119.628 rebuild_ks_matrix 7 6.6 0.000 0.000 118.561 118.562 qs_ks_build_kohn_sham_matrix 7 7.6 0.003 0.003 118.561 118.562 hfx_ks_matrix 7 8.6 0.001 0.001 116.970 116.970 dbcsr_t_total 1342 10.2 0.011 0.012 109.053 109.053 dbcsr_t_contract 445 11.1 7.232 7.596 86.018 86.028 qs_ks_update_qs_env_forces 1 3.0 0.000 0.000 85.876 85.877 hfx_ri_update_forces 1 7.0 0.002 0.003 81.379 81.379 dbcsr_tas_total 805 12.2 0.062 0.066 78.604 78.607 dbcsr_tas_multiply 456 12.2 0.002 0.002 73.218 73.221 dbcsr_tas_dbcsr 456 14.2 0.002 0.002 54.746 54.746 dbcsr_multiply_generic 611 14.8 0.050 0.053 49.659 51.854 hfx_ri_update_ks 7 9.6 0.000 0.000 35.589 35.590 hfx_ri_update_ks_Pmat 7 10.6 2.557 2.767 35.588 35.588 qs_energies 1 3.0 0.000 0.001 33.735 33.735 multiply_cannon 611 15.8 0.093 0.103 31.658 33.563 scf_env_do_scf 1 4.0 0.000 0.000 33.510 33.510 qs_ks_update_qs_env 8 6.0 0.000 0.000 32.686 32.687 multiply_cannon_loop 611 16.8 0.049 0.051 30.376 32.285 dbcsr_tas_mm_2 283 14.9 0.002 0.002 28.729 28.730 multiply_cannon_multrec 1893 15.6 24.914 27.013 25.256 27.341 hfx_ri_forces_Pmat_metric 9 8.0 0.001 0.001 22.663 22.663 scf_env_do_scf_inner_loop 6 5.0 0.000 0.001 18.926 18.927 dbcsr_t_copy 469 11.6 6.093 6.398 17.128 17.616 dbcsr_tas_mm_3N 94 14.7 0.000 0.001 15.692 16.465 make_m2s 1222 15.8 0.040 0.042 14.658 15.509 make_images 1222 16.8 0.319 0.331 14.563 15.413 hfx_ri_forces_Pmat_2c_inv_2 9 8.0 0.000 0.001 14.922 14.922 init_scf_loop 2 5.0 0.000 0.000 14.583 14.583 mp_sync 5251 12.1 10.172 14.249 10.172 14.249 hfx_ri_update_ks_Pmat_KS 63 11.6 0.001 0.001 14.040 14.040 hfx_ri_forces_Pmat_3c_RI 9 8.0 0.001 0.001 11.911 11.911 hfx_ri_forces_Pmat_PQ_der 9 8.0 0.069 0.072 9.443 9.443 make_images_data 1222 17.8 0.031 0.033 7.959 8.440 hfx_ri_update_ks_Pmat_Px3C 63 11.6 0.000 0.000 8.241 8.241 hybrid_alltoall_any 1272 18.5 6.480 7.520 7.789 8.213 mp_waitall_2 1948 15.2 7.284 7.938 7.284 7.938 dbcsr_tas_reserve_blocks_index 2058 14.4 0.657 0.711 7.028 7.696 dbcsr_tas_mm_3T 77 17.1 0.000 0.000 7.313 7.691 dbcsr_reserve_blocks 2505 15.0 6.848 7.573 6.891 7.619 dbcsr_t_reserve_blocks_index 1523 13.2 0.171 0.183 5.994 6.643 make_images_pack 1222 17.8 6.012 6.599 6.025 6.614 dbcsr_t_reserve_blocks_index_a 1502 12.2 0.014 0.015 5.954 6.602 mp_sum_l 13555 13.9 4.955 6.599 4.955 6.599 mp_waitall_1 24260 17.6 4.694 6.020 4.694 6.020 precalc_derivatives 1 8.0 0.003 0.003 5.998 5.998 dbcsr_tas_replicate 359 14.0 1.717 3.000 5.728 5.993 convert_to_new_pgrid 1368 14.2 0.019 0.021 4.873 5.261 dbcsr_copy 1981 15.7 4.782 5.173 4.800 5.191 hfx_ri_forces_Pmat_3c_AO 9 8.0 0.000 0.001 5.073 5.073 dbcsr_tas_communicate_buffer 728 14.9 0.032 0.035 4.383 4.966 hfx_ri_forces_Pmat_2c_inv_1 1 8.0 0.159 0.170 4.735 4.741 hfx_ri_pre_scf_Pmat 1 12.0 0.000 0.000 4.603 4.604 hfx_ri_forces_Pmat_Pmat_2 9 8.0 0.000 0.000 4.446 4.446 dbcsr_tas_replicate_communicat 127 15.0 0.003 0.005 3.178 3.732 dbcsr_multiply_generic_mpsum_f 445 17.1 0.002 0.002 2.199 3.621 dbcsr_t_communicate_buffer 330 12.4 0.022 0.024 3.328 3.458 build_3c_derivatives 2 9.0 0.601 0.649 3.357 3.363 multiply_cannon_metrocomm3 1893 15.6 0.006 0.007 1.872 2.966 dbcsr_tas_merge 232 12.1 1.635 1.773 2.649 2.915 dbcsr_tas_copy 144 13.2 1.303 1.399 2.446 2.626 hfx_ri_pre_scf_Pmat_RIx3C 9 13.0 0.000 0.000 2.392 2.403 ------------------------------------------------------------------------------- Plot: name="RI-HFX_H2O-32_timings_32omp", title="Timings of RI-HFX_H2O-32 with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="rest", label="rest", y=256.94000000000005, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="dbcsr_t_contract", label="dbcsr_t_contract", y=163.939, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="dbcsr_reserve_blocks", label="dbcsr_reserve_blocks", y=149.024, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="dbcsr_t_copy", label="dbcsr_t_copy", y=91.208, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=77.248, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="hfx_ri_update_ks_Pmat", label="hfx_ri_update_ks_Pmat", y=60.167, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="mp_waitall_2", label="mp_waitall_2", y=0.0, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32omp", name="mp_sync", label="mp_sync", y=0.0, yerr=0.0 Plot: name="RI-HFX_H2O-32_timings_32mpi", title="Timings of RI-HFX_H2O-32 with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="rest", label="rest", y=54.734999999999985, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="dbcsr_t_contract", label="dbcsr_t_contract", y=7.232, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="dbcsr_reserve_blocks", label="dbcsr_reserve_blocks", y=6.848, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="dbcsr_t_copy", label="dbcsr_t_copy", y=6.093, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=24.914, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="hfx_ri_update_ks_Pmat", label="hfx_ri_update_ks_Pmat", y=2.557, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="mp_waitall_2", label="mp_waitall_2", y=7.284, yerr=0.0 PlotPoint: plot="RI-HFX_H2O-32_timings_32mpi", name="mp_sync", label="mp_sync", y=10.172, yerr=0.0 Running diag_cu144_broy.inp with 1 threads and 32 ranks... done. Running diag_cu144_broy.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/diag_cu144_broy_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.102 0.102 190.517 190.517 qs_energies 1 2.0 0.000 0.000 188.660 188.660 scf_env_do_scf 1 3.0 0.000 0.000 178.481 178.481 scf_env_do_scf_inner_loop 15 4.0 0.002 0.002 178.481 178.481 qs_scf_new_mos 15 5.0 0.001 0.001 78.290 78.290 qs_ks_update_qs_env 15 5.0 0.000 0.000 69.920 69.920 rebuild_ks_matrix 15 6.0 0.000 0.000 69.558 69.558 qs_ks_build_kohn_sham_matrix 15 7.0 0.003 0.003 69.558 69.558 eigensolver 15 6.0 0.002 0.002 65.051 65.051 cp_fm_diag_elpa 15 7.0 0.000 0.000 50.692 50.692 cp_fm_diag_elpa_base 15 8.0 46.041 46.041 50.691 50.691 qs_vxc_create 15 8.0 0.037 0.037 45.462 45.462 calculate_dispersion_nonloc 15 9.0 9.121 9.121 39.701 39.701 pw_transfer 1191 9.8 0.091 0.091 27.128 27.128 fft_wrap_pw1pw2 1086 10.9 0.013 0.013 26.833 26.833 qs_rho_update_rho 16 5.0 0.000 0.000 24.096 24.096 calculate_rho_elec 16 6.0 0.343 0.343 24.096 24.096 grid_collocate_task_list 16 7.0 22.522 22.522 22.522 22.522 sum_up_and_integrate 15 8.0 0.078 0.078 22.481 22.481 integrate_v_rspace 15 9.0 0.036 0.036 22.404 22.404 grid_integrate_task_list 15 10.0 21.761 21.761 21.761 21.761 fft_wrap_pw1pw2_150 765 12.0 3.374 3.374 20.301 20.301 fft3d_s 1087 12.8 11.075 11.075 11.087 11.087 pw_scatter_s 585 13.0 10.681 10.681 10.681 10.681 copy_dbcsr_to_fm 16 5.9 0.001 0.001 10.499 10.499 cp_fm_cholesky_restore 45 7.0 9.822 9.822 9.822 9.822 dbcsr_complete_redistribute 46 8.3 3.382 3.382 9.374 9.374 cp_fm_upper_to_full 30 8.0 9.185 9.185 9.185 9.185 vdW_energy 15 10.0 7.866 7.866 7.866 7.866 gspace_mixing 14 5.0 0.273 0.273 7.428 7.428 broyden_mixing 14 6.0 6.663 6.663 6.664 6.664 fft_wrap_pw1pw2_200 197 11.5 0.350 0.350 6.282 6.282 xc_vxc_pw_create 15 9.0 1.424 1.424 5.724 5.724 qs_energies_init_hamiltonians 1 3.0 0.000 0.000 4.701 4.701 init_scf_run 1 3.0 0.000 0.000 4.657 4.657 dbcsr_finalize 159 9.9 0.021 0.021 4.092 4.092 dbcsr_merge_all 91 11.1 0.078 0.078 3.940 3.940 ------------------------------------------------------------------------------- From /workspace/artifacts/diag_cu144_broy_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.013 0.017 85.653 85.654 qs_energies 1 2.0 0.000 0.001 85.264 85.265 scf_env_do_scf 1 3.0 0.000 0.000 80.153 80.154 scf_env_do_scf_inner_loop 15 4.0 0.001 0.002 80.153 80.154 qs_ks_update_qs_env 15 5.0 0.000 0.000 39.865 39.880 rebuild_ks_matrix 15 6.0 0.000 0.000 39.817 39.832 qs_ks_build_kohn_sham_matrix 15 7.0 0.004 0.005 39.817 39.832 sum_up_and_integrate 15 8.0 0.013 0.015 23.620 23.658 integrate_v_rspace 15 9.0 0.001 0.001 23.607 23.643 qs_rho_update_rho 16 5.0 0.000 0.000 22.884 22.887 calculate_rho_elec 16 6.0 0.011 0.012 22.884 22.886 grid_integrate_task_list 15 10.0 21.521 22.046 21.521 22.046 grid_collocate_task_list 16 7.0 20.977 21.415 20.977 21.415 qs_scf_new_mos 15 5.0 0.001 0.001 17.814 17.988 eigensolver 15 6.0 0.002 0.003 16.350 16.361 qs_vxc_create 15 8.0 0.001 0.001 15.672 15.684 calculate_dispersion_nonloc 15 9.0 1.418 1.492 12.767 12.781 pw_transfer 1191 9.8 0.133 0.146 11.935 12.042 cp_fm_diag_elpa 15 7.0 0.000 0.000 11.898 11.906 cp_fm_diag_elpa_base 15 8.0 11.646 11.680 11.893 11.896 fft_wrap_pw1pw2 1086 10.9 0.021 0.023 11.636 11.753 fft3d_ps 1086 12.9 5.137 5.251 8.790 9.001 fft_wrap_pw1pw2_150 765 12.0 0.688 0.721 7.810 7.849 cp_fm_cholesky_restore 45 7.0 4.210 4.268 4.210 4.268 fft_wrap_pw1pw2_200 197 11.5 0.369 0.397 3.676 3.754 qs_energies_init_hamiltonians 1 3.0 0.000 0.000 3.199 3.200 build_core_hamiltonian_matrix 1 4.0 0.000 0.000 2.746 3.028 xc_vxc_pw_create 15 9.0 0.057 0.076 2.904 2.919 mp_alltoall_z22v 1086 14.9 2.158 2.543 2.158 2.543 rs_pw_transfer 158 9.4 0.002 0.003 1.800 2.329 vdW_energy 15 10.0 2.091 2.196 2.091 2.196 x_to_yz 585 14.0 0.923 0.976 2.049 2.134 density_rs2pw 16 7.0 0.002 0.002 1.746 2.108 build_core_ppnl 1 5.0 1.832 2.021 1.832 2.021 yz_to_x 501 13.7 0.540 0.607 1.573 1.787 mp_waitany 520 11.3 1.159 1.778 1.159 1.778 ------------------------------------------------------------------------------- Plot: name="diag_cu144_broy_timings_32omp", title="Timings of diag_cu144_broy with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="diag_cu144_broy_timings_32omp", name="rest", label="rest", y=68.61500000000001, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="cp_fm_diag_elpa_base", label="cp_fm_diag_elpa_base", y=46.041, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=22.522, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=21.761, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="fft3d_s", label="fft3d_s", y=11.075, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="pw_scatter_s", label="pw_scatter_s", y=10.681, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="cp_fm_cholesky_restore", label="cp_fm_cholesky_restore", y=9.822, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32omp", name="fft3d_ps", label="fft3d_ps", y=0.0, yerr=0.0 Plot: name="diag_cu144_broy_timings_32mpi", title="Timings of diag_cu144_broy with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="rest", label="rest", y=22.162, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="cp_fm_diag_elpa_base", label="cp_fm_diag_elpa_base", y=11.646, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=20.977, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=21.521, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="fft3d_s", label="fft3d_s", y=0.0, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="pw_scatter_s", label="pw_scatter_s", y=0.0, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="cp_fm_cholesky_restore", label="cp_fm_cholesky_restore", y=4.21, yerr=0.0 PlotPoint: plot="diag_cu144_broy_timings_32mpi", name="fft3d_ps", label="fft3d_ps", y=5.137, yerr=0.0 Running bench_dftb.inp with 1 threads and 32 ranks... done. Running bench_dftb.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/bench_dftb_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.085 0.085 306.708 306.708 qs_energies 1 2.0 0.000 0.000 306.551 306.551 ls_scf 1 3.0 0.000 0.000 304.783 304.783 ls_scf_main 1 4.0 0.002 0.002 291.579 291.579 density_matrix_trs4 11 5.0 0.011 0.011 175.789 175.789 ls_scf_dm_to_ks 11 5.0 0.000 0.000 109.017 109.017 dbcsr_multiply_generic 185 6.1 0.470 0.470 108.789 108.789 matrix_ls_to_qs 11 6.0 0.000 0.000 104.773 104.773 multiply_cannon 185 7.1 3.066 3.066 74.243 74.243 dbcsr_complete_redistribute 23 7.5 41.279 41.279 56.932 56.932 multiply_cannon_loop 185 8.1 0.391 0.391 53.150 53.150 dbcsr_copy_into_existing 11 7.0 52.651 52.651 52.652 52.652 matrix_decluster 11 7.0 0.000 0.000 52.120 52.120 multiply_cannon_multrec 185 9.1 50.964 50.964 51.016 51.016 arnoldi_extremal 12 6.1 0.000 0.000 48.045 48.045 arnoldi_normal_ev 12 7.1 0.030 0.030 48.045 48.045 build_subspace 23 8.1 0.131 0.131 47.390 47.390 dbcsr_matrix_vector_mult 652 9.0 0.279 0.279 36.758 36.758 dbcsr_matrix_vector_mult_local 652 10.0 35.085 35.085 35.094 35.094 make_m2s 370 7.1 0.031 0.031 28.479 28.479 make_images 370 8.1 7.400 7.400 26.036 26.036 dbcsr_finalize 646 7.5 0.206 0.206 21.082 21.082 dbcsr_merge_all 597 8.5 3.581 3.581 19.241 19.241 setup_rec_index_2d 370 8.1 17.795 17.795 17.795 17.795 dbcsr_sort_indices 1103 9.9 14.567 14.567 14.567 14.567 tree_to_linear_d 110 9.4 13.400 13.400 13.400 13.400 quick_finalize 395 10.0 0.515 0.515 12.443 12.443 ls_scf_init_scf 1 4.0 0.000 0.000 12.245 12.245 ls_scf_init_matrix_S 1 5.0 0.000 0.000 11.814 11.814 dbcsr_special_finalize 370 9.1 0.003 0.003 11.464 11.464 matrix_sqrt_Newton_Schulz 1 6.0 0.001 0.001 10.951 10.951 dbcsr_dot_sd 144 6.3 8.973 8.973 8.974 8.974 dbcsr_frobenius_norm 142 6.1 7.821 7.821 7.823 7.823 matrix_qs_to_ls 12 5.1 0.000 0.000 7.043 7.043 matrix_cluster 12 6.1 0.000 0.000 7.043 7.043 make_images_data 370 9.1 0.010 0.010 6.959 6.959 ------------------------------------------------------------------------------- From /workspace/artifacts/bench_dftb_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.009 0.010 94.003 94.003 qs_energies 1 2.0 0.000 0.000 93.910 93.911 ls_scf 1 3.0 0.000 0.000 93.833 93.834 ls_scf_main 1 4.0 0.001 0.003 90.100 90.101 density_matrix_trs4 11 5.0 0.009 0.012 86.343 86.408 dbcsr_multiply_generic 185 6.1 0.071 0.075 80.980 81.217 multiply_cannon 185 7.1 0.041 0.044 67.730 68.534 multiply_cannon_loop 185 8.1 0.218 0.227 63.914 65.027 multiply_cannon_multrec 1480 9.1 42.182 43.979 42.674 44.474 mp_waitall_1 11936 10.3 19.130 21.744 19.130 21.744 multiply_cannon_metrocomm3 1480 9.1 0.018 0.020 11.316 15.763 make_m2s 370 7.1 0.034 0.037 9.052 9.132 make_images 370 8.1 0.697 0.721 8.932 9.016 multiply_cannon_metrocomm1 1480 9.1 0.010 0.012 4.594 7.826 calculate_norms 2960 9.1 5.035 5.230 5.035 5.230 make_images_data 370 9.1 0.012 0.014 3.612 3.956 arnoldi_extremal 12 6.1 0.000 0.001 3.948 3.955 arnoldi_normal_ev 12 7.1 0.002 0.008 3.948 3.954 build_subspace 23 8.1 0.039 0.052 3.824 3.827 mp_sum_l 1039 5.9 2.911 3.537 2.911 3.537 ls_scf_dm_to_ks 11 5.0 0.000 0.000 3.249 3.340 dbcsr_matrix_vector_mult 652 9.0 0.019 0.081 3.193 3.269 hybrid_alltoall_any 393 9.9 0.321 1.616 2.942 3.240 dbcsr_complete_redistribute 23 7.5 1.812 1.926 2.873 2.969 matrix_ls_to_qs 11 6.0 0.000 0.000 2.834 2.951 ls_scf_init_scf 1 4.0 0.000 0.000 2.860 2.861 ls_scf_init_matrix_S 1 5.0 0.000 0.000 2.823 2.832 make_images_pack 370 9.1 2.478 2.698 2.483 2.703 matrix_decluster 11 7.0 0.000 0.000 2.584 2.682 dbcsr_matrix_vector_mult_local 652 10.0 2.550 2.676 2.554 2.680 matrix_sqrt_Newton_Schulz 1 6.0 0.001 0.001 2.584 2.587 dbcsr_multiply_generic_mpsum_f 137 7.1 0.000 0.001 2.018 2.508 buffer_matrices_ensure_size 370 8.1 2.200 2.308 2.200 2.308 dbcsr_add_d 280 6.0 0.002 0.002 2.089 2.159 dbcsr_add_anytype 280 7.0 1.131 1.196 2.087 2.158 dbcsr_finalize 646 7.5 0.014 0.014 1.944 2.075 ------------------------------------------------------------------------------- Plot: name="bench_dftb_timings_32omp", title="Timings of bench_dftb with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="bench_dftb_timings_32omp", name="rest", label="rest", y=108.93400000000003, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="dbcsr_copy_into_existing", label="dbcsr_copy_into_existing", y=52.651, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=50.964, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="dbcsr_complete_redistribute", label="dbcsr_complete_redistribute", y=41.279, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="dbcsr_matrix_vector_mult_local", label="dbcsr_matrix_vector_mult_local", y=35.085, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="setup_rec_index_2d", label="setup_rec_index_2d", y=17.795, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="calculate_norms", label="calculate_norms", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="mp_sum_l", label="mp_sum_l", y=0.0, yerr=0.0 Plot: name="bench_dftb_timings_32mpi", title="Timings of bench_dftb with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="bench_dftb_timings_32mpi", name="rest", label="rest", y=20.38300000000001, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="dbcsr_copy_into_existing", label="dbcsr_copy_into_existing", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=42.182, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="dbcsr_complete_redistribute", label="dbcsr_complete_redistribute", y=1.812, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="dbcsr_matrix_vector_mult_local", label="dbcsr_matrix_vector_mult_local", y=2.55, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="setup_rec_index_2d", label="setup_rec_index_2d", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=19.13, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="calculate_norms", label="calculate_norms", y=5.035, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="mp_sum_l", label="mp_sum_l", y=2.911, yerr=0.0 Running dbcsr.inp with 1 threads and 32 ranks... done. Running dbcsr.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/dbcsr_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.006 0.006 94.583 94.583 lib_test 1 2.0 0.000 0.000 94.576 94.576 dbcsr_run_tests 3 3.0 0.002 0.002 94.576 94.576 test_multiplies_multiproc 3 4.0 0.001 0.001 75.401 75.401 dbcsr_redistribute 9 5.0 48.127 48.127 51.566 51.566 dbcsr_multiply_generic 9 5.0 0.001 0.001 22.209 22.209 dbcsr_make_random_matrix 9 4.0 13.893 13.893 19.089 19.089 multiply_cannon 9 6.0 0.002 0.002 15.913 15.913 multiply_cannon_loop 9 7.0 0.003 0.003 15.411 15.411 multiply_cannon_multrec 9 8.0 15.407 15.407 15.408 15.408 dbcsr_finalize 27 5.7 0.004 0.004 8.815 8.815 dbcsr_merge_all 18 6.5 3.079 3.079 8.103 8.103 tree_to_linear_d 9 7.0 3.115 3.115 3.115 3.115 mp_alltoall_d11v 27 6.0 3.110 3.110 3.110 3.110 dbcsr_data_release 975 7.6 2.425 2.425 2.425 2.425 make_m2s 18 6.0 0.001 0.001 2.107 2.107 make_images 18 7.0 0.657 0.657 2.036 2.036 ------------------------------------------------------------------------------- From /workspace/artifacts/dbcsr_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.003 0.005 26.137 26.138 lib_test 1 2.0 0.000 0.000 26.107 26.128 dbcsr_run_tests 3 3.0 0.001 0.001 26.106 26.127 test_multiplies_multiproc 3 4.0 0.001 0.001 24.955 25.038 dbcsr_multiply_generic 9 5.0 0.001 0.002 23.057 23.162 multiply_cannon 9 6.0 0.002 0.003 20.817 21.281 multiply_cannon_loop 9 7.0 0.004 0.004 20.368 20.800 multiply_cannon_multrec 72 8.0 17.169 18.315 17.170 18.316 mp_waitall_1 576 9.2 3.570 4.244 3.570 4.244 multiply_cannon_metrocomm1 72 8.0 0.002 0.002 2.811 3.646 multiply_cannon_metrocomm3 72 8.0 0.000 0.001 0.375 1.353 mp_sum_l 310 2.7 0.535 1.321 0.535 1.321 dbcsr_multiply_generic_mpsum_f 9 6.0 0.000 0.000 0.531 1.317 dbcsr_make_random_matrix 9 4.0 0.885 0.927 1.110 1.152 make_m2s 18 6.0 0.001 0.001 0.905 0.981 make_images 18 7.0 0.026 0.028 0.902 0.978 dbcsr_finalize 27 5.7 0.000 0.001 0.863 0.967 dbcsr_merge_all 18 6.5 0.137 0.168 0.760 0.857 dbcsr_data_release 444 7.6 0.643 0.761 0.643 0.761 dbcsr_redistribute 9 5.0 0.386 0.449 0.672 0.713 dbcsr_destroy 111 5.9 0.005 0.051 0.548 0.640 make_images_data 18 8.0 0.001 0.001 0.432 0.535 ------------------------------------------------------------------------------- Plot: name="dbcsr_timings_32omp", title="Timings of dbcsr with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="dbcsr_timings_32omp", name="rest", label="rest", y=8.506, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="dbcsr_redistribute", label="dbcsr_redistribute", y=48.127, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=15.407, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="dbcsr_make_random_matrix", label="dbcsr_make_random_matrix", y=13.893, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="tree_to_linear_d", label="tree_to_linear_d", y=3.115, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="mp_alltoall_d11v", label="mp_alltoall_d11v", y=3.11, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="dbcsr_data_release", label="dbcsr_data_release", y=2.425, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="mp_sum_l", label="mp_sum_l", y=0.0, yerr=0.0 Plot: name="dbcsr_timings_32mpi", title="Timings of dbcsr with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="dbcsr_timings_32mpi", name="rest", label="rest", y=2.948999999999998, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="dbcsr_redistribute", label="dbcsr_redistribute", y=0.386, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=17.169, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="dbcsr_make_random_matrix", label="dbcsr_make_random_matrix", y=0.885, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="tree_to_linear_d", label="tree_to_linear_d", y=0.0, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="mp_alltoall_d11v", label="mp_alltoall_d11v", y=0.0, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="dbcsr_data_release", label="dbcsr_data_release", y=0.643, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=3.57, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="mp_sum_l", label="mp_sum_l", y=0.535, yerr=0.0 Running MQAE_single_node.inp with 1 threads and 32 ranks... done. Running MQAE_single_node.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/MQAE_single_node_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.042 0.042 138.974 138.974 qs_mol_dyn_low 1 2.0 0.004 0.004 137.130 137.130 velocity_verlet 5 3.0 0.004 0.004 110.838 110.838 qmmm_el_coupling 6 3.8 0.000 0.000 60.980 60.980 qmmm_elec_with_gaussian 6 4.8 0.182 0.182 60.974 60.974 qmmm_elec_with_gaussian_low 6 5.8 0.000 0.000 59.969 59.969 qmmm_elec_gaussian_low_G 6 6.8 58.571 58.571 58.571 58.571 qs_forces 6 3.8 0.001 0.001 56.052 56.052 qs_energies 6 4.8 0.000 0.000 49.871 49.871 scf_env_do_scf 6 5.8 0.001 0.001 46.131 46.131 scf_env_do_scf_inner_loop 39 6.8 0.003 0.003 38.916 38.916 rebuild_ks_matrix 45 8.4 0.000 0.000 38.689 38.689 qs_ks_build_kohn_sham_matrix 45 9.4 0.007 0.007 38.689 38.689 qs_ks_update_qs_env 45 7.8 0.000 0.000 33.231 33.231 pw_transfer 966 11.9 0.070 0.070 23.154 23.154 fft_wrap_pw1pw2 801 13.0 0.009 0.009 22.821 22.821 fft_wrap_pw1pw2_150 507 14.3 2.372 2.372 22.321 22.321 qs_vxc_create 45 10.4 0.001 0.001 20.941 20.941 xc_vxc_pw_create 45 11.4 4.297 4.297 20.940 20.940 fist_calc_energy_force 6 3.8 0.002 0.002 10.968 10.968 pw_scatter_s 429 15.4 10.375 10.375 10.375 10.375 qs_rho_update_rho 45 7.9 0.000 0.000 10.022 10.022 calculate_rho_elec 45 8.9 0.885 0.885 10.022 10.022 force_nonbond 6 4.8 9.738 9.738 9.738 9.738 xc_rho_set_and_dset_create 45 12.4 0.248 0.248 9.632 9.632 fft3d_s 802 15.0 8.738 8.738 8.748 8.748 qmmm_forces 6 3.8 0.001 0.001 8.314 8.314 qmmm_forces_with_gaussian 6 4.8 0.127 0.127 7.810 7.810 pw_integral_ab 2539 7.4 7.376 7.376 7.376 7.376 init_scf_loop 6 6.8 0.000 0.000 7.209 7.209 qmmm_force_with_gaussian_low 6 5.8 0.000 0.000 6.683 6.683 qs_ks_ddapc 45 10.4 0.001 0.001 6.503 6.503 qmmm_forces_gaussian_low_G 6 6.8 5.610 5.610 5.610 5.610 qs_ks_update_qs_env_forces 6 4.8 0.000 0.000 5.470 5.470 pw_poisson_solve 51 9.9 2.289 2.289 5.222 5.222 density_rs2pw 45 9.9 0.002 0.002 4.582 4.582 grid_collocate_task_list 45 9.9 4.554 4.554 4.554 4.554 sum_up_and_integrate 45 10.4 0.233 0.233 4.269 4.269 integrate_v_rspace 45 11.4 0.013 0.013 4.035 4.035 cp_ddapc_apply_CD 45 11.4 0.006 0.006 4.021 4.021 ------------------------------------------------------------------------------- From /workspace/artifacts/MQAE_single_node_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.033 0.038 83.434 83.435 qs_mol_dyn_low 1 2.0 0.004 0.005 81.872 81.968 qs_forces 6 3.8 0.001 0.001 60.341 60.341 qs_energies 6 4.8 0.001 0.001 57.524 57.524 scf_env_do_scf 6 5.8 0.000 0.001 56.076 56.076 scf_env_do_scf_inner_loop 113 6.2 0.003 0.009 53.835 53.836 rebuild_ks_matrix 119 8.1 0.000 0.000 39.635 39.654 qs_ks_build_kohn_sham_matrix 119 9.1 0.020 0.022 39.635 39.654 qs_ks_update_qs_env 119 7.3 0.001 0.001 37.269 37.287 velocity_verlet 5 3.0 0.002 0.003 34.196 34.201 pw_transfer 2446 11.8 0.261 0.283 25.032 25.210 fft_wrap_pw1pw2 2059 12.8 0.033 0.036 24.272 24.499 fft_wrap_pw1pw2_150 1321 14.0 2.165 2.321 23.559 23.729 qs_vxc_create 119 10.1 0.003 0.004 20.010 20.014 xc_vxc_pw_create 119 11.1 0.438 0.621 20.006 20.010 fft3d_ps 2059 14.8 10.908 11.812 18.190 18.684 qs_rho_update_rho 119 7.3 0.001 0.001 15.713 15.714 calculate_rho_elec 119 8.3 0.086 0.095 15.713 15.714 sum_up_and_integrate 119 10.1 0.084 0.090 14.295 14.347 integrate_v_rspace 119 11.1 0.004 0.005 14.210 14.267 qmmm_forces 6 3.8 0.003 0.003 11.970 11.970 qmmm_forces_with_gaussian 6 4.8 0.366 0.446 11.593 11.802 rs_pw_transfer 988 11.5 0.015 0.017 10.643 11.275 density_rs2pw 119 9.3 0.010 0.012 9.342 9.880 xc_rho_set_and_dset_create 119 12.1 0.503 0.597 9.489 9.849 qmmm_el_coupling 6 3.8 0.000 0.000 8.451 8.522 qmmm_elec_with_gaussian 6 4.8 0.333 0.440 8.448 8.518 potential_pw2rs 119 12.1 0.010 0.011 8.326 8.340 grid_collocate_task_list 119 9.3 6.094 6.581 6.094 6.581 mp_alltoall_z22v 2059 16.8 4.396 6.016 4.396 6.016 qmmm_force_with_gaussian_low 6 5.8 0.000 0.000 5.756 5.838 grid_integrate_task_list 119 12.1 5.522 5.778 5.522 5.778 qmmm_forces_gaussian_low_G 6 6.8 4.711 4.790 4.711 4.790 rs_pw_transfer_PW2RS_150 125 13.9 2.405 2.501 4.647 4.701 pw_restrict_s3 18 5.8 2.317 2.525 4.335 4.495 rs_pw_transfer_RS2PW_150 125 11.2 1.946 2.129 3.840 4.438 yz_to_x 964 15.3 1.074 1.222 3.280 4.403 mp_waitany 4028 12.8 3.251 4.334 3.251 4.334 x_to_yz 1095 16.3 1.766 1.917 3.956 4.197 qs_scf_new_mos 113 7.2 0.001 0.001 3.604 3.612 qs_scf_loop_do_ot 113 8.2 0.001 0.001 3.604 3.612 qmmm_elec_with_gaussian:spline 6 5.8 0.000 0.000 3.499 3.567 pw_prolongate_s3 18 6.8 1.848 1.933 3.499 3.567 qmmm_elec_with_gaussian_low 6 5.8 0.000 0.000 3.359 3.494 ot_scf_mini 113 9.2 0.002 0.002 3.446 3.453 dbcsr_multiply_generic 2588 12.3 0.096 0.113 3.276 3.324 qs_ks_ddapc 119 10.1 0.003 0.004 2.766 2.917 pw_integral_ab 2761 7.7 2.118 2.141 2.468 2.701 qmmm_elec_gaussian_low_G 6 6.8 2.419 2.544 2.419 2.544 mp_sum_dm3 33 5.7 2.209 2.411 2.209 2.411 qs_ks_update_qs_env_forces 6 4.8 0.000 0.000 2.376 2.377 init_scf_loop 6 6.8 0.000 0.000 2.237 2.238 ot_mini 113 10.2 0.001 0.001 2.177 2.189 pw_gather_p 964 14.3 1.986 2.129 1.986 2.129 mp_waitall_1 188862 16.2 1.853 2.024 1.853 2.024 pw_scatter_p 1095 15.3 1.839 1.915 1.839 1.915 pw_derive 732 12.5 1.649 1.826 1.649 1.826 qs_ot_get_derivative 113 11.2 0.001 0.001 1.723 1.732 ------------------------------------------------------------------------------- Plot: name="MQAE_single_node_timings_32omp", title="Timings of MQAE_single_node with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="MQAE_single_node_timings_32omp", name="rest", label="rest", y=34.011999999999986, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="qmmm_elec_gaussian_low_G", label="qmmm_elec_gaussian_low_G", y=58.571, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="pw_scatter_s", label="pw_scatter_s", y=10.375, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="force_nonbond", label="force_nonbond", y=9.738, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="fft3d_s", label="fft3d_s", y=8.738, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="pw_integral_ab", label="pw_integral_ab", y=7.376, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="qmmm_forces_gaussian_low_G", label="qmmm_forces_gaussian_low_G", y=5.61, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=4.554, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="fft3d_ps", label="fft3d_ps", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="mp_alltoall_z22v", label="mp_alltoall_z22v", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=0.0, yerr=0.0 Plot: name="MQAE_single_node_timings_32mpi", title="Timings of MQAE_single_node with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="MQAE_single_node_timings_32mpi", name="rest", label="rest", y=47.266, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="qmmm_elec_gaussian_low_G", label="qmmm_elec_gaussian_low_G", y=2.419, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="pw_scatter_s", label="pw_scatter_s", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="force_nonbond", label="force_nonbond", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="fft3d_s", label="fft3d_s", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="pw_integral_ab", label="pw_integral_ab", y=2.118, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="qmmm_forces_gaussian_low_G", label="qmmm_forces_gaussian_low_G", y=4.711, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=6.094, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="fft3d_ps", label="fft3d_ps", y=10.908, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="mp_alltoall_z22v", label="mp_alltoall_z22v", y=4.396, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=5.522, yerr=0.0 Summary: Performance test took 56 minutes. Status: OK Uploading artifacts... done EndDate: 2021-12-16 20:49:33+00:00