StartDate: 2021-07-11 19:39:39+00:00 CpuId: 64x Intel Xeon W 2000 / D-2100 (Skylake / Cascade Lake) {Skylake}, 14nm CommitSHA: 55d3731eea70b965a4015ad0e31caf946183de86 CommitTime: 2021-07-11 20:19:29 +0200 CommitAuthor: Tiziano Müller CommitSubject: scf: add MO occupation print stat command Trying to pull image cp2k-toolchain-mpich... success :-) Trying to pull image cp2k-perf-openmp... image not found. #################### Building Image cp2k-perf-openmp #################### Dockerfile: /tools/docker/Dockerfile.test_performance Build-Args: TOOLCHAIN=gcr.io/cp2k-org-project/img_cp2k-toolchain-mpich-arch-b51:gittree-2c6e181-buildargs-68b329d Sending build context to Docker daemon 73.73kB Step 1/9 : ARG TOOLCHAIN=cp2k/toolchain:latest Step 2/9 : FROM ${TOOLCHAIN} ---> 629114bbc64e Step 3/9 : WORKDIR /workspace ---> Running in bfed46f203a8 Removing intermediate container bfed46f203a8 ---> 9c66f91925f0 Step 4/9 : COPY ./scripts/install_basics.sh . ---> cb293a754af9 Step 5/9 : RUN ./install_basics.sh ---> Running in c59c564a5741 Installing Ubuntu packages... debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libpopt0:amd64. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 14724 files and directories currently installed.) Preparing to unpack .../libpopt0_1.16-14_amd64.deb ... Unpacking libpopt0:amd64 (1.16-14) ... Selecting previously unselected package rsync. Preparing to unpack .../rsync_3.1.3-8_amd64.deb ... Unpacking rsync (3.1.3-8) ... Setting up libpopt0:amd64 (1.16-14) ... Setting up rsync (3.1.3-8) ... invoke-rc.d: could not determine current runlevel invoke-rc.d: policy-rc.d denied execution of start. Processing triggers for libc-bin (2.31-0ubuntu9.2) ... done. Cloning cp2k repository... done. Removing intermediate container c59c564a5741 ---> 4797a89d4dd1 Step 6/9 : COPY ./scripts/install_performance.sh . ---> a9a4cadc2194 Step 7/9 : RUN ./install_performance.sh "local" ---> Running in 9794c1cdaef0 './local.pdbg' -> '/opt/cp2k-toolchain/install/arch/local.pdbg' './local.psmp' -> '/opt/cp2k-toolchain/install/arch/local.psmp' './local.sdbg' -> '/opt/cp2k-toolchain/install/arch/local.sdbg' './local.ssmp' -> '/opt/cp2k-toolchain/install/arch/local.ssmp' './local_coverage.pdbg' -> '/opt/cp2k-toolchain/install/arch/local_coverage.pdbg' './local_static.psmp' -> '/opt/cp2k-toolchain/install/arch/local_static.psmp' './local_static.ssmp' -> '/opt/cp2k-toolchain/install/arch/local_static.ssmp' './local_warn.psmp' -> '/opt/cp2k-toolchain/install/arch/local_warn.psmp' Warming cache by trying to compile cp2k... done. Removing intermediate container 9794c1cdaef0 ---> 4fb73b510d1b Step 8/9 : COPY ./scripts/ci_entrypoint.sh ./scripts/test_performance.sh ./scripts/plot_performance.py ./ ---> 3994fef93f55 Step 9/9 : CMD ["./ci_entrypoint.sh", "./test_performance.sh", "local"] ---> Running in 18e753968bd1 Removing intermediate container 18e753968bd1 ---> 6e3a0dd19b42 Successfully built 6e3a0dd19b42 Successfully tagged gcr.io/cp2k-org-project/img_cp2k-perf-openmp-arch-b51:gittree-89e326b-buildargs-fb7e31a Pushing image cp2k-perf-openmp... done. #################### Running Image cp2k-perf-openmp #################### ========== Fetching Git Commit ========== CommitSHA: 55d3731eea70b965a4015ad0e31caf946183de86 CommitTime: 2021-07-11 20:19:29 +0200 CommitAuthor: Tiziano Müller CommitSubject: scf: add MO occupation print stat command ========== Running Test ========== ========== Compiling CP2K ========== Compiling cp2k... done. ========== Running Performance Test ========== Running H2O-64.inp with 1 threads and 32 ranks... done. Running H2O-64.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/H2O-64_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.039 0.039 154.721 154.721 qs_mol_dyn_low 1 2.0 0.004 0.004 153.910 153.910 qs_forces 11 3.9 0.002 0.002 153.852 153.852 qs_energies 11 4.9 0.001 0.001 142.676 142.676 scf_env_do_scf 11 5.9 0.001 0.001 114.348 114.348 velocity_verlet 10 3.0 0.002 0.002 103.981 103.981 scf_env_do_scf_inner_loop 108 6.5 0.010 0.010 88.216 88.216 rebuild_ks_matrix 119 8.3 0.001 0.001 41.733 41.733 qs_ks_build_kohn_sham_matrix 119 9.3 0.018 0.018 41.733 41.733 qs_ks_update_qs_env 119 7.6 0.001 0.001 37.338 37.338 qs_rho_update_rho 119 7.7 0.001 0.001 35.331 35.331 calculate_rho_elec 119 8.7 1.519 1.519 35.330 35.330 sum_up_and_integrate 119 10.3 0.422 0.422 29.945 29.945 grid_collocate_task_list 119 9.7 29.619 29.619 29.619 29.619 integrate_v_rspace 119 11.3 0.129 0.129 29.523 29.523 grid_integrate_task_list 119 12.3 27.049 27.049 27.049 27.049 init_scf_loop 11 6.9 0.000 0.000 25.932 25.932 qs_scf_new_mos 108 7.5 0.001 0.001 22.914 22.914 qs_scf_loop_do_ot 108 8.5 0.001 0.001 22.913 22.913 dbcsr_multiply_generic 2286 12.5 0.162 0.162 21.865 21.865 ot_scf_mini 108 9.5 0.003 0.003 21.616 21.616 prepare_preconditioner 11 7.9 0.000 0.000 21.257 21.257 make_preconditioner 11 8.9 0.000 0.000 21.257 21.257 make_full_inverse_cholesky 11 9.9 0.000 0.000 19.084 19.084 ot_mini 108 10.5 0.001 0.001 14.131 14.131 init_scf_run 11 5.9 0.001 0.001 14.071 14.071 scf_env_initial_rho_setup 11 6.9 0.001 0.001 14.070 14.070 wfi_extrapolate 11 7.9 0.001 0.001 13.236 13.236 make_m2s 4572 13.5 0.061 0.061 13.131 13.131 cp_gemm 81 9.0 0.001 0.001 10.952 10.952 cp_gemm_cosma 81 10.0 10.951 10.951 10.951 10.951 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 10.472 10.472 ot_diis_step 108 11.5 0.005 0.005 7.415 7.415 cp_fm_cholesky_decompose 22 10.9 7.373 7.373 7.373 7.373 pw_transfer 1439 11.6 0.093 0.093 7.168 7.168 fft_wrap_pw1pw2 1201 12.6 0.010 0.010 6.910 6.910 dbcsr_complete_redistribute 329 12.2 3.186 3.186 6.831 6.831 make_images 4572 14.5 2.564 2.564 6.739 6.739 dbcsr_make_dense_low 5837 15.5 0.087 0.087 6.716 6.716 qs_ot_get_derivative 108 11.5 0.001 0.001 6.711 6.711 make_dense_data 5837 16.5 5.951 5.951 6.612 6.612 apply_preconditioner_dbcsr 119 12.6 0.000 0.000 6.383 6.383 apply_single 119 13.6 0.000 0.000 6.383 6.383 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 6.297 6.297 qs_env_update_s_mstruct 11 6.9 0.000 0.000 6.279 6.279 dbcsr_make_images_dense 3978 14.8 0.023 0.023 6.001 6.001 fft_wrap_pw1pw2_140 487 13.2 0.541 0.541 5.812 5.812 qs_create_task_list 11 7.9 0.000 0.000 5.734 5.734 generate_qs_task_list 11 8.9 3.865 3.865 5.733 5.733 copy_dbcsr_to_fm 153 11.3 0.003 0.003 5.565 5.565 cp_fm_cholesky_invert 11 10.9 4.943 4.943 4.943 4.943 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 4.877 4.877 pw_poisson_solve 119 10.3 2.009 2.009 4.730 4.730 transfer_dbcsr_to_fm 11 10.9 0.000 0.000 4.628 4.628 multiply_cannon 2286 13.5 0.255 0.255 4.615 4.615 dbcsr_copy 2102 12.0 0.236 0.236 4.449 4.449 density_rs2pw 119 9.7 0.006 0.006 4.192 4.192 dbcsr_copy_into_existing 22 7.9 4.171 4.171 4.172 4.172 multiply_cannon_loop 2286 14.5 0.047 0.047 3.945 3.945 multiply_cannon_multrec 2286 15.5 3.835 3.835 3.896 3.896 qs_ot_get_p 119 10.4 0.001 0.001 3.839 3.839 qs_energies_compute_matrix_w 11 5.9 0.000 0.000 3.596 3.596 calculate_w_matrix_ot 11 6.9 0.009 0.009 3.596 3.596 build_core_hamiltonian_matrix 11 6.9 0.001 0.001 3.565 3.565 copy_fm_to_dbcsr 176 11.2 0.002 0.002 3.422 3.422 ------------------------------------------------------------------------------- From /workspace/artifacts/H2O-64_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.010 0.015 71.388 71.389 qs_mol_dyn_low 1 2.0 0.005 0.005 71.265 71.271 qs_forces 11 3.9 0.002 0.002 71.210 71.210 qs_energies 11 4.9 0.001 0.002 66.165 66.167 scf_env_do_scf 11 5.9 0.001 0.001 60.983 60.985 scf_env_do_scf_inner_loop 108 6.5 0.003 0.011 56.492 56.492 velocity_verlet 10 3.0 0.002 0.002 42.075 42.077 rebuild_ks_matrix 119 8.3 0.001 0.001 28.446 28.477 qs_ks_build_kohn_sham_matrix 119 9.3 0.020 0.022 28.445 28.477 qs_ks_update_qs_env 119 7.6 0.001 0.001 25.175 25.206 qs_rho_update_rho 119 7.7 0.001 0.001 22.635 22.647 calculate_rho_elec 119 8.7 0.048 0.049 22.634 22.646 sum_up_and_integrate 119 10.3 0.044 0.048 22.611 22.642 integrate_v_rspace 119 11.3 0.004 0.005 22.567 22.597 grid_collocate_task_list 119 9.7 16.511 17.244 16.511 17.244 grid_integrate_task_list 119 12.3 16.470 17.144 16.470 17.144 dbcsr_multiply_generic 2286 12.5 0.119 0.123 16.352 16.474 qs_scf_new_mos 108 7.5 0.001 0.001 13.353 13.388 qs_scf_loop_do_ot 108 8.5 0.001 0.001 13.352 13.387 ot_scf_mini 108 9.5 0.003 0.003 12.532 12.569 multiply_cannon 2286 13.5 0.223 0.227 11.013 11.275 multiply_cannon_loop 2286 14.5 0.200 0.214 9.988 10.271 mp_waitall_1 169478 16.3 8.032 8.282 8.032 8.282 ot_mini 108 10.5 0.001 0.001 7.454 7.495 rs_pw_transfer 974 11.9 0.016 0.017 6.502 7.230 density_rs2pw 119 9.7 0.008 0.008 5.558 6.302 pw_transfer 1439 11.6 0.143 0.154 5.525 5.600 multiply_cannon_metrocomm3 18288 15.5 0.068 0.073 5.084 5.420 fft_wrap_pw1pw2 1201 12.6 0.013 0.014 5.236 5.320 potential_pw2rs 119 12.3 0.009 0.010 4.931 4.940 fft_wrap_pw1pw2_140 487 13.2 0.525 0.558 4.535 4.703 init_scf_loop 11 6.9 0.000 0.001 4.474 4.475 multiply_cannon_multrec 18288 15.5 3.814 3.972 3.829 3.987 fft3d_ps 1201 14.6 2.144 2.268 3.894 3.968 ot_diis_step 108 11.5 0.004 0.005 3.759 3.760 apply_preconditioner_dbcsr 119 12.6 0.000 0.000 3.693 3.727 apply_single 119 13.6 0.001 0.001 3.692 3.727 make_m2s 4572 13.5 0.070 0.074 3.647 3.704 qs_ot_get_derivative 108 11.5 0.001 0.001 3.659 3.697 init_scf_run 11 5.9 0.000 0.002 3.549 3.549 scf_env_initial_rho_setup 11 6.9 0.000 0.001 3.549 3.549 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 3.478 3.486 wfi_extrapolate 11 7.9 0.001 0.001 3.187 3.187 make_images 4572 14.5 0.180 0.185 3.018 3.077 mp_waitany 9880 13.7 2.258 2.962 2.258 2.962 rs_pw_transfer_RS2PW_140 130 11.5 0.498 0.536 2.031 2.771 rs_pw_transfer_PW2RS_140 130 13.9 1.153 1.211 2.432 2.465 mp_alltoall_d11v 2130 13.8 1.433 1.962 1.433 1.962 qs_ot_get_p 119 10.4 0.001 0.001 1.742 1.777 rs_gather_matrices 119 12.3 0.131 0.144 1.116 1.601 prepare_preconditioner 11 7.9 0.000 0.000 1.523 1.537 make_preconditioner 11 8.9 0.000 0.000 1.523 1.537 make_images_data 4572 15.5 0.056 0.060 1.411 1.525 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 1.426 1.522 ------------------------------------------------------------------------------- Plot: name="H2O-64_timings_32omp", title="Timings of H2O-64 with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="H2O-64_timings_32omp", name="rest", label="rest", y=69.94300000000001, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=29.619, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=27.049, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=10.951, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=7.373, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="make_dense_data", label="make_dense_data", y=5.951, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=3.835, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="mp_waitany", label="mp_waitany", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 Plot: name="H2O-64_timings_32mpi", title="Timings of H2O-64 with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="H2O-64_timings_32mpi", name="rest", label="rest", y=24.30300000000001, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=16.511, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=16.47, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="make_dense_data", label="make_dense_data", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=3.814, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="mp_waitany", label="mp_waitany", y=2.258, yerr=0.0 PlotPoint: plot="H2O-64_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=8.032, yerr=0.0 Running H2O-64_nonortho.inp with 1 threads and 32 ranks... done. Running H2O-64_nonortho.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/H2O-64_nonortho_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.036 0.036 193.453 193.453 qs_mol_dyn_low 1 2.0 0.004 0.004 192.600 192.600 qs_forces 11 3.9 0.002 0.002 192.531 192.531 qs_energies 11 4.9 0.001 0.001 179.351 179.351 scf_env_do_scf 11 5.9 0.001 0.001 147.175 147.175 velocity_verlet 10 3.0 0.002 0.002 127.460 127.460 scf_env_do_scf_inner_loop 96 6.5 0.009 0.009 117.979 117.979 rebuild_ks_matrix 107 8.3 0.001 0.001 62.121 62.121 qs_ks_build_kohn_sham_matrix 107 9.3 0.017 0.017 62.120 62.120 qs_ks_update_qs_env 107 7.6 0.001 0.001 55.632 55.632 qs_rho_update_rho 107 7.7 0.001 0.001 55.341 55.341 calculate_rho_elec 107 8.7 1.364 1.364 55.340 55.340 sum_up_and_integrate 107 10.3 0.372 0.372 51.306 51.306 integrate_v_rspace 107 11.3 0.118 0.118 50.934 50.934 grid_collocate_task_list 107 9.7 50.079 50.079 50.079 50.079 grid_integrate_task_list 107 12.3 48.631 48.631 48.631 48.631 init_scf_loop 11 6.9 0.000 0.000 28.987 28.987 prepare_preconditioner 11 7.9 0.000 0.000 21.589 21.589 make_preconditioner 11 8.9 0.000 0.000 21.589 21.589 qs_scf_new_mos 96 7.5 0.001 0.001 19.433 19.433 qs_scf_loop_do_ot 96 8.5 0.001 0.001 19.433 19.433 make_full_inverse_cholesky 11 9.9 0.000 0.000 19.413 19.413 dbcsr_multiply_generic 1966 12.4 0.144 0.144 18.535 18.535 ot_scf_mini 96 9.5 0.003 0.003 18.269 18.269 init_scf_run 11 5.9 0.001 0.001 16.468 16.468 scf_env_initial_rho_setup 11 6.9 0.001 0.001 16.467 16.467 wfi_extrapolate 11 7.9 0.001 0.001 15.367 15.367 qs_energies_init_hamiltonians 11 5.9 0.000 0.000 11.883 11.883 ot_mini 96 10.5 0.001 0.001 11.710 11.710 make_m2s 3932 13.4 0.053 0.053 10.867 10.867 cp_gemm 81 9.0 0.001 0.001 10.865 10.865 cp_gemm_cosma 81 10.0 10.864 10.864 10.864 10.864 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 8.223 8.223 qs_env_update_s_mstruct 11 6.9 0.000 0.000 7.628 7.628 cp_fm_cholesky_decompose 22 10.9 7.546 7.546 7.546 7.546 qs_create_task_list 11 7.9 0.000 0.000 7.078 7.078 generate_qs_task_list 11 8.9 5.180 5.180 7.078 7.078 dbcsr_complete_redistribute 317 12.2 3.171 3.171 6.979 6.979 pw_transfer 1295 11.6 0.090 0.090 6.693 6.693 fft_wrap_pw1pw2 1081 12.6 0.009 0.009 6.447 6.447 qs_ot_get_derivative 96 11.5 0.001 0.001 5.875 5.875 ot_diis_step 96 11.5 0.005 0.005 5.832 5.832 copy_dbcsr_to_fm 147 11.2 0.003 0.003 5.712 5.712 make_images 3932 14.4 2.128 2.128 5.660 5.660 dbcsr_make_dense_low 4961 15.5 0.075 0.075 5.503 5.503 fft_wrap_pw1pw2_140 439 13.2 0.554 0.554 5.451 5.451 make_dense_data 4961 16.5 4.868 4.868 5.414 5.414 apply_preconditioner_dbcsr 107 12.6 0.000 0.000 5.070 5.070 apply_single 107 13.6 0.000 0.000 5.070 5.070 cp_fm_cholesky_invert 11 10.9 4.961 4.961 4.961 4.961 build_core_hamiltonian_matrix_ 11 4.9 0.001 0.001 4.955 4.955 dbcsr_make_images_dense 3386 14.7 0.021 0.021 4.872 4.872 transfer_dbcsr_to_fm 11 10.9 0.000 0.000 4.731 4.731 dbcsr_copy 1855 11.9 0.220 0.220 4.539 4.539 pw_poisson_solve 107 10.3 2.047 2.047 4.516 4.516 dbcsr_copy_into_existing 22 7.9 4.279 4.279 4.279 4.279 multiply_cannon 1966 13.4 0.227 0.227 4.141 4.141 density_rs2pw 107 9.7 0.005 0.005 3.897 3.897 ------------------------------------------------------------------------------- From /workspace/artifacts/H2O-64_nonortho_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.007 0.010 123.015 123.016 qs_mol_dyn_low 1 2.0 0.005 0.005 122.903 122.909 qs_forces 11 3.9 0.002 0.002 122.851 122.851 qs_energies 11 4.9 0.001 0.001 114.294 114.296 scf_env_do_scf 11 5.9 0.001 0.001 106.380 106.380 scf_env_do_scf_inner_loop 96 6.5 0.003 0.009 98.869 98.869 velocity_verlet 10 3.0 0.002 0.002 72.904 72.905 rebuild_ks_matrix 107 8.3 0.001 0.001 56.841 56.868 qs_ks_build_kohn_sham_matrix 107 9.3 0.018 0.019 56.840 56.868 sum_up_and_integrate 107 10.3 0.038 0.041 51.719 51.748 integrate_v_rspace 107 11.3 0.004 0.004 51.682 51.712 qs_ks_update_qs_env 107 7.6 0.001 0.001 50.022 50.048 qs_rho_update_rho 107 7.7 0.001 0.001 48.209 48.219 calculate_rho_elec 107 8.7 0.043 0.044 48.208 48.218 grid_integrate_task_list 107 12.3 45.575 46.799 45.575 46.799 grid_collocate_task_list 107 9.7 42.620 43.514 42.620 43.514 dbcsr_multiply_generic 1966 12.4 0.102 0.106 14.016 14.071 qs_scf_new_mos 96 7.5 0.001 0.001 11.280 11.306 qs_scf_loop_do_ot 96 8.5 0.001 0.001 11.279 11.306 ot_scf_mini 96 9.5 0.003 0.003 10.580 10.600 multiply_cannon 1966 13.4 0.192 0.199 9.484 9.653 multiply_cannon_loop 1966 14.4 0.169 0.177 8.638 8.882 init_scf_loop 11 6.9 0.000 0.000 7.496 7.496 rs_pw_transfer 878 11.9 0.014 0.016 6.091 7.323 mp_waitall_1 146670 16.2 6.876 7.162 6.876 7.162 qs_ks_update_qs_env_forces 11 4.9 0.000 0.000 7.005 7.012 density_rs2pw 107 9.7 0.007 0.007 5.095 6.358 ot_mini 96 10.5 0.001 0.001 6.297 6.320 init_scf_run 11 5.9 0.000 0.001 6.271 6.271 scf_env_initial_rho_setup 11 6.9 0.000 0.001 6.271 6.271 wfi_extrapolate 11 7.9 0.001 0.001 5.680 5.680 pw_transfer 1295 11.6 0.129 0.137 4.837 4.898 fft_wrap_pw1pw2 1081 12.6 0.012 0.013 4.583 4.645 multiply_cannon_metrocomm3 15728 15.4 0.058 0.060 4.327 4.613 potential_pw2rs 107 12.3 0.008 0.009 4.427 4.441 fft_wrap_pw1pw2_140 439 13.2 0.471 0.490 3.999 4.116 mp_waitany 8968 13.7 2.282 3.551 2.282 3.551 multiply_cannon_multrec 15728 15.4 3.378 3.489 3.390 3.502 fft3d_ps 1081 14.6 1.896 1.990 3.379 3.447 mp_alltoall_d11v 1998 13.7 1.923 3.292 1.923 3.292 apply_preconditioner_dbcsr 107 12.6 0.000 0.000 3.229 3.254 apply_single 107 13.6 0.000 0.001 3.229 3.254 rs_pw_transfer_RS2PW_140 118 11.5 0.419 0.442 2.012 3.233 ot_diis_step 96 11.5 0.004 0.004 3.219 3.219 make_m2s 3932 13.4 0.061 0.063 3.144 3.182 qs_ot_get_derivative 96 11.5 0.001 0.001 3.051 3.073 rs_gather_matrices 107 12.3 0.116 0.127 1.630 2.983 make_images 3932 14.4 0.157 0.162 2.596 2.634 ------------------------------------------------------------------------------- Plot: name="H2O-64_nonortho_timings_32omp", title="Timings of H2O-64_nonortho with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="rest", label="rest", y=71.15299999999999, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=50.079, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=48.631, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=10.864, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=7.546, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="generate_qs_task_list", label="generate_qs_task_list", y=5.18, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="mp_waitany", label="mp_waitany", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=0.0, yerr=0.0 Plot: name="H2O-64_nonortho_timings_32mpi", title="Timings of H2O-64_nonortho with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="rest", label="rest", y=22.284000000000006, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=42.62, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=45.575, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="cp_fm_cholesky_decompose", label="cp_fm_cholesky_decompose", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="generate_qs_task_list", label="generate_qs_task_list", y=0.0, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="mp_waitany", label="mp_waitany", y=2.282, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=6.876, yerr=0.0 PlotPoint: plot="H2O-64_nonortho_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=3.378, yerr=0.0 Running H2O-hyb.inp with 1 threads and 32 ranks... done. Running H2O-hyb.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/H2O-hyb_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.376 0.376 247.374 247.374 qs_energies 1 2.0 0.000 0.000 246.145 246.145 scf_env_do_scf 1 3.0 0.000 0.000 243.569 243.569 qs_ks_update_qs_env 8 5.0 0.000 0.000 234.827 234.827 rebuild_ks_matrix 7 6.0 0.000 0.000 234.714 234.714 qs_ks_build_kohn_sham_matrix 7 7.0 0.002 0.002 234.713 234.713 hfx_ks_matrix 7 8.0 0.000 0.000 170.378 170.378 integrate_four_center 7 9.0 2.474 2.474 170.353 170.353 integrate_four_center_main 7 10.0 1.421 1.421 158.699 158.699 integrate_four_center_bin 446 11.0 157.277 157.277 157.277 157.277 scf_env_do_scf_inner_loop 7 4.0 0.001 0.001 145.747 145.747 init_scf_loop 1 4.0 0.000 0.000 97.805 97.805 cp_gemm 129 10.3 0.001 0.001 49.855 49.855 cp_gemm_cosma 129 11.3 49.853 49.853 49.853 49.853 admm_mo_calc_rho_aux 7 8.0 0.000 0.000 29.535 29.535 admm_fit_mo_coeffs 7 9.0 0.000 0.000 26.976 26.976 admm_mo_merge_derivs 7 8.0 0.000 0.000 24.652 24.652 merge_mo_derivs_diag 7 9.0 0.023 0.023 24.652 24.652 purify_mo_diag 7 10.0 0.001 0.001 13.820 13.820 fit_mo_coeffs 7 10.0 0.000 0.000 13.156 13.156 integrate_four_center_load 7 10.0 0.001 0.001 8.804 8.804 hfx_load_balance 1 11.0 0.003 0.003 8.803 8.803 calculate_rho_elec 15 7.4 0.189 0.189 5.989 5.989 grid_collocate_task_list 15 8.4 5.262 5.262 5.262 5.262 ------------------------------------------------------------------------------- From /workspace/artifacts/H2O-hyb_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.014 0.017 183.962 183.963 qs_energies 1 2.0 0.001 0.001 183.816 183.817 scf_env_do_scf 1 3.0 0.000 0.001 183.216 183.216 qs_ks_update_qs_env 8 5.0 0.000 0.000 180.198 180.198 rebuild_ks_matrix 7 6.0 0.000 0.000 180.178 180.179 qs_ks_build_kohn_sham_matrix 7 7.0 0.002 0.004 180.178 180.179 hfx_ks_matrix 7 8.0 0.000 0.001 171.833 171.839 integrate_four_center 7 9.0 0.216 0.600 171.821 171.825 integrate_four_center_main 7 10.0 0.004 0.005 156.530 161.837 integrate_four_center_bin 448 11.0 156.525 161.833 156.525 161.833 scf_env_do_scf_inner_loop 7 4.0 0.001 0.001 107.484 107.484 init_scf_loop 1 4.0 0.000 0.000 75.731 75.731 integrate_four_center_load 7 10.0 0.000 0.000 8.916 8.918 hfx_load_balance 1 11.0 0.001 0.001 8.916 8.918 mp_sync 70 11.3 5.329 7.510 5.329 7.510 hfx_load_balance_bin 1 12.0 4.299 4.475 4.299 4.475 hfx_load_balance_count 1 12.0 4.290 4.433 4.290 4.433 qs_vxc_create 14 8.0 0.000 0.000 3.870 3.871 xc_vxc_pw_create 14 9.0 0.021 0.024 3.869 3.871 ------------------------------------------------------------------------------- Plot: name="H2O-hyb_timings_32omp", title="Timings of H2O-hyb with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="H2O-hyb_timings_32omp", name="rest", label="rest", y=31.087000000000018, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="integrate_four_center_bin", label="integrate_four_center_bin", y=157.277, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=49.853, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=5.262, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="integrate_four_center", label="integrate_four_center", y=2.474, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="integrate_four_center_main", label="integrate_four_center_main", y=1.421, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="mp_sync", label="mp_sync", y=0.0, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="hfx_load_balance_count", label="hfx_load_balance_count", y=0.0, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32omp", name="hfx_load_balance_bin", label="hfx_load_balance_bin", y=0.0, yerr=0.0 Plot: name="H2O-hyb_timings_32mpi", title="Timings of H2O-hyb with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="H2O-hyb_timings_32mpi", name="rest", label="rest", y=13.298999999999978, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="integrate_four_center_bin", label="integrate_four_center_bin", y=156.525, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=0.0, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=0.0, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="integrate_four_center", label="integrate_four_center", y=0.216, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="integrate_four_center_main", label="integrate_four_center_main", y=0.004, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="mp_sync", label="mp_sync", y=5.329, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="hfx_load_balance_count", label="hfx_load_balance_count", y=4.29, yerr=0.0 PlotPoint: plot="H2O-hyb_timings_32mpi", name="hfx_load_balance_bin", label="hfx_load_balance_bin", y=4.299, yerr=0.0 Running GW_PBE_4benzene.inp with 1 threads and 32 ranks... done. Running GW_PBE_4benzene.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/GW_PBE_4benzene_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.019 0.019 366.020 366.020 qs_energies 1 2.0 0.000 0.000 365.501 365.501 mp2_main 1 3.0 0.000 0.000 361.108 361.108 mp2_gpw_main 1 4.0 0.001 0.001 360.930 360.930 rpa_ri_compute_en 1 5.0 0.000 0.000 339.817 339.817 rpa_num_int 1 6.0 0.001 0.001 339.789 339.789 compute_mat_P_omega 1 7.0 0.002 0.002 199.809 199.809 compute_mat_P_omega_contract 10 8.0 13.432 13.432 198.571 198.571 dbcsr_t_total 2336 9.6 0.016 0.016 186.901 186.901 cp_gemm 105 8.4 0.001 0.001 116.841 116.841 cp_gemm_cosma 105 9.4 116.840 116.840 116.840 116.840 dbcsr_t_contract 787 11.0 49.634 49.634 108.695 108.695 GW_matrix_operations 10 7.0 0.006 0.006 77.385 77.385 dbcsr_t_copy 1103 10.7 21.389 21.389 76.693 76.693 compute_mat_P_omega_calc_M_occ 250 9.0 13.417 13.417 73.744 73.744 dbcsr_tas_total 1149 12.2 0.048 0.048 52.429 52.429 dbcsr_tas_multiply 807 12.1 0.002 0.002 50.939 50.939 compute_mat_P_omega_calc_M_vir 250 9.0 0.001 0.001 44.789 44.789 rpa_num_int_RPA_matrix_operati 10 7.0 0.000 0.000 40.893 40.893 contract_P_omega_with_mat_L 10 8.0 0.000 0.000 38.854 38.854 dbcsr_multiply_generic 837 15.8 0.138 0.138 37.555 37.555 dbcsr_tas_dbcsr 807 14.1 0.002 0.002 37.305 37.305 dbcsr_tas_reserve_blocks_index 3261 13.7 7.431 7.431 30.823 30.823 dbcsr_tas_copy 574 11.4 18.032 18.032 26.532 26.532 dbcsr_tas_mm_1N 524 15.1 0.002 0.002 26.230 26.230 multiply_cannon 837 16.8 0.369 0.369 24.458 24.458 dbcsr_t_reserve_blocks_index 2280 12.5 1.192 1.192 23.575 23.575 dbcsr_reserve_blocks 3717 14.7 22.532 22.532 22.987 22.987 dbcsr_t_reserve_blocks_index_a 2222 11.6 0.011 0.011 22.088 22.088 multiply_cannon_loop 837 17.8 0.152 0.152 21.579 21.579 mp2_ri_gpw_compute_in 1 5.0 0.000 0.000 21.096 21.096 compute_mat_P_omega_copy_M_occ 250 9.0 0.001 0.001 20.726 20.726 multiply_cannon_multrec 837 18.8 20.040 20.040 20.577 20.577 compute_QP_energies 1 7.0 0.000 0.000 19.660 19.660 compute_self_energy_cubic_gw 1 8.0 0.108 0.108 19.659 19.659 compute_mat_P_omega_copy_M_vir 250 9.0 0.002 0.002 15.163 15.163 dbcsr_t_copy_nocomm 251 12.0 11.871 11.871 14.418 14.418 compute_mat_P_omega_calc_P_t 250 9.0 0.001 0.001 11.491 11.491 make_m2s 1674 16.8 0.107 0.107 10.651 10.651 make_images 1674 17.8 5.414 5.414 10.177 10.177 dbcsr_tas_mm_2 251 15.0 0.001 0.001 9.861 9.861 dbcsr_finalize 9888 13.6 1.659 1.659 8.542 8.542 mp2_ri_gpw_compute_in_copy_3c 6 6.0 0.687 0.687 8.039 8.039 contract_cubic_gw 21 9.0 0.000 0.000 7.820 7.820 build_3c_integrals 5 6.0 3.507 3.507 7.772 7.772 ------------------------------------------------------------------------------- From /workspace/artifacts/GW_PBE_4benzene_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.007 0.010 48.952 48.953 qs_energies 1 2.0 0.001 0.002 48.811 48.817 mp2_main 1 3.0 0.000 0.000 47.293 47.300 mp2_gpw_main 1 4.0 0.000 0.000 47.232 47.238 rpa_ri_compute_en 1 5.0 0.000 0.000 45.371 45.378 rpa_num_int 1 6.0 0.000 0.000 45.363 45.369 dbcsr_t_total 2336 9.6 0.015 0.017 40.912 40.912 compute_mat_P_omega 1 7.0 0.001 0.002 39.785 39.790 compute_mat_P_omega_contract 10 8.0 0.731 0.766 39.671 39.676 dbcsr_t_contract 787 11.0 1.848 2.001 30.086 30.090 dbcsr_tas_total 1149 12.2 0.059 0.062 26.445 26.446 dbcsr_tas_multiply 807 12.1 0.002 0.003 26.310 26.312 dbcsr_tas_dbcsr 807 14.1 0.003 0.003 19.295 19.296 dbcsr_multiply_generic 837 15.8 0.068 0.072 15.954 17.045 compute_mat_P_omega_calc_M_occ 250 9.0 0.714 0.749 13.319 13.319 multiply_cannon 837 16.8 0.129 0.140 9.452 10.046 compute_mat_P_omega_calc_P_t 250 9.0 0.001 0.001 9.676 9.676 dbcsr_t_copy 1111 10.7 4.150 4.378 9.252 9.633 dbcsr_tas_mm_1N 524 15.1 0.002 0.003 8.459 9.465 multiply_cannon_loop 837 17.8 0.039 0.043 8.612 9.177 compute_mat_P_omega_calc_M_vir 250 9.0 0.001 0.001 8.501 8.502 mp_sync 8696 11.6 6.403 7.541 6.403 7.541 multiply_cannon_multrec 1386 17.8 6.712 7.201 6.953 7.423 dbcsr_tas_mm_2 251 15.0 0.002 0.002 7.317 7.317 make_m2s 1674 16.8 0.042 0.046 5.580 6.174 make_images 1674 17.8 0.250 0.261 5.502 6.096 compute_QP_energies 1 7.0 0.000 0.001 4.001 4.001 compute_self_energy_cubic_gw 1 8.0 0.005 0.005 3.998 4.000 dbcsr_t_communicate_buffer 1098 11.7 0.080 0.085 3.411 3.596 mp_waitall_2 3776 14.7 3.240 3.522 3.240 3.522 contract_cubic_gw 21 9.0 0.000 0.000 3.122 3.122 make_images_data 1674 18.8 0.034 0.036 2.930 3.089 hybrid_alltoall_any 1724 19.5 2.257 2.513 2.821 2.975 dbcsr_t_reserve_blocks_index 2849 12.4 0.098 0.105 2.607 2.916 dbcsr_tas_reserve_blocks_index 3300 13.8 0.272 0.291 2.565 2.872 dbcsr_t_reserve_blocks_index_a 2791 11.4 0.015 0.016 2.560 2.858 make_images_pack 1674 18.8 2.136 2.671 2.146 2.682 dbcsr_reserve_blocks 3785 14.7 2.282 2.573 2.322 2.614 mp_waitall_1 26582 19.0 1.551 2.002 1.551 2.002 convert_to_new_pgrid 2421 14.1 0.015 0.017 1.789 1.900 dbcsr_copy 3323 15.8 1.730 1.845 1.756 1.871 mp2_ri_gpw_compute_in 1 5.0 0.001 0.001 1.859 1.859 compute_mat_P_omega_copy_M_vir 250 9.0 0.001 0.002 1.697 1.705 dbcsr_add_anytype 909 13.7 0.958 1.009 1.495 1.555 compute_mat_P_omega_copy_M_occ 250 9.0 0.001 0.001 1.506 1.511 scf_env_do_scf 1 3.0 0.000 0.000 1.450 1.450 scf_env_do_scf_inner_loop 17 4.0 0.001 0.001 1.450 1.450 dbcsr_tas_replicate 396 14.1 0.748 0.826 1.229 1.306 mp_max_i 2055 9.6 0.971 1.207 0.971 1.207 dbcsr_finalize 10566 13.5 0.038 0.040 1.011 1.062 ------------------------------------------------------------------------------- Plot: name="GW_PBE_4benzene_timings_32omp", title="Timings of GW_PBE_4benzene with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="rest", label="rest", y=135.58499999999998, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="cp_gemm_cosma", label="cp_gemm_cosma", y=116.84, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="dbcsr_t_contract", label="dbcsr_t_contract", y=49.634, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="dbcsr_reserve_blocks", label="dbcsr_reserve_blocks", y=22.532, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="dbcsr_t_copy", label="dbcsr_t_copy", y=21.389, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=20.04, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="mp_waitall_2", label="mp_waitall_2", y=0.0, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32omp", name="mp_sync", label="mp_sync", y=0.0, yerr=0.0 Plot: name="GW_PBE_4benzene_timings_32mpi", title="Timings of GW_PBE_4benzene with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="rest", label="rest", y=24.317, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="cp_gemm_cosma", label="cp_gemm_cosma", y=0.0, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="dbcsr_t_contract", label="dbcsr_t_contract", y=1.848, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="dbcsr_reserve_blocks", label="dbcsr_reserve_blocks", y=2.282, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="dbcsr_t_copy", label="dbcsr_t_copy", y=4.15, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=6.712, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="mp_waitall_2", label="mp_waitall_2", y=3.24, yerr=0.0 PlotPoint: plot="GW_PBE_4benzene_timings_32mpi", name="mp_sync", label="mp_sync", y=6.403, yerr=0.0 Running bench_dftb.inp with 1 threads and 32 ranks... done. Running bench_dftb.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/bench_dftb_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.115 0.115 274.034 274.034 qs_energies 1 2.0 0.000 0.000 273.847 273.847 ls_scf 1 3.0 0.000 0.000 271.884 271.884 ls_scf_main 1 4.0 0.002 0.002 260.381 260.381 density_matrix_trs4 11 5.0 0.011 0.011 135.799 135.799 ls_scf_dm_to_ks 11 5.0 0.000 0.000 117.343 117.343 matrix_ls_to_qs 11 6.0 0.000 0.000 112.684 112.684 dbcsr_multiply_generic 185 6.1 0.469 0.469 88.452 88.452 dbcsr_copy_into_existing 11 7.0 59.700 59.700 59.701 59.701 dbcsr_complete_redistribute 23 7.5 41.481 41.481 58.055 58.055 matrix_decluster 11 7.0 0.000 0.000 52.982 52.982 multiply_cannon 185 7.1 0.326 0.326 52.370 52.370 multiply_cannon_loop 185 8.1 0.429 0.429 33.249 33.249 multiply_cannon_multrec 185 9.1 31.275 31.275 31.324 31.324 make_m2s 370 7.1 0.031 0.031 30.199 30.199 make_images 370 8.1 7.031 7.031 27.895 27.895 arnoldi_extremal 12 6.1 0.000 0.000 25.474 25.474 arnoldi_normal_ev 12 7.1 0.027 0.027 25.474 25.474 build_subspace 23 8.1 0.136 0.136 24.814 24.814 dbcsr_matrix_vector_mult 652 9.0 0.216 0.216 23.958 23.958 dbcsr_matrix_vector_mult_local 652 10.0 22.648 22.648 22.665 22.665 dbcsr_finalize 646 7.5 0.209 0.209 21.609 21.609 dbcsr_merge_all 597 8.5 3.632 3.632 19.769 19.769 setup_rec_index_2d 370 8.1 18.654 18.654 18.654 18.654 dbcsr_sort_indices 1103 9.9 17.874 17.874 17.874 17.874 quick_finalize 395 10.0 0.524 0.524 15.255 15.255 tree_to_linear_d 110 9.4 14.092 14.092 14.092 14.092 dbcsr_special_finalize 370 9.1 0.003 0.003 14.046 14.046 ls_scf_init_scf 1 4.0 0.000 0.000 10.700 10.700 ls_scf_init_matrix_S 1 5.0 0.000 0.000 10.238 10.238 dbcsr_dot_sd 144 6.3 9.503 9.503 9.504 9.504 matrix_sqrt_Newton_Schulz 1 6.0 0.001 0.001 9.370 9.370 dbcsr_frobenius_norm 142 6.1 8.096 8.096 8.098 8.098 matrix_qs_to_ls 12 5.1 0.000 0.000 7.572 7.572 matrix_cluster 12 6.1 0.000 0.000 7.572 7.572 make_images_data 370 9.1 0.010 0.010 6.673 6.673 dbcsr_new_transposed 2 7.0 0.151 0.151 5.750 5.750 dbcsr_add_d 280 6.0 0.001 0.001 5.569 5.569 dbcsr_add_anytype 280 7.0 1.456 1.456 5.568 5.568 dbcsr_redistribute 2 8.0 5.492 5.492 5.558 5.558 ------------------------------------------------------------------------------- From /workspace/artifacts/bench_dftb_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.011 0.012 93.352 93.353 qs_energies 1 2.0 0.000 0.000 93.247 93.248 ls_scf 1 3.0 0.000 0.000 93.168 93.169 ls_scf_main 1 4.0 0.000 0.002 89.447 89.447 density_matrix_trs4 11 5.0 0.008 0.012 85.716 85.776 dbcsr_multiply_generic 185 6.1 0.068 0.082 80.470 80.871 multiply_cannon 185 7.1 0.042 0.044 66.992 68.094 multiply_cannon_loop 185 8.1 0.207 0.214 63.199 64.696 multiply_cannon_multrec 1480 9.1 41.840 43.981 42.312 44.447 mp_waitall_1 11936 10.3 19.089 21.808 19.089 21.808 multiply_cannon_metrocomm3 1480 9.1 0.017 0.020 11.200 15.263 make_m2s 370 7.1 0.033 0.038 9.012 9.097 make_images 370 8.1 0.723 0.752 8.894 8.982 multiply_cannon_metrocomm1 1480 9.1 0.008 0.009 4.654 6.766 calculate_norms 2960 9.1 4.757 4.979 4.757 4.979 mp_sum_l 1039 5.9 3.108 3.957 3.108 3.957 make_images_data 370 9.1 0.011 0.012 3.599 3.902 arnoldi_extremal 12 6.1 0.000 0.001 3.866 3.872 arnoldi_normal_ev 12 7.1 0.002 0.008 3.865 3.871 build_subspace 23 8.1 0.037 0.049 3.735 3.738 ls_scf_dm_to_ks 11 5.0 0.000 0.000 3.229 3.349 dbcsr_matrix_vector_mult 652 9.0 0.017 0.076 3.122 3.208 hybrid_alltoall_any 393 9.9 0.299 1.521 2.947 3.168 dbcsr_complete_redistribute 23 7.5 1.752 1.878 2.810 2.942 matrix_ls_to_qs 11 6.0 0.000 0.000 2.774 2.907 dbcsr_multiply_generic_mpsum_f 137 7.1 0.000 0.000 2.129 2.876 ls_scf_init_scf 1 4.0 0.000 0.000 2.836 2.836 ls_scf_init_matrix_S 1 5.0 0.000 0.000 2.801 2.808 make_images_pack 370 9.1 2.388 2.708 2.393 2.712 matrix_decluster 11 7.0 0.000 0.000 2.523 2.657 dbcsr_matrix_vector_mult_local 652 10.0 2.454 2.586 2.457 2.590 matrix_sqrt_Newton_Schulz 1 6.0 0.001 0.001 2.557 2.560 buffer_matrices_ensure_size 370 8.1 2.119 2.252 2.119 2.252 dbcsr_add_d 280 6.0 0.001 0.001 2.016 2.086 dbcsr_add_anytype 280 7.0 1.089 1.139 2.015 2.085 dbcsr_finalize 646 7.5 0.013 0.014 1.916 2.030 ------------------------------------------------------------------------------- Plot: name="bench_dftb_timings_32omp", title="Timings of bench_dftb with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="bench_dftb_timings_32omp", name="rest", label="rest", y=100.27599999999998, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="dbcsr_copy_into_existing", label="dbcsr_copy_into_existing", y=59.7, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="dbcsr_complete_redistribute", label="dbcsr_complete_redistribute", y=41.481, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=31.275, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="dbcsr_matrix_vector_mult_local", label="dbcsr_matrix_vector_mult_local", y=22.648, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="setup_rec_index_2d", label="setup_rec_index_2d", y=18.654, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="mp_sum_l", label="mp_sum_l", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32omp", name="calculate_norms", label="calculate_norms", y=0.0, yerr=0.0 Plot: name="bench_dftb_timings_32mpi", title="Timings of bench_dftb with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="bench_dftb_timings_32mpi", name="rest", label="rest", y=20.352000000000004, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="dbcsr_copy_into_existing", label="dbcsr_copy_into_existing", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="dbcsr_complete_redistribute", label="dbcsr_complete_redistribute", y=1.752, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=41.84, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="dbcsr_matrix_vector_mult_local", label="dbcsr_matrix_vector_mult_local", y=2.454, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="setup_rec_index_2d", label="setup_rec_index_2d", y=0.0, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="mp_sum_l", label="mp_sum_l", y=3.108, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=19.089, yerr=0.0 PlotPoint: plot="bench_dftb_timings_32mpi", name="calculate_norms", label="calculate_norms", y=4.757, yerr=0.0 Running dbcsr.inp with 1 threads and 32 ranks... done. Running dbcsr.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/dbcsr_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.005 0.005 100.264 100.264 lib_test 1 2.0 0.000 0.000 100.258 100.258 dbcsr_run_tests 3 3.0 0.003 0.003 100.258 100.258 test_multiplies_multiproc 3 4.0 0.001 0.001 78.736 78.736 dbcsr_redistribute 9 5.0 50.389 50.389 54.116 54.116 dbcsr_multiply_generic 9 5.0 0.001 0.001 22.759 22.759 dbcsr_make_random_matrix 9 4.0 15.328 15.328 21.426 21.426 multiply_cannon 9 6.0 0.002 0.002 15.944 15.944 multiply_cannon_loop 9 7.0 0.004 0.004 15.393 15.393 multiply_cannon_multrec 9 8.0 15.389 15.389 15.390 15.390 dbcsr_finalize 27 5.7 0.004 0.004 9.973 9.973 dbcsr_merge_all 18 6.5 3.285 3.285 9.207 9.207 tree_to_linear_d 9 7.0 3.699 3.699 3.699 3.699 mp_alltoall_d11v 27 6.0 3.376 3.376 3.376 3.376 dbcsr_data_release 975 7.6 2.557 2.557 2.557 2.557 make_m2s 18 6.0 0.001 0.001 2.293 2.293 make_images 18 7.0 0.711 0.711 2.215 2.215 dbcsr_data_copy_aa2 9 7.0 2.178 2.178 2.178 2.178 ------------------------------------------------------------------------------- From /workspace/artifacts/dbcsr_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.003 0.006 25.646 25.647 lib_test 1 2.0 0.000 0.000 25.612 25.635 dbcsr_run_tests 3 3.0 0.001 0.001 25.611 25.633 test_multiplies_multiproc 3 4.0 0.001 0.001 24.491 24.569 dbcsr_multiply_generic 9 5.0 0.001 0.002 22.567 22.650 multiply_cannon 9 6.0 0.002 0.002 20.316 20.687 multiply_cannon_loop 9 7.0 0.003 0.004 19.859 20.243 multiply_cannon_multrec 72 8.0 16.601 17.624 16.602 17.625 mp_waitall_1 576 9.2 3.662 4.328 3.662 4.328 multiply_cannon_metrocomm1 72 8.0 0.001 0.002 2.909 3.724 mp_sum_l 310 2.7 0.517 1.161 0.517 1.161 dbcsr_multiply_generic_mpsum_f 9 6.0 0.000 0.000 0.511 1.155 dbcsr_make_random_matrix 9 4.0 0.881 0.903 1.073 1.105 multiply_cannon_metrocomm3 72 8.0 0.000 0.000 0.338 1.046 make_m2s 18 6.0 0.001 0.001 0.927 1.024 make_images 18 7.0 0.027 0.030 0.923 1.021 dbcsr_finalize 27 5.7 0.000 0.001 0.830 0.899 dbcsr_merge_all 18 6.5 0.138 0.162 0.729 0.839 dbcsr_data_release 444 7.6 0.681 0.817 0.681 0.817 dbcsr_redistribute 9 5.0 0.379 0.427 0.670 0.702 dbcsr_destroy 111 5.9 0.012 0.110 0.593 0.686 make_images_data 18 8.0 0.001 0.001 0.461 0.586 ------------------------------------------------------------------------------- Plot: name="dbcsr_timings_32omp", title="Timings of dbcsr with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="dbcsr_timings_32omp", name="rest", label="rest", y=9.525999999999982, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="dbcsr_redistribute", label="dbcsr_redistribute", y=50.389, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=15.389, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="dbcsr_make_random_matrix", label="dbcsr_make_random_matrix", y=15.328, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="tree_to_linear_d", label="tree_to_linear_d", y=3.699, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="mp_alltoall_d11v", label="mp_alltoall_d11v", y=3.376, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="dbcsr_data_release", label="dbcsr_data_release", y=2.557, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="mp_waitall_1", label="mp_waitall_1", y=0.0, yerr=0.0 PlotPoint: plot="dbcsr_timings_32omp", name="mp_sum_l", label="mp_sum_l", y=0.0, yerr=0.0 Plot: name="dbcsr_timings_32mpi", title="Timings of dbcsr with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="dbcsr_timings_32mpi", name="rest", label="rest", y=2.9250000000000007, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="dbcsr_redistribute", label="dbcsr_redistribute", y=0.379, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="multiply_cannon_multrec", label="multiply_cannon_multrec", y=16.601, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="dbcsr_make_random_matrix", label="dbcsr_make_random_matrix", y=0.881, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="tree_to_linear_d", label="tree_to_linear_d", y=0.0, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="mp_alltoall_d11v", label="mp_alltoall_d11v", y=0.0, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="dbcsr_data_release", label="dbcsr_data_release", y=0.681, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="mp_waitall_1", label="mp_waitall_1", y=3.662, yerr=0.0 PlotPoint: plot="dbcsr_timings_32mpi", name="mp_sum_l", label="mp_sum_l", y=0.517, yerr=0.0 Running MQAE_single_node.inp with 1 threads and 32 ranks... done. Running MQAE_single_node.inp with 32 threads and 1 ranks... done. From /workspace/artifacts/MQAE_single_node_32omp.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.047 0.047 159.852 159.852 qs_mol_dyn_low 1 2.0 0.004 0.004 157.818 157.818 velocity_verlet 5 3.0 0.004 0.004 128.114 128.114 qmmm_el_coupling 6 3.8 0.000 0.000 83.503 83.503 qmmm_elec_with_gaussian 6 4.8 0.209 0.209 83.496 83.496 qmmm_elec_with_gaussian_low 6 5.8 0.000 0.000 81.516 81.516 qmmm_elec_gaussian_low_G 6 6.8 80.109 80.109 80.109 80.109 qs_forces 6 3.8 0.001 0.001 53.045 53.045 qs_energies 6 4.8 0.000 0.000 46.814 46.814 scf_env_do_scf 6 5.8 0.000 0.000 43.227 43.227 rebuild_ks_matrix 45 8.4 0.000 0.000 37.594 37.594 qs_ks_build_kohn_sham_matrix 45 9.4 0.007 0.007 37.594 37.594 scf_env_do_scf_inner_loop 39 6.8 0.003 0.003 37.535 37.535 qs_ks_update_qs_env 45 7.8 0.000 0.000 32.066 32.066 pw_transfer 966 11.9 0.072 0.072 24.292 24.292 fft_wrap_pw1pw2 801 13.0 0.009 0.009 24.006 24.006 fft_wrap_pw1pw2_150 507 14.3 2.590 2.590 23.457 23.457 qs_vxc_create 45 10.4 0.001 0.001 18.022 18.022 xc_vxc_pw_create 45 11.4 1.041 1.041 18.022 18.022 pw_scatter_s 429 15.4 11.208 11.208 11.208 11.208 fist_calc_energy_force 6 3.8 0.002 0.002 10.341 10.341 pw_integral_ab 2539 7.4 10.290 10.290 10.290 10.290 qmmm_forces 6 3.8 0.001 0.001 10.056 10.056 xc_rho_set_and_dset_create 45 12.4 0.218 0.218 9.867 9.867 qs_rho_update_rho 45 7.9 0.000 0.000 9.776 9.776 calculate_rho_elec 45 8.9 0.906 0.906 9.776 9.776 qmmm_forces_with_gaussian 6 4.8 0.147 0.147 9.506 9.506 force_nonbond 6 4.8 8.948 8.948 8.948 8.948 fft3d_s 802 15.0 8.892 8.892 8.902 8.902 qs_ks_ddapc 45 10.4 0.001 0.001 7.198 7.198 qmmm_force_with_gaussian_low 6 5.8 0.000 0.000 7.183 7.183 qmmm_forces_gaussian_low_G 6 6.8 6.093 6.093 6.093 6.093 pw_poisson_solve 51 9.9 2.596 2.596 5.832 5.832 init_scf_loop 6 6.8 0.000 0.000 5.686 5.686 qs_ks_update_qs_env_forces 6 4.8 0.000 0.000 5.541 5.541 density_rs2pw 45 9.9 0.002 0.002 4.771 4.771 sum_up_and_integrate 45 10.4 0.264 0.264 4.727 4.727 cp_ddapc_apply_CD 45 11.4 0.006 0.006 4.515 4.515 integrate_v_rspace 45 11.4 0.010 0.010 4.463 4.463 grid_collocate_task_list 45 9.9 4.099 4.099 4.099 4.099 ------------------------------------------------------------------------------- From /workspace/artifacts/MQAE_single_node_32mpi.out: ------------------------------------------------------------------------------- - - - T I M I N G - - - ------------------------------------------------------------------------------- SUBROUTINE CALLS ASD SELF TIME TOTAL TIME MAXIMUM AVERAGE MAXIMUM AVERAGE MAXIMUM CP2K 1 1.0 0.035 0.037 97.106 97.107 qs_mol_dyn_low 1 2.0 0.004 0.005 95.434 95.531 qs_forces 6 3.8 0.001 0.001 68.539 68.539 qs_energies 6 4.8 0.001 0.001 65.479 65.479 scf_env_do_scf 6 5.8 0.000 0.001 63.836 63.836 scf_env_do_scf_inner_loop 113 6.2 0.003 0.010 61.306 61.307 rebuild_ks_matrix 119 8.1 0.000 0.001 44.834 44.852 qs_ks_build_kohn_sham_matrix 119 9.1 0.022 0.024 44.834 44.851 qs_ks_update_qs_env 119 7.3 0.001 0.001 42.243 42.259 velocity_verlet 5 3.0 0.002 0.003 40.775 40.780 pw_transfer 2446 11.8 0.290 0.303 28.754 28.943 fft_wrap_pw1pw2 2059 12.8 0.033 0.035 27.905 28.104 fft_wrap_pw1pw2_150 1321 14.0 2.343 2.531 26.947 27.160 qs_vxc_create 119 10.1 0.004 0.005 22.332 22.338 xc_vxc_pw_create 119 11.1 0.373 0.460 22.328 22.335 fft3d_ps 2059 14.8 12.561 13.777 21.158 21.378 qs_rho_update_rho 119 7.3 0.001 0.001 18.214 18.216 calculate_rho_elec 119 8.3 0.088 0.097 18.213 18.215 sum_up_and_integrate 119 10.1 0.090 0.099 16.664 16.711 integrate_v_rspace 119 11.1 0.005 0.005 16.574 16.619 qmmm_forces 6 3.8 0.002 0.003 15.066 15.066 qmmm_forces_with_gaussian 6 4.8 0.397 0.495 14.597 14.828 rs_pw_transfer 988 11.5 0.017 0.021 12.127 12.521 density_rs2pw 119 9.3 0.010 0.011 10.687 11.027 xc_rho_set_and_dset_create 119 12.1 0.514 0.595 10.401 10.745 qmmm_el_coupling 6 3.8 0.000 0.000 10.469 10.566 qmmm_elec_with_gaussian 6 4.8 0.343 0.452 10.465 10.562 potential_pw2rs 119 12.1 0.010 0.011 9.511 9.524 grid_collocate_task_list 119 9.3 7.178 7.698 7.178 7.698 mp_alltoall_z22v 2059 16.8 5.474 7.029 5.474 7.029 grid_integrate_task_list 119 12.1 6.593 6.868 6.593 6.868 qmmm_force_with_gaussian_low 6 5.8 0.000 0.000 6.596 6.803 pw_restrict_s3 18 5.8 2.450 2.504 6.342 6.446 pw_integral_ab 2761 7.7 5.317 5.384 5.742 5.980 qmmm_forces_gaussian_low_G 6 6.8 5.503 5.706 5.503 5.706 qmmm_elec_with_gaussian:spline 6 5.8 0.000 0.000 5.106 5.199 pw_prolongate_s3 18 6.8 1.969 2.028 5.106 5.199 rs_pw_transfer_PW2RS_150 125 13.9 2.603 2.681 5.129 5.163 yz_to_x 964 15.3 1.144 1.290 3.917 5.035 x_to_yz 1095 16.3 1.935 2.121 4.637 4.866 rs_pw_transfer_RS2PW_150 125 11.2 2.057 2.151 4.320 4.721 mp_waitany 4028 12.8 3.837 4.469 3.837 4.469 qs_scf_new_mos 113 7.2 0.001 0.001 3.979 3.989 qs_scf_loop_do_ot 113 8.2 0.001 0.001 3.979 3.988 ot_scf_mini 113 9.2 0.002 0.002 3.804 3.812 qmmm_elec_with_gaussian_low 6 5.8 0.000 0.000 3.590 3.790 dbcsr_multiply_generic 2588 12.3 0.092 0.107 3.550 3.619 qs_ks_ddapc 119 10.1 0.003 0.003 3.053 3.198 qmmm_elec_gaussian_low_G 6 6.8 2.618 2.819 2.618 2.819 mp_sum_dm3 33 5.7 2.513 2.673 2.513 2.673 qs_ks_update_qs_env_forces 6 4.8 0.000 0.000 2.605 2.606 mp_waitall_1 188862 16.2 2.359 2.540 2.359 2.540 init_scf_loop 6 6.8 0.000 0.000 2.525 2.526 pw_gather_p 964 14.3 2.232 2.456 2.232 2.456 ot_mini 113 10.2 0.001 0.001 2.403 2.414 pw_scatter_p 1095 15.3 2.083 2.192 2.083 2.192 ------------------------------------------------------------------------------- Plot: name="MQAE_single_node_timings_32omp", title="Timings of MQAE_single_node with 32 OpenMP Threads", ylabel="time [s]" PlotPoint: plot="MQAE_single_node_timings_32omp", name="rest", label="rest", y=30.212999999999994, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="qmmm_elec_gaussian_low_G", label="qmmm_elec_gaussian_low_G", y=80.109, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="pw_scatter_s", label="pw_scatter_s", y=11.208, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="pw_integral_ab", label="pw_integral_ab", y=10.29, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="force_nonbond", label="force_nonbond", y=8.948, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="fft3d_s", label="fft3d_s", y=8.892, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="qmmm_forces_gaussian_low_G", label="qmmm_forces_gaussian_low_G", y=6.093, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="grid_collocate_task_list", label="grid_collocate_task_list", y=4.099, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="grid_integrate_task_list", label="grid_integrate_task_list", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="fft3d_ps", label="fft3d_ps", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32omp", name="mp_alltoall_z22v", label="mp_alltoall_z22v", y=0.0, yerr=0.0 Plot: name="MQAE_single_node_timings_32mpi", title="Timings of MQAE_single_node with 32 MPI Ranks", ylabel="time [s]" PlotPoint: plot="MQAE_single_node_timings_32mpi", name="rest", label="rest", y=51.861999999999995, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="qmmm_elec_gaussian_low_G", label="qmmm_elec_gaussian_low_G", y=2.618, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="pw_scatter_s", label="pw_scatter_s", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="pw_integral_ab", label="pw_integral_ab", y=5.317, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="force_nonbond", label="force_nonbond", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="fft3d_s", label="fft3d_s", y=0.0, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="qmmm_forces_gaussian_low_G", label="qmmm_forces_gaussian_low_G", y=5.503, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="grid_collocate_task_list", label="grid_collocate_task_list", y=7.178, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="grid_integrate_task_list", label="grid_integrate_task_list", y=6.593, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="fft3d_ps", label="fft3d_ps", y=12.561, yerr=0.0 PlotPoint: plot="MQAE_single_node_timings_32mpi", name="mp_alltoall_z22v", label="mp_alltoall_z22v", y=5.474, yerr=0.0 Summary: Performance test works fine. Status: OK Uploading artifacts... done EndDate: 2021-07-11 20:30:14+00:00