performance
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
performance [2017/03/24 16:13] – 79.64.85.68 | performance [2019/10/18 09:51] – [H2O-64] rschade | ||
---|---|---|---|
Line 5: | Line 5: | ||
The purpose of the CP2K benchmark suite is to provide performance which can be used to guide users towards the best configuration (e.g. machine, number of MPI processors, number of OpenMP threads) for a particular problem, and give a good estimation for the parallel performance of the code for different types of method. Five benchmarks are provided: '' | The purpose of the CP2K benchmark suite is to provide performance which can be used to guide users towards the best configuration (e.g. machine, number of MPI processors, number of OpenMP threads) for a particular problem, and give a good estimation for the parallel performance of the code for different types of method. Five benchmarks are provided: '' | ||
- | We encourage you to contribute benchmark results from your own local cluster or HPC system - just run the inputs and add timings in the relevant sections below. | + | We encourage you to contribute benchmark results from your own local cluster or HPC system - just run the inputs and add timings in the relevant sections below. |
If you have any questions or problems running benchmarks or using the scripts please contact Iain Bethune (< | If you have any questions or problems running benchmarks or using the scripts please contact Iain Bethune (< | ||
Line 23: | Line 23: | ||
=== Description === | === Description === | ||
- | // | + | // |
=== Availability === | === Availability === | ||
The benchmark is available (along with other water systems) from the CP2K source distribution: | The benchmark is available (along with other water systems) from the CP2K source distribution: | ||
- | [[src>cp2k/tests/ | + | [[src> |
=== Results === | === Results === | ||
Line 40: | Line 40: | ||
| Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 19.885 | 192 cores | 1 OMP thread per MPI task, no GPU | [[performance: | | Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 19.885 | 192 cores | 1 OMP thread per MPI task, no GPU | [[performance: | ||
| Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 15.560 | 1152 cores | 9 OMP threads per MPI task | [[performance: | | Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 15.560 | 1152 cores | 9 OMP threads per MPI task | [[performance: | ||
+ | | Noctua | Cray CS500 | 25/09/2019 | 9f58d81 | 13.3 | 640 cores | 10 OMP thread per MPI task | [[performance: | ||
+ | |||
==== Fayalite-FIST ==== | ==== Fayalite-FIST ==== | ||
Line 49: | Line 51: | ||
The benchmark is available from the CP2K source distribution: | The benchmark is available from the CP2K source distribution: | ||
- | [[src>cp2k/tests/ | + | [[src> |
=== Results === | === Results === | ||
Line 61: | Line 63: | ||
| Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 207.972 | 512 cores | 2 OMP threads per MPI task, no GPU | [[performance: | | Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 207.972 | 512 cores | 2 OMP threads per MPI task, no GPU | [[performance: | ||
| Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 166.192 | 576 cores | 2 OMP threads per MPI task | [[performance: | | Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 166.192 | 576 cores | 2 OMP threads per MPI task | [[performance: | ||
+ | | Noctua | Cray CS500 | 27/04/2019 | 3cf5f249 | 139.177 | 320 cores | 1 OMP thread per MPI task | [[performance: | ||
==== LiH-HFX ==== | ==== LiH-HFX ==== | ||
Line 70: | Line 73: | ||
=== Availability === | === Availability === | ||
- | The benchmark is available from [[src>cp2k/tests/ | + | The benchmark is available from [[src> |
=== Results === | === Results === | ||
Line 77: | Line 80: | ||
^ Machine Name ^ Architecture ^ Date ^ SVN Revision ^ Fastest time (s) ^ Configuration ^^ Detailed results ^ | ^ Machine Name ^ Architecture ^ Date ^ SVN Revision ^ Fastest time (s) ^ Configuration ^^ Detailed results ^ | ||
- | | HECToR | Cray XE6 | 21/1/2014 | 13196 | 121.362 | 65536 cores | 8 OMP threads per MPI task | [[performance: | + | | HECToR | Cray XE6 | 21/1/2014 | 13196(*) | 121.362 | 65536 cores | 8 OMP threads per MPI task | [[performance: |
- | | ARCHER | Cray XC30 | 9/1/2014 | 13473 | 51.172 | 49152 cores | 6 OMP threads per MPI task | [[performance: | + | | ARCHER | Cray XC30 | 9/1/2014 | 13473(*) | 51.172 | 49152 cores | 6 OMP threads per MPI task | [[performance: |
- | | Magnus | Cray XC40 | 10/11/2014 | 14377 | 62.075 | 24576 cores | 4 OMP threads per MPI task | [[performance: | + | | Magnus | Cray XC40 | 10/11/2014 | 14377(*) | 62.075 | 24576 cores | 4 OMP threads per MPI task | [[performance: |
- | | Piz Daint | Cray XC30 | 12/05/2015 | 15268(*) | 66.051 | 32768 cores | 4 OMP threads per MPI task, no GPU | [[performance: | + | | Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 66.051 | 32768 cores | 4 OMP threads per MPI task, no GPU | [[performance: |
- | | Cirrus | SGI ICE XA | 24/11/2016 | 17566(*) | 483.676 | 2016 cores | 6 OMP threads per MPI task | [[performance: | + | | Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 483.676 | 2016 cores | 6 OMP threads per MPI task | [[performance: |
- | + | | Noctua | Cray CS500 | 27/04/2019 | 3cf5f249 | 203.092 | 5120 cores | 1 OMP thread per MPI task | [[performance: | |
- | (*) Some time after Nov 2014, something changed resulting in around 50% more ERIs being included in the HFX calculation. | + | |
+ | (*) Prior to r14945, a bug resulted in an underestimation of the number of ERIs which should be computed (by roughly 50% for this benchmark. | ||
==== H2O-DFT-LS ==== | ==== H2O-DFT-LS ==== | ||
Line 95: | Line 98: | ||
The benchmark input file used to generate these results is {{performance: | The benchmark input file used to generate these results is {{performance: | ||
- | It is a slightly modified version of the more general one in the CP2K SVN at [[src>cp2k/tests/ | + | It is a slightly modified version of the more general one in the CP2K SVN at [[src> |
=== Results === | === Results === | ||
Line 107: | Line 110: | ||
| Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 27.900 | 32768 cores | 2 OMP threads per MPI task, no GPU | [[performance: | | Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 27.900 | 32768 cores | 2 OMP threads per MPI task, no GPU | [[performance: | ||
| Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 543.032 | 2016 cores | 2 OMP threads per MPI task | [[performance: | | Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 543.032 | 2016 cores | 2 OMP threads per MPI task | [[performance: | ||
+ | | Noctua | Cray CS500 | 27/04/2019 | 3cf5f249 | 77.413 | 5120 cores | 1 OMP thread per MPI task | [[performance: | ||
+ | |||
==== H2O-64-RI-MP2 ==== | ==== H2O-64-RI-MP2 ==== | ||
Line 115: | Line 120: | ||
=== Availability === | === Availability === | ||
- | The benchmark is in the CP2K SVN at: [[src>cp2k/tests/ | + | The benchmark is in the CP2K SVN at: [[src> |
=== Results === | === Results === | ||
Line 127: | Line 132: | ||
| Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 48.15 | 32768 cores | 8 OMP threads per MPI task, no GPU | [[performance: | | Piz Daint | Cray XC30 | 12/05/2015 | 15268 | 48.15 | 32768 cores | 8 OMP threads per MPI task, no GPU | [[performance: | ||
| Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 303.571 | 2016 cores | 1 OMP thread per MPI task | [[performance: | | Cirrus | SGI ICE XA | 24/11/2016 | 17566 | 303.571 | 2016 cores | 1 OMP thread per MPI task | [[performance: | ||
+ | | Noctua | Cray CS500 | 27/04/2019 | 3cf5f249 | 101.617 | 5120 cores | 1 OMP thread per MPI task | [[performance: | ||
+ |
performance.txt · Last modified: 2020/11/10 13:29 by rschade