There are a couple of areas to consider when you are interested in performance, firstly the physics, and secondly, computational.
-O3) and an optimised BLAS/LAPACK library (e.g. MKL, GotoBLAS, ATLAS)
Before going to far down any of these areas, take a look at the timing report which is printed at the end of your CP2K job output. This will give you some information about which parts of the code are taking the most time, and therefore where to invest your time tweaking to get the best performance.