Reference counting

Reference counting is a memory management technique (pattern), that can help to share objects, using memory more efficiently and safely. This technique can be applied to objects that are always transferred via pointers.

Sharing objects

An object (type) can be shared between many others, by having many objects that keep pointers to him. Sharing an object obviously spares memory and keeps the data always in sync. Often this is an advantage, and it works especially well with immutable objects, but one must be careful with mutable objects.

This has one drawback, namely when is it safe to deallocate the shared object? Obviously when nobody else is using it. Sometime the code as a special structure that make you know when you can deallocate the shared object, but often the things get messy and you either try your chance and deallocate the shared object or you just leave it around, leaking memory (the other possibility is to just always copy the whole object, using up much more memory and ending up having many out-of-sync copies of your object).

Reference counting

Reference counting kicks in exactly at this point to help out deciding when it is safe to deallocate the shared object. The idea is really simple: just keep into the shared object a counter (ref_count, or reference count) of how many objects are using the shared object, when ref_count hits 0 (nobody is using it) deallocate the shared object.

Keeping the reference count always up to date needs some discipline, but the guidelines to follow are quite natural. Central to this is the concept of ownership. It is thought that each objects or piece of code can “own” other objects. If you create, copy or retain (more about retain later) an object then you own it. The policy introduced by the retain-release technique is very simple: if you own an object you are guaranteed that it stays around, but you are also responsible of releasing it when you no longer need it.

The basic methods of reference counting are the following:

*_retain increments the reference count by one, and must be called when you what to keep around a shared copy of the object (for immutable objects it can be seen as a lightweight copy). After having retained an object you own it and you are responsible of releasing it.
*_release decrements the reference count by one and deallocates the object if ref_count hist 0. It must be called when you no longer need your shared copy (i.e. to relinquish ownership). Release replaces deallocation.
*_create*, *_copy* routines give back an object with a retain count of one (i.e. already retained), so that you own it and you are responsible of releasing it

Summary

If you allocated, copied, or retained an object, then you are responsible for releasing the object with either -release when you no longer need the object. If you did not allocate, copy, or retain an object, then you should not release it.
When you receive an object (as the result of a method call), it will normally remain valid until the end of your method and the object can be safely returned as a result of your method. If you need the object to live longer than this–for example, if you plan to store it in a type –then you must either -retain or -copy the object.

Sample code

Some sample code to get a feeling of how it works…

! create the matrix structure
  call cp_fmstruct_create(my_struct,...)
 
! create some matrixes
  call cp_fm_create(new_matrix_1,matrix_struct=my_struct)
  call cp_fm_create(new_matrix_2,matrix_struct=my_struct)
  call cp_fm_create(new_matrix_3,matrix_struct=my_struct)
 
! get rid of the matrix struct as we do not need it anymore
! (the matrix do, but they should look after themselves)
  call cp_fm_release(my_struct)
 
! work with the matrices
...
 
! get rid of the matrixes
  call cp_fm_release(new_matrix_1)
  call cp_fm_release(new_matrix_2)
  call cp_fm_release(new_matrix_3) ! my_struct gets deallocated only here

subroutine my_env_set_matrix(my_env,matrix) 
  type(my_env_type), pointer :: my_env
  type(blacs_matrix_type), pointer :: matrix
 
! why you should not swap the following two calls?
  call cp_fm_retain(matrix)
  call cp_fm_relase(my_env%matrix)
  my_env%matrix => matrix
end subroutine my_env_set_matrix
 
...
 
! in the deallocation subroutine of my env
! either release if (my_env%ref_count==0) or a plain
! deallocate routine
  call cp_fm_release(my_env%matrix)

Mixing & details

cp2k does not use retain and release consequently (and it isn't always good to pass types by pointer, as needed by reference counting), so it is nice to have some conventions about how to treat objects that don't implement reference counting.

If the object is just passed in it is copied
if the argument name ends with _ptr, the object is passed as pointer, and deallocated when no longer needed (if not explicitly noted otherwise)
if there is a logical variable named owns_* or shoul_dealloc_* then the object is shared and deallocation depends on the value of that variable.

My retain and release have the following properties:

it is ok to release an unassociated pointer
When a pointer is released it is always nullified
it is an error to retain an unassociated pointer

References

Geamma et al., Patterns (Reference counting, I think)
Cocoa uses reference counting and there have been a couple of articles on it. They discuss retain cycles, but also auto-release pools, an extension to reference counting to be able to return temporary objects. This is not implemented in cp2k and hopefully avoidable (seeing the kind of code that there is in cp2k).