Algorithm Implementation Tips¶

Implementing algorithms is fun, but there are a few things that are good to keep in mind.

In general, development follows the following:

Correct
Clear
Fast

In that order. Further, we always want LensKit to be usable in an easy fashion. Code implementing algorithms, however, may be quite complex in order to achieve good performance.

Performance¶

We use Numba to optimize critical code paths and provide parallelism in a number of cases, such as ALS training. See the ALS source code for examples.

We also directly use MKL sparse matrix routines when available for some operations. Whenever this is done in the main LensKit code base, however, we also provide fallback implementations when the MKL is not available. The k-NN recommenders both demonstrate different versions of this. The _mkl_ops module exposes MKL operations; we implement them through C wrappers in the mkl_ops.c file, that are then called through FFI. This extra layer is because the raw MKL calls are quite complex to call via FFI, and are not particularly amenable to use with Numba. We re-expose simplified interfaces that are also usable with Numba.

Random Number Generation¶

See lenskit.util.random for documentation on how to use random number generation.

In general, algorithms using randomization should have an rng parameter that takes a seed or RNG, and pass this to lenskit.util.random.rng() to get a random number generator. Algorithms that use randomness at predict or recommendation time, not just training time, should support the value 'user' for the rng parameter, and if it is passed, derive a new seed for each user using lenskit.util.random.derive_seed() to allow reproducibility in the face of parallelism for common experimental designs. lenskit.util.random.derivable_rng() automates this logic.

Memory Map Friendliness¶

LensKit uses joblib.Parallel to parallelize internal operations (when it isn’t using Numba). Joblib is pretty good about using shared memory to minimize memory overhead in parallel computations, and LensKit has some tricks to maximize this use. However, it does require a bit of attention in your algorithm implementation.

The easiest way to make this fail is to use many small NumPy or Pandas data structures. If you have a dictionary of np.ndarray objects, for instance, it will cause a problem. This is because each array will be memory-mapped, and each map will reopen the file. Having too many active open files will cause your process to run out of file descriptors on many systems. Keep your object count to a small, ideally fixed number; in lenskit.algorithms.basic.UnratedItemSelector, we do this by storing user and item indexes along with a matrix.CSR containing the items rated by each user. The old implementation had a dictionary mapping user IDs to ``ndarray``s with each user’s rated items. This is a change from \(|U|+1\) arrays to 5 arrays.

Algorithm Implementation Tips¶

Performance¶

Pickling and Sharing¶

Random Number Generation¶

Memory Map Friendliness¶