Utility Functions


Miscellaneous utility functions.


Clone an algorithm, but not its fitted data. This is like scikit.base.clone(), but may not work on arbitrary SciKit estimators. LensKit algorithms are compatible with SciKit clone, however, so feel free to use that if you need more general capabilities.

This function is somewhat derived from the SciKit one.

>>> from lenskit.algorithms.basic import Bias
>>> orig = Bias()
>>> copy = clone(orig)
>>> copy is orig
>>> copy.damping == orig.damping

Backport of os.fspath() function for Python 3.5.


Load the ratings from a modern MovieLens data set (ML-20M or one of the ‘latest’ data sets).

>>> load_ml_ratings().head()
    user item rating  timestamp
0   1      31    2.5 1260759144
1   1    1029    3.0 1260759179
2   1    1061    3.0 1260759182
3   1    1129    2.0 1260759185
4   1    1172    4.0 1260759205
Parameters:path – The path where the MovieLens data is unpacked.
Returns:The rating data, with user and item columns named properly for LensKit.
Return type:pandas.DataFrame

Read a Pandas data frame, auto-detecting the file format based on filename suffix. The following file types are supported:

File has suffix .csv, read with pandas.read_csv().
File has suffix .parquet, .parq, or .pq, read with pandas.read_parquet().
lenskit.util.write_parquet(path, frame, append=False)

Write a Parquet file.

  • path (pathlib.Path) – The path of the Parquet file to write.
  • frame (pandas.DataFrame) – The data to write.
  • append (bool) – Whether to append to the file or overwrite it.