Utility Functions¶
Miscellaneous¶
Miscellaneous utility functions.
-
lenskit.util.
clone
(algo)¶ Clone an algorithm, but not its fitted data. This is like
scikit.base.clone()
, but may not work on arbitrary SciKit estimators. LensKit algorithms are compatible with SciKit clone, however, so feel free to use that if you need more general capabilities.This function is somewhat derived from the SciKit one.
>>> from lenskit.algorithms.basic import Bias >>> orig = Bias() >>> copy = clone(orig) >>> copy is orig False >>> copy.damping == orig.damping True
-
lenskit.util.
fspath
(path)¶ Backport of
os.fspath()
function for Python 3.5.
-
lenskit.util.
load_ml_ratings
(path='ml-latest-small')¶ Load the ratings from a modern MovieLens data set (ML-20M or one of the ‘latest’ data sets).
>>> load_ml_ratings().head() user item rating timestamp 0 1 31 2.5 1260759144 1 1 1029 3.0 1260759179 2 1 1061 3.0 1260759182 3 1 1129 2.0 1260759185 4 1 1172 4.0 1260759205
Parameters: path – The path where the MovieLens data is unpacked. Returns: The rating data, with user and item columns named properly for LensKit. Return type: pandas.DataFrame
-
lenskit.util.
read_df_detect
(path)¶ Read a Pandas data frame, auto-detecting the file format based on filename suffix. The following file types are supported:
- CSV
- File has suffix
.csv
, read withpandas.read_csv()
. - Parquet
- File has suffix
.parquet
,.parq
, or.pq
, read withpandas.read_parquet()
.
-
lenskit.util.
write_parquet
(path, frame, append=False)¶ Write a Parquet file.
Parameters: - path (pathlib.Path) – The path of the Parquet file to write.
- frame (pandas.DataFrame) – The data to write.
- append (bool) – Whether to append to the file or overwrite it.