Utility Functions¶
Miscellaneous¶
Miscellaneous utility functions.
-
lenskit.util.
clone
(algo)¶ Clone an algorithm, but not its fitted data. This is like
scikit.base.clone()
, but may not work on arbitrary SciKit estimators. LensKit algorithms are compatible with SciKit clone, however, so feel free to use that if you need more general capabilities.This function is somewhat derived from the SciKit one.
>>> from lenskit.algorithms.basic import Bias >>> orig = Bias() >>> copy = clone(orig) >>> copy is orig False >>> copy.damping == orig.damping True
-
lenskit.util.
fspath
(path)¶ Backport of
os.fspath()
function for Python 3.5.
-
lenskit.util.
load_ml_ratings
(path='ml-latest-small')¶ Load the ratings from a modern MovieLens data set (ML-20M or one of the ‘latest’ data sets).
>>> load_ml_ratings().head() user item rating timestamp 0 1 31 2.5 1260759144 1 1 1029 3.0 1260759179 2 1 1061 3.0 1260759182 3 1 1129 2.0 1260759185 4 1 1172 4.0 1260759205
- Parameters
path – The path where the MovieLens data is unpacked.
- Returns
The rating data, with user and item columns named properly for LensKit.
- Return type
-
lenskit.util.
read_df_detect
(path)¶ Read a Pandas data frame, auto-detecting the file format based on filename suffix. The following file types are supported:
- CSV
File has suffix
.csv
, read withpandas.read_csv()
.- Parquet
File has suffix
.parquet
,.parq
, or.pq
, read withpandas.read_parquet()
.
-
lenskit.util.
write_parquet
(path, frame, append=False)¶ Write a Parquet file.
- Parameters
path (pathlib.Path) – The path of the Parquet file to write.
frame (pandas.DataFrame) – The data to write.
append (bool) – Whether to append to the file or overwrite it.