lenskit.data.ItemListCollection#

class lenskit.data.ItemListCollection(key=None, *, index=True)#

Bases: Generic[KL], ABC

A collection of item lists. This protocol defines read access to the collection; see ItemListCollector for the ability to add new lists. See Item List Collections for an introduction to using this class.

An item list collection consists of a sequence of item lists with associated keys following a fixed schema. Item list collections support iteration (in order) and lookup by key. They are used to represent a variety of things, including test data and the results of a batch run.

The key schema can be specified either by a list of field names, or by providing a named tuple class (created by either namedtuple() or NamedTuple) defining the key schema. Schemas should not be nested: field values must be scalars, not tuples or lists. Keys should also be hashable.

This protocol and its implementations exist, instead of using raw dictionaries or lists, to consistently handle some of the nuances of multi-valued keys, and different collections having different key fields. For example, if a run produces item lists with both user IDs and sequence numbers, but your test data is only indexed by user ID, the projected lookup capabilities make it easy to find the test data to go with an item list in the run.

Item list collections support indexing by position, like a list, returning a tuple of the key and list; iterating over an item list collection similarly produces (key, list) pairs (so an item list collection is a Sequence of key/list pairs).

If the item list is _indexed_ (constructed with index=True), it also supports lookup by _key_ with lookup(). The key can be supplied as either a tuple or an instance of the key type. If more than one item with the same key is inserted into the collection, then the _last_ one is returned (just like a dictionary), but the others remain in the underlying list when it is iterated.

Note

Constructing an item list collection yields a ListILC.

Parameters:

key (type[KL] | Sequence[str] | None) – The type (a NamedTuple class) or list of field names specifying the key schema.
index (bool) – Whether or not to index lists by key to facilitate fast lookups.

__init__(key)#

Parameters:: key (type[KL] | Sequence[str])

Methods

`__init__`(key)
`empty`(key, *[, index])	Create a new empty, mutable item list collection.
`from_arrow`(table)	Convert an Arrow table into an item list collection. The table must be in ``'native`'' format.
`from_df`(df[, key])	Create an item list collection from a data frame.
`from_dict`()	Create an item list collection from a dictionary.
`items`()	Iterate over item lists and keys.
`keys`()	Iterate over keys.
`lists`()	Iterate over item lists without keys.
`load_parquet`()	Load this item list from a Parquet file.
`lookup`()	Look up a list by key.
`lookup_projected`(key)	Look up an item list using a projected key.
`record_batches`([batch_size, columns, layout])	Get the item list collection as Arrow record batches (in native layout).
`save_parquet`(path, *[, layout, batch_size, ...])	Save this item list collection to a Parquet file.
`to_arrow`(*[, batch_size, layout])	Convert this item list collection to an Arrow table.
`to_dataset`()	Construct a dataset populated with this item list collection's data as interactions.
`to_df`()	Convert this item list collection to a data frame.

Attributes

`key_fields`	The names of the key fields.
`key_type`	The type of collection keys.
`list_schema`	Get the schema for the lists in this ILC.

static empty(key, *, index=True)#

Create a new empty, mutable item list collection.

Parameters:

key (type[K] | Sequence[str])
index (bool)

Return type:

MutableItemListCollection[K]

static from_dict(data: Mapping[tuple[int | str | bytes | integer[Any] | str_ | bytes_ | object_, ...] | int | str | bytes | integer[Any] | str_ | bytes_ | object_, ItemList], key: type[K]) → ItemListCollection[K]#
static from_dict(data: Mapping[tuple[int | str | bytes | integer[Any] | str_ | bytes_ | object_, ...] | int | str | bytes | integer[Any] | str_ | bytes_ | object_, ItemList], key: Sequence[str] | str | None = None) → ItemListCollection[tuple[int | str | bytes | integer[Any] | str_ | bytes_ | object_, ...]]: Create an item list collection from a dictionary.

See also

lenskit.data.collection.ListILC.from_dict()

static from_df(df, key=None, *others)#

Create an item list collection from a data frame.

lenskit.data.ItemListCollection#

This Page