lenskit.data.EntitySet#
- class lenskit.data.EntitySet(name, schema, vocabulary, table, _sel=None)#
Bases:
object
Representation of a set of entities from the dataset. Obtained from
Dataset.entities()
.- Parameters:
name (str)
schema (EntitySchema)
vocabulary (Vocabulary)
table (pa.Table)
_sel (pa.Int32Array | None)
- __init__(name, schema, vocabulary, table, _sel=None)#
- Parameters:
name (str)
schema (EntitySchema)
vocabulary (Vocabulary)
table (Table)
_sel (Int32Array | None)
Methods
__init__
(name, schema, vocabulary, table[, _sel])arrow
()Get these entities and their attributes as a PyArrow table.
attribute
(name)Get values of an attribute for the entites in this entity set.
count
()Return the number of entities in this entity set.
ids
()Get the identifiers of the entities in this set.
numbers
()Get the numbers (from the vocabulary) for the entities in this set.
pandas
()Get the entities and their attributes as a Pandas data frame.
select
(*[, ids, numbers])Select a subset of the entities in this set.
Attributes
attributes
The name of the entity class for these entities.
schema
The identifier vocabulary for this schema.
- vocabulary: Vocabulary#
The identifier vocabulary for this schema.
- ids()#
Get the identifiers of the entities in this set. This is returned directly as PyArrow array instead of NumPy.
- numbers()#
Get the numbers (from the vocabulary) for the entities in this set.
- attribute(name)#
Get values of an attribute for the entites in this entity set.
- Parameters:
name (str)
- Return type:
- select(*, ids=None, numbers=None)#
Select a subset of the entities in this set.
Note
The vocabulary is unchanged, so numbers in the resulting set will be entity numbers in the dataset’s vocabulary. They are not rearranged to be relative to this entity set.