lenskit.data.Vocabulary#
- class lenskit.data.Vocabulary(keys=None, name=None, *, reorder=True)#
- Bases: - object- Vocabularies of entity identifiers for the LensKit data model. - This class supports bidirectional mappings between key-like data and congiguous nonnegative integer indices. Its key use is to facilitate the entity ID vocabularies in - Dataset, but it can also be used for things like item tags.- IDs in a vocabulary must be unique. Constructing a vocabulary with - reorder=Trueensures uniqueness (and sorts the IDs), but does not preserve the order of IDs in the original input.- It is currently a wrapper around - pandas.Index, but this fact is not part of the stable public API.- Parameters:
- keys (IDSequence | pd.Index | Iterable[ID] | None) – The IDs to put in the vocabulary. 
- name (str | None) – The vocabulary name (i.e. the entity class it stores IDs for). 
- reorder (bool) – If - True, sort and deduplicate the IDs. If- False(the default), use the IDs as-is (assigning each to their position in the input sequence).
 
- Stability:
- Caller (see Stability Levels).
 - __init__(keys=None, name=None, *, reorder=True)#
 - Methods - __init__([keys, name, reorder])- id(num)- Alias for - term()for greater readability for entity ID vocabularies.- id_array()- ids()- Alias for - terms()for greater readability for entity ID vocabularies.- number()- Look up the number for a vocabulary ID. - numbers()- Look up the numbers for an array of terms or IDs. - term(num)- Look up the term with a particular number. - terms()- Get a list of terms, optionally for an array of term numbers. - Attributes - The vocabulary as a Pandas index. - Current vocabulary size. - The name of the vocabulary (e.g. “user”, “item”). - property index: Index#
- The vocabulary as a Pandas index. - Stability:
- Internal (see Stability Levels).
 
 - number(term: object, missing: Literal['error'] = 'error') int#
- number(term: object, missing: Literal['none'] | None) int | None
- Look up the number for a vocabulary ID. 
 - numbers(terms: Sequence[Hashable] | TypeAliasForwardRef('numpy.typing.ArrayLike'), missing: Literal['error', 'negative'] = 'error', *, format: Literal['numpy'] = 'numpy') ndarray[tuple[Any, ...], dtype[int32]]#
- numbers(terms: Sequence[Hashable] | TypeAliasForwardRef('numpy.typing.ArrayLike'), missing: Literal['error', 'negative', 'null'] = 'error', *, format: Literal['arrow']) Int32Array
- Look up the numbers for an array of terms or IDs. 
 - term(num)#
- Look up the term with a particular number. Negative indexing is not supported. 
 - terms(nums: list[int] | ndarray[tuple[Any, ...], dtype[integer]] | Series | None = None, *, format: Literal['numpy'] = 'numpy') ndarray[tuple[int], dtype[integer[Any] | str_ | bytes_ | object_]]#
- terms(nums: list[int] | ndarray[tuple[Any, ...], dtype[integer]] | Series | None = None, *, format: Literal['arrow']) Array
- Get a list of terms, optionally for an array of term numbers. - Parameters:
- nums – The numbers (indices) for of terms to retrieve. If - None, returns all terms.
- Returns:
- The terms corresponding to the specified numbers, or the full array of terms (in order) if - nums=None.
 
 - ids(nums: list[int] | ndarray[tuple[Any, ...], dtype[integer]] | Series | None = None, *, format: Literal['numpy'] = 'numpy') ndarray[tuple[int], dtype[integer[Any] | str_ | bytes_ | object_]]#
- ids(nums: list[int] | ndarray[tuple[Any, ...], dtype[integer]] | Series | None = None, *, format: Literal['arrow']) Array
- Alias for - terms()for greater readability for entity ID vocabularies.
 
