tlo.util module

This file contains helpful utility functions.

create_age_range_lookup(min_age: int, max_age: int, range_size: int = 5) -> (<class 'list'>, typing.Dict[int, str])[source]

Create age-range categories and a dictionary that will map all whole years to age-range categories

If the minimum age is not zero then a below minimum age category will be made, then age ranges until maximum age will be made by the range size, all other ages will map to the greater than maximum age category.

Parameters:
  • min_age – Minimum age for categories,

  • max_age – Maximum age for categories, a greater than maximum age category will be made

  • range_size – Size of each category between minimum and maximum ages

Returns:

age_categories: ordered list of age categories available lookup: Default dict of integers to maximum age mapping to the age categories

transition_states(initial_series: Series, prob_matrix: DataFrame, rng: RandomState) Series[source]

Transition a series of states based on probability matrix

This should carry out all state transitions for a Series (i.e. column in DataFrame) based on the probability of state-transition matrix.

Timing values for 1M rows per state, 4 states, 100 times: - Looping through groups: [59.5, 58.7, 59.5] - Using apply: [84.2, 83.3, 84.4] Because of this, looping through the groups was chosen

Parameters:
  • initial_series (Series) – the initial state series

  • prob_matrix (DataFrame) – DataFrame of state-transition probabilities columns are the original state, rows are the new state. values are the probabilities

  • rng (RandomState) – RandomState from the disease module

Returns:

Series with states changed according to probabilities

sample_outcome(probs: DataFrame, rng: RandomState)[source]

Helper function to randomly sample an outcome for each individual in a population from a set of probabilities that are specific to each individual. :param probs: Each row of this dataframe represents the person and the columns are the possible outcomes. The values are the probability of that outcome for that individual. For each individual, the probabilities of each outcome are assumed to be independent and mutually exclusive (but not necessarily exhaustive). If they sum to more than 1.0, then they are (silently) scaled so that they do sum to 1.0. :param rng: Random Number state to use for the generation of random numbers. :return: A dict of the form {<index>:<outcome>} where an outcome is selected.

class BitsetHandler(population: Population, column: str | None, elements: List[str])[source]

Bases: object

Provides methods to operate on int column(s) in the population dataframe as a bitset

property df: DataFrame
element_repr(*elements: str) int64[source]

Returns integer representation of the specified element(s)

to_strings(integer: int64) Set[str][source]

Given an integer value, returns the corresponding set of strings.

Parameters:

integer – The integer value for the bitset.

Returns:

Set of strings corresponding to integer value.

set(where, *elements: str, columns: str | List[str] | None = None)[source]

Set (i.e. set to True) the bits corersponding to the specified elements.

The where argument is used verbatim as the first item in a df.loc[x, y] call. It can be index items, a boolean logical condition, or list of row indices e.g. “[0]”.

The elements are one of more valid items from the list of elements for this bitset.

Parameters:
  • where – Condition to filter rows that will be set.

  • elements – One or more elements to set to True.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

unset(where, *elements: str, columns: str | List[str] | None = None)[source]

Unset (i.e. set to False) the bits corresponding the specified elements.

The where argument is used verbatim as the first item in a df.loc[x, y] call. It can be index items, a boolean logical condition, or list of row indices e.g. “[0]”.

The elements are one of more valid items from the list of elements for this bitset.

Parameters:
  • where – Condition to filter rows that will be unset.

  • elements – one or more elements to set to False.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

clear(where, columns: str | List[str] | None = None)[source]

Clears all the bits for the specified rows.

Parameters:
  • where – Condition to filter rows that will cleared.

  • columns – Optional argument specifying column(s) containing bitsets to clear. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

has(where, element: str, first: bool = False, columns: str | List[str] | None = None) DataFrame | Series | bool[source]

Test whether bit(s) for a specified element are set.

param where:

Condition to filter rows that will checked.

Parameters:

element

Element string to test if bit is set for. :param first: Boolean keyword argument specifying whether to return only the

first item / row in the computed column / dataframe.

param columns:

Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

return:

Boolean value(s) indicating whether element bit(s) are set.

has_all(where, *elements: str, first: bool = False, columns: str | List[str] | None = None) DataFrame | Series | bool[source]

Check whether individual(s) have all the elements given set to True.

The where argument is used verbatim as the first item in a df.loc[x, y] call. It can be index items, a boolean logical condition, or list of row indices e.g. “[0]”

The elements are one of more valid items from the list of elements for this bitset.

Parameters:
  • where – Condition to filter rows that will checked.

  • elements – One or more elements to set to True.

  • first – Boolean keyword argument specifying whether to return only the first item / row in the computed column / dataframe.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

Returns:

Boolean value(s) indicating whether all element bit(s) are set.

has_any(where, *elements: str, first: bool = False, columns: str | List[str] | None = None) DataFrame | Series | bool[source]

Check whether individual(s) have any of the elements given set to True.

The where argument is used verbatim as the first item in a df.loc[x, y] call. It can be index items, a boolean logical condition, or list of row indices e.g. “[0]”

The elements are one of more valid items from the list of elements for this bitset.

Parameters:
  • where – Condition to filter rows that will checked.

  • elements – One or more elements to set to True.

  • first – Boolean keyword argument specifying whether to return only the first item / row in the computed column / dataframe.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

Returns:

Boolean value(s) indicating whether any element bit(s) are set.

get(where, first: bool = False, columns: str | List[str] | None = None) DataFrame | Series | Set[str][source]

Returns a series or dataframe with set of string elements where bit is True.

The where argument is used verbatim as the first item in a df.loc[x, y] call. It can be index items, a boolean logical condition, or list of row indices e.g. “[0]”

The elements are one of more valid items from the list of elements for this bitset

Parameters:
  • where – Condition to filter rows that will returned.

  • first – Boolean keyword argument specifying whether to return only the first item / row in the computed column / dataframe.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

Returns:

Set(s) of strings corresponding to elements with bits set to True.

uncompress(where=None, columns: str | List[str] | None = None) DataFrame | Dict[str, DataFrame][source]

Returns an exploded representation of the bitset(s).

Each element bit becomes a column and each column is a bool indicating whether the bit is set for the element.

Parameters:
  • where – Condition to filter rows that an exploded representation will be returned for.

  • columns – Optional argument specifying column(s) containing bitsets to return exploded representation for. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

Returns:

If columns is not set or is set to a single string, then a dataframe is returned with a column for each element in set and boolean values indicating whether the corresponding bit is set; if columns is specified as a list of multiple column names a dictionary keyed by column name and with the corresponding value a dataframe corresponding to the exploded representation of the column bitset is returned.

not_empty(where, first=False, columns: str | List[str] | None = None) DataFrame | Series | bool[source]

Returns Series of bool indicating whether the BitSet entry is not empty.

True is set is not empty, False otherwise.

Parameters:
  • where – Condition to filter rows that will checked.

  • first – Boolean keyword argument specifying whether to return only the first item / row in the computed column / dataframe.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

Returns:

Boolean value(s) indicating whether any elements bits are set.

is_empty(where, first=False, columns: str | List[str] | None = None) DataFrame | Series | bool[source]

Returns Series of bool indicating whether the BitSet entry is empty.

True if the set is empty, False otherwise.

Parameters:
  • where – Condition to filter rows that will checked.

  • first – Boolean keyword argument specifying whether to return only the first item / row in the computed column / dataframe.

  • columns – Optional argument specifying column(s) containing bitsets to update. If set to None (the default) a column argument must have been specified when constructing the BitsetHandler object.

Returns:

Boolean value(s) indicating whether all elements bits are not set.

random_date(start, end, rng)[source]
hash_dataframe(dataframe: DataFrame)[source]
get_person_id_to_inherit_from(child_id, mother_id, population_dataframe, rng)[source]

Get index of person to inherit properties from.