tlo.lm module
- class Predictor(property_name: str = None, external: bool = False, conditions_are_mutually_exclusive: bool | None = None, conditions_are_exhaustive: bool | None = False)
Bases:
object
A Predictor variable for the regression model.
- Parameters:
property_name – A property of the population dataframe e.g. age, sex, etc. or if
external=True
the name of the external property that will be passed as a keyword argument to theLinearModel.predict
method.external – Whether the named property is external (
True
) and so will be passed as a keyword argument to theLinearModel.predict
method) or is a property of the population dataframe (False
).conditions_are_mutually_exclusive – Whether the set of conditions that are declared for this predictor are all mutually exclusive, that is, for any pair of conditions, one condition evaluating to
True
implies the other must evaluate toFalse
. If this is declared to be the case a more efficient method of evaluation will be used inLinearModel.predict
. Note however that the validity of this declaration will not be checked so if this is set toTrue
for predictors with non-mutually exclusive conditions, the model output will be erroneous.conditions_are_exhaustive – Whether the set of conditions that are declared for this predictor are all exhaustive, that is at least one condition will always be
True
irrespective of the value of the property. If this is declared to be the case, a more efficient method of evaluation maye be used inLinearModel.predict`, though if a catch-all ``otherwise
condition is included this flag will provide no benefit. Note that the validity of this declaration will not be checked so if this is set toTrue
for predictors with non-exhaustive conditions, the model output will be erroneous.
- class LinearModelType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
The type of model specifies how the results from the predictor are combined: ‘additive’ -> adds the effect_sizes from the predictors ‘logisitic’ -> multiples the effect_sizes from the predictors and applies the transform x/(1+x) [Thus, the intercept can be taken to be an Odds and effect_sizes Odds Ratios, and the prediction is a probability.] ‘multiplicative’ -> multiplies the effect_sizes from the predictors
- ADDITIVE = 1
- LOGISTIC = 2
- MULTIPLICATIVE = 3
- CUSTOM = 4
- class LinearModel(lm_type: LinearModelType, intercept: float | int, *predictors: Predictor)
Bases:
object
A linear model has an intercept and zero or more
Predictor
variables.- Parameters:
lm_type – Model type to use.
intercept – Intercept term for the model.
*predictors –
Any
Predictor
instances to use in computing output.
- property lm_type: LinearModelType
The model type.
- property intercept: float | int
The intercept value for the model.
- static multiplicative(*predictors: Predictor)
Returns a multplicative LinearModel with intercept=1.0
- Parameters:
predictors – One or more Predictor objects defining the model
- static custom(predict_function, **kwargs)
Define a linear model using the supplied function
The function acts as a drop-in replacement to the predict function and must implement the interface:
- (
self: LinearModel, df: Union[pd.DataFrame, pd.Series], rng: Optional[np.random.RandomState] = None, **kwargs
) -> pd.Series
It is the responsibility of the caller of predict to ensure they pass either a dataframe or an individual record as expected by the custom function.
See test_custom() in test_lm.py for a couple of examples.
- predict(df: DataFrame, rng: RandomState | None = None, squeeze_single_row_output=True, **kwargs) Series | bool_
Evaluate linear model output for a given set of input data.
- Parameters:
df – The input
DataFrame
containing the input data to evaluate the model with.rng – If set to a NumPy
RandomState
instance, returned output will be booleanSeries
corresponding to Bernoulli random variables sampled according to probabilities specified by model output. Otherwise model output directly returned.squeeze_single_row_output – If
rng
argument is notNone
and this argument is set toTrue
, the output for adf
input with a single-row will be a scalar boolean value rather than a booleanSeries
, if set toFalse
, the output will always be aSeries
.**kwargs –
Values for any external variables included in model predictors.