atml package

Submodules

atml.cat module

The :mod:’atml.cat’ module contains a set of functions to perform adaptive testing on predictive ML models.

class atml.cat.Standard_CAT(irt_mdl)[source]

Bases: object

The class for a standard adaptive testing on ML models

testing(mdl, measure, data_dict, get_data, item_info='fisher', ability_0=None, N_test=None, remove_tested=True, sparse=False, cap_size=10000, tes_size=0.5)[source]

Perform the adaptive testing and record the testing sequence.

mdl: sklearn.predictor
An instance of the sklearn predictor. The model should have a fit(x, y) method for training and predict_proba(x) for testing.
measure: atml.Measure
A evaluation measure selected from the atml.measure module.
data_dict: dict
A dictionary that defines the index and the reference name of all the datasets. Example: data_dict = {0: ‘iris’, 1: ‘digits’, 2: ‘wine’}
get_data: Callable
A function that takes the dataset index and returns the features (x), and target (y) for the specified dataset.
item_info: string
The selected item information criterion. Options: (1) ‘fisher’, (2) ‘kl’, (3) ‘random’. ‘fisher’: the Fisher item information. ‘kl’: the Kullback-Leibler item information. ‘random’: random Gaussian item information.
ability_0: float
Initial value for the ability parameter.
N_test: int
Number of tests to be performed.
remove_tested: boolean
Whether to remove tested dataset and no longer test with the same dataset.
sparse: boolean
To indicate whether to only use a subset of the dataset to perform the experiments.
cap_size: int
In the case sparse=True, cap_size specifies the maximum size of the dataset to run the experiments.
tes_size: float
The proportion of the dataset that is used as the testing set (validation set).
selected_dataset_index: numpy.ndarray
The index sequence of selected datasets during the adaptive testing.
selected_dataset: list
The reference name sequence of selected dataset during the adaptive testing.
measurement: numpy.ndarray
The performance measurements of selected datasets during the adaptive testing.
ability_seq: numpy.ndarray
The estimated ability through the adaptive testing sequence.
atml.cat.get_fisher_item_information(ability, logit_delta, delta, log_a, log_s2, irt_type='beta3', n_sample=65536)[source]

Compute the Fisher item information for adaptive testing.

ability: float
The current estimated ability for the candidate model.
logit_delta: numpy.ndarray
The logit of the delta parameter of the IRT model.
delta: numpy.ndarray
The delta parameter of the IRT model.
log_a: numpy.ndarray
The log_a parameter of the IRT model.
log_s2: numpy.ndarray
The log_s2 parameter of the IRT model.
irt_type: string
The type of the IRT model.
n_sample: integer
The number of random samples used when calculating the KL item information.
info: numpy.ndarray
The Fisher item information for all the datasets with the given ability.
atml.cat.get_kl_item_information(ability, logit_delta, delta, log_a, log_s2, d_ability=0.1, irt_type='beta3', n_sample=65536)[source]

Compute the KL item information for adaptive testing.

ability: float
The current estimated ability for the candidate model.
logit_delta: numpy.ndarray
The logit of the delta parameter of the IRT model.
delta: numpy.ndarray
The delta parameter of the IRT model.
log_a: numpy.ndarray
The log_a parameter of the IRT model.
log_s2: numpy.ndarray
The log_s2 parameter of the IRT model.
d_ability: float
The change of the ability parameter when calculating the KL item information.
irt_type: string
The type of the IRT model.
n_sample: integer
The number of random samples used when calculating the KL item information.
info: numpy.ndarray
The KL item information for all the datasets with the given ability.
atml.cat.get_obj(parameter, data, extra_args)[source]

Wrapper function to get the value of the objective function.

parameter: tensorflow.Variable
The parameter of the IRT model.
data: tensorflow.Tensor
A tensor contains the selected dataset index, and performance measure of each experiment.
extra_args: tuple
A tuple contains extra parameters of the IRT model. (irt type, logit_delta, delta, log_a, log_2)
L: tensorflow.Tensor
The negative log-likelihood.
atml.cat.get_obj_g(parameter, data, extra_args)[source]

Wrapper function to get the gradient of the objective function.

parameter: tensorflow.Variable
The parameter of the IRT model.
data: tensorflow.Tensor
A tensor contains the selected dataset index, and performance measure of each experiment.
extra_args: tuple
A tuple contains extra parameters of the IRT model. (irt type, logit_delta, delta, log_a, log_2)
L: tensorflow.Tensor
The negative log-likelihood.
g: tensorflow.Tensor
The gradient of the parameters.
atml.cat.m_beta_3_irt(logit_theta, logit_delta, log_a)[source]

Predict the expected response for a set of IRT parameters

logit_theta: numpy.ndarray
The logit_theta parameter of the Beta-3 IRT model.
logit_delta: numpy.ndarray
The logit_delta parameter of the Beta-3 IRT model.
log_a: numpy.ndarray
The log_a parameter of the Beta-3 IRT model.
E: numpy.ndarray
The expected performance measurement.
atml.cat.m_logistic_irt(theta, delta, log_a, log_s2)[source]

Predict the expected response for a set of IRT parameters

theta: numpy.ndarray
The theta parameter of the Logistic IRT model.
delta: numpy.ndarray
The delta parameter of the Logistic IRT model.
log_a: numpy.ndarray
The log_a parameter of the Logistic IRT model.
log_s2: numpy.ndarray
The log_s2 parameter of the Logistic IRT model.
E: numpy.ndarray
The expected performance measurement.
atml.cat.ml_beta_3_obj(logit_theta, logit_delta, log_a, measure, tested_list, using_samples=False)[source]

The log-likelihood objective function of the Logistic IRT model

logit_theta: tensorflow.Variable
The current ability parameter of the IRT model.
logit_delta: tf.Tensor
The current logit_delta parameter of the IRT model.
log_a: tf.Tensor
The current log_a parameter of the IRT model.
measure: tf.Tensor
The performance measurements for the selected datasets.
tested_list: tf.Tensor
The index of each selected dataset.
using_samples: boolean
Whether to return the sample-wise negative log-likelihood.
nll: tf.Tensor
The negative log-likelihood of the IRT model.
atml.cat.ml_logistic_obj(theta, delta, log_a, log_s2, measure, tested_list, using_samples=False)[source]

The log-likelihood objective function of the Logistic IRT model

theta: tensorflow.Variable
The current ability parameter of the IRT model.
delta: tf.Tensor
The current delta parameter of the IRT model.
log_a: tf.Tensor
The current log_a parameter of the IRT model.
log_s2: tf.Tensor
The current log_s2 parameter of the IRT model.
measure: tf.Tensor
The performance measurements for the selected datasets.
tested_list: tf.Tensor
The index of each selected dataset.
using_samples: boolean
Whether to return the sample-wise negative log-likelihood.
nll: tf.Tensor
The negative log-likelihood of the IRT model.

atml.exp module

The :mod:’atml.exp’ module holds a set of functions to perform machine learning experiments and gather the corresponding performances metrics.

atml.exp.get_exhaustive_testing(data_dict, get_data, model_dict, get_model, measure, sparse=False, cap_size=10000, test_size=0.5)[source]

Perform testing experiments on all the possible combinations between different models and datasets.

data_dict: dict
A dictionary that defines the index and the reference name of all the datasets. Example: data_dict = {0: ‘iris’, 1: ‘digits’, 2: ‘wine’}
get_data: Callable
A function that takes the dataset index and returns the features (x), and target (y) for the specified dataset.
model_dict: dict
A dictionary that defines the index and the reference name of all the models. Example: model_dict = {0: ‘logistic regression’, 1: ‘random forest’, 2: ‘naive bayes’}
get_model: Callable
A function that takes the model index and returns the instance from a model class with a sklearn template. The model should have a fit(x, y) method for training and predict_proba(x) for testing.
measure: atml.Measure
A evaluation measure selected from the atml.measure module.
sparse: boolean
To indicate whether to only use a subset of the dataset to perform the experiments.
cap_size: integer
In the case sparse=True, cap_size specifies the maximum size of the dataset to run the experiments.
test_size: float
The proportion of the dataset that is used as the testing set (validation set).
res: pandas.DataFrame
A table that contains the
atml.exp.get_openml_testing(openml_dict, flow_dict, measure, max_n_exp=10)[source]

Gather machine learning experiment results from OpenML

openml_dict: dict
A dictionary that defines the (1) user defined dataset index, (2) name of the dataset, (3) OpenML defined dataset ID, (3) OpenML defined task ID. Example: {0, (‘adult’, 1590, 7592)}
flow_dict: dict
A dictionary that define the (1) OpenMl defined flow (model) ID, (2) user defined flow (model) index. Example: {1172: 0}
measure: str
The selected evaluation measure as defined by OpenML. Example: ‘predictive_accuracy’ See: https://www.openml.org/search?type=measure
max_n_exp: int
The maximum number of results collected for each dataset and task combination.
res: pandas.DataFrame
A table that contains the collected experiment results.
atml.exp.get_random_split_measurement(model_instance, x, y, measure, sparse=False, cap_size=10000, test_size=0.5)[source]

Perform a random split validation experiment for a given combination of model, dataset, and evaluation measure.

model_instance: sklearn.predictor
A model instance with the sklearn.predictor template, it should have a fit() method for model training, and a predict_proba() method to predict probability vectors on test data.
x: numpy.ndarray
The data matrix with shape (n samples, d dimensions)
y: numpy.ndarray
The label vector with shape (n samples, 1)
measure: atml.Measure
A evaluation measure selected from the atml.measure module.
sparse: boolean
To indicate whether to only use a subset of the dataset to perform the experiments.
cap_size: integer
In the case sparse=True, cap_size specifies the maximum size of the dataset to run the experiments.
test_size: float
The proportion of the dataset that is used as the testing set (validation set).
measurement: float
The performance measurement on the testing set (validation set).
atml.exp.get_single_testing(data_idx, mdl, data_dict, get_data, measure, sparse=False, cap_size=10000, test_size=0.5)[source]

Perform a single testing experiment on a specified dataset with the given model.

data_idx: int
The index of the selected dataset, as defined with data_dict.
mdl: sklearn.predictor
An instance of the sklearn predictor. The model should have a fit(x, y) method for training and predict_proba(x) for testing.
data_dict: dict
A dictionary that defines the index and the reference name of all the datasets. Example: data_dict = {0: ‘iris’, 1: ‘digits’, 2: ‘wine’}
get_data: Callable
A function that takes the dataset index and returns the features (x), and target (y) for the specified dataset.
measure: atml.Measure
A evaluation measure selected from the atml.measure module.
sparse: boolean
To indicate whether to only use a subset of the dataset to perform the experiments.
cap_size: int
In the case sparse=True, cap_size specifies the maximum size of the dataset to run the experiments.
test_size: float
The proportion of the dataset that is used as the testing set (validation set).
tmp_m: float
The performance measurement on the testing set (validation set).

atml.irt module

The :mod:’atml.irt’ module contains a set of statistical models based on the Item Response Theory.

class atml.irt.Beta_3_IRT[source]

Bases: object

The class for the Beta-3 IRT model.

curve(d_id)[source]

Generate the item characteristic curve (expectation, 0.75, 0.5, 0.25 percentile) for a given dataset

d_id: int
The index of the given dataset.
E: numpy.ndarray
The curve values of the expectation.
E_up: numpy.ndarray
The curve values of the 0.75 percentile.
E_mid: numpy.ndarray
The curve values of the 0.5 percentile (median).
E_low: numpy.ndarray
The curve values of the 0.25 percentile.
fit(dataset_list, dataset, model_list, model, measure)[source]

Fit the IRT model to a collection of testing results.

dataset_list: list
The list contains all the reference names of the datasets.
dataset: numpy.ndarray
A (n_experiment, ) numpy array contains the dataset index of each experiment
model_list: list
The list contains all the reference names of the models.
model: numpy.ndarray
A (n_experiment, ) numpy array contains the model index of each experiment.
measure: numpy.ndarray
A (n_experiment, ) numpy array contains the performance measurement of each experiment.
predict(dataset, model)[source]

Predict the expected response for a set of combinations between different models and datasets

dataset: numpy.ndarray
A (n_test, ) numpy array contains the dataset index of each testing experiment.
model: numpy.ndarray
A (n_test, ) numpy array contains the model index of each testing experiment.
E: numpy.ndarray
A (n_test, ) numpy array contains the expected performance measurement of each testing experiment.
class atml.irt.Logistic_IRT[source]

Bases: object

The class for the Logistic IRT model.

curve(d_id)[source]

Generate the item characteristic curve (expectation, 0.75, 0.5, 0.25 percentile) for a given dataset

d_id: int
The index of the given dataset.
E: numpy.ndarray
The curve values of the expectation.
E_up: numpy.ndarray
The curve values of the 0.75 percentile.
E_mid: numpy.ndarray
The curve values of the 0.5 percentile (median).
E_low: numpy.ndarray
The curve values of the 0.25 percentile.
fit(dataset_list, dataset, model_list, model, measure)[source]

Fit the IRT model to a collection of testing results.

dataset_list: list
The list contains all the reference names of the datasets.
dataset: numpy.ndarray
A (n_experiment, ) numpy array contains the dataset index of each experiment
model_list: list
The list contains all the reference names of the models.
model: numpy.ndarray
A (n_experiment, ) numpy array contains the model index of each experiment.
measure: atml.Measure
A (n_experiment, ) numpy array contains the performance measurement of each experiment.
predict(dataset, model)[source]

Predict the expected response for a set of combinations between different models and datasets

dataset: numpy.ndarray
A (n_test, ) numpy array contains the dataset index of each testing experiment.
model: numpy.ndarray
A (n_test, ) numpy array contains the model index of each testing experiment.
E: numpy.ndarray
A (n_test, ) numpy array contains the expected performance measurement of each testing experiment.
atml.irt.get_E_beta_3(parameter, fid, did, N_data, N_flow)[source]

Predict the expected response for a set of combinations between different models and datasets

parameter: numpy.ndarray
The estimated parameters for the Beta-3 IRT model.
fid: numpy.ndarray
A (n_test, ) numpy array contains the model index of each testing experiment.
did: numpy.ndarray
A (n_test, ) numpy array contains the dataset index of each testing experiment.
N_data: int
The total number of datasets in the IRT model.
N_flow: int
The total number of models in the IRT model.
E: numpy.ndarray
A (n_test, ) numpy array contains the expected performance measurement of each testing experiment.
atml.irt.get_E_logistic(parameter, fid, did, N_data, N_flow)[source]

Predict the expected response for a set of combinations between different models and datasets

parameter: numpy.ndarray
The estimated parameters for the Logistic IRT model.
fid: numpy.ndarray
A (n_test, ) numpy array contains the model index of each testing experiment.
did: numpy.ndarray
A (n_test, ) numpy array contains the dataset index of each testing experiment.
N_data: int
The total number of datasets in the IRT model.
N_flow: int
The total number of models in the IRT model.
E: numpy.ndarray
A (n_test, ) numpy array contains the expected performance measurement of each testing experiment.
atml.irt.get_curve_beta3(parameter, did, N_data, N_flow)[source]

Generate the item characteristic curve (expectation, 0.75, 0.5, 0.25 percentile) for a given dataset

parameter: numpy.ndarray
The estimated parameters for the Beta-3 IRT model.
did: int
The index of the given dataset.
N_data: int
The total number of datasets in the IRT model.
N_flow: int
The total number of models in the IRT model.
E: numpy.ndarray
The curve values of the expectation.
E_up: numpy.ndarray
The curve values of the 0.75 percentile.
E_mid: numpy.ndarray
The curve values of the 0.5 percentile (median).
E_low: numpy.ndarray
The curve values of the 0.25 percentile.
atml.irt.get_curve_logistic(parameter, did, N_data, N_flow)[source]

Generate the item characteristic curve (expectation, 0.75, 0.5, 0.25 percentile) for a given dataset

parameter: numpy.ndarray
The estimated parameters for the Logistic IRT model.
did: int
The index of the given dataset.
N_data: int
The total number of datasets in the IRT model.
N_flow: int
The total number of models in the IRT model.
E: numpy.ndarray
The curve values of the expectation.
E_up: numpy.ndarray
The curve values of the 0.75 percentile.
E_mid: numpy.ndarray
The curve values of the 0.5 percentile (median).
E_low: numpy.ndarray
The curve values of the 0.25 percentile.
atml.irt.get_obj(parameter, data, extra_args)[source]

Wrapper function to get the value of the objective function.

parameter: tensorflow.Variable
The parameter of the IRT model.
data: tensorflow.Tensor
A (n_batch, 3) tensor contains the model index, dataset index, and performance measure of each experiment.
extra_args: tuple
A tuple contains extra information for the IRT model. (irt type, total number of datasets, total number of models)
L: tensorflow.Tensor
The averaged log-likelihood of the IRT model.
atml.irt.get_obj_g(parameter, data, extra_args)[source]

Wrapper function to get the gradient of the objective function.

parameter: tensorflow.Variable
The parameter of the IRT model.
data: tensorflow.Tensor
A (n_batch, 3) tensor contains the model index, dataset index, and performance measure of each experiment.
extra_args: tuple
A tuple contains extra information for the IRT model. (irt type, total number of datasets, total number of models)

L: tensorflow.Tensor

g: tensorflow.Tensor

atml.irt.ml_beta_3_obj(parameter, fid, did, measure, N_data, N_flow)[source]

The log-likelihood objective function of the Beta-3 IRT model

parameter: tensorflow.Variable
The parameters for the Beta-3 IRT model.
fid: tf.Tensor
A (n_batch, ) tensorflow array contains the dataset index of each experiment.
did: tf.Tensor
A (n_batch, ) tensorflow array contains the model index of each experiment.
measure: tf.Tensor
A (n_batch, ) tensorflow array contains the performance measurement of each experiment.
N_data: int
The total number of datasets in the IRT model.
N_flow: int
The total number of models in the IRT model.
L: tf.Tensor
The averaged log-likelihood of the IRT model.
atml.irt.ml_logistic_obj(parameter, fid, did, measure, N_data, N_flow)[source]

The log-likelihood objective function of the Logistic IRT model

parameter: tensorflow.Variable
The parameters for the Logistic IRT model.
fid: tf.Tensor
A (n_batch, ) tensorflow array contains the dataset index of each experiment.
did: tf.Tensor
A (n_batch, ) tensorflow array contains the model index of each experiment.
measure: tf.Tensor
A (n_batch, ) tensorflow array contains the performance measurement of each experiment.
N_data: int
The total number of datasets in the IRT model.
N_flow: int
The total number of models in the IRT model.
L: tf.Tensor
The averaged log-likelihood of the IRT model.

atml.measure module

The :mod:’atml.measure’ module contains a set of common evaluation measures for predictive machine learning tasks.

class atml.measure.AUC(target_positive=0)[source]

Bases: atml.measure.Measure

Area under the ROC curve

get_measure(s, y)[source]

s: numpy.ndarray

y: numpy.ndarray

auc: float

static transform(m)[source]

m: float

m_hat: float

class atml.measure.Acc[source]

Bases: atml.measure.Measure

multi-class accuracy

static get_measure(s, y)[source]

s: numpy.ndarray

y: numpy.ndarray

acc: float

class atml.measure.BAcc(target_positive=0)[source]

Bases: atml.measure.Measure

Binary accuracy

get_measure(s, y)[source]

s: numpy.ndarray

y: numpy.ndarray

bacc: float

class atml.measure.BS[source]

Bases: atml.measure.Measure

Brier score

static get_measure(s, y)[source]

s: numpy.ndarray

y: numpy.ndarray

bs: float

static transform(m)[source]

m: float

m_hat: float

class atml.measure.F1(target_positive=0)[source]

Bases: atml.measure.Measure

F1 score

get_measure(s, y)[source]

s: numpy.ndarray

y: numpy.ndarray

f1: float

class atml.measure.LL[source]

Bases: atml.measure.Measure

Logarithm loss (cross entropy)

static get_measure(s, y)[source]

s: numpy.ndarray

y: numpy.ndarray

ll: float

class atml.measure.Measure(task)[source]

Bases: object

The base measure class, and specifies the corresponding task type for the measure. (e.g. classification)

atml.visualisation module

atml.visualisation.get_beta3_curve(data_idx, data_ref, beta_mdl, res, measure)[source]
atml.visualisation.get_beta3_figures(target_measure=2)[source]
atml.visualisation.get_cat_figures(target_measure, test_mdl_class)[source]
atml.visualisation.get_gp_curve(target_measure=2)[source]
atml.visualisation.get_logistic_curve(data_idx, data_ref, logistic_mdl, res, measure)[source]
atml.visualisation.get_logistic_figures(target_measure=2)[source]

Module contents