QUMPHY package

qumphy.data.deepbeat module

File: qumphy/data/pulsedb.py Project: 22HLT01 QUMPHY Contact: oskar.pfeffer@ptb.de Gitlab: https://gitlab.com/qumphy Description: Functions handling DeepBeat dataset.

class qumphy.data.deepbeat.DeepBeatDataModule(batch_size, num_workers, pin_memory=True, **dskwargs)[source]

Bases: LightningDataModule

LightningDataModule implementation for the DeepBeat dataset.

setup(stage)[source]

Called at the beginning of fit (train + validate), validate, test, or predict. This is a good hook when you need to build models dynamically or adjust something about them. This hook is called on every process when using DDP.

Parameters:: stage – either 'fit', 'validate', 'test', or 'predict'

Example:

class LitModel(...):
    def __init__(self):
        self.l1 = None

    def prepare_data(self):
        download_data()
        tokenize()

        # don't do this
        self.something = else

    def setup(self, stage):
        data = load_data(...)
        self.l1 = nn.Linear(28, data.num_classes)

test_dataloader()[source]

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

test()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader()[source]

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

fit()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

val_dataloader()[source]

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

fit()
validate()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

class qumphy.data.deepbeat.DeepBeatDataset(data_directory, subset, dataset='set_revised', target_format='binary', normalize=False, dtype=torch.float32, load_data=True)[source]

Bases: Dataset

DeepBeat dataset class.

get_data()[source]

get_labels()[source]

load_data(data_directory)[source]

normalize_data(data)[source]

Rescales the data to the range [-1, 1].

Parameters:: (array) (data)
Returns:: array: The normalized data in the range [-1, 1].

select_subset_indices(metadata)[source]

Set an index mask of the memmapped dataset based on the selected subset.

Return type:: Index
Parameters:: metadata (pd.core.frame.DataFrame) – The metadata dataframe.
Returns:: The index mask.
Return type:: pd.core.indexes.base.Index

qumphy.data.pulsedb module

File: qumphy/data/pulsedb.py Project: 22HLT01 QUMPHY Contact: oskar.pfeffer@ptb.de Gitlab: https://gitlab.com/qumphy Description: Functions handling PulseDB data.

class qumphy.data.pulsedb.PulseDBDataModule(data_directory, dataset, source, batch_size, num_workers, **dskwargs)[source]

Bases: LightningDataModule

LightningDataModule implementation for the PulseDB dataset.

get_target_stats()[source]

setup(stage)[source]

Called at the beginning of fit (train + validate), validate, test, or predict. This is a good hook when you need to build models dynamically or adjust something about them. This hook is called on every process when using DDP.

Parameters:: stage – either 'fit', 'validate', 'test', or 'predict'

Example:

class LitModel(...):
    def __init__(self):
        self.l1 = None

    def prepare_data(self):
        download_data()
        tokenize()

        # don't do this
        self.something = else

    def setup(self, stage):
        data = load_data(...)
        self.l1 = nn.Linear(28, data.num_classes)

test_dataloader()[source]

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

test()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader()[source]

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

fit()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

val_dataloader()[source]

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

fit()
validate()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

class qumphy.data.pulsedb.PulseDBDataset(data_directory, dataset, subset, source='all', pressure='both', normalize=False, dtype=torch.float32, load_data=True)[source]

Bases: Dataset

Dataset Class for the PulseDB dataset.

To use the PulseDB Dataset, the data should be in a directory containing signals.npy and metadata.csv. Before using the dataset, run the write_target_stats_yaml(data_directory) function to create a stats.yaml file containint the statistics of the target, i.e., the mean, median, std, and baseline measures of the SBP and DBP.

calculate_baseline_measures()[source]

calculate_target_stats()[source]

get_data()[source]

get_labels()[source]

get_target_stats()[source]

Returns the statistics of the target.

Returns:: (BP_mean, BP_std, BP_median, MAE_baseline, RMSE_baseline)
Return type:: tuple

load_data(data_directory)[source]

load_target_stats(data_directory)[source]

Reads the target statistics of the PulseDB training datasets from a YAML file.

Return type:: None
Parameters:: data_directory (pathlib.Path) – The directory where the data is located.
Return type:: None

normalize_data(data)[source]

Rescales the data to the range [-1, 1].

Parameters:: (array) (data)
Returns:: array: The normalized data in the range [-1, 1].

normalize_target(target, denormalize=False)[source]

Rescales the target to 0 mean and 1 standard deviation, using the BP_mean and BP_std attributes.

Parameters:

target (np.ndarray) – The input target to be normalized.
denormalize (bool, optional) – If True, the target will be denormalized back to its original scale.

Returns:

The (de)normalized target.

Return type:

np.ndarray

select_subset_indices(metadata)[source]

Return an index mask of the memmapped dataset based on the selected subset.

Return type:: Index
Parameters:: metadata (pd.core.frame.DataFrame) – The metadata dataframe.
Returns:: The index mask.
Return type:: pd.core.indexes.base.Index

qumphy.data.pulsedb.get_target_stats(data_directory, dataset, source, dtype=<class 'numpy.float16'>)[source]

Reads the target statistics of the PulseDB training datasets from a YAML file.

Return type:

dict

Parameters:

data_directory (string) – Full path of the directory containing the data.
dataset (string) – Choose between the “calibfree”, “calib”, “aami” or “mini” dataset.
source (string) – Either “mimiciii” or “vital” or “all” (both).
dtype (np.dtype) – The data type of the target statistics.

Returns:

The target statistics as a dictionary.

Return type:

dict

qumphy.data.pulsedb.write_target_stats_yaml(data_directory, verbose=True)[source]

Writes the target statistics of the PulseDB training datasets to a YAML file.

Parameters:: data_directory (string) – The directory where the data is located.
Returns:: None

qumphy.data.utils module

File: qumphy/data/utils.py Project: 22HLT01 QUMPHY Contact: nando.hegemann@ptb.de Gitlab: https://gitlab.com/qumphy Description: Loading functions for various datasets.

qumphy.data.utils.calculate_regression_baseline(dataset, median, mean)[source]

Calculate regression baseline metrics using the median and mean of the dataset.

Return type:

tuple[float, float, float, float]

Parameters:

dataset (Class) – Dataset object.
median (np.ndarray | float) – Median of the dataset.
mean (np.ndarray | float) – Mean of the dataset.

Returns:

Baseline metrics.

Return type:

tuple[float, float, float, float]

qumphy.data.utils.get_filename(dataset, filetype)[source]

Create filename for dataset descriptor.

Return type:

str

Parameters:

dataset (str) – Dataset descriptor, e.g. test, train3 or validate02.
filetype (str) – Descriptor for the data part, e.g. label or signal.

Returns:

Filename for the corresponding data.

Examples

>>> get_filename("test", "signal")
'test_signal.npy'

>>> get_filename("train3", "label")
'train_label_03.npy'

>>> get_filename("train03", "label")
'train_label_03.npy'

qumphy.metrics module

File: qumphy/metrics.py Project: 22HLT01 QUMPHY Contact: nando.hegemann@ptb.de Gitlab: https://gitlab.com/qumphy Description: Evaluation metrics for model performance.

qumphy.metrics.all_binary_metrics(target, prediction)[source]

Evaluate all binary classification metrics.

Given a target and a prediction array, this function computes all metrics as decided for the QUMPHY common evaluation framework.

The metrics are returned as a dictionary with the following keys:

auc: Area under the curve calculated with raw probabilities
f1: F1-score calculated with a classification threshold of 0.5
mcc_sens: Matthews correlation coefficient calculated with a threshold achieving a sensitivity of 0.8
mcc_spec: Matthews correlation coefficient calculated with a threshold achieving a specificity of 0.8
sens: Sensitivity (with a threshold achieving a sensitivity of 0.8)
spec: Specificity (with a threshold achieving a specificity of 0.8)

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions (raw probability of positive class).

Returns:

Dictionary with all metrics.

Return type:

Dict[str, float]

qumphy.metrics.all_regression_metrics(target, prediction, baseline_mae=None)[source]

Evaluate all regression classification metrics.

Return type:

dict[str, float]

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.
baseline_mae (float) – Baseline mean absolute error.

Returns:

Dictionary with all metrics.

Return type:

dict[str, float]

qumphy.metrics.auc_score_binary(target, prediction, axis=0)[source]

Compute the area und curve (AUC) score for binary classification.

Return type:

float | ndarray

Parameters:

target (np.ndarray) – Binary ground truth values for different samples.
prediction (np.ndarray) – Binary model output predictions (raw prob.) associated with the positive class.
axis (int, optional) – Axis to compute AUC over, by default 0.

Returns:

Array of AUC values.

See also

multiclass_auc_score: AUC score for more then two classes.

Examples

>>> target = np.array([0, 1, 0, 1, 1, 0, 1, 0, 1, 0])
>>> prediction = np.array([.99, .8, .6, .63, .77, .23, .3, .78, .2, 0.01])
>>> auc_score_binary(target, prediction)
xx

>>> target = np.random.randint(0, 2, (100, 2))
>>> auc_score_binary(target, target, axis=0)
(1.0, 1.0)

>>> target = np.random.randint(0, 2, (50, 100, 2, 5))
>>> auc_score_binary(target, target, axis=1).shape
(50, 2, 5)

qumphy.metrics.auc_score_multiclass(target, prediction, comparison_type='ovr')[source]

Compute the area und curve (AUC) score.

Return type:

float | ndarray

Parameters:

target (np.ndarray) – Multiclass ground truth values.
prediction (np.ndarray) – Array of model output probabilities of different classes for different samples. If target shape is (n_samples, ...) with n_classes different class values, then prediction needs to have shape (n_samples, n_classes, ...). Axis 1 (n_classes) needs to sum to one.
comparison_type (str) –
Comparison type for multiclasses, by default “ovr”.

ovr : Stands for one-vs-rest. Computes the AUC for each class against the rest of the classes.

ovo : Stands for one-vs-one. Computes the average AUC of all possible pairwise combinations of classes.

Returns:

Array of AUC values.

See also

auc_score_binary: AUC score for exactly two classes.

Examples

>>> target = np.array([0, 1, 2, 1, 2, 0])
>>> prediction = np.array([[0.8, 0.1, 0.1],
>>>                        [0.2, 0.5, 0.3],
>>>                        [0.8, 0.1, 0.1],
>>>                        [0.7, 0.2, 0.1],
>>>                        [0.4, 0.3, 0.3],
>>>                        [0.5, 0.4, 0.1]])
>>> auc_score_multiclass(target, prediction, comparison_type="ovo")
0.6875

>>> target = np.random.randint(0, 3, (100, 10, 5, 2))
>>> prediction = np.random.uniform(0, 1, (100, 3, 10, 5, 2))
>>> prediction /= np.expand_dims(np.sum(prediction, axis=1), 1)
>>> auc_score_multiclass(target, prediction).shape
(10, 5, 2)

qumphy.metrics.balanced_accuracy_score(target, prediction)[source]

Compute balanced accuracy score for binary or multi-class classification.

For the binary case the balanced accuracy score \(\operatorname{Acc}_b\) is given by the arithmetic mean of sensitivity (Se) and specificity (Sp), i.e. :rtype: float

\[\operatorname{Acc}_b = \frac{1}{2}(\operatorname{Se} + \operatorname{Sp}) = \frac{1}{2}\Bigl( \frac{\operatorname{TP}}{\operatorname{TP}+\operatorname{FN}} + \frac{\operatorname{TN}}{\operatorname{TN}+\operatorname{FP}} \Bigr),\]

expressed by true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN). In general, balanced accuracy is computed by

\[\operatorname{Acc}_b(y_{\mathrm{true}}, y_{\mathrm{pred}}) = \frac{\sum_{i=1}^N w_i \, \delta(y_{\mathrm{true},i} = y_{\mathrm{pred}, i})}{\sum_{i=1}^N w_i}\]

with weights

\[w_i = \frac{1}{\sum_{j=1}^N \delta(y_{\mathrm{true},i} = y_{\mathrm{true},j})},\]

where \(\delta(y_i = y_j)\) denotes the Kronecker delta function.

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Predicted values.

Returns:

Balanced accuracy score for binary or multi-class classification.

Return type:

float

Examples

Computation of balanced accuracy score for binary classification, i.e., a one dimensional array with only 2 classes

>>> target = np.array([0, 1, 1, 0, 1, 0, 0, 1, 0, 1])
>>> prediction = np.array([0, 1, 0, 1, 1, 1, 0, 1, 1, 1])
>>> balanced_accuracy_score(target, prediction)
0.6

Computation of balanced accuracy score for a multi-class scenario, i.e., a 1D array with more then two classes

>>> target = np.array([1, 2, 2, 2, 1, 2, 1, 0, 1, 1])
>>> prediction = np.array([1, 1, 2, 0, 0, 1, 1, 0, 0, 2])
>>> balanced_accuracy_score(target, prediction)
0.55

qumphy.metrics.f1_score(target, prediction, average=None)[source]

Compute F1-score of binary, multi-class or multi-label classification.

The \(F_1\) score is computed using the true positives (TP), false positives (FP) and false negatives (FN) via :rtype: float | ndarray

\[F_1 = \frac{2\operatorname{TP}}{2\operatorname{TP} + \operatorname{FP} + \operatorname{FN}}.\]

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Predicted values.
average (str | None, optional) – Averaging of the F1-scores (default None). For binary classification, average=binary is the default case. For multi-class and multi-lable classification, average=None is the default case, which results in F1-scores for each individual class. For more detail about averaging see the documentation of sklearn.metrics.f1_score.

Returns:

F1 score(s) for binary, multi-class or multi-lable classification.

Return type:

float | np.ndarray

Examples

Computation of F1-score for binary classification, i.e., a one dimensional array with only 2 classes

>>> target = np.array([0, 0, 1, 0, 1, 0, 0, 1, 0, 1])
>>> prediction = np.array([1, 1, 0, 1, 1, 1, 0, 1, 1, 1])
>>> f1_score(target, prediction)
0.5

Computation of F1-score for a multi-class scenario, i.e., a 1D array with more then two classes

>>> target = np.array([1, 2, 2, 2, 1, 2, 1, 0, 1, 1])
>>> prediction = np.array([1, 1, 2, 0, 0, 1, 1, 0, 0, 2])
>>> f1_score(target, prediction)
array([0.4, 0.44444444, 0.33333333])

Computation of F1-score for a multi-lable scenario, i.e., a 2D array with with columns representing different labels and values 0 or 1 as entries

>>> target = np.array([[0, 1, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]])
>>> prediction = np.array([[1, 0, 1], [1, 1, 1], [0, 0, 1], [1, 1, 0]])
>>> f1_score(target, prediction)
array([0.66666667, 0.4, 0.4])

Computation of F1-score for a multi-class scenario with averaging of F1-scores over the different classes

>>> target = np.array([1, 2, 2, 2, 1, 2, 1, 0, 1, 1])
>>> prediction = np.array([1, 1, 2, 0, 0, 1, 1, 0, 0, 2])
>>> f1_score(target, prediction, average='micro')
0.4

qumphy.metrics.false_discovery_rate(target, prediction, average=None)[source]

Compute false discovery rate (FDR).

The false discovery rate (FDR) is given by \(\operatorname{FDR} = 1 - \operatorname{PPV}\), where \(PPV\) is the precision (positive predicted value).

Return type:

float | ndarray

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Predicted values.
average (str | None, optional) – Averaging type (default None). For more detail about averaging see the documentation of sklearn.metrics.precision_score.

Returns:

False discovery rate for binary, multi-class or multi-lable classification.

Return type:

float | np.ndarray

See also

precision_score, f1_score

qumphy.metrics.general_threshold(target, prediction, metric, metric_value, greater_than=True)[source]

Find the threshold that sets the given metric closest to metric_value.

Return type:

float

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.
metric (Callable[[np.ndarray, np.ndarray], float]) – A metric function that takes target and prediction arrays as input.
metric_value (float) – The value of the metric to be achieved.
greater_than (bool) – True if the metric is supposed to be higher than metric_value, false otherwise.

Returns:

The threshold that achieves the metric.

Return type:

float

qumphy.metrics.ieee_grades(target, prediction)[source]

Compute the IEEE grades of the predicted values.

The grades are calculated by comparing the difference between the target and the prediction. Returned are the percentage of samples that fall within each grade. The grading scores follow the IEEE Std 1708a™-2019 scheme, where instead of a mean absolute difference of two measurements with the standard device, we use only one measurement.

Return type:: ndarray

The grading for each sample is determined as follows

Grade A for error ≤5 mmHg
Grade B for error between 5-6 mmHg
Grade C for error between 6-7 mmHg
Grade D for error ≥7 mmHg

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.

Return type:

np.ndarray

qumphy.metrics.l1_norm(array, axis=0)[source]

Compute the \(L^1\)-norm of an array along an axis.

The \(L^2\)-norm of an array \(x\in\mathbb{R}^N\) is given by \(\Vert x \Vert_{L^1} = \frac{1}{N}\sum_{j=1}^N \vert x_j \vert\).

Return type:

float | ndarray

Parameters:

array (np.ndarray) – Data array.
axis (int, optional) – Axis, by default 0.

Returns:

Array of \(L^1\)-norms over the specified axes.

See also

mean_absolute_error: Wrapper for l1_norm(target - prediction).

l2_norm, root_mean_square_error

Examples

>>> l1_norm(np.array([1, 2, 3, 4]))
10

>>> array = np.random.normal(0, 1, (10, 5, 3, 2))
>>> l1_norm(array, axis=1).shape  # norm over second axis
(10, 3, 1)

qumphy.metrics.l2_norm(array, axis=0)[source]

Compute the \(L^2\)-norm along an axis.

The \(L^2\)-norm of an array \(x\in\mathbb{R}^N\) is given by \(\Vert x \Vert_{L^2} = \sqrt{\frac{1}{N}\sum_{j=1}^N x_j^2 }\).

Return type:

float | ndarray

Parameters:

array (np.ndarray) – Data array.
axis (int, optional) – Axis, by default 0.

Returns:

Array of \(L^2\)-norms over the specified axes.

See also

root_mean_square_error: Wrapper for l2_norm(target - prediction).

l1_norm, mean_absolute_error

Examples

>>> l2_norm(np.array([1, 2, 3, 4]))**2
30.0

>>> array = np.random.normal(0, 1, (10, 5, 3, 2))
>>> l2_norm(array, axis=1).shape  # norm over second axis
(10, 3, 1)

qumphy.metrics.matthews_correlation_coefficient(target, prediction)[source]

Compute Matthews correlation coefficient (Mcc) of binary or multi-class task.

The Matthews correlation coefficient (Mcc) is computed using the true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) via :rtype: float

\[\operatorname{Mcc} = \frac{\operatorname{TP}\cdot\operatorname{TN} - \operatorname{FP}\cdot\operatorname{FN}}{\sqrt{(\operatorname{TP}+\operatorname{FP})(\operatorname{TP}+\operatorname{FN})(\operatorname{TN}+\operatorname{FP})(\operatorname{TN}+\operatorname{FN})}}.\]

For the multi-class case, let \(C\) be the confusion matrix for \(K\) classes and define the number of times class \(k\) truly occurs \(t_k = \sum_{i=1}^K C_{ik}\), the number of times class \(k\) was predicted \(p_k = \sum_{i=1}^K C_{ki}\), the total number of samples correctly predicted \(c = \sum_{k=1}^K C_{kk}\) and the total number of samples \(s = \sum_{i,j=1}^K C_{ij}\). Then the multiclass Mcc is defined as

\[\operatorname{Mcc} = \frac{c \cdot s -\sum_{k=1}^K p_k \cdot t_k}{\sqrt{ (s^2 - \sum_{k=1}^K p_k^2)(s^2 - \sum_{k=1}^K t_k^2)}}.\]

Note

When there are more than two labels, the value of the MCC will no longer range between -1 and +1. Instead the minimum value will be somewhere between -1 and 0 depending on the number and distribution of ground true labels. The maximum value is always +1.

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Predicted values.

Returns:

Mcc for binary and multi-class classification.

Return type:

float | np.ndarray

Examples

Computation of Mcc for binary classification, i.e., a one dimensional array with only 2 classes

>>> target = np.array([0, 0, 1, 0, 1, 0, 0, 1, 0, 1])
>>> prediction = np.array([1, 1, 0, 1, 1, 1, 0, 1, 1, 1])
>>> matthews_correlation_coefficient(target, prediction)
-0.10206207261596577

Computation of Mcc for a multi-class scenario, i.e., a 1D array with more then two classes

>>> target = np.array([1, 2, 2, 2, 1, 2, 1, 0, 1, 1])
>>> prediction = np.array([1, 1, 2, 0, 0, 1, 1, 0, 0, 2])
>>> matthews_correlation_coefficient(target, prediction)
0.13130643285972254

qumphy.metrics.mean_absolute_error(target, prediction, axis=0)[source]

Compute the mean absolute error (MAE) between target and prediction.

Return type:

float | ndarray

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.
axis (int, optional) – Axis to sum over, by default 0.

Returns:

Array of MAE values (\(L^1\)-norms) over the specified axes.

See also

l1_norm: This is a wrapper for l1_norm(target - prediction, axis=axis).

l2_norm, root_mean_square_error

Examples

>>> mean_absolute_error(np.array([1, 2, 3]), np.array([1, 2, 3]))
0.0

>>> target = np.random.normal(0, 1, (10, 5, 3, 2))
>>> prediction = np.random.normal(0, 1, (10, 5, 3, 2))
>>> mean_absolute_error(target, prediction, axis=1).shape  # norm over second axis
(10, 3, 1)

qumphy.metrics.mean_absolute_scaled_error(baseline_mae, model_mae)[source]

Compute mean absolute scaled error (MASE).

The MASE is a measure of the magnitude of the error relative to a baseline error. It is defined as the mean absolute error divided by the baseline error.

Return type:

float

Parameters:

baseline_mae (float) – Mean absolute scaled error of the baseline.
model_mae (float) – Mean absolute error.

Returns:

Mean absolute scaled error.

Return type:

float

qumphy.metrics.precision_score(target, prediction, average=None)[source]

Compute precision (PPV) of binary, multi-class or multi-label classification.

The precision score (positive predictive value, PPV) is computed using the true positives (TP) and false positives (FP) via :rtype: float | ndarray

\[\operatorname{PPV} = \frac{\operatorname{TP}}{\operatorname{TP} + \operatorname{FP}}.\]

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Predicted values.
average (str | None, optional) –
Averaging type for score (default None). For more detail about averaging see the documentation of sklearn.metrics.precision_score.

Returns:

Precision scores for binary, multi-class or multi-lable classification.

Return type:

float | np.ndarray

See also

f1_score, false_discovery_rate

Examples

Computation of precision score for binary classification, i.e., a one dimensional array with only 2 classes

>>> target = np.array([0, 0, 1, 0, 1, 0, 0, 1, 0, 1])
>>> prediction = np.array([1, 1, 0, 1, 1, 1, 0, 1, 1, 1])
>>> precision_score(target, prediction)
0.375

Computation of precision score for a multi-class scenario, i.e., a 1D array with more then two classes

>>> target = np.array([1, 2, 2, 2, 1, 2, 1, 0, 1, 1])
>>> prediction = np.array([1, 1, 2, 0, 0, 1, 1, 0, 0, 2])
>>> precision_score(target, prediction)
array([0.25, 0.5, 0.5])

Computation of precision score for a multi-lable scenario, i.e., a 2D array with with columns representing different labels and values 0 or 1 as entries

>>> target = np.array([[0, 1, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]])
>>> prediction = np.array([[1, 0, 1], [1, 1, 1], [0, 0, 1], [1, 1, 0]])
>>> precision_score(target, prediction)
array([0.33333333, 0.5, 0.66666667])

Computation of precision score for a multi-class scenario with averaging of F1-scores over the different classes

>>> target = np.array([1, 2, 2, 2, 1, 2, 1, 0, 1, 1])
>>> prediction = np.array([1, 1, 2, 0, 0, 1, 1, 0, 0, 2])
>>> precision_score(target, prediction, average='micro')
0.4

qumphy.metrics.recall_score_threshold(target, prediction, recall_value, pos_label=1, greater_than=True, dtype=<class 'numpy.float32'>)[source]

Compute the classification threshold so that the recall score is closest to the specified value, but greater (or lower).

Default: The threshold is computed for the sensitivity score.

The threshold is set as the next floating point number after (before) the value of the prediction that needs to be classified positive (negative) to achieve the desired recall score.

Return type:

float

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.
recall_value (float) – The desired recall score.
pos_label (1 | 0, optional) – 1 to compute the threshold for sensitivity, 0 for specificity.
greater_than (bool, optional) – True to let the recall score be higher than recall_value, false to let the recall score be lower than recall_value.
dtype (np.dtype, optional) – The data type of the threshold, by default np.float32

Returns:

The classification threshold.

Return type:

float

qumphy.metrics.root_mean_square_error(target, prediction, axis=0)[source]

Compute the root mean square error (RMSE) between target and prediction.

Return type:

float | ndarray

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.
axis (int, optional) – Axis to sum over, by default 0.

Returns:

Array of RMSE values (\(L^2\)-norms) over the specified axes.

See also

l2_norm: This is a wrapper for l2_norm(target - prediction, axis=axis).

l1_norm, mean_absolute_error

Examples

>>> root_mean_square_error(np.array([1, 2, 3]), np.array([1, 2, 3]))
0.0

>>> target = np.random.normal(0, 1, (10, 5, 3, 2))
>>> prediction = np.random.normal(0, 1, (10, 5, 3, 2))
>>> root_mean_square_error(target, prediction, axis=1).shape  # norm over second axis
(10, 3, 1)

qumphy.metrics.sensitivity(target, prediction)[source]

Compute the sensitivity (or true positive rate). Sensitivity is also known as the recall score of the positive class.

Return type:

float

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.

Returns:

Sensitivity score.

Return type:

float

See also

specificity

qumphy.metrics.specificity(target, prediction)[source]

Compute the specificity (or true negative rate). Specificity is also known as the recall score of the negative class.

Return type:

float

Parameters:

target (np.ndarray) – Ground truth values.
prediction (np.ndarray) – Model output predictions.

Returns:

Specificity score.

Return type:

float

See also

sensitivity

qumphy.misc module

File: qumphy/misc.py Project: 22HLT01 QUMPHY Contact: nando.hegemann@ptb.de Gitlab: https://gitlab.com/qumphy Description: Miscellaneous functions.

qumphy.misc.batch(iterable, n=1)[source]

Split iterable into different batches of batchsize n.

Return type:

Iterator

Parameters:

iterable (array_like) – Iterable to split.
n (int, default=1) – Batch size.

Yields:

Iterable for different batches.

qumphy.misc.eval_argument_parser()[source]

qumphy.misc.eval_torch_model_by_numpy_ndarray(model, data)[source]

Evaluate a torch model (nn.Module) with a numpy ndarray.

Return type:

ndarray

Parameters:

model (nn.Module) – Torch model.
data (np.ndarray) – Input data.

Returns:

Model output predictions.

Return type:

np.ndarray

qumphy.misc.instantiate_class(config)[source]

Instantiate a class from a dictionary config.

The config dictionary should have a “class_path” key with the path to the class (e.g., ‘my_module.MyClass’). It should also have an “init_args” key with a dictionary of arguments to pass to the class constructor.

If the config dictionary has a “classes” key, the function will recursively instantiate the classes specified in the list and pass them as arguments to the class constructor. The “classes” key should be a list of dictionaries, each with a “class_path” key and a “keyword” key specifying the keyword argument to pass the class instance to. If the “class_list” key is present, a list of classes will be instantiated and passed to the same keyword argument.

Example: >>> config = { … “class_path”: “my_module.Class1”, … “init_args”: {“first_arg”: 1, “second_arg”: 2}, … “classes”: [ … {“keyword”: “third_arg”, “class_path”: “my_module.Class2”, “init_args”: {“x”: 3}}, … {“keyword”: “fourth_arg”, “class_list”: [ … {“class_path”: “my_module.Class3”, “init_args”: {“y”: 4}}, … {“class_path”: “my_module.Class4”, “init_args”: {“z”: 5}} … ] … ] … }

Return type:: object
Parameters:: config (dict) – A dictionary with the class path and arguments to pass to the class constructor.
Returns:: An instance of the specified class.
Return type:: object

qumphy.misc.instantiate_class_from_string(class_path, *init_args, **init_kwargs)[source]

Instantiate a class from a given string path.

Parameters:

class_path (str) – The full path to the class (e.g., ‘my_module.MyClass’).
*init_args – Arguments to pass to the class constructor.
**init_kwargs – Keyword arguments to pass to the class constructor.

Returns:

An instance of the specified class.

Return type:

object

qumphy.misc.parse_value(value)[source]

Try parsing a string as a boolean, integer, or float. If parsing fails, return the original string.

Parameters:: value (str) – The string to parse.
Returns:: The parsed value, or the original string if parsing fails.
Return type:: object

qumphy.misc.str2dict(text)[source]

Convert a string of the format “a.b.c:value” or “a.b.c=value” into a nested dictionary.

The value part of the string is parsed as a boolean, integer, float, or string.

Parameters:: text (str) – The string to convert.
Returns:: A nested dictionary where the keys are the parts of the string separated by ‘.’, and the value is separated by ‘:’.
Return type:: dict

Examples

>>> str2dict("a.b.c:1")
{'a': {'b': {'c': '1'}}}
>>> str2dict("x.y.z:foo")
{'x': {'y': {'z': 'foo'}}}

qumphy.misc.train_argument_parser()[source]

qumphy.misc.update_dictionary(d1, d2)[source]

qumphy.uq module

File: qumphy/uq.py Project: 22HLT01 QUMPHY Contact: oskar.pfeffer@ptb.de Gitlab: https://gitlab.com/qumphy Description: Uncertainty quantification utilities.

qumphy.uq.deep_ensemble(models, data, weights=None)[source]

Compute deep ensemble of the data using the given models.

Return type:

ndarray

Parameters:

models (list) – List of callable models. Expected to return type np.ndarray.
data (np.ndarray) – Input data.
weights (np.ndarray, optional) – Weights for each model.

Returns:

Weighted model output predictions.

Return type:

np.ndarray

Examples

Compute unweighted deep ensemble prediction of two models on given data.

>>> model0 = lambda x : np.dot(np.zeros((1, 2)), x.T).reshape(-1, 1)
>>> model1 = lambda x : np.dot(np.ones((1, 2)), x.T).reshape(-1, 1)
>>> data = np.array([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
>>> deep_ensemble([model0, model1], data)
[[0. ]
 [0.5]
 [0.5]
 [1. ]]

Compute weighted deep ensemble prediction of two models on given data.

>>> model0 = lambda x : np.dot(np.zeros((1, 2)), x.T).reshape(-1, 1)
>>> model1 = lambda x : np.dot(np.ones((1, 2)), x.T).reshape(-1, 1)
>>> data = np.array([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
>>> weights=np.array([0.0, 1.0])
>>> deep_ensemble([model0, model1], data)
[[0.]
 [1.]
 [1.]
 [2.]]

qumphy.uq.deep_ensemble_gaussian(prediction_mean, prediction_var, weights=None)[source]

Compute deep ensemble using the given predictions.

Return type:

Tuple[ndarray, ndarray]

Parameters:

prediction_mean (np.ndarray) – Mean of the predicted gaussian distribution.
prediction_std (np.ndarray) – Variance of the predicted gaussian distribution.
weights (np.ndarray, optional) – Weights for each model.

Returns:

Weighted ensemble prediction mean and variance.

Return type:

Tuple[np.ndarray, np.ndarray]

Examples

Compute deep ensemble using the given predictions. Shape of prediction_mean and prediction_var should be:

[#models, #samples, #outputs]

>>> prediction_mean_1 = np.array([[[0.0], [0.5]], [[0.5], [1.0]]])
>>> prediction_mean_2 = np.array([[[1.0], [0.5]], [[0.5], [1.0]]])
>>> prediction_var_1 = np.array([[[0.0], [1.0]], [[0.5], [1.0]]])
>>> prediction_var_2 = np.array([[[0.0], [0.0]], [[0.5], [1.0]]])
>>> prediction_mean = np.array([prediction_mean_1, prediction_mean_2])
>>> prediction_var = np.array([prediction_var_1, prediction_var_2])
>>> weights = np.array([0.25, 0.75])
>>> deep_ensemble_gaussian(prediction_mean, prediction_var, weights)