qumphy.data.pulsedb module

File: qumphy/data/pulsedb.py Project: 22HLT01 QUMPHY Contact: oskar.pfeffer@ptb.de Gitlab: https://gitlab.com/qumphy Description: Functions handling PulseDB data.

class qumphy.data.pulsedb.PulseDBDataModule(data_directory, dataset, source, batch_size, num_workers, sampling_rate, prefetch_factor=8, **dskwargs)[source]

Bases: LightningDataModule

LightningDataModule implementation for the PulseDB dataset.

get_target_stats()[source]
setup(stage)[source]
test_dataloader()[source]
train_dataloader()[source]
val_dataloader()[source]
class qumphy.data.pulsedb.PulseDBDataset(data_directory, dataset, subset, source='all', pressure='both', normalize=False, dtype=torch.float32, load_data=True, data_fraction=1.0, filter_params=None, noise_params=None, input_sampling_rate=None, split_to_input_sampling_rate=None, target_sampling_rate=None)[source]

Bases: Dataset

Dataset Class for the PulseDB dataset.

To use the PulseDB Dataset, the data should be in a directory containing signals.npy and metadata.csv. Before using the dataset, run the write_target_stats_yaml(data_directory) function to create a stats.yaml file containint the statistics of the target, i.e., the mean, median, std, and baseline measures of the SBP and DBP.

calculate_baseline_measures()[source]
calculate_target_stats()[source]
get_data()[source]
get_labels()[source]
get_target_stats()[source]

Returns the statistics of the target.

Returns:

(BP_mean, BP_std, BP_median, MAE_baseline, RMSE_baseline)

Return type:

tuple

load_data(data_directory, filter_params=None, noise_params=None)[source]
load_target_stats(data_directory)[source]

Reads the target statistics of the PulseDB training datasets from a YAML file.

Return type:

None

Parameters:

data_directory (pathlib.Path) – The directory where the data is located.

Return type:

None

normalize_data(data)[source]

Rescales the data to the range [-1, 1].

Parameters:

(array) (data)

Returns:

array: The normalized data in the range [-1, 1].

normalize_target(target, denormalize=False)[source]

Rescales the target to 0 mean and 1 standard deviation, using the BP_mean and BP_std attributes.

Parameters:
  • target (np.ndarray) – The input target to be normalized.

  • denormalize (bool, optional) – If True, the target will be denormalized back to its original scale.

Returns:

The (de)normalized target.

Return type:

np.ndarray

select_subset_indices(metadata)[source]

Return an index mask of the memmapped dataset based on the selected subset.

Return type:

Index

Parameters:

metadata (pd.core.frame.DataFrame) – The metadata dataframe.

Returns:

The index mask.

Return type:

pd.core.indexes.base.Index

qumphy.data.pulsedb.get_target_stats(data_directory, dataset, source, dtype=numpy.float16)[source]

Reads the target statistics of the PulseDB training datasets from a YAML file.

Return type:

dict

Parameters:
  • data_directory (string) – Full path of the directory containing the data.

  • dataset (string) – Choose between the “calibfree”, “calib”, “aami” or “mini” dataset.

  • source (string) – Either “mimiciii” or “vital” or “all” (both).

  • dtype (np.dtype) – The data type of the target statistics.

Returns:

The target statistics as a dictionary.

Return type:

dict

qumphy.data.pulsedb.write_target_stats_yaml(data_directory, verbose=True)[source]

Writes the target statistics of the PulseDB training datasets to a YAML file.

Parameters:

data_directory (string) – The directory where the data is located.

Returns:

None