qumphy.data.pulsedb module
File: qumphy/data/pulsedb.py Project: 22HLT01 QUMPHY Contact: oskar.pfeffer@ptb.de Gitlab: https://gitlab.com/qumphy Description: Functions handling PulseDB data.
- class qumphy.data.pulsedb.PulseDBDataModule(data_directory, dataset, source, batch_size, num_workers, sampling_rate, prefetch_factor=8, **dskwargs)[source]
Bases:
LightningDataModuleLightningDataModule implementation for the PulseDB dataset.
- class qumphy.data.pulsedb.PulseDBDataset(data_directory, dataset, subset, source='all', pressure='both', normalize=False, dtype=torch.float32, load_data=True, data_fraction=1.0, filter_params=None, noise_params=None, input_sampling_rate=None, split_to_input_sampling_rate=None, target_sampling_rate=None)[source]
Bases:
DatasetDataset Class for the PulseDB dataset.
To use the PulseDB Dataset, the data should be in a directory containing signals.npy and metadata.csv. Before using the dataset, run the write_target_stats_yaml(data_directory) function to create a stats.yaml file containint the statistics of the target, i.e., the mean, median, std, and baseline measures of the SBP and DBP.
- get_target_stats()[source]
Returns the statistics of the target.
- Returns:
(BP_mean, BP_std, BP_median, MAE_baseline, RMSE_baseline)
- Return type:
tuple
- load_target_stats(data_directory)[source]
Reads the target statistics of the PulseDB training datasets from a YAML file.
- Return type:
None- Parameters:
data_directory (pathlib.Path) – The directory where the data is located.
- Return type:
None
- normalize_data(data)[source]
Rescales the data to the range [-1, 1].
- Parameters:
(array) (data)
- Returns:
array: The normalized data in the range [-1, 1].
- normalize_target(target, denormalize=False)[source]
Rescales the target to 0 mean and 1 standard deviation, using the BP_mean and BP_std attributes.
- Parameters:
target (np.ndarray) – The input target to be normalized.
denormalize (bool, optional) – If True, the target will be denormalized back to its original scale.
- Returns:
The (de)normalized target.
- Return type:
np.ndarray
- qumphy.data.pulsedb.get_target_stats(data_directory, dataset, source, dtype=numpy.float16)[source]
Reads the target statistics of the PulseDB training datasets from a YAML file.
- Return type:
dict- Parameters:
data_directory (string) – Full path of the directory containing the data.
dataset (string) – Choose between the “calibfree”, “calib”, “aami” or “mini” dataset.
source (string) – Either “mimiciii” or “vital” or “all” (both).
dtype (np.dtype) – The data type of the target statistics.
- Returns:
The target statistics as a dictionary.
- Return type:
dict