hookeai.model_architectures.procedures.model_data_scaling.fit_data_scaler_from_dataset¶
- fit_data_scaler_from_dataset(dataset, features_type, n_features, scaling_type='mean-std', scaling_parameters={})[source]¶
Fit features type data scaler from given data set.
Data scaler normalization tensors are fitted from given data set, overriding provided data scaling parameters.
- Parameters:
dataset (torch.utils.data.Dataset) – Time series data set. Each sample is stored as a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).
features_type (str) – Features for which data scaler is fitted (e.g., ‘features_in’, ‘features_out’). Must be directly available from data set samples.
n_features (int) – Number of features (dimensionality).
scaling_type ({'min-max', 'mean-std'}, default='mean-std') – Type of data scaling. Min-Max scaling (‘min-max’) or standardization (‘mean-std’).
scaling_parameters (dict, default={}) – Data scaling parameters (item, dict) for each features type (key, str). For ‘min-max’ data scaling, the parameters are the ‘minimum’ and ‘maximum’ features normalization tensors, as well as the ‘norm_minimum’ and ‘norm_maximum’ normalization bounds. For ‘mean-std’ data scaling, the parameters are the ‘mean’ and ‘std’ features normalization tensors.
- Returns:
data_scaler – Data scaler.
- Return type:
{TorchStandardScaler, TorchMinMaxScaler}