hookeai.model_architectures.materials.strain_features.TimeSeriesDataset

class TimeSeriesDataset(dataset_directory, dataset_sample_files, dataset_basename='time_series_dataset')[source]

Bases: Dataset

Time series data set.

_dataset_directory

Directory where the time series data set is stored (all data set samples files).

Type:

str

_dataset_sample_files

Time series data set samples files paths. Each sample file contains a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).

Type:

list[str]

_dataset_basename

Data set file base name.

Type:

str

__len__(self)[source]

Return size of data set (number of samples).

__getitem__(self, index)[source]

Return data set sample from corresponding index.

update_dataset_sample(self, index, time_series)[source]

Update data set sample time series data.

get_dataset_directory(self)[source]

Get directory where time series data set is stored.

get_dataset_sample_files(self)[source]

Get time series data set samples files paths.

set_dataset_basename(self, dataset_basename)[source]

Set data set file base name.

get_dataset_basename(self)[source]

Get data set file base name.

update_dataset_file_internal_directory(dataset_file_path, new_directory, is_reload_data=False)[source]

Update internal directory of stored data set in provided file.

_update_dataset_directory(self, dataset_directory, is_reload_data=False)[source]

Update directory where time series data set is stored.

Constructor.

Parameters:
  • dataset_directory (str) – Directory where the time series data set is stored (all data set samples files).

  • dataset_sample_files (list[str]) – Time series data set samples files paths. Each sample file contains a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).

  • dataset_basename (str, default='time_series_dataset') – Data set file base name.

  • dataset_samples (list[dict]) – Time series data set samples data. Each sample is stored as a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).

List of Public Methods

get_dataset_basename

Get data set file base name.

get_dataset_directory

Get directory where time series data set is stored.

get_dataset_sample_files

Get time series data set samples files paths.

set_dataset_basename

Set data set file base name.

update_dataset_file_internal_directory

Update internal directory of stored data set in provided file.

update_dataset_sample

Update data set sample time series data.

Methods

__init__(dataset_directory, dataset_sample_files, dataset_basename='time_series_dataset')[source]

Constructor.

Parameters:
  • dataset_directory (str) – Directory where the time series data set is stored (all data set samples files).

  • dataset_sample_files (list[str]) – Time series data set samples files paths. Each sample file contains a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).

  • dataset_basename (str, default='time_series_dataset') – Data set file base name.

  • dataset_samples (list[dict]) – Time series data set samples data. Each sample is stored as a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).

_update_dataset_directory(dataset_directory)[source]

Update directory where time series data set is stored.

Stored data set samples files paths directory is updated according with the new directory.

Parameters:

dataset_directory (str) – Directory where the time series data set is stored (all data set samples files).

get_dataset_basename()[source]

Get data set file base name.

Returns:

dataset_basename – Data set file base name.

Return type:

str

get_dataset_directory()[source]

Get directory where time series data set is stored.

Returns:

dataset_directory – Directory where the time series data set is stored (all data set samples files).

Return type:

str

get_dataset_sample_files()[source]

Get time series data set samples files paths.

Returns:

dataset_sample_files – Time series data set samples files paths. Each sample file contains a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).

Return type:

list[str]

set_dataset_basename(dataset_basename)[source]

Set data set file base name.

Parameters:

dataset_basename (str) – Data set file base name.

static update_dataset_file_internal_directory(dataset_file_path, new_directory, is_reload_data=False)[source]

Update internal directory of stored data set in provided file.

Update is only performed if the new directory does not match the internal directory of the stored data set.

Parameters:
  • dataset_file_path (str) – Data set file path.

  • new_directory (str) – Data set new directory.

  • is_reload_data (bool, default=False) – Reload and store data set samples in attribute dataset_samples_data. Only effective if is_store_dataset=True.

update_dataset_sample(index, time_series)[source]

Update data set sample time series data.

Parameters:
  • index (int) – Index of returned data set sample (index must be in [0, n_sample]).

  • time_series (dict) – Data set sample defined as a dictionary where each feature (key, str) data is a torch.Tensor(2d) of shape (sequence_length, n_features).