graphorge.gnn_base_model.data.graph_dataset.split_dataset

split_dataset(dataset, split_sizes, is_save_subsets=False, subsets_directory=None, subsets_basename=None, seed=None)[source]

Randomly split data set into non-overlapping subsets.

Parameters:
  • dataset (torch.utils.data.Dataset) – Data set.

  • split_sizes (dict) – Size (item, float) of each data subset name (key, str), where size is a fraction contained between 0 and 1. The sum of all sizes must equal 1.

  • is_save_subsets (bool, False) – If True, then save data subsets to files.

  • subsets_directory (str, default=None) – Directory where the data subsets files are stored.

  • subset_basename (str, default=None) – Subset file base name.

  • seed (int, default=None) – Seed for random data set split generator.

Returns:

dataset_split – Data subsets (key, str, item, torch.utils.data.Subset).

Return type:

dict