d3m.container.dataset¶
- 
class d3m.container.dataset.ComputeDigest(value)[source]¶
- Bases: - d3m.utils.Enum- Enumeration of possible approaches to computing dataset digest. 
- 
class d3m.container.dataset.Dataset(resources, metadata=None, *, load_lazy=None, generate_metadata=False, check=True, source=None, timestamp=None)[source]¶
- Bases: - dict- A class representing a dataset. - Internally, it is a dictionary containing multiple resources (e.g., tables). - Parameters
- resources ( - Mapping) – A map from resource IDs to resources.
- metadata (d3m.metadata.base.DataMetadata) – Metadata associated with the - data.
- load_lazy ( - Optional[- Callable[[- Dataset],- None]]) – If constructing a lazy dataset, calling this function will read all the data and convert the dataset to a non-lazy one.
- generate_metadata (bool) – Automatically generate and update the metadata. 
- check ( - bool) – DEPRECATED: argument ignored.
- timestamp ( - Optional[- datetime]) – DEPRECATED: argument ignored.
 
 - 
get_relations_graph()[source]¶
- Builds the relations graph for the dataset. - Each key in the output corresponds to a resource/table. The value under a key is the list of edges this table has. The edge is represented by a tuple of four elements. For example, if the edge is - (resource_id, True, index_1, index_2, custom_state), it means that there is a foreign key that points to table- resource_id. Specifically,- index_1column in the current table points to- index_2column in the table- resource_id.- custom_stateis an empty dict when returned from this method, but allows users of this graph to store custom state there.
 - 
is_lazy()[source]¶
- Return whether this dataset instance is lazy and not all data has been loaded. - Returns
- Trueif this dataset instance is lazy.
- Return type
 
 - 
classmethod load(dataset_uri, *, dataset_id=None, dataset_version=None, dataset_name=None, lazy=False, compute_digest=<ComputeDigest.ONLY_IF_MISSING: 'ONLY_IF_MISSING'>, strict_digest=False, handle_score_split=True)[source]¶
- Tries to load dataset from - dataset_uriusing all registered dataset loaders.- Parameters
- dataset_uri ( - str) – A URI to load.
- dataset_id ( - Optional[- str]) – Override dataset ID determined by the loader.
- dataset_version ( - Optional[- str]) – Override dataset version determined by the loader.
- dataset_name ( - Optional[- str]) – Override dataset name determined by the loader.
- lazy ( - bool) – If- True, load only top-level metadata and not whole dataset.
- compute_digest ( - ComputeDigest) – Compute a digest over the data?
- strict_digest ( - bool) – If computed digest does not match the one provided in metadata, raise an exception?
- handle_score_split ( - bool) – If a scoring dataset has target values in a separate file, merge them in?
 
- Returns
- A loaded dataset. 
- Return type
 
 - 
classmethod register_loader(loader)[source]¶
- Registers a new dataset loader. - Parameters
- loader ( - Loader) – An instance of the loader class implementing a new loader.
- Return type
- None
 
 - 
classmethod register_saver(saver)[source]¶
- Registers a new dataset saver. - Parameters
- saver ( - Saver) – An instance of the saver class implementing a new saver.
- Return type
- None
 
 - 
save(dataset_uri, *, compute_digest=<ComputeDigest.ALWAYS: 'ALWAYS'>, preserve_metadata=True)[source]¶
- Tries to save dataset to - dataset_uriusing all registered dataset savers.- Parameters
- dataset_uri ( - str) – A URI to save to.
- compute_digest ( - ComputeDigest) – Compute digest over the data when saving?
- preserve_metadata ( - bool) – When saving a dataset, store its metadata as well?
 
- Return type
- None
 
 - 
select_rows(row_indices_to_keep)[source]¶
- Generate a new Dataset from the row indices for DataFrames. 
 - 
to_json_structure(*, canonical=False)[source]¶
- Returns only a top-level dataset description. - Return type
 
 - 
loaders: List[d3m.container.dataset.Loader] = [<d3m.container.dataset.D3MDatasetLoader object>, <d3m.container.dataset.CSVLoader object>, <d3m.container.dataset.SklearnExampleLoader object>, <d3m.container.dataset.OpenMLDatasetLoader object>][source]¶
 - 
metadata: d3m.metadata.base.DataMetadata[source]¶
 
