d3m.runtime¶

class d3m.runtime.InputsConfig(inputs, test_inputs, score_inputs, data_pipeline, data_params, data_random_seed)[source]¶

Bases: tuple

property data_params[source]¶: Alias for field number 4

property data_pipeline[source]¶: Alias for field number 3

property data_random_seed[source]¶: Alias for field number 5

property inputs[source]¶: Alias for field number 0

property score_inputs[source]¶: Alias for field number 2

property test_inputs[source]¶: Alias for field number 1

class d3m.runtime.MultiResult(*args, **kwds)[source]¶

Bases: List[d3m.runtime.Result]

Results of running a pipeline multiple times.

check_success()[source]¶

Throws an exception if pipeline has not successfully finished in any of the runs.

Return type: None

has_error()[source]¶

Returns True if any of pipelines has not successfully finished.

Return type: bool

property pipeline_runs[source]¶

Return type: Sequence[PipelineRun]

class d3m.runtime.Result(pipeline_run, values, error=None)[source]¶

Bases: object

Results from running a pipeline.

Parameters

pipeline_run (PipelineRun) – A pipeline run description.
values (Dict[str, Any]) – A map between data references and their values computed during pipeline run.
error (Optional[Exception]) – If during a run an exception occurred, then it is available here.

check_success()[source]¶

Throws an exception if pipeline has not successfully finished.

Return type: None

get_standard_pipeline_output()[source]¶

Returns the output value if exists and its from a standard pipeline.

Return type: Optional[DataFrame]

has_error()[source]¶

Returns True if pipeline has not successfully finished.

Return type: bool

class d3m.runtime.Runtime(pipeline, hyperparams=None, *, problem_description=None, context, random_seed=0, volumes_dir=None, scratch_dir=None, is_standard_pipeline=False, environment=None, users=None)[source]¶

Bases: object

Reference runtime to fit and produce a pipeline.

Parameters

pipeline (d3m.metadata.pipeline.Pipeline) – A pipeline to run.
hyperparams (Optional[Sequence]) – Values for free hyper-parameters of the pipeline. It should be a list, where each element corresponds to free hyper-parameters of the corresponding pipeline step. Not all free hyper-parameters have to be specified. Default values are used for those which are not. Optional.
problem_description (Optional[d3m.metadata.problem.Problem]) – A parsed problem description in standard problem description schema.
context (d3m.metadata.base.Context) – In which context to run pipelines.
random_seed (int) – A random seed to use for every run. This control all randomness during the run.
volumes_dir (Optional[str]) – Path to a directory with static files required by primitives. In the standard directory structure (as obtained running python3 -m d3m primitive download).
scratch_dir (Optional[str]) – Path to a directory to store any temporary files needed during execution.
is_standard_pipeline (bool) – Is the pipeline a standard pipeline?
environment (d3m.metadata.pipeline_run.RuntimeEnvironment) – A description of the runtime environment, including engine versions, Docker images, compute resources, and benchmarks. If not provided, an attempt is made to determine it automatically.
users (Optional[Sequence[d3m.metadata.pipeline_run.User]]) – Users associated with running the pipeline.

fit(inputs, *, outputs_to_expose=None, return_values=None)[source]¶

Does a “fit” phase of the pipeline.

Parameters

inputs (Sequence[Any]) – A list of inputs to the pipeline.
outputs_to_expose (Optional[Iterable[str]]) – Data references of all outputs of all steps to return. Requesting a data reference of an output which would otherwise not be produced is allowed and it forces that output to be produced, but all inputs necessary have to be provided to the primitive, otherwise an error is logged and output is skipped. If None, the outputs of the whole pipeline are returned.
return_values (Optional[Iterable[str]]) – DEPRECATED: use outputs_to_expose instead.

Returns

A result object with kept values, pipeline run description, and any exception.

Return type

Result

get_params()[source]¶

Return type: List[Union[Any, List]]

produce(inputs, *, outputs_to_expose=None, return_values=None)[source]¶

Does a “produce” phase of the pipeline and returns outputs.

Parameters

inputs (Sequence[Any]) – A list of inputs to the pipeline.
outputs_to_expose (Optional[Iterable[str]]) – Data references of all outputs of all steps to return. Requesting a data reference of an output which would otherwise not be produced is allowed and it forces that output to be produced, but all inputs necessary have to be provided to the primitive, otherwise an error is logged and output is skipped. If None, the outputs of the whole pipeline are returned.
return_values (Optional[Iterable[str]]) – DEPRECATED: use outputs_to_expose instead.

Returns

A result object with kept values, pipeline run description, and any exception.

Return type

Result

set_params(params)[source]¶

Return type: None

context: d3m.metadata.base.Context[source]¶: In which context to run pipelines.

current_step: int[source]¶: Which step is currently being ran.

data_values: Dict[str, Any][source]¶: Map between available data references and their values during the run.

environment: d3m.metadata.pipeline_run.RuntimeEnvironment[source]¶: A description of the runtime environment, including engine versions, Docker images, compute resources, and benchmarks. If not provided, an attempt is made to determine it automatically.

hyperparams: Optional[Sequence][source]¶: Values for free hyper-parameters of the pipeline. It should be a list, where each element corresponds to free hyper-parameters of the corresponding pipeline step. Not all free hyper-parameters have to be specified. Default values are used for those which are not. Optional.

is_standard_pipeline: bool[source]¶: Is the pipeline a standard pipeline?

outputs_to_expose: Iterable[str][source]¶: Which step outputs should the runtime keep during a pipeline run, even after they are necessary. Outputs which would otherwise not be produced are allowed and that forces those outputs to be produced.

phase: d3m.metadata.base.PipelineRunPhase[source]¶: Which phase are we currently running.

pipeline: d3m.metadata.pipeline.Pipeline[source]¶: A pipeline to run.

pipeline_run: Optional[d3m.metadata.pipeline_run.PipelineRun][source]¶: A current instance of pipeline run.

problem_description: Optional[d3m.metadata.problem.Problem][source]¶: A parsed problem description in standard problem description schema.

random_seed: int[source]¶: A random seed to use for every run. This control all randomness during the run.

scratch_dir: Optional[str][source]¶: Path to a directory to store any temporary files needed during execution.

steps_state: List[Union[Any, List]][source]¶: Fitted state for each step of the pipeline.

users: Optional[Sequence[d3m.metadata.pipeline_run.User]][source]¶: Users associated with running the pipeline.

volumes_dir: Optional[str][source]¶: Path to a directory with static files required by primitives. In the standard directory structure (as obtained running python3 -m d3m primitive download).

d3m.runtime.combine_folds(scores_list)[source]¶

Return type: DataFrame

d3m.runtime.combine_pipeline_runs(standard_pipeline_run, *, data_pipeline_run=None, scoring_pipeline_run=None, score_inputs=None, metrics=None, scores=None, fold_group_uuid=None, fold_index=None)[source]¶

Return type: None

d3m.runtime.combine_random_seed(scores, random_seed)[source]¶

Return type: DataFrame

d3m.runtime.evaluate(pipeline, inputs, *, data_pipeline, scoring_pipeline, problem_description, data_params=None, metrics, context, scoring_params=None, hyperparams=None, random_seed=0, data_random_seed=0, scoring_random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None)[source]¶

Values in data_params should be serialized as JSON, as obtained by JSON-serializing the output of hyper-parameter’s value_to_json_structure method call.

Return type: Tuple[List[DataFrame], MultiResult]

d3m.runtime.evaluate_fold(pipeline, train_inputs, test_inputs, score_inputs, all_scores, all_results, *, data_pipeline_run, fold_group_uuid, fold_index, scoring_pipeline, problem_description, metrics, context, scoring_params=None, hyperparams=None, random_seed=0, scoring_random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None)[source]¶

Return type: None

d3m.runtime.evaluate_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.evaluate_with_prepared_data(pipeline, inputs_dir, *, scoring_pipeline, problem_description, metrics, context, scoring_params=None, hyperparams=None, random_seed=0, scoring_random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, dataset_resolver=None, compute_digest=<ComputeDigest.ONLY_IF_MISSING: 'ONLY_IF_MISSING'>, strict_digest=False)[source]¶

Return type: Tuple[List[DataFrame], MultiResult]

d3m.runtime.export_dataframe(dataframe, output_file=None)[source]¶

Return type: Optional[str]

d3m.runtime.fit(pipeline, inputs, *, problem_description, context, hyperparams=None, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, is_standard_pipeline=True, expose_produced_outputs=False, outputs_to_expose=None, data_pipeline=None, data_params=None, data_random_seed=0, data_pipeline_run=None, fold_group_uuid=None, fold_index=0)[source]¶

Return type: Tuple[Optional[Runtime], Optional[DataFrame], Result]

d3m.runtime.fit_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.fit_produce_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.fit_score_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.get_metrics_from_list(metrics)[source]¶

Return type: Sequence[Dict]

d3m.runtime.get_metrics_from_problem_description(problem_description)[source]¶

Return type: Sequence[Dict]

d3m.runtime.get_singleton_value(value)[source]¶

A helper to extract a value from a singleton value (extracting a sole element of a container of length 1).

Return type: Any

d3m.runtime.main(argv)[source]¶

Return type: None

d3m.runtime.parse_pipeline_run(pipeline_run_file, pipeline_search_paths, datasets_dir, *, pipeline_resolver=None, dataset_resolver=None, problem_resolver=None, strict_resolving=False, compute_digest=<ComputeDigest.ONLY_IF_MISSING: 'ONLY_IF_MISSING'>, strict_digest=False, handle_score_split=True)[source]¶

Return type: Sequence[Dict[str, Any]]

d3m.runtime.prepare_data(inputs, *, data_pipeline, problem_description, data_params=None, context, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None)[source]¶

This function calls a data preparation pipeline. That pipeline can take as input one or more datasets but must always return only one dataset split into training, testing, and scoring splits (e.g., the pipeline combines multiple input datasets). Each split can be across multiple folds. So the data preparation pipeline must have three pipeline outputs, each returning a list of datasets, where every list item corresponds to a fold index.

Values in data_params should be serialized as JSON, as obtained by JSON-serializing the output of hyper-parameter’s value_to_json_structure method call.

Return type: Tuple[List, Result]

d3m.runtime.prepare_data_and_save(save_dir, inputs, *, data_pipeline, problem_description, data_params=None, context, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, dataset_view_maps=None)[source]¶

Return type: None

d3m.runtime.prepare_data_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.produce(fitted_pipeline, test_inputs, *, expose_produced_outputs=False, outputs_to_expose=None, data_pipeline=None, data_params=None, data_random_seed=0, data_pipeline_run=None, fold_group_uuid=None, fold_index=0)[source]¶

Return type: Tuple[Optional[DataFrame], Result]

d3m.runtime.produce_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.save_exposed_outputs(results, output_dir)[source]¶

Return type: None

d3m.runtime.save_steps_outputs(results, output_dir)[source]¶

Return type: None

d3m.runtime.score(predictions, score_inputs, *, scoring_pipeline, problem_description, metrics, predictions_random_seed=None, context, scoring_params=None, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, data_pipeline=None, data_params=None, data_random_seed=0, data_pipeline_run=None, fold_group_uuid=None, fold_index=0)[source]¶

Return type: Tuple[Optional[DataFrame], Result]

d3m.runtime.score_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime.score_predictions_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]¶

Return type: None

d3m.runtime¶

Version

Table of Contents