d3m.runtime

class d3m.runtime.InputsConfig(inputs, test_inputs, score_inputs, data_pipeline, data_params, data_random_seed)[source]

Bases: tuple

property data_params[source]

Alias for field number 4

property data_pipeline[source]

Alias for field number 3

property data_random_seed[source]

Alias for field number 5

property inputs[source]

Alias for field number 0

property score_inputs[source]

Alias for field number 2

property test_inputs[source]

Alias for field number 1

class d3m.runtime.MultiResult(*args, **kwds)[source]

Bases: List[d3m.runtime.Result]

Results of running a pipeline multiple times.

check_success()[source]

Throws an exception if pipeline has not successfully finished in any of the runs.

Return type

None

has_error()[source]

Returns True if any of pipelines has not successfully finished.

Return type

bool

property pipeline_runs[source]
Return type

Sequence[PipelineRun]

class d3m.runtime.Result(pipeline_run, values, error=None)[source]

Bases: object

Results from running a pipeline.

Parameters
  • pipeline_run (PipelineRun) – A pipeline run description.

  • values (Dict[str, Any]) – A map between data references and their values computed during pipeline run.

  • error (Optional[Exception]) – If during a run an exception occurred, then it is available here.

check_success()[source]

Throws an exception if pipeline has not successfully finished.

Return type

None

get_standard_pipeline_output()[source]

Returns the output value if exists and its from a standard pipeline.

Return type

Optional[DataFrame]

has_error()[source]

Returns True if pipeline has not successfully finished.

Return type

bool

class d3m.runtime.Runtime(pipeline, hyperparams=None, *, problem_description=None, context, random_seed=0, volumes_dir=None, scratch_dir=None, is_standard_pipeline=False, environment=None, users=None)[source]

Bases: object

Reference runtime to fit and produce a pipeline.

Parameters
  • pipeline (d3m.metadata.pipeline.Pipeline) – A pipeline to run.

  • hyperparams (Optional[Sequence]) – Values for free hyper-parameters of the pipeline. It should be a list, where each element corresponds to free hyper-parameters of the corresponding pipeline step. Not all free hyper-parameters have to be specified. Default values are used for those which are not. Optional.

  • problem_description (Optional[d3m.metadata.problem.Problem]) – A parsed problem description in standard problem description schema.

  • context (d3m.metadata.base.Context) – In which context to run pipelines.

  • random_seed (int) – A random seed to use for every run. This control all randomness during the run.

  • volumes_dir (Optional[str]) – Path to a directory with static files required by primitives. In the standard directory structure (as obtained running python3 -m d3m primitive download).

  • scratch_dir (Optional[str]) – Path to a directory to store any temporary files needed during execution.

  • is_standard_pipeline (bool) – Is the pipeline a standard pipeline?

  • environment (d3m.metadata.pipeline_run.RuntimeEnvironment) – A description of the runtime environment, including engine versions, Docker images, compute resources, and benchmarks. If not provided, an attempt is made to determine it automatically.

  • users (Optional[Sequence[d3m.metadata.pipeline_run.User]]) – Users associated with running the pipeline.

fit(inputs, *, outputs_to_expose=None, return_values=None)[source]

Does a “fit” phase of the pipeline.

Parameters
  • inputs (Sequence[Any]) – A list of inputs to the pipeline.

  • outputs_to_expose (Optional[Iterable[str]]) – Data references of all outputs of all steps to return. Requesting a data reference of an output which would otherwise not be produced is allowed and it forces that output to be produced, but all inputs necessary have to be provided to the primitive, otherwise an error is logged and output is skipped. If None, the outputs of the whole pipeline are returned.

  • return_values (Optional[Iterable[str]]) – DEPRECATED: use outputs_to_expose instead.

Returns

A result object with kept values, pipeline run description, and any exception.

Return type

Result

get_params()[source]
Return type

List[Union[Any, List]]

produce(inputs, *, outputs_to_expose=None, return_values=None)[source]

Does a “produce” phase of the pipeline and returns outputs.

Parameters
  • inputs (Sequence[Any]) – A list of inputs to the pipeline.

  • outputs_to_expose (Optional[Iterable[str]]) – Data references of all outputs of all steps to return. Requesting a data reference of an output which would otherwise not be produced is allowed and it forces that output to be produced, but all inputs necessary have to be provided to the primitive, otherwise an error is logged and output is skipped. If None, the outputs of the whole pipeline are returned.

  • return_values (Optional[Iterable[str]]) – DEPRECATED: use outputs_to_expose instead.

Returns

A result object with kept values, pipeline run description, and any exception.

Return type

Result

set_params(params)[source]
Return type

None

context: d3m.metadata.base.Context[source]

In which context to run pipelines.

current_step: int[source]

Which step is currently being ran.

data_values: Dict[str, Any][source]

Map between available data references and their values during the run.

environment: d3m.metadata.pipeline_run.RuntimeEnvironment[source]

A description of the runtime environment, including engine versions, Docker images, compute resources, and benchmarks. If not provided, an attempt is made to determine it automatically.

hyperparams: Optional[Sequence][source]

Values for free hyper-parameters of the pipeline. It should be a list, where each element corresponds to free hyper-parameters of the corresponding pipeline step. Not all free hyper-parameters have to be specified. Default values are used for those which are not. Optional.

is_standard_pipeline: bool[source]

Is the pipeline a standard pipeline?

outputs_to_expose: Iterable[str][source]

Which step outputs should the runtime keep during a pipeline run, even after they are necessary. Outputs which would otherwise not be produced are allowed and that forces those outputs to be produced.

phase: d3m.metadata.base.PipelineRunPhase[source]

Which phase are we currently running.

pipeline: d3m.metadata.pipeline.Pipeline[source]

A pipeline to run.

pipeline_run: Optional[d3m.metadata.pipeline_run.PipelineRun][source]

A current instance of pipeline run.

problem_description: Optional[d3m.metadata.problem.Problem][source]

A parsed problem description in standard problem description schema.

random_seed: int[source]

A random seed to use for every run. This control all randomness during the run.

scratch_dir: Optional[str][source]

Path to a directory to store any temporary files needed during execution.

steps_state: List[Union[Any, List]][source]

Fitted state for each step of the pipeline.

users: Optional[Sequence[d3m.metadata.pipeline_run.User]][source]

Users associated with running the pipeline.

volumes_dir: Optional[str][source]

Path to a directory with static files required by primitives. In the standard directory structure (as obtained running python3 -m d3m primitive download).

d3m.runtime.combine_folds(scores_list)[source]
Return type

DataFrame

d3m.runtime.combine_pipeline_runs(standard_pipeline_run, *, data_pipeline_run=None, scoring_pipeline_run=None, score_inputs=None, metrics=None, scores=None, fold_group_uuid=None, fold_index=None)[source]
Return type

None

d3m.runtime.combine_random_seed(scores, random_seed)[source]
Return type

DataFrame

d3m.runtime.evaluate(pipeline, inputs, *, data_pipeline, scoring_pipeline, problem_description, data_params=None, metrics, context, scoring_params=None, hyperparams=None, random_seed=0, data_random_seed=0, scoring_random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None)[source]

Values in data_params should be serialized as JSON, as obtained by JSON-serializing the output of hyper-parameter’s value_to_json_structure method call.

Return type

Tuple[List[DataFrame], MultiResult]

d3m.runtime.evaluate_fold(pipeline, train_inputs, test_inputs, score_inputs, all_scores, all_results, *, data_pipeline_run, fold_group_uuid, fold_index, scoring_pipeline, problem_description, metrics, context, scoring_params=None, hyperparams=None, random_seed=0, scoring_random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None)[source]
Return type

None

d3m.runtime.evaluate_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.evaluate_with_prepared_data(pipeline, inputs_dir, *, scoring_pipeline, problem_description, metrics, context, scoring_params=None, hyperparams=None, random_seed=0, scoring_random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, dataset_resolver=None, compute_digest=<ComputeDigest.ONLY_IF_MISSING: 'ONLY_IF_MISSING'>, strict_digest=False)[source]
Return type

Tuple[List[DataFrame], MultiResult]

d3m.runtime.export_dataframe(dataframe, output_file=None)[source]
Return type

Optional[str]

d3m.runtime.fit(pipeline, inputs, *, problem_description, context, hyperparams=None, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, is_standard_pipeline=True, expose_produced_outputs=False, outputs_to_expose=None, data_pipeline=None, data_params=None, data_random_seed=0, data_pipeline_run=None, fold_group_uuid=None, fold_index=0)[source]
Return type

Tuple[Optional[Runtime], Optional[DataFrame], Result]

d3m.runtime.fit_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.fit_produce_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.fit_score_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.get_metrics_from_list(metrics)[source]
Return type

Sequence[Dict]

d3m.runtime.get_metrics_from_problem_description(problem_description)[source]
Return type

Sequence[Dict]

d3m.runtime.get_singleton_value(value)[source]

A helper to extract a value from a singleton value (extracting a sole element of a container of length 1).

Return type

Any

d3m.runtime.main(argv)[source]
Return type

None

d3m.runtime.parse_pipeline_run(pipeline_run_file, pipeline_search_paths, datasets_dir, *, pipeline_resolver=None, dataset_resolver=None, problem_resolver=None, strict_resolving=False, compute_digest=<ComputeDigest.ONLY_IF_MISSING: 'ONLY_IF_MISSING'>, strict_digest=False, handle_score_split=True)[source]
Return type

Sequence[Dict[str, Any]]

d3m.runtime.prepare_data(inputs, *, data_pipeline, problem_description, data_params=None, context, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None)[source]

This function calls a data preparation pipeline. That pipeline can take as input one or more datasets but must always return only one dataset split into training, testing, and scoring splits (e.g., the pipeline combines multiple input datasets). Each split can be across multiple folds. So the data preparation pipeline must have three pipeline outputs, each returning a list of datasets, where every list item corresponds to a fold index.

Values in data_params should be serialized as JSON, as obtained by JSON-serializing the output of hyper-parameter’s value_to_json_structure method call.

Return type

Tuple[List, Result]

d3m.runtime.prepare_data_and_save(save_dir, inputs, *, data_pipeline, problem_description, data_params=None, context, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, dataset_view_maps=None)[source]
Return type

None

d3m.runtime.prepare_data_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.produce(fitted_pipeline, test_inputs, *, expose_produced_outputs=False, outputs_to_expose=None, data_pipeline=None, data_params=None, data_random_seed=0, data_pipeline_run=None, fold_group_uuid=None, fold_index=0)[source]
Return type

Tuple[Optional[DataFrame], Result]

d3m.runtime.produce_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.save_exposed_outputs(results, output_dir)[source]
Return type

None

d3m.runtime.save_steps_outputs(results, output_dir)[source]
Return type

None

d3m.runtime.score(predictions, score_inputs, *, scoring_pipeline, problem_description, metrics, predictions_random_seed=None, context, scoring_params=None, random_seed=0, volumes_dir=None, scratch_dir=None, runtime_environment=None, data_pipeline=None, data_params=None, data_random_seed=0, data_pipeline_run=None, fold_group_uuid=None, fold_index=0)[source]
Return type

Tuple[Optional[DataFrame], Result]

d3m.runtime.score_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None

d3m.runtime.score_predictions_handler(arguments, *, pipeline_resolver=None, pipeline_run_parser=None, dataset_resolver=None, problem_resolver=None)[source]
Return type

None