d3m.runtime module

class d3m.runtime.MultiResult[source]

Bases: typing.List

Results of running a pipeline multiple times.

check_success() → None[source]

Throws an exception if pipeline has not successfully finished in any of the runs.

has_error() → bool[source]

Returns True if any of pipelines has not successfully finished.

pipeline_runs[source]
class d3m.runtime.Result(pipeline_run: d3m.metadata.pipeline_run.PipelineRun, values: Dict[str, Any], error: Exception = None)[source]

Bases: object

Results from running a pipeline.

Parameters:
  • pipeline_run (PipelineRun) – A pipeline run description.
  • values (Dict[str, Any]) – A map between data references and their values computed during pipeline run.
  • error (Optional[Exception]) – If during a run an exception occurred, then it is available here.
check_success() → None[source]

Throws an exception if pipeline has not successfully finished.

has_error() → bool[source]

Returns True if pipeline has not successfully finished.

class d3m.runtime.Runtime(pipeline: d3m.metadata.pipeline.Pipeline, hyperparams: Sequence = None, *, problem_description: d3m.metadata.problem.Problem = None, context: d3m.metadata.base.Context, random_seed: int = 0, volumes_dir: str = None, scratch_dir: str = None, is_standard_pipeline: bool = False, environment: d3m.metadata.pipeline_run.RuntimeEnvironment = None, users: Sequence[d3m.metadata.pipeline_run.User] = None)[source]

Bases: object

Reference runtime to fit and produce a pipeline.

Parameters:
  • pipeline (Pipeline) – A pipeline to run.
  • hyperparams (Sequence) – Values for free hyper-parameters of the pipeline. It should be a list, where each element corresponds to free hyper-parameters of the corresponding pipeline step. Not all free hyper-parameters have to be specified. Default values are used for those which are not. Optional.
  • problem_description (Problem) – A parsed problem description in standard problem description schema.
  • context (Context) – In which context to run pipelines, default is TESTING.
  • random_seed (int) – A random seed to use for every run. This control all randomness during the run.
  • volumes_dir (str) – Path to a directory with static files required by primitives. In the standard directory structure (as obtained running python3 -m d3m index download).
  • scratch_dir (str) – Path to a directory to store any temporary files needed during execution.
  • is_standard_pipeline (bool) – Is the pipeline a standard pipeline?
  • environment (RuntimeEnvironment) – A description of the runtime environment, including engine versions, Docker images, compute resources, and benchmarks. If not provided, an attempt is made to determine it automatically.
  • users (Sequence[User]) – Users associated with running the pipeline.
pipeline[source]

A pipeline to run.

Type:Pipeline
hyperparams[source]

Values for free hyper-parameters of the pipeline. It should be a list, where each element corresponds to free hyper-parameters of the corresponding pipeline step. Not all free hyper-parameters have to be specified. Default values are used for those which are not. Optional.

Type:Sequence
problem_description[source]

A parsed problem description in standard problem description schema.

Type:Problem
context[source]

In which context to run pipelines, default is TESTING.

Type:Context
random_seed[source]

A random seed to use for every run. This control all randomness during the run.

Type:int
volumes_dir[source]

Path to a directory with static files required by primitives. In the standard directory structure (as obtained running python3 -m d3m index download).

Type:str
scratch_dir[source]

Path to a directory to store any temporary files needed during execution.

Type:str
is_standard_pipeline[source]

Is the pipeline a standard pipeline?

Type:bool
environment[source]

A description of the runtime environment, including engine versions, Docker images, compute resources, and benchmarks. If not provided, an attempt is made to determine it automatically.

Type:RuntimeEnvironment
users[source]

Users associated with running the pipeline.

Type:Sequence[User]
current_step[source]

Which step is currently being ran.

Type:int
phase[source]

Which phase are we currently running.

Type:PipelineRunPhase
pipeline_run[source]

A current instance of pipeline run.

Type:PipelineRun
return_values[source]

Which values should the runtime keep during a pipeline run, even after they are necessary.

Type:Sequence[str]
data_values[source]

Map between available data references and their values during the run.

Type:Dict[str, Any]
steps_state[source]

Fitted state for each step of the pipeline.

Type:List[Union[PrimitiveBase, Runtime]]
fit(inputs: Sequence[Any], *, return_values: Sequence[str] = None) → d3m.runtime.Result[source]

Does a “fit” phase of the pipeline.

Parameters:
  • inputs (Sequence[Any]) – A list of inputs to the pipeline.
  • return_values (Sequence[str]) – A list of data references of all output values of all steps to return. If None, the output values of the whole pipeline are returned.
Returns:

A result object with kept values, pipeline run description, and any exception.

Return type:

Result

produce(inputs: Sequence[Any], *, return_values: Sequence[str] = None) → d3m.runtime.Result[source]

Does a “produce” phase of the pipeline and returns outputs.

Parameters:
  • inputs (Sequence[Any]) – A list of inputs to the pipeline.
  • return_values (Sequence[str]) – A list of data references of all output values of all steps to return. If None, the output values of the whole pipeline are returned.
Returns:

A result object with kept values, pipeline run description, and any exception.

Return type:

Result

d3m.runtime.combine_folds(scores_list: List[d3m.container.pandas.DataFrame]) → d3m.container.pandas.DataFrame[source]
d3m.runtime.combine_pipeline_runs(standard_pipeline_run: d3m.metadata.pipeline_run.PipelineRun, *, data_pipeline_run: d3m.metadata.pipeline_run.PipelineRun = None, scoring_pipeline_run: d3m.metadata.pipeline_run.PipelineRun = None, score_inputs: Sequence[Any] = None, metrics: Sequence[Dict] = None, scores: d3m.container.pandas.DataFrame = None, fold_group_uuid: uuid.UUID = None, fold_index: int = None) → None[source]
d3m.runtime.combine_random_seed(scores: d3m.container.pandas.DataFrame, random_seed: int) → d3m.container.pandas.DataFrame[source]
d3m.runtime.evaluate(pipeline: d3m.metadata.pipeline.Pipeline, inputs: Sequence[d3m.container.dataset.Dataset], *, data_pipeline: d3m.metadata.pipeline.Pipeline, scoring_pipeline: d3m.metadata.pipeline.Pipeline, problem_description: Optional[d3m.metadata.problem.Problem], data_params: Dict[str, str], metrics: Sequence[Dict], context: d3m.metadata.base.Context, scoring_params: Dict[str, str] = None, hyperparams: Sequence = None, random_seed: int = 0, data_random_seed: int = 0, scoring_random_seed: int = 0, volumes_dir: str = None, scratch_dir: str = None, runtime_environment: d3m.metadata.pipeline_run.RuntimeEnvironment = None) → Tuple[List[d3m.container.pandas.DataFrame], d3m.runtime.MultiResult][source]

Values in data_params should be serialized as JSON, as obtained by JSON-serializing the output of hyper-parameter’s value_to_json_structure method call.

d3m.runtime.evaluate_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]
d3m.runtime.export_dataframe(dataframe: d3m.container.pandas.DataFrame, output_file: IO[Any] = None) → Optional[str][source]
d3m.runtime.fit(pipeline: d3m.metadata.pipeline.Pipeline, inputs: Sequence[d3m.container.dataset.Dataset], *, problem_description: Optional[d3m.metadata.problem.Problem], context: d3m.metadata.base.Context, hyperparams: Sequence = None, random_seed: int = 0, volumes_dir: str = None, scratch_dir: str = None, runtime_environment: d3m.metadata.pipeline_run.RuntimeEnvironment = None, is_standard_pipeline: bool = True, expose_produced_outputs: bool = False) → Tuple[Optional[d3m.runtime.Runtime], Optional[d3m.container.pandas.DataFrame], d3m.runtime.Result][source]
d3m.runtime.fit_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]
d3m.runtime.fit_produce_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]
d3m.runtime.fit_score_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]
d3m.runtime.get_metrics_from_list(metrics: Sequence[str]) → Sequence[Dict][source]
d3m.runtime.get_metrics_from_problem_description(problem_description: Optional[d3m.metadata.problem.Problem]) → Sequence[Dict][source]
d3m.runtime.get_singleton_value(value: Any) → Any[source]

A helper to extract a value from a singleton value (extracting a sole element of a container of length 1).

d3m.runtime.main(argv: Sequence) → None[source]
d3m.runtime.parse_pipeline_run(pipeline_run_file: IO[Any], pipeline_search_paths: Sequence[str], datasets_dir: Optional[str], *, pipeline_resolver: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None, strict_resolving: bool = False, compute_digest: d3m.container.dataset.ComputeDigest = <ComputeDigest.ONLY_IF_MISSING: 'ONLY_IF_MISSING'>, strict_digest: bool = False, handle_score_split: bool = True) → Sequence[Dict[str, Any]][source]
d3m.runtime.prepare_data(inputs: Sequence[d3m.container.dataset.Dataset], *, data_pipeline: d3m.metadata.pipeline.Pipeline, problem_description: Optional[d3m.metadata.problem.Problem], data_params: Dict[str, str], context: d3m.metadata.base.Context, random_seed: int = 0, volumes_dir: str = None, scratch_dir: str = None, runtime_environment: d3m.metadata.pipeline_run.RuntimeEnvironment = None) → Tuple[List, d3m.runtime.Result][source]

Values in data_params should be serialized as JSON, as obtained by JSON-serializing the output of hyper-parameter’s value_to_json_structure method call.

d3m.runtime.produce(fitted_pipeline: d3m.runtime.Runtime, test_inputs: Sequence[d3m.container.dataset.Dataset], *, expose_produced_outputs: bool = False) → Tuple[Optional[d3m.container.pandas.DataFrame], d3m.runtime.Result][source]
d3m.runtime.produce_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]
d3m.runtime.save_steps_outputs(results: Union[d3m.runtime.Result, d3m.runtime.MultiResult], output_dir: str) → None[source]
d3m.runtime.score(predictions: d3m.container.pandas.DataFrame, score_inputs: Sequence[d3m.container.dataset.Dataset], *, scoring_pipeline: d3m.metadata.pipeline.Pipeline, problem_description: Optional[d3m.metadata.problem.Problem], metrics: Sequence[Dict], predictions_random_seed: int = None, context: d3m.metadata.base.Context, scoring_params: Dict[str, str] = None, random_seed: int = 0, volumes_dir: str = None, scratch_dir: str = None, runtime_environment: d3m.metadata.pipeline_run.RuntimeEnvironment = None) → Tuple[Optional[d3m.container.pandas.DataFrame], d3m.runtime.Result][source]
d3m.runtime.score_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]
d3m.runtime.score_predictions_handler(arguments: argparse.Namespace, *, pipeline_resolver: Callable = None, pipeline_run_parser: Callable = None, dataset_resolver: Callable = None, problem_resolver: Callable = None) → None[source]