TODO: Document that custom/additional fields are allowed (which are part of digest). Document _prefix fields (which are not part of digest).
TODO: Document sub-pipeline step. Document how data references for sub-pipelines are done.
TODO: Document placeholder step.
TODO: Document resolving of pipelines (by filename based on ID in the pipeline search path).
Interaction with Problem Description¶
TODO: Passing true targets and LUPI through semantic types from the problem description.
All input and output (container) values passed between primitives should
protocol (sequence in
samples) and provide
metadata attribute with metadata.
d3m.container module exposes such standard types:
Dataset– a class representing datasets, including D3M datasets, implemented in
List can be used to create a simple list container.
It is strongly encouraged to use the
DataFrame container type for
primitives which do not have strong reasons to use something else
Datasets to operate on initial pipeline input, or optimized
high-dimensional packed data in
lists to pass as
values to hyper-parameters). This makes it easier to operate just on
columns without type casting while the data is being transformed to make
it useful for models.
When deciding which container type to use for inputs and outputs of a
primitive, consider as well where an expected place for your primitive
is in the pipeline. Generally, pipelines tend to have primitives
Dataset at the beginning, then use
then convert to
Container types can contain values of the following types:
Placeholders can be used to define pipeline templates to be used outside of the metalearning context. A placeholder is replaced with a pipeline step to form a pipeline. Restrictions of placeholders may apply on the number of them, their position, allowed inputs and outputs, etc.