d3m.primitive_interfaces.base¶
-
class
d3m.primitive_interfaces.base.
CallResult
(value, has_finished=True, iterations_done=None)[source]¶ Bases:
Generic
[typing.T
]Some methods return additional metadata about the method call itself (which is different to metadata about the value returned, which is stored in
metadata
attribute of the value itself).For
produce
method call,has_finished
isTrue
if the last call toproduce
has produced the final outputs and a call with more time or more iterations cannot get different outputs.For
fit
method call,has_finished
isTrue
if a primitive has been fully fitted on current training data and further calls tofit
are unnecessary and will not change anything.False
means that more iterations can be done (but it does not necessary mean that more iterations are beneficial).If a primitive has iterations internally, then
iterations_done
contains how many of those iterations have been made during the last call. If primitive does not support them,iterations_done
isNone
.Those methods should return value wrapped into this class.
-
class
d3m.primitive_interfaces.base.
ContinueFitMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
],typing.Hyperparams
]Abstract base class for generic types.
A generic type is typically declared by inheriting from this class parameterized with one or more type variables. For example, a generic mapping type might be defined as:
class Mapping(Generic[KT, VT]): def __getitem__(self, key: KT) -> VT: ... # Etc.
This class can then be used as follows:
def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT: try: return mapping[key] except KeyError: return default
-
abstract
continue_fit
(*, timeout=None, iterations=None)[source]¶ Similar to base
fit
, this method fits the primitive using inputs and outputs (if any) using currently set training data.The difference is what happens when currently set training data is different from what the primitive might have already been fitted on.
fit
resets parameters and refits the primitive (restarts fitting), whilecontinue_fit
fits the primitive further on new training data.fit
does not have to be called beforecontinue_fit
, callingcontinue_fit
first starts fitting as well.Caller can still call
continue_fit
multiple times on the same training data as well, in which case primitive should try to improve the fit in the same way as withfit
.From the perspective of a caller of all other methods, the training data in effect is still just currently set training data. If a caller wants to call
gradient_output
on all data on which the primitive has been fitted through multiple calls ofcontinue_fit
on different training data, the caller should pass all this data themselves through another call toset_training_data
, do not callfit
orcontinue_fit
again, and usegradient_output
method. In this way primitives which truly support continuation of fitting and need only the latest data to do another fitting, do not have to keep all past training data around themselves.If a primitive supports this mixin, then both
fit
andcontinue_fit
can be called.continue_fit
always continues fitting, if it was started throughfit
orcontinue_fit
and fitting has not already finished. Callingfit
always restarts fitting aftercontinue_fit
has been called, even if training data has not changed.Primitives supporting this mixin and which operate on categorical target columns should use
all_distinct_values
metadata to obtain which all values (labels) can be in a target column, even if currently set training data does not contain all those values.- Parameters
- Returns
A
CallResult
withNone
value.- Return type
CallResult
[None
]
-
abstract
-
class
d3m.primitive_interfaces.base.
DockerContainer
(address, ports)[source]¶ Bases:
tuple
A tuple suitable to describe connection information necessary to connect to exposed ports of a running Docker container.
-
class
d3m.primitive_interfaces.base.
GradientCompositionalityMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
],typing.Hyperparams
]This mixin provides additional abstract methods which primitives should implement to help callers with doing various end-to-end refinements using gradient-based compositionality.
This mixin adds methods to support at least:
gradient-based, compositional end-to-end training
regularized pre-training
multi-task adaptation
black box variational inference
Hamiltonian Monte Carlo
-
abstract
backward
(*, gradient_outputs, fine_tune=False, fine_tune_learning_rate=1e-05, fine_tune_weight_decay=1e-05)[source]¶ Returns the gradient with respect to inputs and with respect to params of a loss that is being backpropagated end-to-end in a pipeline.
This is the standard backpropagation algorithm: backpropagation needs to be preceded by a forward propagation (
forward
method call).- Parameters
gradient_outputs (
Gradients
[~Outputs]) – The gradient of the loss with respect to this primitive’s output. During backpropagation, this comes from the next primitive in the pipeline, i.e., the primitive whose input is the output of this primitive during the forward execution withforward
(andproduce
).fine_tune (
bool
) – IfTrue
, executes a fine-tuning gradient descent step as a part of this call. This provides the most straightforward way of end-to-end training/fine-tuning.fine_tune_learning_rate (
float
) – Learning rate for end-to-end training/fine-tuning gradient descent steps.fine_tune_weight_decay (
float
) – L2 regularization (weight decay) coefficient for end-to-end training/fine-tuning gradient descent steps.
- Returns
A tuple of the gradient with respect to inputs and with respect to params.
- Return type
-
forward
(*, inputs)[source]¶ Similar to
produce
method but it is meant to be used for a forward pass during backpropagation-based end-to-end training. Primitive can implement it differently thanproduce
, e.g., forward pass during training can enable dropout layers, orproduce
might not compute gradients whileforward
does.By default it calls
produce
for one iteration.- Parameters
inputs (~Inputs) – The inputs of shape [num_inputs, …].
- Returns
The outputs of shape [num_inputs, …].
- Return type
~Outputs
-
abstract
gradient_output
(*, outputs, inputs)[source]¶ Returns the gradient of loss sum_i(L(output_i, produce_one(input_i))) with respect to outputs.
When fit term temperature is set to non-zero, it should return the gradient with respect to outputs of:
sum_i(L(output_i, produce_one(input_i))) + temperature * sum_i(L(training_output_i, produce_one(training_input_i)))
When used in combination with the
ProbabilisticCompositionalityMixin
, it returns gradient of sum_i(log(p(output_i | input_i, params))) with respect to outputs.When fit term temperature is set to non-zero, it should return the gradient with respect to outputs of:
sum_i(log(p(output_i | input_i, params))) + temperature * sum_i(log(p(training_output_i | training_input_i, params)))
- Parameters
outputs (~Outputs) – The outputs.
inputs (~Inputs) – The inputs.
- Returns
A structure similar to
Container
but the values are of typeOptional[float]
.- Return type
Gradients
[~Outputs]
-
abstract
gradient_params
(*, outputs, inputs)[source]¶ Returns the gradient of loss sum_i(L(output_i, produce_one(input_i))) with respect to params.
When fit term temperature is set to non-zero, it should return the gradient with respect to params of:
sum_i(L(output_i, produce_one(input_i))) + temperature * sum_i(L(training_output_i, produce_one(training_input_i)))
When used in combination with the
ProbabilisticCompositionalityMixin
, it returns gradient of sum_i(log(p(output_i | input_i, params))) with respect to params.When fit term temperature is set to non-zero, it should return the gradient with respect to params of:
sum_i(log(p(output_i | input_i, params))) + temperature * sum_i(log(p(training_output_i | training_input_i, params)))
- Parameters
outputs (~Outputs) – The outputs.
inputs (~Inputs) – The inputs.
- Returns
A version of
Params
with all differentiable fields fromParams
and values set to gradient for each parameter.- Return type
Gradients
[~Params]
-
class
d3m.primitive_interfaces.base.
Gradients
(*args, **kwds)[source]¶ Bases:
Generic
[typing.Container
]A type representing a structure similar to
Container
, but the values are of typeOptional[float]
. Value isNone
if gradient for that part of the structure is not possible.
-
class
d3m.primitive_interfaces.base.
LossFunctionMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
],typing.Hyperparams
]Mixin which provides abstract methods for a caller to call to inspect which loss function or functions a primitive is using internally, and to compute loss on given inputs and outputs.
-
abstract
get_loss_functions
()[source]¶ Returns a list of loss functions used by the primitive. Each element of the list can be:
A D3M metric value of the loss function used by the primitive during the last fitting.
Primitives can be passed to other primitives as arguments. As such, some primitives can accept another primitive as a loss function to use, or use it internally. A primitive can expose this loss primitive to others, providing directly an instance of the primitive being used during the last fitting.
None
if using a non-standard loss function. Used so that the loss function can still be exposed throughloss
andlosses
methods.
It should return an empty list if the primitive does not use loss functions at all.
The order in the list matters because the loss function index is used for
loss
andlosses
methods.- Return type
Sequence
[Tuple
[PerformanceMetric
,PrimitiveBase
,None
]]- Returns
A list of (a D3M standard metric value of the loss function used,) or a D3M primitive used to compute loss, or
None
.
-
loss
(*, loss_function, inputs, outputs, timeout=None, iterations=None)[source]¶ Returns the loss sum_i(L(output_i, produce_one(input_i))) for all (input_i, output_i) pairs using a loss function used by the primitive during the last fitting, identified by the
loss_function
index in the list of loss functions as returned by theget_loss_functions
.By default it calls
losses
and tries to automatically compute a sum, but subclasses can implement a more efficient or even correct version.- Parameters
loss_function (
int
) – An index of the loss function to use.inputs (~Inputs) – The inputs.
outputs (~Outputs) – The outputs.
timeout (
Optional
[float
]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do.
- Return type
CallResult
[~Outputs]- Returns
sum_i(L(output_i, produce_one(input_i))) for all (input_i, output_i) pairs wrapped inside
CallResult
. The number of returned samples is always 1. The number of columns should match the number of target columns inoutputs
.
-
abstract
losses
(*, loss_function, inputs, outputs, timeout=None, iterations=None)[source]¶ Returns the loss L(output_i, produce_one(input_i)) for each (input_i, output_i) pair using a loss function used by the primitive during the last fitting, identified by the
loss_function
index in the list of loss functions as returned by theget_loss_functions
.- Parameters
loss_function (
int
) – An index of the loss function to use.inputs (~Inputs) – The inputs.
outputs (~Outputs) – The outputs.
timeout (
Optional
[float
]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do.
- Return type
CallResult
[~Outputs]- Returns
L(output_i, produce_one(input_i)) for each (input_i, output_i) pair wrapped inside
CallResult
. The number of columns should match the number of target columns inoutputs
.
-
abstract
-
class
d3m.primitive_interfaces.base.
MultiCallResult
(values, has_finished=True, iterations_done=None)[source]¶ Bases:
object
Similar to CallResult, but used by
multi_produce
.It has no precise typing information because type would have to be a dependent type which is not (yet) supported in standard Python typing. Type would depend on
produce_methods
argument and output types of corresponding produce methods.
-
class
d3m.primitive_interfaces.base.
NeuralNetworkModuleMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
,typing.Hyperparams
],typing.Module
]Mixin which provides an abstract method for connecting neural network modules together. Mixin is parameterized with type variable
Module
. These modules can be either single layers, or they can be blocks of layers. The construction of these modules is done by mapping the neural network to the pipeline structure, where primitives (exposing modules through this abstract method) are passed to followup layers through hyper-parameters. The whole such structure is then passed for the final time as a hyper-parameter to a training primitive which then builds the internal representation of the neural network and trains it.-
abstract
get_neural_network_module
(*, input_module)[source]¶ Returns a neural network module corresponding to this primitive. That module might be already connected to other modules, which can be done by primitive calling this method recursively on other primitives. If this is initial layer of the neural network, it input is provided through
input_module
argument.- Parameters
input_module (~Module) – The input module to the initial layer of the neural network.
- Returns
The
Module
instance corresponding to this primitive.- Return type
~Module
-
abstract
-
class
d3m.primitive_interfaces.base.
NeuralNetworkObjectMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
,typing.Hyperparams
],typing.Module
]Mixin which provides an abstract method which returns auxiliary objects for use in representing neural networks as pipelines: loss functions, optimizers, etc.
One should consider the use of other primitive metadata (primitive family, algorithm types) to describe the primitive implementing this mixin and limit primitives in hyper-parameters.
-
abstract
get_neural_network_object
(module)[source]¶ Returns a neural network object. The object is opaque from the perspective of the pipeline. The caller is responsible to assure that the returned object is of correct type and interface and that it is passed on to a correct consumer understanding the object.
- Parameters
module (~Module) – The module representing the neural network for which the object is requested. It should be always provided even if particular implementation does not use it.
- Returns
An opaque object.
- Return type
-
abstract
-
class
d3m.primitive_interfaces.base.
PrimitiveBase
(*, hyperparams, random_seed=0, docker_containers=None, volumes=None, temporary_directory=None)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
],typing.Hyperparams
]A base class for primitives.
Class is parameterized using four type variables,
Inputs
,Outputs
,Params
, andHyperparams
.Params
has to be a subclass of d3m.metadata.params.Params and should define all fields and their types for parameters which the primitive is fitting.Hyperparams
has to be a subclass of a d3m.metadata.hyperparams.Hyperparams. Hyper-parameters are those primitive’s parameters which primitive is not fitting and generally do not change during a life-time of a primitive.Params
andHyperparams
have to be picklable and copyable. See pickle, copy, and copyreg Python modules for more information.In this context we use term method arguments to mean both formal parameters and actual parameters of a method. We do this to not confuse method parameters with primitive parameters (
Params
).All arguments to all methods are keyword-only. No
*args
or**kwargs
should ever be used in any method.Standardized interface use few public attributes and no other public attributes are allowed to assure future compatibility. For your attributes use the convention that private symbols should start with
_
.Primitives can have methods which are not part of standardized interface classes:
Additional “produce” methods which are prefixed with
produce_
and have the same semantics asproduce
but potentially return different output container types instead ofOutputs
(in such primitiveOutputs
is seen as primary output type, but the primitive also has secondary output types). They should returnCallResult
and havetimeout
anditerations
arguments.Private methods prefixed with
_
.
No other public additional methods are allowed. If this represents a problem for you, open an issue. (The rationale is that for other methods an automatic system will not understand the semantics of the method.)
Method arguments which start with
_
are seen as private and can be used for arguments useful for debugging and testing, but they should not be used by (or even known to) a caller during normal execution. Such arguments have to be optional (have a default value) so that the method can be called without the knowledge of the argument.All arguments to all methods and all hyper-parameters together are seen as arguments to the primitive as a whole. They are identified by their names. This means that any argument name must have the same type and semantics across all methods, effectively be the same argument. If a method argument matches in name a hyper-parameter, it has to match it in type and semantics as well. Such method argument overrides a hyper-parameter for a method call. All this is necessary so that callers can have easier time determine what values to pass to arguments and that it is easier to describe what all values are inputs to a primitive as a whole (set of all arguments).
To recap, subclasses can extend arguments of standard methods with explicit typed keyword arguments used for the method call, or define new “produce” methods with arbitrary explicit typed keyword arguments. There are multiple kinds of such arguments allowed:
An (additional) input argument of any container type and not necessary of
Inputs
(in such primitiveInputs
is seen as primary input type, but the primitive also has secondary input types).An argument which is overriding a hyper-parameter for the duration of the call. It should match a hyper-parameter in name and type. It should be a required argument (no default value) which the caller has to supply (or with a default value of a hyper-parameter, or with the same hyper-parameter as it was passed to the constructor, or with some other value). This is meant just for fine-control by a caller during fitting or producing, e.g., for a threshold or learning rate, and is not reasonable for most hyper-parameters.
An (additional) value argument which is one of standard data types, but not a container type. In this case a caller will try to satisfy the input by creating part of a pipeline which ends with a primitive with singleton produce method and extract the singleton value and pass it without a container. This kind of an argument is discouraged and should probably be a hyper-parameter instead (because it is unclear how can a caller determine which value is a reasonable value to pass in an automatic way), but it is defined for completeness and so that existing pipelines can be easier described.
A private argument prefixed with
_
which is used for debugging and testing. It should not be used by (or even known to) a caller during normal execution. Such argument has to be optional (have a default value) so that the method can be called without the knowledge of the argument.
Each primitive’s class automatically gets an instance of Python’s logging logger stored into its
logger
class attribute. The instance is made under the name of primitive’spython_path
metadata value. Primitives can use this logger to log information at various levels (debug, warning, error) and even associate extra data with log record using theextra
argument to the logger calls.Subclasses of this class allow functional compositionality.
-
__getstate__
()[source]¶ Returns state which is used to pickle an instance of a primitive.
By default it returns standard constructor arguments and value returned from
get_params
method.Consider extending default implementation if your primitive accepts additional constructor arguments you would like to preserve when pickling.
Note that unpickled primitive instances can generally continue to work only inside the same environment they were pickled in because they continue to use same
docker_containers
,volumes
, andtemporary_directory
values passed initially to primitive’s constructor. Those generally do not work in another environment where those resources might be available differently. Consider constructing primitive instance directly providing updated constructor arguments and then usingget_params
/set_params
to restore primitive’s state.- Returns
State to pickle.
- Return type
-
__setstate__
(state)[source]¶ Uses
state
to restore the state of a primitive when unpickling.By default it passes constructor arguments to the constructor and calls
get_params
.- Parameters
state (
dict
) – Unpickled state.- Return type
None
-
abstract
fit
(*, timeout=None, iterations=None)[source]¶ Fits primitive using inputs and outputs (if any) using currently set training data.
The returned value should be a
CallResult
object withvalue
set toNone
.If
fit
has already been called in the past on different training data, this method fits it again from scratch using currently set training data.On the other hand, caller can call
fit
multiple times on the same training data to continue fitting.If
fit
fully fits using provided training data, there is no point in making further calls to this method with same training data, and in fact further calls can be noops, or a primitive can decide to fully refit from scratch.In the case fitting can continue with same training data (even if it is maybe not reasonable, because the internal metric primitive is using looks like fitting will be degrading), if
fit
is called again (without setting training data), the primitive has to continue fitting.Caller can provide
timeout
information to guide the length of the fitting process. Ideally, a primitive should adapt its fitting process to try to do the best fitting possible inside the time allocated. If this is not possible and the primitive reaches the timeout before fitting, it should raise aTimeoutError
exception to signal that fitting was unsuccessful in the given time. The state of the primitive after the exception should be as the method call has never happened and primitive should continue to operate normally. The purpose oftimeout
is to give opportunity to a primitive to cleanly manage its state instead of interrupting execution from outside. Maintaining stable internal state should have precedence over respecting thetimeout
(caller can terminate the misbehaving primitive from outside anyway). If a longertimeout
would produce different fitting, thenCallResult
’shas_finished
should be set toFalse
.Some primitives have internal fitting iterations (for example, epochs). For those, caller can provide how many of primitive’s internal iterations should a primitive do before returning. Primitives should make iterations as small as reasonable. If
iterations
isNone
, then there is no limit on how many iterations the primitive should do and primitive should choose the best amount of iterations on its own (potentially controlled through hyper-parameters). Ifiterations
is a number, a primitive has to do those number of iterations (even if not reasonable), if possible.timeout
should still be respected and potentially less iterations can be done because of that. Primitives with internal iterations should makeCallResult
contain correct values.For primitives which do not have internal iterations, any value of
iterations
means that they should fit fully, respecting onlytimeout
.- Parameters
- Returns
A
CallResult
withNone
value.- Return type
CallResult
[None
]
-
fit_multi_produce
(*, produce_methods, inputs, outputs, timeout=None, iterations=None)[source]¶ A method calling
fit
and after that multiple produce methods at once.This method allows primitive author to implement an optimized version of both fitting and producing a primitive on same data.
If any additional method arguments are added to primitive’s
set_training_data
method or produce method(s), or removed from them, they have to be added to or removed from this method as well. This method should accept an union of all arguments accepted by primitive’sset_training_data
method and produce method(s) and then use them accordingly when computing results. Despite accepting all arguments they can be passed asNone
by the caller when they are not needed by any of the produce methods inproduce_methods
andset_training_data
.The default implementation of this method just calls first
set_training_data
method,fit
method, and all produce methods listed inproduce_methods
in order and is potentially inefficient.- Parameters
produce_methods (
Sequence
[str
]) – A list of names of produce methods to call.inputs (~Inputs) – The inputs given to
set_training_data
and all produce methods.outputs (~Outputs) – The outputs given to
set_training_data
.timeout (
Optional
[float
]) – A maximum time this primitive should take to both fit the primitive and produce outputs for all produce methods listed inproduce_methods
argument, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do for both fitting and producing outputs of all produce methods.
- Returns
A dict of values for each produce method wrapped inside
MultiCallResult
.- Return type
-
abstract
get_params
()[source]¶ Returns parameters of this primitive.
Parameters are all parameters of the primitive which can potentially change during a life-time of a primitive. Parameters which cannot are passed through constructor.
Parameters should include all data which is necessary to create a new instance of this primitive behaving exactly the same as this instance, when the new instance is created by passing the same parameters to the class constructor and calling
set_params
.No other arguments to the method are allowed (except for private arguments).
- Returns
An instance of parameters.
- Return type
-
multi_produce
(*, produce_methods, inputs, timeout=None, iterations=None)[source]¶ A method calling multiple produce methods at once.
When a primitive has multiple produce methods it is common that they might compute the same internal results for same inputs but return different representations of those results. If caller is interested in multiple of those representations, calling multiple produce methods might lead to recomputing same internal results multiple times. To address this, this method allows primitive author to implement an optimized version which computes internal results only once for multiple calls of produce methods, but return those different representations.
If any additional method arguments are added to primitive’s produce method(s), they have to be added to this method as well. This method should accept an union of all arguments accepted by primitive’s produce method(s) and then use them accordingly when computing results. Despite accepting all arguments they can be passed as
None
by the caller when they are not needed by any of the produce methods inproduce_methods
.The default implementation of this method just calls all produce methods listed in
produce_methods
in order and is potentially inefficient.If primitive should have been fitted before calling this method, but it has not been, primitive should raise a
PrimitiveNotFittedError
exception.- Parameters
produce_methods (
Sequence
[str
]) – A list of names of produce methods to call.inputs (~Inputs) – The inputs given to all produce methods.
timeout (
Optional
[float
]) – A maximum time this primitive should take to produce outputs for all produce methods listed inproduce_methods
argument, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do.
- Returns
A dict of values for each produce method wrapped inside
MultiCallResult
.- Return type
-
abstract
produce
(*, inputs, timeout=None, iterations=None)[source]¶ Produce primitive’s best choice of the output for each of the inputs.
The output value should be wrapped inside
CallResult
object before returning.In many cases producing an output is a quick operation in comparison with
fit
, but not all cases are like that. For example, a primitive can start a potentially long optimization process to compute outputs.timeout
anditerations
can serve as a way for a caller to guide the length of this process.Ideally, a primitive should adapt its call to try to produce the best outputs possible inside the time allocated. If this is not possible and the primitive reaches the timeout before producing outputs, it should raise a
TimeoutError
exception to signal that the call was unsuccessful in the given time. The state of the primitive after the exception should be as the method call has never happened and primitive should continue to operate normally. The purpose oftimeout
is to give opportunity to a primitive to cleanly manage its state instead of interrupting execution from outside. Maintaining stable internal state should have precedence over respecting thetimeout
(caller can terminate the misbehaving primitive from outside anyway). If a longertimeout
would produce different outputs, thenCallResult
’shas_finished
should be set toFalse
.Some primitives have internal iterations (for example, optimization iterations). For those, caller can provide how many of primitive’s internal iterations should a primitive do before returning outputs. Primitives should make iterations as small as reasonable. If
iterations
isNone
, then there is no limit on how many iterations the primitive should do and primitive should choose the best amount of iterations on its own (potentially controlled through hyper-parameters). Ifiterations
is a number, a primitive has to do those number of iterations, if possible.timeout
should still be respected and potentially less iterations can be done because of that. Primitives with internal iterations should makeCallResult
contain correct values.For primitives which do not have internal iterations, any value of
iterations
means that they should run fully, respecting onlytimeout
.If primitive should have been fitted before calling this method, but it has not been, primitive should raise a
PrimitiveNotFittedError
exception.- Parameters
- Returns
The outputs of shape [num_inputs, …] wrapped inside
CallResult
.- Return type
CallResult
[~Outputs]
-
abstract
set_params
(*, params)[source]¶ Sets parameters of this primitive.
Parameters are all parameters of the primitive which can potentially change during a life-time of a primitive. Parameters which cannot are passed through constructor.
No other arguments to the method are allowed (except for private arguments).
- Parameters
params (~Params) – An instance of parameters.
- Return type
None
-
abstract
set_training_data
(*, inputs, outputs)[source]¶ Sets current training data of this primitive.
This marks training data as changed even if new training data is the same as previous training data.
Standard sublasses in this package do not adhere to the Liskov substitution principle when inheriting this method because they do not necessary accept all arguments found in the base class. This means that one has to inspect which arguments are accepted at runtime, or in other words, one has to inspect which exactly subclass a primitive implements, if you are accepting a wider range of primitives. This relaxation is allowed only for standard subclasses found in this package. Primitives themselves should not break the Liskov substitution principle but should inherit from a suitable base class.
- Parameters
inputs (~Inputs) – The inputs.
outputs (~Outputs) – The outputs.
- Return type
None
-
docker_containers
: Dict[str, d3m.primitive_interfaces.base.DockerContainer][source]¶ A dict mapping Docker image keys from primitive’s metadata to (named) tuples containing container’s address under which the container is accessible by the primitive, and a dict mapping exposed ports to ports on that address.
-
logger
: ClassVar[logging.Logger][source]¶ Primitive’s logger. Available as a class attribute. This gets automatically set to primitive’s logger in metaclass.
-
metadata
: ClassVar[d3m.metadata.base.PrimitiveMetadata][source]¶ Primitive’s metadata. Available as a class attribute. Primitive author should provide all fields which cannot be determined automatically inside the code. In this way metadata is close to the code and it is easier for consumers to make sure metadata they are using is really matching the code they are using. PrimitiveMetadata class updates itself with metadata about code and other things it can extract automatically.
-
class
d3m.primitive_interfaces.base.
ProbabilisticCompositionalityMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
],typing.Hyperparams
]This mixin provides additional abstract methods which primitives should implement to help callers with doing various end-to-end refinements using probabilistic compositionality.
This mixin adds methods to support at least:
Metropolis-Hastings
Mixin should be used together with
SamplingCompositionalityMixin
mixin.-
log_likelihood
(*, outputs, inputs, timeout=None, iterations=None)[source]¶ Returns log probability of outputs given inputs and params under this primitive:
sum_i(log(p(output_i | input_i, params)))
By default it calls
log_likelihoods
and tries to automatically compute a sum, but subclasses can implement a more efficient or even correct version.- Parameters
outputs (~Outputs) – The outputs. The number of samples should match
inputs
.inputs (~Inputs) – The inputs. The number of samples should match
outputs
.timeout (
Optional
[float
]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do.
- Return type
CallResult
[~Outputs]- Returns
sum_i(log(p(output_i | input_i, params))) wrapped inside
CallResult
. The number of returned samples is always 1. The number of columns should match the number of target columns inoutputs
.
-
abstract
log_likelihoods
(*, outputs, inputs, timeout=None, iterations=None)[source]¶ Returns log probability of outputs given inputs and params under this primitive:
log(p(output_i | input_i, params))
- Parameters
outputs (~Outputs) – The outputs. The number of samples should match
inputs
.inputs (~Inputs) – The inputs. The number of samples should match
outputs
.timeout (
Optional
[float
]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do.
- Return type
CallResult
[~Outputs]- Returns
log(p(output_i | input_i, params))) wrapped inside
CallResult
. The number of columns should match the number of target columns inoutputs
.
-
class
d3m.primitive_interfaces.base.
SamplingCompositionalityMixin
(*args, **kwds)[source]¶ Bases:
Generic
[[typing.Inputs
,typing.Outputs
,typing.Params
],typing.Hyperparams
]This mixin signals to a caller that the primitive is probabilistic but may be likelihood free.
-
abstract
sample
(*, inputs, num_samples=1, timeout=None, iterations=None)[source]¶ Sample output for each input from
inputs
num_samples
times.Semantics of
timeout
anditerations
is the same as inproduce
.- Parameters
inputs (~Inputs) – The inputs of shape [num_inputs, …].
num_samples (
int
) – The number of samples to return in a set of samples.timeout (
Optional
[float
]) – A maximum time this primitive should take to sample outputs during this method call, in seconds.iterations (
Optional
[int
]) – How many of internal iterations should the primitive do.
- Return type
CallResult
[Sequence
[~Outputs]]- Returns
The multiple sets of samples of shape [num_samples, num_inputs, …] wrapped inside
CallResult
. While the output value type is specified asSequence[Outputs]
, the output value can be in fact any container type with dimensions/shape equal to combinedSequence[Outputs]
dimensions/shape. Subclasses should specify which exactly type the output is.
-
abstract
-
d3m.primitive_interfaces.base.
inputs_across_samples
(func=None, inputs=None, *args)[source]¶ A produce method can use this decorator to signal which of the inputs (arguments) is using across all samples and not sample by sample.
For many produce methods it does not matter if it is called 100x on 1 sample or 1x on 100 samples, but not all produce methods are like that and some produce results based on which all inputs were given to them. If just a subset of inputs is given, results are different. An example of this is
produce_distance_matrix
method which returns a NxN matrix where N is number of samples, computing a distance from each sample to each other sample.When inputs have a primary key without uniqueness constraint, then “sample” for the purpose of this decorator means all samples with the same primary key value.
Decorator accepts a list of inputs which are used across all samples. By default, inputs argument name is used.
- Return type
-
d3m.primitive_interfaces.base.
singleton
(f)[source]¶ If a produce method is using this decorator, it is signaling that all outputs from the produce method are sequences of length 1. This is useful because a caller can then directly extract this element.
Example of such produce methods are produce methods of primitives which compute loss, which are returning one number for multiple inputs. With this decorator they can return a sequence with this one number, but caller which cares about the loss can extract it out. At the same time, other callers which operate only on sequences can continue to operate normally.
We can see other produce methods as mapping produce methods, and produce methods with this decorator as reducing produce methods.
- Return type