d3m.primitive_interfaces.base¶
-
class
d3m.primitive_interfaces.base.CallResult(value, has_finished=True, iterations_done=None)[source]¶ Bases:
Generic[typing.T]Some methods return additional metadata about the method call itself (which is different to metadata about the value returned, which is stored in
metadataattribute of the value itself).For
producemethod call,has_finishedisTrueif the last call toproducehas produced the final outputs and a call with more time or more iterations cannot get different outputs.For
fitmethod call,has_finishedisTrueif a primitive has been fully fitted on current training data and further calls tofitare unnecessary and will not change anything.Falsemeans that more iterations can be done (but it does not necessary mean that more iterations are beneficial).If a primitive has iterations internally, then
iterations_donecontains how many of those iterations have been made during the last call. If primitive does not support them,iterations_doneisNone.Those methods should return value wrapped into this class.
-
class
d3m.primitive_interfaces.base.ContinueFitMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params],typing.Hyperparams]Abstract base class for generic types.
A generic type is typically declared by inheriting from this class parameterized with one or more type variables. For example, a generic mapping type might be defined as:
class Mapping(Generic[KT, VT]): def __getitem__(self, key: KT) -> VT: ... # Etc.
This class can then be used as follows:
def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT: try: return mapping[key] except KeyError: return default
-
abstract
continue_fit(*, timeout=None, iterations=None)[source]¶ Similar to base
fit, this method fits the primitive using inputs and outputs (if any) using currently set training data.The difference is what happens when currently set training data is different from what the primitive might have already been fitted on.
fitresets parameters and refits the primitive (restarts fitting), whilecontinue_fitfits the primitive further on new training data.fitdoes not have to be called beforecontinue_fit, callingcontinue_fitfirst starts fitting as well.Caller can still call
continue_fitmultiple times on the same training data as well, in which case primitive should try to improve the fit in the same way as withfit.From the perspective of a caller of all other methods, the training data in effect is still just currently set training data. If a caller wants to call
gradient_outputon all data on which the primitive has been fitted through multiple calls ofcontinue_fiton different training data, the caller should pass all this data themselves through another call toset_training_data, do not callfitorcontinue_fitagain, and usegradient_outputmethod. In this way primitives which truly support continuation of fitting and need only the latest data to do another fitting, do not have to keep all past training data around themselves.If a primitive supports this mixin, then both
fitandcontinue_fitcan be called.continue_fitalways continues fitting, if it was started throughfitorcontinue_fitand fitting has not already finished. Callingfitalways restarts fitting aftercontinue_fithas been called, even if training data has not changed.Primitives supporting this mixin and which operate on categorical target columns should use
all_distinct_valuesmetadata to obtain which all values (labels) can be in a target column, even if currently set training data does not contain all those values.- Parameters
- Returns
A
CallResultwithNonevalue.- Return type
CallResult[None]
-
abstract
-
class
d3m.primitive_interfaces.base.DockerContainer(address, ports)[source]¶ Bases:
tupleA tuple suitable to describe connection information necessary to connect to exposed ports of a running Docker container.
-
class
d3m.primitive_interfaces.base.GradientCompositionalityMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params],typing.Hyperparams]This mixin provides additional abstract methods which primitives should implement to help callers with doing various end-to-end refinements using gradient-based compositionality.
This mixin adds methods to support at least:
gradient-based, compositional end-to-end training
regularized pre-training
multi-task adaptation
black box variational inference
Hamiltonian Monte Carlo
-
abstract
backward(*, gradient_outputs, fine_tune=False, fine_tune_learning_rate=1e-05, fine_tune_weight_decay=1e-05)[source]¶ Returns the gradient with respect to inputs and with respect to params of a loss that is being backpropagated end-to-end in a pipeline.
This is the standard backpropagation algorithm: backpropagation needs to be preceded by a forward propagation (
forwardmethod call).- Parameters
gradient_outputs (
Gradients[~Outputs]) – The gradient of the loss with respect to this primitive’s output. During backpropagation, this comes from the next primitive in the pipeline, i.e., the primitive whose input is the output of this primitive during the forward execution withforward(andproduce).fine_tune (
bool) – IfTrue, executes a fine-tuning gradient descent step as a part of this call. This provides the most straightforward way of end-to-end training/fine-tuning.fine_tune_learning_rate (
float) – Learning rate for end-to-end training/fine-tuning gradient descent steps.fine_tune_weight_decay (
float) – L2 regularization (weight decay) coefficient for end-to-end training/fine-tuning gradient descent steps.
- Returns
A tuple of the gradient with respect to inputs and with respect to params.
- Return type
-
forward(*, inputs)[source]¶ Similar to
producemethod but it is meant to be used for a forward pass during backpropagation-based end-to-end training. Primitive can implement it differently thanproduce, e.g., forward pass during training can enable dropout layers, orproducemight not compute gradients whileforwarddoes.By default it calls
producefor one iteration.- Parameters
inputs (~Inputs) – The inputs of shape [num_inputs, …].
- Returns
The outputs of shape [num_inputs, …].
- Return type
~Outputs
-
abstract
gradient_output(*, outputs, inputs)[source]¶ Returns the gradient of loss sum_i(L(output_i, produce_one(input_i))) with respect to outputs.
When fit term temperature is set to non-zero, it should return the gradient with respect to outputs of:
sum_i(L(output_i, produce_one(input_i))) + temperature * sum_i(L(training_output_i, produce_one(training_input_i)))
When used in combination with the
ProbabilisticCompositionalityMixin, it returns gradient of sum_i(log(p(output_i | input_i, params))) with respect to outputs.When fit term temperature is set to non-zero, it should return the gradient with respect to outputs of:
sum_i(log(p(output_i | input_i, params))) + temperature * sum_i(log(p(training_output_i | training_input_i, params)))
- Parameters
outputs (~Outputs) – The outputs.
inputs (~Inputs) – The inputs.
- Returns
A structure similar to
Containerbut the values are of typeOptional[float].- Return type
Gradients[~Outputs]
-
abstract
gradient_params(*, outputs, inputs)[source]¶ Returns the gradient of loss sum_i(L(output_i, produce_one(input_i))) with respect to params.
When fit term temperature is set to non-zero, it should return the gradient with respect to params of:
sum_i(L(output_i, produce_one(input_i))) + temperature * sum_i(L(training_output_i, produce_one(training_input_i)))
When used in combination with the
ProbabilisticCompositionalityMixin, it returns gradient of sum_i(log(p(output_i | input_i, params))) with respect to params.When fit term temperature is set to non-zero, it should return the gradient with respect to params of:
sum_i(log(p(output_i | input_i, params))) + temperature * sum_i(log(p(training_output_i | training_input_i, params)))
- Parameters
outputs (~Outputs) – The outputs.
inputs (~Inputs) – The inputs.
- Returns
A version of
Paramswith all differentiable fields fromParamsand values set to gradient for each parameter.- Return type
Gradients[~Params]
-
class
d3m.primitive_interfaces.base.Gradients(*args, **kwds)[source]¶ Bases:
Generic[typing.Container]A type representing a structure similar to
Container, but the values are of typeOptional[float]. Value isNoneif gradient for that part of the structure is not possible.
-
class
d3m.primitive_interfaces.base.LossFunctionMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params],typing.Hyperparams]Mixin which provides abstract methods for a caller to call to inspect which loss function or functions a primitive is using internally, and to compute loss on given inputs and outputs.
-
abstract
get_loss_functions()[source]¶ Returns a list of loss functions used by the primitive. Each element of the list can be:
A D3M metric value of the loss function used by the primitive during the last fitting.
Primitives can be passed to other primitives as arguments. As such, some primitives can accept another primitive as a loss function to use, or use it internally. A primitive can expose this loss primitive to others, providing directly an instance of the primitive being used during the last fitting.
Noneif using a non-standard loss function. Used so that the loss function can still be exposed throughlossandlossesmethods.
It should return an empty list if the primitive does not use loss functions at all.
The order in the list matters because the loss function index is used for
lossandlossesmethods.- Return type
Sequence[Tuple[PerformanceMetric,PrimitiveBase,None]]- Returns
A list of (a D3M standard metric value of the loss function used,) or a D3M primitive used to compute loss, or
None.
-
loss(*, loss_function, inputs, outputs, timeout=None, iterations=None)[source]¶ Returns the loss sum_i(L(output_i, produce_one(input_i))) for all (input_i, output_i) pairs using a loss function used by the primitive during the last fitting, identified by the
loss_functionindex in the list of loss functions as returned by theget_loss_functions.By default it calls
lossesand tries to automatically compute a sum, but subclasses can implement a more efficient or even correct version.- Parameters
loss_function (
int) – An index of the loss function to use.inputs (~Inputs) – The inputs.
outputs (~Outputs) – The outputs.
timeout (
Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do.
- Return type
CallResult[~Outputs]- Returns
sum_i(L(output_i, produce_one(input_i))) for all (input_i, output_i) pairs wrapped inside
CallResult. The number of returned samples is always 1. The number of columns should match the number of target columns inoutputs.
-
abstract
losses(*, loss_function, inputs, outputs, timeout=None, iterations=None)[source]¶ Returns the loss L(output_i, produce_one(input_i)) for each (input_i, output_i) pair using a loss function used by the primitive during the last fitting, identified by the
loss_functionindex in the list of loss functions as returned by theget_loss_functions.- Parameters
loss_function (
int) – An index of the loss function to use.inputs (~Inputs) – The inputs.
outputs (~Outputs) – The outputs.
timeout (
Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do.
- Return type
CallResult[~Outputs]- Returns
L(output_i, produce_one(input_i)) for each (input_i, output_i) pair wrapped inside
CallResult. The number of columns should match the number of target columns inoutputs.
-
abstract
-
class
d3m.primitive_interfaces.base.MultiCallResult(values, has_finished=True, iterations_done=None)[source]¶ Bases:
objectSimilar to CallResult, but used by
multi_produce.It has no precise typing information because type would have to be a dependent type which is not (yet) supported in standard Python typing. Type would depend on
produce_methodsargument and output types of corresponding produce methods.
-
class
d3m.primitive_interfaces.base.NeuralNetworkModuleMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params,typing.Hyperparams],typing.Module]Mixin which provides an abstract method for connecting neural network modules together. Mixin is parameterized with type variable
Module. These modules can be either single layers, or they can be blocks of layers. The construction of these modules is done by mapping the neural network to the pipeline structure, where primitives (exposing modules through this abstract method) are passed to followup layers through hyper-parameters. The whole such structure is then passed for the final time as a hyper-parameter to a training primitive which then builds the internal representation of the neural network and trains it.-
abstract
get_neural_network_module(*, input_module)[source]¶ Returns a neural network module corresponding to this primitive. That module might be already connected to other modules, which can be done by primitive calling this method recursively on other primitives. If this is initial layer of the neural network, it input is provided through
input_moduleargument.- Parameters
input_module (~Module) – The input module to the initial layer of the neural network.
- Returns
The
Moduleinstance corresponding to this primitive.- Return type
~Module
-
abstract
-
class
d3m.primitive_interfaces.base.NeuralNetworkObjectMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params,typing.Hyperparams],typing.Module]Mixin which provides an abstract method which returns auxiliary objects for use in representing neural networks as pipelines: loss functions, optimizers, etc.
One should consider the use of other primitive metadata (primitive family, algorithm types) to describe the primitive implementing this mixin and limit primitives in hyper-parameters.
-
abstract
get_neural_network_object(module)[source]¶ Returns a neural network object. The object is opaque from the perspective of the pipeline. The caller is responsible to assure that the returned object is of correct type and interface and that it is passed on to a correct consumer understanding the object.
- Parameters
module (~Module) – The module representing the neural network for which the object is requested. It should be always provided even if particular implementation does not use it.
- Returns
An opaque object.
- Return type
-
abstract
-
class
d3m.primitive_interfaces.base.PrimitiveBase(*, hyperparams, random_seed=0, docker_containers=None, volumes=None, temporary_directory=None)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params],typing.Hyperparams]A base class for primitives.
Class is parameterized using four type variables,
Inputs,Outputs,Params, andHyperparams.Paramshas to be a subclass of d3m.metadata.params.Params and should define all fields and their types for parameters which the primitive is fitting.Hyperparamshas to be a subclass of a d3m.metadata.hyperparams.Hyperparams. Hyper-parameters are those primitive’s parameters which primitive is not fitting and generally do not change during a life-time of a primitive.ParamsandHyperparamshave to be picklable and copyable. See pickle, copy, and copyreg Python modules for more information.In this context we use term method arguments to mean both formal parameters and actual parameters of a method. We do this to not confuse method parameters with primitive parameters (
Params).All arguments to all methods are keyword-only. No
*argsor**kwargsshould ever be used in any method.Standardized interface use few public attributes and no other public attributes are allowed to assure future compatibility. For your attributes use the convention that private symbols should start with
_.Primitives can have methods which are not part of standardized interface classes:
Additional “produce” methods which are prefixed with
produce_and have the same semantics asproducebut potentially return different output container types instead ofOutputs(in such primitiveOutputsis seen as primary output type, but the primitive also has secondary output types). They should returnCallResultand havetimeoutanditerationsarguments.Private methods prefixed with
_.
No other public additional methods are allowed. If this represents a problem for you, open an issue. (The rationale is that for other methods an automatic system will not understand the semantics of the method.)
Method arguments which start with
_are seen as private and can be used for arguments useful for debugging and testing, but they should not be used by (or even known to) a caller during normal execution. Such arguments have to be optional (have a default value) so that the method can be called without the knowledge of the argument.All arguments to all methods and all hyper-parameters together are seen as arguments to the primitive as a whole. They are identified by their names. This means that any argument name must have the same type and semantics across all methods, effectively be the same argument. If a method argument matches in name a hyper-parameter, it has to match it in type and semantics as well. Such method argument overrides a hyper-parameter for a method call. All this is necessary so that callers can have easier time determine what values to pass to arguments and that it is easier to describe what all values are inputs to a primitive as a whole (set of all arguments).
To recap, subclasses can extend arguments of standard methods with explicit typed keyword arguments used for the method call, or define new “produce” methods with arbitrary explicit typed keyword arguments. There are multiple kinds of such arguments allowed:
An (additional) input argument of any container type and not necessary of
Inputs(in such primitiveInputsis seen as primary input type, but the primitive also has secondary input types).An argument which is overriding a hyper-parameter for the duration of the call. It should match a hyper-parameter in name and type. It should be a required argument (no default value) which the caller has to supply (or with a default value of a hyper-parameter, or with the same hyper-parameter as it was passed to the constructor, or with some other value). This is meant just for fine-control by a caller during fitting or producing, e.g., for a threshold or learning rate, and is not reasonable for most hyper-parameters.
An (additional) value argument which is one of standard data types, but not a container type. In this case a caller will try to satisfy the input by creating part of a pipeline which ends with a primitive with singleton produce method and extract the singleton value and pass it without a container. This kind of an argument is discouraged and should probably be a hyper-parameter instead (because it is unclear how can a caller determine which value is a reasonable value to pass in an automatic way), but it is defined for completeness and so that existing pipelines can be easier described.
A private argument prefixed with
_which is used for debugging and testing. It should not be used by (or even known to) a caller during normal execution. Such argument has to be optional (have a default value) so that the method can be called without the knowledge of the argument.
Each primitive’s class automatically gets an instance of Python’s logging logger stored into its
loggerclass attribute. The instance is made under the name of primitive’spython_pathmetadata value. Primitives can use this logger to log information at various levels (debug, warning, error) and even associate extra data with log record using theextraargument to the logger calls.Subclasses of this class allow functional compositionality.
-
__getstate__()[source]¶ Returns state which is used to pickle an instance of a primitive.
By default it returns standard constructor arguments and value returned from
get_paramsmethod.Consider extending default implementation if your primitive accepts additional constructor arguments you would like to preserve when pickling.
Note that unpickled primitive instances can generally continue to work only inside the same environment they were pickled in because they continue to use same
docker_containers,volumes, andtemporary_directoryvalues passed initially to primitive’s constructor. Those generally do not work in another environment where those resources might be available differently. Consider constructing primitive instance directly providing updated constructor arguments and then usingget_params/set_paramsto restore primitive’s state.- Returns
State to pickle.
- Return type
-
__setstate__(state)[source]¶ Uses
stateto restore the state of a primitive when unpickling.By default it passes constructor arguments to the constructor and calls
get_params.- Parameters
state (
dict) – Unpickled state.- Return type
None
-
abstract
fit(*, timeout=None, iterations=None)[source]¶ Fits primitive using inputs and outputs (if any) using currently set training data.
The returned value should be a
CallResultobject withvalueset toNone.If
fithas already been called in the past on different training data, this method fits it again from scratch using currently set training data.On the other hand, caller can call
fitmultiple times on the same training data to continue fitting.If
fitfully fits using provided training data, there is no point in making further calls to this method with same training data, and in fact further calls can be noops, or a primitive can decide to fully refit from scratch.In the case fitting can continue with same training data (even if it is maybe not reasonable, because the internal metric primitive is using looks like fitting will be degrading), if
fitis called again (without setting training data), the primitive has to continue fitting.Caller can provide
timeoutinformation to guide the length of the fitting process. Ideally, a primitive should adapt its fitting process to try to do the best fitting possible inside the time allocated. If this is not possible and the primitive reaches the timeout before fitting, it should raise aTimeoutErrorexception to signal that fitting was unsuccessful in the given time. The state of the primitive after the exception should be as the method call has never happened and primitive should continue to operate normally. The purpose oftimeoutis to give opportunity to a primitive to cleanly manage its state instead of interrupting execution from outside. Maintaining stable internal state should have precedence over respecting thetimeout(caller can terminate the misbehaving primitive from outside anyway). If a longertimeoutwould produce different fitting, thenCallResult’shas_finishedshould be set toFalse.Some primitives have internal fitting iterations (for example, epochs). For those, caller can provide how many of primitive’s internal iterations should a primitive do before returning. Primitives should make iterations as small as reasonable. If
iterationsisNone, then there is no limit on how many iterations the primitive should do and primitive should choose the best amount of iterations on its own (potentially controlled through hyper-parameters). Ifiterationsis a number, a primitive has to do those number of iterations (even if not reasonable), if possible.timeoutshould still be respected and potentially less iterations can be done because of that. Primitives with internal iterations should makeCallResultcontain correct values.For primitives which do not have internal iterations, any value of
iterationsmeans that they should fit fully, respecting onlytimeout.- Parameters
- Returns
A
CallResultwithNonevalue.- Return type
CallResult[None]
-
fit_multi_produce(*, produce_methods, inputs, outputs, timeout=None, iterations=None)[source]¶ A method calling
fitand after that multiple produce methods at once.This method allows primitive author to implement an optimized version of both fitting and producing a primitive on same data.
If any additional method arguments are added to primitive’s
set_training_datamethod or produce method(s), or removed from them, they have to be added to or removed from this method as well. This method should accept an union of all arguments accepted by primitive’sset_training_datamethod and produce method(s) and then use them accordingly when computing results. Despite accepting all arguments they can be passed asNoneby the caller when they are not needed by any of the produce methods inproduce_methodsandset_training_data.The default implementation of this method just calls first
set_training_datamethod,fitmethod, and all produce methods listed inproduce_methodsin order and is potentially inefficient.- Parameters
produce_methods (
Sequence[str]) – A list of names of produce methods to call.inputs (~Inputs) – The inputs given to
set_training_dataand all produce methods.outputs (~Outputs) – The outputs given to
set_training_data.timeout (
Optional[float]) – A maximum time this primitive should take to both fit the primitive and produce outputs for all produce methods listed inproduce_methodsargument, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do for both fitting and producing outputs of all produce methods.
- Returns
A dict of values for each produce method wrapped inside
MultiCallResult.- Return type
-
abstract
get_params()[source]¶ Returns parameters of this primitive.
Parameters are all parameters of the primitive which can potentially change during a life-time of a primitive. Parameters which cannot are passed through constructor.
Parameters should include all data which is necessary to create a new instance of this primitive behaving exactly the same as this instance, when the new instance is created by passing the same parameters to the class constructor and calling
set_params.No other arguments to the method are allowed (except for private arguments).
- Returns
An instance of parameters.
- Return type
-
multi_produce(*, produce_methods, inputs, timeout=None, iterations=None)[source]¶ A method calling multiple produce methods at once.
When a primitive has multiple produce methods it is common that they might compute the same internal results for same inputs but return different representations of those results. If caller is interested in multiple of those representations, calling multiple produce methods might lead to recomputing same internal results multiple times. To address this, this method allows primitive author to implement an optimized version which computes internal results only once for multiple calls of produce methods, but return those different representations.
If any additional method arguments are added to primitive’s produce method(s), they have to be added to this method as well. This method should accept an union of all arguments accepted by primitive’s produce method(s) and then use them accordingly when computing results. Despite accepting all arguments they can be passed as
Noneby the caller when they are not needed by any of the produce methods inproduce_methods.The default implementation of this method just calls all produce methods listed in
produce_methodsin order and is potentially inefficient.If primitive should have been fitted before calling this method, but it has not been, primitive should raise a
PrimitiveNotFittedErrorexception.- Parameters
produce_methods (
Sequence[str]) – A list of names of produce methods to call.inputs (~Inputs) – The inputs given to all produce methods.
timeout (
Optional[float]) – A maximum time this primitive should take to produce outputs for all produce methods listed inproduce_methodsargument, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do.
- Returns
A dict of values for each produce method wrapped inside
MultiCallResult.- Return type
-
abstract
produce(*, inputs, timeout=None, iterations=None)[source]¶ Produce primitive’s best choice of the output for each of the inputs.
The output value should be wrapped inside
CallResultobject before returning.In many cases producing an output is a quick operation in comparison with
fit, but not all cases are like that. For example, a primitive can start a potentially long optimization process to compute outputs.timeoutanditerationscan serve as a way for a caller to guide the length of this process.Ideally, a primitive should adapt its call to try to produce the best outputs possible inside the time allocated. If this is not possible and the primitive reaches the timeout before producing outputs, it should raise a
TimeoutErrorexception to signal that the call was unsuccessful in the given time. The state of the primitive after the exception should be as the method call has never happened and primitive should continue to operate normally. The purpose oftimeoutis to give opportunity to a primitive to cleanly manage its state instead of interrupting execution from outside. Maintaining stable internal state should have precedence over respecting thetimeout(caller can terminate the misbehaving primitive from outside anyway). If a longertimeoutwould produce different outputs, thenCallResult’shas_finishedshould be set toFalse.Some primitives have internal iterations (for example, optimization iterations). For those, caller can provide how many of primitive’s internal iterations should a primitive do before returning outputs. Primitives should make iterations as small as reasonable. If
iterationsisNone, then there is no limit on how many iterations the primitive should do and primitive should choose the best amount of iterations on its own (potentially controlled through hyper-parameters). Ifiterationsis a number, a primitive has to do those number of iterations, if possible.timeoutshould still be respected and potentially less iterations can be done because of that. Primitives with internal iterations should makeCallResultcontain correct values.For primitives which do not have internal iterations, any value of
iterationsmeans that they should run fully, respecting onlytimeout.If primitive should have been fitted before calling this method, but it has not been, primitive should raise a
PrimitiveNotFittedErrorexception.- Parameters
- Returns
The outputs of shape [num_inputs, …] wrapped inside
CallResult.- Return type
CallResult[~Outputs]
-
abstract
set_params(*, params)[source]¶ Sets parameters of this primitive.
Parameters are all parameters of the primitive which can potentially change during a life-time of a primitive. Parameters which cannot are passed through constructor.
No other arguments to the method are allowed (except for private arguments).
- Parameters
params (~Params) – An instance of parameters.
- Return type
None
-
abstract
set_training_data(*, inputs, outputs)[source]¶ Sets current training data of this primitive.
This marks training data as changed even if new training data is the same as previous training data.
Standard sublasses in this package do not adhere to the Liskov substitution principle when inheriting this method because they do not necessary accept all arguments found in the base class. This means that one has to inspect which arguments are accepted at runtime, or in other words, one has to inspect which exactly subclass a primitive implements, if you are accepting a wider range of primitives. This relaxation is allowed only for standard subclasses found in this package. Primitives themselves should not break the Liskov substitution principle but should inherit from a suitable base class.
- Parameters
inputs (~Inputs) – The inputs.
outputs (~Outputs) – The outputs.
- Return type
None
-
docker_containers: Dict[str, d3m.primitive_interfaces.base.DockerContainer][source]¶ A dict mapping Docker image keys from primitive’s metadata to (named) tuples containing container’s address under which the container is accessible by the primitive, and a dict mapping exposed ports to ports on that address.
-
logger: ClassVar[logging.Logger][source]¶ Primitive’s logger. Available as a class attribute. This gets automatically set to primitive’s logger in metaclass.
-
metadata: ClassVar[d3m.metadata.base.PrimitiveMetadata][source]¶ Primitive’s metadata. Available as a class attribute. Primitive author should provide all fields which cannot be determined automatically inside the code. In this way metadata is close to the code and it is easier for consumers to make sure metadata they are using is really matching the code they are using. PrimitiveMetadata class updates itself with metadata about code and other things it can extract automatically.
-
class
d3m.primitive_interfaces.base.ProbabilisticCompositionalityMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params],typing.Hyperparams]This mixin provides additional abstract methods which primitives should implement to help callers with doing various end-to-end refinements using probabilistic compositionality.
This mixin adds methods to support at least:
Metropolis-Hastings
Mixin should be used together with
SamplingCompositionalityMixinmixin.-
log_likelihood(*, outputs, inputs, timeout=None, iterations=None)[source]¶ Returns log probability of outputs given inputs and params under this primitive:
sum_i(log(p(output_i | input_i, params)))
By default it calls
log_likelihoodsand tries to automatically compute a sum, but subclasses can implement a more efficient or even correct version.- Parameters
outputs (~Outputs) – The outputs. The number of samples should match
inputs.inputs (~Inputs) – The inputs. The number of samples should match
outputs.timeout (
Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do.
- Return type
CallResult[~Outputs]- Returns
sum_i(log(p(output_i | input_i, params))) wrapped inside
CallResult. The number of returned samples is always 1. The number of columns should match the number of target columns inoutputs.
-
abstract
log_likelihoods(*, outputs, inputs, timeout=None, iterations=None)[source]¶ Returns log probability of outputs given inputs and params under this primitive:
log(p(output_i | input_i, params))
- Parameters
outputs (~Outputs) – The outputs. The number of samples should match
inputs.inputs (~Inputs) – The inputs. The number of samples should match
outputs.timeout (
Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do.
- Return type
CallResult[~Outputs]- Returns
log(p(output_i | input_i, params))) wrapped inside
CallResult. The number of columns should match the number of target columns inoutputs.
-
class
d3m.primitive_interfaces.base.SamplingCompositionalityMixin(*args, **kwds)[source]¶ Bases:
Generic[[typing.Inputs,typing.Outputs,typing.Params],typing.Hyperparams]This mixin signals to a caller that the primitive is probabilistic but may be likelihood free.
-
abstract
sample(*, inputs, num_samples=1, timeout=None, iterations=None)[source]¶ Sample output for each input from
inputsnum_samplestimes.Semantics of
timeoutanditerationsis the same as inproduce.- Parameters
inputs (~Inputs) – The inputs of shape [num_inputs, …].
num_samples (
int) – The number of samples to return in a set of samples.timeout (
Optional[float]) – A maximum time this primitive should take to sample outputs during this method call, in seconds.iterations (
Optional[int]) – How many of internal iterations should the primitive do.
- Return type
CallResult[Sequence[~Outputs]]- Returns
The multiple sets of samples of shape [num_samples, num_inputs, …] wrapped inside
CallResult. While the output value type is specified asSequence[Outputs], the output value can be in fact any container type with dimensions/shape equal to combinedSequence[Outputs]dimensions/shape. Subclasses should specify which exactly type the output is.
-
abstract
-
d3m.primitive_interfaces.base.inputs_across_samples(func=None, inputs=None, *args)[source]¶ A produce method can use this decorator to signal which of the inputs (arguments) is using across all samples and not sample by sample.
For many produce methods it does not matter if it is called 100x on 1 sample or 1x on 100 samples, but not all produce methods are like that and some produce results based on which all inputs were given to them. If just a subset of inputs is given, results are different. An example of this is
produce_distance_matrixmethod which returns a NxN matrix where N is number of samples, computing a distance from each sample to each other sample.When inputs have a primary key without uniqueness constraint, then “sample” for the purpose of this decorator means all samples with the same primary key value.
Decorator accepts a list of inputs which are used across all samples. By default, inputs argument name is used.
- Return type
-
d3m.primitive_interfaces.base.singleton(f)[source]¶ If a produce method is using this decorator, it is signaling that all outputs from the produce method are sequences of length 1. This is useful because a caller can then directly extract this element.
Example of such produce methods are produce methods of primitives which compute loss, which are returning one number for multiple inputs. With this decorator they can return a sequence with this one number, but caller which cares about the loss can extract it out. At the same time, other callers which operate only on sequences can continue to operate normally.
We can see other produce methods as mapping produce methods, and produce methods with this decorator as reducing produce methods.
- Return type