d3m.primitive_interfaces.clustering module

class d3m.primitive_interfaces.clustering.ClusteringLearnerPrimitiveBase(*, hyperparams, random_seed=0, docker_containers=None, volumes=None, temporary_directory=None)[source]

Bases: d3m.primitive_interfaces.unsupervised_learning.UnsupervisedLearnerPrimitiveBase

A base class for primitives implementing a clustering algorithm which learns clusters.

metadata[source]

Primitive’s metadata. Available as a class attribute.

logger[source]

Primitive’s logger. Available as a class attribute.

hyperparams[source]

Hyperparams passed to the constructor.

random_seed[source]

Random seed passed to the constructor.

docker_containers[source]

A dict mapping Docker image keys from primitive’s metadata to (named) tuples containing container’s address under which the container is accessible by the primitive, and a dict mapping exposed ports to ports on that address.

volumes[source]

A dict mapping volume keys from primitive’s metadata to file and directory paths where downloaded and extracted files are available to the primitive.

temporary_directory[source]

An absolute path to a temporary directory a primitive can use to store any files for the duration of the current pipeline run phase. Directory is automatically cleaned up after the current pipeline run phase finishes.

docker_containers = None[source]
hyperparams = None[source]
abstract produce(*, inputs, timeout=None, iterations=None)[source]

produce method should return a membership map.

A data structure that for each input sample tells to which cluster that sample was assigned to. So Outputs should have the same number of samples than Inputs, and the value at each output sample should represent a cluster. Consider representing it with just a simple numeric identifier.

Parameters
  • inputs (~Inputs) – The inputs of shape [num_inputs, …].

  • timeout (Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.

  • iterations (Optional[int]) – How many of internal iterations should the primitive do.

Return type

CallResult[~Outputs]

Returns

  • The outputs of shape [num_inputs, 1] wrapped inside CallResult for a simple numeric

  • cluster identifier.

random_seed = None[source]
temporary_directory = None[source]
volumes = None[source]
class d3m.primitive_interfaces.clustering.ClusteringTransformerPrimitiveBase(*, hyperparams, random_seed=0, docker_containers=None, volumes=None, temporary_directory=None)[source]

Bases: d3m.primitive_interfaces.transformer.TransformerPrimitiveBase

A base class for primitives implementing a clustering algorithm without learning any sort of model.

metadata[source]

Primitive’s metadata. Available as a class attribute.

logger[source]

Primitive’s logger. Available as a class attribute.

hyperparams[source]

Hyperparams passed to the constructor.

random_seed[source]

Random seed passed to the constructor.

docker_containers[source]

A dict mapping Docker image keys from primitive’s metadata to (named) tuples containing container’s address under which the container is accessible by the primitive, and a dict mapping exposed ports to ports on that address.

volumes[source]

A dict mapping volume keys from primitive’s metadata to file and directory paths where downloaded and extracted files are available to the primitive.

temporary_directory[source]

An absolute path to a temporary directory a primitive can use to store any files for the duration of the current pipeline run phase. Directory is automatically cleaned up after the current pipeline run phase finishes.

docker_containers = None[source]
hyperparams = None[source]
abstract produce(*, inputs, timeout=None, iterations=None)[source]

produce method should return a membership map.

A data structure that for each input sample tells to which cluster that sample was assigned to. So Outputs should have the same number of samples than Inputs, and the value at each output sample should represent a cluster. Consider representing it with just a simple numeric identifier.

If an implementation of this method computes clusters based on the whole set of input samples, use inputs_across_samples decorator to mark inputs as being computed across samples.

Parameters
  • inputs (~Inputs) – The inputs of shape [num_inputs, …].

  • timeout (Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.

  • iterations (Optional[int]) – How many of internal iterations should the primitive do.

Return type

CallResult[~Outputs]

Returns

  • The outputs of shape [num_inputs, 1] wrapped inside CallResult for a simple numeric

  • cluster identifier.

random_seed = None[source]
temporary_directory = None[source]
volumes = None[source]
class d3m.primitive_interfaces.clustering.ClusteringDistanceMatrixMixin[source]

Bases: typing.Generic

Abstract base class for generic types.

A generic type is typically declared by inheriting from this class parameterized with one or more type variables. For example, a generic mapping type might be defined as:

class Mapping(Generic[KT, VT]):
    def __getitem__(self, key: KT) -> VT:
        ...
    # Etc.

This class can then be used as follows:

def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT:
    try:
        return mapping[key]
    except KeyError:
        return default
abstract produce_distance_matrix(*, inputs, timeout=None, iterations=None)[source]

Semantics of this call are the same as the call to a regular produce method, just that the output is a distance matrix instead of a membership map.

Implementations of this method should use inputs_across_samples decorator to mark inputs as being computed across samples.

When this mixin is used with ClusteringTransformerPrimitiveBase, Params type variable should be set to None.

Parameters
  • inputs (~Inputs) – The inputs of shape [num_inputs, …].

  • timeout (Optional[float]) – A maximum time this primitive should take to produce outputs during this method call, in seconds.

  • iterations (Optional[int]) – How many of internal iterations should the primitive do.

Return type

CallResult[~DistanceMatrixOutput]

Returns

  • The distance matrix of shape [num_inputs, num_inputs, …] wrapped inside CallResult, where (i, j) element

  • of the matrix represent a distance between i-th and j-th sample in the inputs.