d3m.utils module

class d3m.utils.AbstractMetaclass(name, bases, namespace, **kwargs)[source]

Bases: abc.ABCMeta, d3m.utils.Metaclass

A metaclass which makes sure docstrings are inherited. For use with abstract classes.

class d3m.utils.CallbackHandler(callback)[source]

Bases: logging.Handler

Calls a callback with logging records as they are without any conversion except for:

  • formatting the logging message and adding it to the record object

  • assuring asctime is set

  • converts exception exc_info into exception’s name

  • making sure args are JSON-compatible or removing it

  • making sure there are no null values

emit(record)[source]

Do whatever it takes to actually log the specified logging record.

This version is intended to be implemented by subclasses and so raises a NotImplementedError.

Return type

None

prepare(record)[source]
Return type

Dict

class d3m.utils.Enum(value)[source]

Bases: enum.Enum

An extension of Enum base class where:

  • Instances are equal to their string names, too.

  • It registers itself with “yaml” module to serialize itself as a string.

  • Allows dynamic registration of additional values using register_value.

classmethod register_value(name, value)[source]
Return type

Any

class d3m.utils.EnumMeta(class_name, bases, namespace, **kwargs)[source]

Bases: enum.EnumMeta

class d3m.utils.Evolver(original_pmap)[source]

Bases: pyrsistent._pmap.PMap._Evolver

persistent()[source]
Return type

PMap

class d3m.utils.FileType(mode='r', bufsize=- 1, encoding=None, errors=None)[source]

Bases: argparse.FileType

class d3m.utils.GenericMetaclass(name, bases, namespace, tvars=None, args=None, origin=None, extra=None, orig_bases=None)[source]

Bases: typing.GenericMeta, d3m.utils.Metaclass

A metaclass which makes sure docstrings are inherited. For use with generic classes (which are also abstract).

class d3m.utils.JsonEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: json.encoder.JSONEncoder

JSON encoder with extensions, among them the main ones are:

  • Frozen dict is encoded as a dict.

  • Python types are encoded into strings describing them.

  • Python enumerations are encoded into their string names.

  • Sets are encoded into lists.

  • Encodes ndarray and DataFrame as nested lists.

  • Encodes datetime into ISO format with UTC timezone.

  • Everything else which cannot be encoded is converted to a string.

You probably want to use to_json_structure and not this class, because to_json_structure also encodes NaN`, ``Infinity, and -Infinity as strings.

It does not necessary make a JSON which can then be parsed back to reconstruct original value.

default(o)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
Return type

Any

class d3m.utils.Metaclass(class_name, class_bases, class_dict)[source]

Bases: custom_inherit._metaclass_base.DocInheritorBase

A metaclass which makes sure docstrings are inherited.

It knows how to merge numpy-style docstrings and merge parent sections with child sections. For example, then it is not necessary to repeat documentation for parameters if they have not changed.

static attr_doc_inherit(prnt_doc=None, child_doc=None)[source]

Merge the docstrings of method or property from parent class and the corresponding attribute of its child.

Parameters
Raises

NotImplementedError

Notes

This works for properties, methods, static methods, class methods, and decorated methods/properties.

Return type

Optional[str]

static class_doc_inherit(prnt_doc=None, child_doc=None)[source]

Merge the docstrings of a parent class and its child.

Parameters
Raises

NotImplementedError

Return type

Optional[str]

class d3m.utils.PMap(size, buckets)[source]

Bases: pyrsistent._pmap.PMap

Extends pyrsistent.PMap to (by default) iterate over its items in sorted order.

evolver()[source]

Create a new evolver for this pmap. For a discussion on evolvers in general see the documentation for the pvector evolver.

Create the evolver and perform various mutating updates to it:

>>> m1 = m(a=1, b=2)
>>> e = m1.evolver()
>>> e['c'] = 3
>>> len(e)
3
>>> del e['a']

The underlying pmap remains the same:

>>> m1
pmap({'a': 1, 'b': 2})

The changes are kept in the evolver. An updated pmap can be created using the persistent() function on the evolver.

>>> m2 = e.persistent()
>>> m2
pmap({'c': 3, 'b': 2})

The new pmap will share data with the original pmap in the same way that would have been done if only using operations on the pmap.

Return type

Evolver

items(*, sort=True, reverse=False)[source]
Return type

AbstractSet

iteritems(*, sort=True, reverse=False)[source]
Return type

Iterable

iterkeys(*, sort=True, reverse=False)[source]
Return type

Iterable

itervalues(*, sort=True, reverse=False)[source]
Return type

Iterable

keys(*, sort=True, reverse=False)[source]
Return type

AbstractSet

values(*, sort=True, reverse=False)[source]
Return type

ValuesView

class d3m.utils.RefResolverNoRemote(base_uri, referrer, store=(), cache_remote=True, handlers=(), urljoin_cache=None, remote_cache=None)[source]

Bases: jsonschema.validators.RefResolver

resolve_remote(uri)[source]

Resolve a remote uri.

If called directly, does not check the store first, but after retrieving the document at the specified URI it will be saved in the store if cache_remote is True.

Note

If the requests library is present, jsonschema will use it to request the remote uri, so that the correct encoding is detected and used.

If it isn’t, or if the scheme of the uri is not http or https, UTF-8 is assumed.

Parameters

uri (str) – The URI to resolve

Return type

Any

Returns

The retrieved document

class d3m.utils.StreamToLogger(logger, level, pass_through_stream=None)[source]

Bases: object

close()[source]
Return type

None

fileno()[source]
Return type

int

flush()[source]
Return type

None

isatty()[source]
Return type

bool

read(n=- 1)[source]
Return type

str

readable()[source]
Return type

bool

readline(limit=- 1)[source]
Return type

str

readlines(hint=- 1)[source]
Return type

List[str]

seek(offset, whence=0)[source]
Return type

int

seekable()[source]
Return type

bool

tell()[source]
Return type

int

truncate(size=None)[source]
Return type

int

writable()[source]
Return type

bool

write(buffer)[source]
Return type

int

writelines(lines)[source]
Return type

None

d3m.utils.check_immutable(obj)[source]

Checks that obj is immutable. Raises an exception if this is not true.

Parameters

obj (Any) – Object to check.

Return type

None

d3m.utils.columns_sum(inputs, *, source=None)[source]

Computes sum per column.

Return type

Any

d3m.utils.compute_digest(obj, extra_data=None)[source]

Input should be a JSON compatible structure.

Return type

str

d3m.utils.compute_hash_id(obj)[source]

Input should be a JSON compatible structure.

Return type

str

d3m.utils.create_enum_from_json_schema_enum(class_name, obj, json_paths, *, module=None, qualname=None, base_class=None)[source]
Return type

Any

d3m.utils.current_git_commit(path, search_parent_directories=True)[source]

Returns a git commit hash of the repo at path or above if search_parent_directories is True.

When used to get a commit hash of a Python package, for this to work, the package has to be installed in “editable” mode (pip install -e).

Parameters
  • path (str) – A path to repo or somewhere under the repo.

  • search_parent_directories (bool) – Whether to search for a git repository in directories above path.

Returns

Return type

A git commit hash.

d3m.utils.datetime_for_json(timestamp)[source]
Return type

str

d3m.utils.enum_validator(validator, enums, instance, schema)[source]
d3m.utils.filter_local_location_uris(doc, *, empty_value=None)[source]
Return type

None

d3m.utils.fix_uri(uri, *, allow_relative_path=True)[source]

Make a real file URI from a path.

Parameters
  • uri (str) – An input URI.

  • allow_relative_path (bool) – Allow path to be relative?

Returns

Return type

A fixed URI.

d3m.utils.from_reversible_json_structure(obj)[source]
Return type

Any

d3m.utils.get_datasets_and_problems(datasets_dir, handle_score_split=True)[source]
Return type

Tuple[Dict[str, str], Dict[str, str]]

d3m.utils.get_dict_path(input_dict, path)[source]
Return type

Any

d3m.utils.get_full_name(value)[source]
Return type

str

d3m.utils.get_type(obj)[source]
Return type

type

d3m.utils.get_type_arguments(cls, *, unique_names=False)[source]

Returns a mapping between type arguments and their types of a given class cls.

Parameters
  • cls (type) – A class to return mapping for.

  • unique_names (bool) – Should we force unique names of type parameters.

Returns

Return type

A mapping from type argument to its type.

d3m.utils.get_type_hints(func)[source]
Return type

Dict[str, Any]

class d3m.utils.global_randomness_warning(enable=True)[source]

Bases: contextlib.AbstractContextManager

A Python context manager which issues a warning if global sources of randomness are used. Currently it checks Python built-in global random source, NumPy global random source, and NumPy default_rng being used without a seed.

d3m.utils.has_duplicates(data)[source]

Returns True if data has duplicate elements.

It works both with hashable and not-hashable elements.

Return type

bool

d3m.utils.is_class_method_on_class(method)[source]
Return type

bool

d3m.utils.is_class_method_on_object(method, object)[source]
Return type

bool

d3m.utils.is_float(typ)[source]
Return type

bool

d3m.utils.is_instance(obj, cls)[source]
Return type

bool

d3m.utils.is_instance_method_on_class(method)[source]
Return type

bool

d3m.utils.is_instance_method_on_object(method, object)[source]
Return type

bool

d3m.utils.is_int(typ)[source]
Return type

bool

d3m.utils.is_numeric(typ)[source]
Return type

bool

d3m.utils.is_sequence(value)[source]
Return type

bool

d3m.utils.is_subclass(subclass, superclass)[source]
Return type

bool

d3m.utils.is_type(obj)[source]
Return type

bool

d3m.utils.is_uri(uri)[source]

Test if a given string is an URI.

Parameters

uri (str) – A potential URI to test.

Returns

Return type

True if string is an URI, False otherwise.

d3m.utils.json_schema_is_array(checker, instance)[source]
Return type

bool

d3m.utils.json_schema_is_object(checker, instance)[source]
Return type

bool

d3m.utils.json_schema_is_python_type(instance)[source]
Return type

bool

d3m.utils.json_schema_is_string(checker, instance)[source]
Return type

bool

d3m.utils.json_structure_equals(obj1, obj2, ignore_keys=None)[source]
Parameters
  • obj1 (Any) – JSON serializable object to compare with obj2.

  • obj2 (Any) – JSON serializable object to compare with obj1.

  • ignore_keys (Optional[Set]) – If obj1 and obj2 are of type Mapping, any keys found in this set will not be considered to determine whether obj1 and obj2 are equal.

Returns

Return type

A boolean indicating whether obj1 and obj2 are equal.

d3m.utils.list_files(base_directory)[source]
Return type

Sequence[str]

d3m.utils.load_schema_validators(schemas, load_validators)[source]
Return type

List[Validator]

d3m.utils.log_once(logger, level, msg, *args, ignore_modules=None, **kwargs)[source]
Return type

None

d3m.utils.make_immutable_copy(obj)[source]

Converts a given obj into an immutable copy of it, if possible.

Parameters

obj (Any) – Object to convert.

Returns

Return type

An immutable copy of obj.

d3m.utils.matches_structural_type(source_structural_type, target_structural_type)[source]
Return type

bool

d3m.utils.normalize_numbers(obj)[source]
Return type

Dict

d3m.utils.open(file, mode='r', buffering=- 1, encoding=None, errors=None)[source]
Return type

IO[Any]

d3m.utils.outside_package_context()[source]
Return type

Optional[Context]

d3m.utils.pmap(initial={}, pre_size=0)[source]
Return type

PMap

class d3m.utils.redirect_to_logging(logger=None, stdout_level='INFO', stderr_level='ERROR', pass_through=True)[source]

Bases: contextlib.AbstractContextManager

A Python context manager which redirects all writes to stdout and stderr to Python logging.

Primitives should use logging to log messages, but maybe they are not doing that or there are other libraries they are using which are not doing that. One can then use this context manager to assure that (at least all Python) writes to stdout and stderr by primitives are redirected to logging:

with redirect_to_logging(logger=PrimitiveClass.logger):
    primitive = PrimitiveClass(...)
    primitive.set_training_data(...)
    primitive.fit(...)
    primitive.produce(...)
d3m.utils.register_yaml_representers()[source]
Return type

None

d3m.utils.register_yaml_resolvers()[source]
Return type

None

d3m.utils.set_dict_path(input_dict, path, value)[source]
Return type

None

d3m.utils.silence()[source]

Hides logging and stdout output.

Return type

Generator

d3m.utils.to_json_structure(obj)[source]

In addition to what JsonEncoder encodes, this function also encodes as strings float NaN, Infinity, and -Infinity.

It does not necessary make a JSON structure which can then be parsed back to reconstruct original value. For that use to_reversible_json_structure.

Return type

Any

d3m.utils.to_reversible_json_structure(obj)[source]

Operation is not idempotent.

Return type

Any

d3m.utils.type_to_str(obj)[source]
Return type

str

d3m.utils.yaml_add_representer(value_type, represented)[source]
Return type

None

d3m.utils.yaml_dump(data, stream=None, **kwds)[source]
Return type

Any

d3m.utils.yaml_dump_all(documents, stream=None, **kwds)[source]
Return type

Any

d3m.utils.yaml_load(stream)[source]
Return type

Any

d3m.utils.yaml_load_all(stream)[source]
Return type

Any