Skip to content

Resolvers

matchbox.client.resolvers

Resolver methodologies and resolver DAG nodes.

Modules:

  • base

    Base classes for resolver methodologies.

  • components

    Connected-components resolver methodology.

  • resolvers

    Resolver nodes and methodology registry for client-side execution.

Classes:

  • ResolverMethod

    Base class for resolver methodologies.

  • ResolverSettings

    Base settings type for resolver methodologies.

  • Components

    Resolver methodology that computes connected components.

  • ComponentsSettings

    Settings for the Components resolver methodology.

  • Resolver

    Client-side node that computes clusters from model and resolver inputs.

Functions:

ResolverMethod

Bases: BaseModel, ABC


              flowchart TD
              matchbox.client.resolvers.ResolverMethod[ResolverMethod]

              

              click matchbox.client.resolvers.ResolverMethod href "" "matchbox.client.resolvers.ResolverMethod"
            

Base class for resolver methodologies.

Methods:

Attributes:

resolver_type class-attribute

resolver_type: ResolverType

settings instance-attribute

settings: ResolverSettings

compute_clusters abstractmethod

compute_clusters(model_edges: Mapping[ModelStepName, DataFrame]) -> DataFrame

Compute cluster assignments from model edges.

Parameters:

  • model_edges
    (Mapping[ModelStepName, DataFrame]) –

    A mapping of model names to model edges which conform to SCHEMA_MODEL_EDGES

Returns:

  • DataFrame

    A Polars DataFrame which conforms to SCHEMA_CLUSTERS

Raises:

  • RuntimeError

    if supplied model names don’t match the Resolver’s settings

ResolverSettings

Bases: BaseModel, ABC


              flowchart TD
              matchbox.client.resolvers.ResolverSettings[ResolverSettings]

              

              click matchbox.client.resolvers.ResolverSettings href "" "matchbox.client.resolvers.ResolverSettings"
            

Base settings type for resolver methodologies.

Methods:

  • validate_inputs

    Validates whether the models’ clusters can be computed with this object.

validate_inputs abstractmethod

validate_inputs(model_names: Iterable[ModelStepName]) -> None

Validates whether the models’ clusters can be computed with this object.

Should be used in conjunction with ResolverMethod.compute_clusters().

Parameters:

Raises:

  • RuntimeError

    if supplied model names don’t match the settings

Components

Bases: ResolverMethod


              flowchart TD
              matchbox.client.resolvers.Components[Components]
              matchbox.client.resolvers.base.ResolverMethod[ResolverMethod]

                              matchbox.client.resolvers.base.ResolverMethod --> matchbox.client.resolvers.Components
                


              click matchbox.client.resolvers.Components href "" "matchbox.client.resolvers.Components"
              click matchbox.client.resolvers.base.ResolverMethod href "" "matchbox.client.resolvers.base.ResolverMethod"
            

Resolver methodology that computes connected components.

Thresholds are assumed to be 0.0 unless otherwise specified.

Methods:

Attributes:

resolver_type class-attribute

resolver_type: ResolverType = COMPONENTS

settings instance-attribute

compute_clusters

compute_clusters(model_edges: Mapping[ModelStepName, DataFrame]) -> DataFrame

Compute cluster assignments from model edges.

Parameters:

  • model_edges
    (Mapping[ModelStepName, DataFrame]) –

    A mapping of model names to model edges which conform to SCHEMA_MODEL_EDGES

Returns:

  • DataFrame

    A Polars DataFrame which conforms to SCHEMA_CLUSTERS

Raises:

  • RuntimeError

    if supplied model names don’t match the Resolver’s settings

ComponentsSettings

Bases: ResolverSettings


              flowchart TD
              matchbox.client.resolvers.ComponentsSettings[ComponentsSettings]
              matchbox.client.resolvers.base.ResolverSettings[ResolverSettings]

                              matchbox.client.resolvers.base.ResolverSettings --> matchbox.client.resolvers.ComponentsSettings
                


              click matchbox.client.resolvers.ComponentsSettings href "" "matchbox.client.resolvers.ComponentsSettings"
              click matchbox.client.resolvers.base.ResolverSettings href "" "matchbox.client.resolvers.base.ResolverSettings"
            

Settings for the Components resolver methodology.

Methods:

  • validate_inputs

    Validates whether the models’ clusters can be computed with this object.

Attributes:

thresholds class-attribute instance-attribute

thresholds: dict[ModelStepName, Annotated[float, Field(ge=0.0, le=1.0)]] = Field(default_factory=dict)

validate_inputs

validate_inputs(model_names: Iterable[ModelStepName]) -> None

Validates whether the models’ clusters can be computed with this object.

Should be used in conjunction with ResolverMethod.compute_clusters().

Parameters:

Raises:

  • RuntimeError

    if supplied model names don’t match the settings

Resolver

Resolver(dag: DAG, name: ResolverStepName, inputs: Iterable[Model], resolver_class: type[ResolverMethod] | str, resolver_settings: ResolverSettings | dict[str, Any], description: str | None = None)

Bases: StepABC


              flowchart TD
              matchbox.client.resolvers.Resolver[Resolver]
              matchbox.client.steps.StepABC[StepABC]

                              matchbox.client.steps.StepABC --> matchbox.client.resolvers.Resolver
                


              click matchbox.client.resolvers.Resolver href "" "matchbox.client.resolvers.Resolver"
              click matchbox.client.steps.StepABC href "" "matchbox.client.steps.StepABC"
            

Client-side node that computes clusters from model and resolver inputs.

Methods:

  • clear_data

    Drop locally computed data.

  • delete

    Delete this step and its associated data from the backend.

  • download

    Fetch remote data for this step and store it locally.

  • sync

    Send step config and local data to the server.

  • compute_clusters

    Delegate cluster computation to the configured resolver instance.

  • run

    Run the resolver and materialise cluster assignments.

  • to_dto

    Convert to Step DTO for API calls.

  • from_dto

    Reconstruct from Step DTO.

  • query

    Create a query rooted at this resolver.

Attributes:

dag instance-attribute

dag = dag

name instance-attribute

name = name

description instance-attribute

description = description

local_data property

local_data: DataFrame | None

The locally computed results for this step.

inputs instance-attribute

inputs = tuple(deduped_inputs)

resolver_class instance-attribute

resolver_class: type[ResolverMethod] = _RESOLVER_CLASSES[resolver_class]

resolver_instance instance-attribute

resolver_instance = resolver_class(settings=resolver_settings)

resolver_settings instance-attribute

resolver_settings = SettingsClass(**resolver_settings)

results property writable

results: DataFrame | None

The locally computed cluster assignments. Alias for local_data.

results_eval property

results_eval: DataFrame

Get mapping of result clusters to leaf IDs from the server.

config property

Generate config DTO from Resolver.

sources property

sources: set[SourceStepName]

Set of source names upstream of this node.

path property

Return resolver step path.

clear_data

clear_data() -> None

Drop locally computed data.

delete

delete(certain: bool = False) -> bool

Delete this step and its associated data from the backend.

download

download() -> DataFrame

Fetch remote data for this step and store it locally.

sync

sync() -> None

Send step config and local data to the server.

Not resistant to race conditions: only one client should call sync at a time.

compute_clusters

compute_clusters(model_edges: Mapping[StepName, DataFrame]) -> DataFrame

Delegate cluster computation to the configured resolver instance.

run

run() -> DataFrame

Run the resolver and materialise cluster assignments.

to_dto

to_dto() -> Step

Convert to Step DTO for API calls.

from_dto classmethod

from_dto(step: Step, step_name: str, dag: DAG, **kwargs: Any) -> Resolver

Reconstruct from Step DTO.

query

query(*sources: Source, **kwargs: Any) -> Query

Create a query rooted at this resolver.

add_resolver_class

add_resolver_class(resolver_class: type[ResolverMethod]) -> None

Register a resolver methodology class.

base

Base classes for resolver methodologies.

Classes:

ResolverSettings

Bases: BaseModel, ABC


              flowchart TD
              matchbox.client.resolvers.base.ResolverSettings[ResolverSettings]

              

              click matchbox.client.resolvers.base.ResolverSettings href "" "matchbox.client.resolvers.base.ResolverSettings"
            

Base settings type for resolver methodologies.

Methods:

  • validate_inputs

    Validates whether the models’ clusters can be computed with this object.

validate_inputs abstractmethod
validate_inputs(model_names: Iterable[ModelStepName]) -> None

Validates whether the models’ clusters can be computed with this object.

Should be used in conjunction with ResolverMethod.compute_clusters().

Parameters:

Raises:

  • RuntimeError

    if supplied model names don’t match the settings

ResolverMethod

Bases: BaseModel, ABC


              flowchart TD
              matchbox.client.resolvers.base.ResolverMethod[ResolverMethod]

              

              click matchbox.client.resolvers.base.ResolverMethod href "" "matchbox.client.resolvers.base.ResolverMethod"
            

Base class for resolver methodologies.

Methods:

Attributes:

resolver_type class-attribute
resolver_type: ResolverType
settings instance-attribute
settings: ResolverSettings
compute_clusters abstractmethod
compute_clusters(model_edges: Mapping[ModelStepName, DataFrame]) -> DataFrame

Compute cluster assignments from model edges.

Parameters:

  • model_edges
    (Mapping[ModelStepName, DataFrame]) –

    A mapping of model names to model edges which conform to SCHEMA_MODEL_EDGES

Returns:

  • DataFrame

    A Polars DataFrame which conforms to SCHEMA_CLUSTERS

Raises:

  • RuntimeError

    if supplied model names don’t match the Resolver’s settings

components

Connected-components resolver methodology.

Classes:

  • ComponentsSettings

    Settings for the Components resolver methodology.

  • Components

    Resolver methodology that computes connected components.

ComponentsSettings

Bases: ResolverSettings


              flowchart TD
              matchbox.client.resolvers.components.ComponentsSettings[ComponentsSettings]
              matchbox.client.resolvers.base.ResolverSettings[ResolverSettings]

                              matchbox.client.resolvers.base.ResolverSettings --> matchbox.client.resolvers.components.ComponentsSettings
                


              click matchbox.client.resolvers.components.ComponentsSettings href "" "matchbox.client.resolvers.components.ComponentsSettings"
              click matchbox.client.resolvers.base.ResolverSettings href "" "matchbox.client.resolvers.base.ResolverSettings"
            

Settings for the Components resolver methodology.

Methods:

  • validate_inputs

    Validates whether the models’ clusters can be computed with this object.

Attributes:

thresholds class-attribute instance-attribute
thresholds: dict[ModelStepName, Annotated[float, Field(ge=0.0, le=1.0)]] = Field(default_factory=dict)
validate_inputs
validate_inputs(model_names: Iterable[ModelStepName]) -> None

Validates whether the models’ clusters can be computed with this object.

Should be used in conjunction with ResolverMethod.compute_clusters().

Parameters:

Raises:

  • RuntimeError

    if supplied model names don’t match the settings

Components

Bases: ResolverMethod


              flowchart TD
              matchbox.client.resolvers.components.Components[Components]
              matchbox.client.resolvers.base.ResolverMethod[ResolverMethod]

                              matchbox.client.resolvers.base.ResolverMethod --> matchbox.client.resolvers.components.Components
                


              click matchbox.client.resolvers.components.Components href "" "matchbox.client.resolvers.components.Components"
              click matchbox.client.resolvers.base.ResolverMethod href "" "matchbox.client.resolvers.base.ResolverMethod"
            

Resolver methodology that computes connected components.

Thresholds are assumed to be 0.0 unless otherwise specified.

Methods:

Attributes:

resolver_type class-attribute
resolver_type: ResolverType = COMPONENTS
settings instance-attribute
compute_clusters
compute_clusters(model_edges: Mapping[ModelStepName, DataFrame]) -> DataFrame

Compute cluster assignments from model edges.

Parameters:

  • model_edges
    (Mapping[ModelStepName, DataFrame]) –

    A mapping of model names to model edges which conform to SCHEMA_MODEL_EDGES

Returns:

  • DataFrame

    A Polars DataFrame which conforms to SCHEMA_CLUSTERS

Raises:

  • RuntimeError

    if supplied model names don’t match the Resolver’s settings

resolvers

Resolver nodes and methodology registry for client-side execution.

Classes:

  • Resolver

    Client-side node that computes clusters from model and resolver inputs.

Functions:

Resolver

Resolver(dag: DAG, name: ResolverStepName, inputs: Iterable[Model], resolver_class: type[ResolverMethod] | str, resolver_settings: ResolverSettings | dict[str, Any], description: str | None = None)

Bases: StepABC


              flowchart TD
              matchbox.client.resolvers.resolvers.Resolver[Resolver]
              matchbox.client.steps.StepABC[StepABC]

                              matchbox.client.steps.StepABC --> matchbox.client.resolvers.resolvers.Resolver
                


              click matchbox.client.resolvers.resolvers.Resolver href "" "matchbox.client.resolvers.resolvers.Resolver"
              click matchbox.client.steps.StepABC href "" "matchbox.client.steps.StepABC"
            

Client-side node that computes clusters from model and resolver inputs.

Methods:

  • compute_clusters

    Delegate cluster computation to the configured resolver instance.

  • run

    Run the resolver and materialise cluster assignments.

  • to_dto

    Convert to Step DTO for API calls.

  • from_dto

    Reconstruct from Step DTO.

  • query

    Create a query rooted at this resolver.

  • clear_data

    Drop locally computed data.

  • delete

    Delete this step and its associated data from the backend.

  • download

    Fetch remote data for this step and store it locally.

  • sync

    Send step config and local data to the server.

Attributes:

inputs instance-attribute
inputs = tuple(deduped_inputs)
resolver_class instance-attribute
resolver_class: type[ResolverMethod] = _RESOLVER_CLASSES[resolver_class]
resolver_instance instance-attribute
resolver_instance = resolver_class(settings=resolver_settings)
resolver_settings instance-attribute
resolver_settings = SettingsClass(**resolver_settings)
results property writable
results: DataFrame | None

The locally computed cluster assignments. Alias for local_data.

results_eval property
results_eval: DataFrame

Get mapping of result clusters to leaf IDs from the server.

config property

Generate config DTO from Resolver.

sources property
sources: set[SourceStepName]

Set of source names upstream of this node.

path property

Return resolver step path.

dag instance-attribute
dag = dag
name instance-attribute
name = name
description instance-attribute
description = description
local_data property
local_data: DataFrame | None

The locally computed results for this step.

compute_clusters
compute_clusters(model_edges: Mapping[StepName, DataFrame]) -> DataFrame

Delegate cluster computation to the configured resolver instance.

run
run() -> DataFrame

Run the resolver and materialise cluster assignments.

to_dto
to_dto() -> Step

Convert to Step DTO for API calls.

from_dto classmethod
from_dto(step: Step, step_name: str, dag: DAG, **kwargs: Any) -> Resolver

Reconstruct from Step DTO.

query
query(*sources: Source, **kwargs: Any) -> Query

Create a query rooted at this resolver.

clear_data
clear_data() -> None

Drop locally computed data.

delete
delete(certain: bool = False) -> bool

Delete this step and its associated data from the backend.

download
download() -> DataFrame

Fetch remote data for this step and store it locally.

sync
sync() -> None

Send step config and local data to the server.

Not resistant to race conditions: only one client should call sync at a time.

add_resolver_class

add_resolver_class(resolver_class: type[ResolverMethod]) -> None

Register a resolver methodology class.