Resolvers
matchbox.common.factories.resolvers
¶
Factory helpers for resolver testkits.
Classes:
-
MockResolverSettings–Settings type for MockResolver.
-
MockResolver–Mock resolver methodology used by resolver testkits.
-
ResolverTestkit–Resolver plus local expected data for tests.
Functions:
-
resolver_factory–Generate a complete resolver testkit.
MockResolverSettings
¶
Bases: ResolverSettings
flowchart TD
matchbox.common.factories.resolvers.MockResolverSettings[MockResolverSettings]
matchbox.client.resolvers.base.ResolverSettings[ResolverSettings]
matchbox.client.resolvers.base.ResolverSettings --> matchbox.common.factories.resolvers.MockResolverSettings
click matchbox.common.factories.resolvers.MockResolverSettings href "" "matchbox.common.factories.resolvers.MockResolverSettings"
click matchbox.client.resolvers.base.ResolverSettings href "" "matchbox.client.resolvers.base.ResolverSettings"
Settings type for MockResolver.
Methods:
-
validate_inputs–Validate all model names are present in thresholds.
Attributes:
-
thresholds(dict[ModelStepName, Annotated[float, Field(ge=0.0, le=1.0)]]) –
thresholds
class-attribute
instance-attribute
¶
thresholds: dict[ModelStepName, Annotated[float, Field(ge=0.0, le=1.0)]] = Field(default_factory=dict)
validate_inputs
¶
validate_inputs(model_names: Iterable[ModelStepName]) -> None
Validate all model names are present in thresholds.
MockResolver
¶
Bases: ResolverMethod
flowchart TD
matchbox.common.factories.resolvers.MockResolver[MockResolver]
matchbox.client.resolvers.base.ResolverMethod[ResolverMethod]
matchbox.client.resolvers.base.ResolverMethod --> matchbox.common.factories.resolvers.MockResolver
click matchbox.common.factories.resolvers.MockResolver href "" "matchbox.common.factories.resolvers.MockResolver"
click matchbox.client.resolvers.base.ResolverMethod href "" "matchbox.client.resolvers.base.ResolverMethod"
Mock resolver methodology used by resolver testkits.
Methods:
-
compute_clusters–Compute mock clusters with connected components.
Attributes:
compute_clusters
¶
compute_clusters(model_edges: Mapping[ModelStepName, DataFrame]) -> DataFrame
Compute mock clusters with connected components.
ResolverTestkit
¶
Bases: BaseModel
flowchart TD
matchbox.common.factories.resolvers.ResolverTestkit[ResolverTestkit]
click matchbox.common.factories.resolvers.ResolverTestkit href "" "matchbox.common.factories.resolvers.ResolverTestkit"
Resolver plus local expected data for tests.
Methods:
-
query–Thin wrapper to Query this testkit’s Sources via its Resolver.
-
fake_run–Set resolver results without running the resolver.
-
into_dag–Return kwargs for explicit DAG insertion.
Attributes:
-
resolver(Resolver) – -
assignments(DataFrame) – -
entities(tuple[ClusterEntity, ...]) – -
name(str) –Return resolver name.
resolver_factory
¶
resolver_factory(dag: DAG | None = None, inputs: Iterable[ModelTestkit] | None = None, true_entities: Iterable[SourceEntity] | None = None, name: ResolverStepName | None = None, description: str | None = None, thresholds: Mapping[ModelStepName, float] | None = None, seed: int = 42) -> ResolverTestkit
Generate a complete resolver testkit.
Allows autoconfiguration with minimal settings, or more nuanced control.
Can either be used to generate a resolver in a pipeline, interconnected with existing testkit objects, or generate a standalone resolver with random data.
Parameters:
-
(dag¶DAG | None, default:None) –DAG containing this resolver. Inferred from the first input testkit if not provided. A default DAG is created when inputs are also absent.
-
(inputs¶Iterable[ModelTestkit] | None, default:None) –An iterable of ModelTestkit objects to use as resolver inputs. If None, a single default deduper model testkit is created automatically. All inputs must belong to the same DAG.
-
(true_entities¶Iterable[SourceEntity] | None, default:None) –Ground truth SourceEntity objects used to generate the expected cluster assignments. If None, the resolver testkit will have no expected entities.
-
(name¶ResolverStepName | None, default:None) –Name of the resolver. Defaults to a randomly generated word suffixed with ‘_resolver’.
-
(description¶str | None, default:None) –Description of the resolver.
-
(thresholds¶Mapping[ModelStepName, float] | None, default:None) –Per-model score thresholds in [0.0, 1.0]. If omitted, defaults to 0.0 for all resolver inputs.
-
(seed¶int, default:42) –Random seed for reproducibility.
Returns:
-
ResolverTestkit(ResolverTestkit) –A resolver testkit with generated assignments and expected entities.
Raises:
-
TypeError–If any element of inputs is not a ModelTestkit.
-
ValueError–If inputs belong to different DAGs.