Skip to content

Scenarios

matchbox.common.factories.scenarios

Scenario factories for creating TestkitDAG scenarios.

Functions:

Attributes:

ScenarioBuilder module-attribute

ScenarioBuilder = Callable[..., TestkitDAG]

SCENARIO_REGISTRY module-attribute

SCENARIO_REGISTRY: dict[str, ScenarioBuilder] = {}

register_scenario

register_scenario(name: str) -> Callable[[ScenarioBuilder], ScenarioBuilder]

Decorator to register a new scenario builder function.

create_bare_scenario

create_bare_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a bare TestkitDAG scenario.

The warehouse and backend are empty, no users.

create_admin_scenario

create_admin_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create an admin TestkitDAG scenario.

The warehouse and backend are empty except for a single admin user, alice.

create_closed_collection_scenario

create_closed_collection_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a closed collection scenario for permission testing.

Users: - alice: admin (from setup_mode_admin) - bob: member of ‘readers’ group (has READ) - charlie: member of ‘writers’ group (has READ + WRITE) - dave: no permissions (public group only)

Collection ‘restricted’ has: - READ permission granted to ‘readers’ group - WRITE permission granted to ‘writers’ group - No public permissions

create_preindex_scenario

create_preindex_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a preindex TestkitDAG scenario.

One admin user, alice.

The warehouse contains three interlinked tables that cover common linkage problem scenarios, but are not yet indexed.

create_index_scenario

create_index_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create an index TestkitDAG scenario.

One admin user, alice.

The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed in the backend.

create_dedupe_scenario

create_dedupe_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a dedupe TestkitDAG scenario.

One admin user, alice.

The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed and deduplicated in the backend.

create_probabilistic_dedupe_scenario

create_probabilistic_dedupe_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a probabilistic dedupe TestkitDAG scenario.

One admin user, alice.

The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed and deduplicated in the backend using probabilistic methodologies.

create_link_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a link TestkitDAG scenario.

One admin user, alice.

The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed, deduplicated and linked in the backend.

create_alt_dedupe_scenario

create_alt_dedupe_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a TestkitDAG scenario with two alternative dedupers.

One admin user, alice.

The warehouse contains a single table, indexed in the backend. It has been deduplicated twice, by two rival proabilistic models.

create_convergent_partial_scenario

create_convergent_partial_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a TestkitDAG scenario with convergent sources.

One admin user, alice.

Two sources index almost identically. TestkitDAG contains two indexed sources with repetition, and two naive dedupe models that haven’t yet had their results inserted.

create_convergent_scenario

create_convergent_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a TestkitDAG scenario with convergent sources, deduped.

One admin user, alice.

This is where two sources index almost identically. TestkitDAG contains two indexed sources with repetition, and two naive dedupe models, all inserted.

create_mega_scenario

create_mega_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG

Create a TestkitDAG scenario that produces large clusters.

One admin user, alice.

Two tables with many features are in the warehouse. They are indexed and linked in the backend.

Aims to produce “mega” clusters with more features than the CLI has screen rows, and more variations than the CLI has screen columns.

setup_scenario

setup_scenario(backend: MatchboxDBAdapter, scenario_type: Literal['bare', 'admin', 'closed_collection', 'preindex', 'index', 'dedupe', 'link', 'probabilistic_dedupe', 'alt_dedupe', 'convergent'], warehouse: Engine, n_entities: int = 10, seed: int = 42, **kwargs: dict[str, Any]) -> Generator[TestkitDAG, None, None]

Context manager for creating TestkitDAG scenarios.