Scenarios
matchbox.common.factories.scenarios
¶
Scenario factories for creating TestkitDAG scenarios.
Functions:
-
register_scenario–Decorator to register a new scenario builder function.
-
create_bare_scenario–Create a bare TestkitDAG scenario.
-
create_admin_scenario–Create an admin TestkitDAG scenario.
-
create_closed_collection_scenario–Create a closed collection scenario for permission testing.
-
create_preindex_scenario–Create a preindex TestkitDAG scenario.
-
create_index_scenario–Create an index TestkitDAG scenario.
-
create_dedupe_scenario–Create a dedupe TestkitDAG scenario.
-
create_probabilistic_dedupe_scenario–Create a probabilistic dedupe TestkitDAG scenario.
-
create_link_scenario–Create a link TestkitDAG scenario.
-
create_alt_dedupe_scenario–Create a TestkitDAG scenario with two alternative dedupers.
-
create_convergent_partial_scenario–Create a TestkitDAG scenario with convergent sources.
-
create_convergent_scenario–Create a TestkitDAG scenario with convergent sources, deduped.
-
create_mega_scenario–Create a TestkitDAG scenario that produces large clusters.
-
setup_scenario–Context manager for creating TestkitDAG scenarios.
Attributes:
register_scenario
¶
register_scenario(name: str) -> Callable[[ScenarioBuilder], ScenarioBuilder]
Decorator to register a new scenario builder function.
create_bare_scenario
¶
create_bare_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a bare TestkitDAG scenario.
The warehouse and backend are empty, no users.
create_admin_scenario
¶
create_admin_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create an admin TestkitDAG scenario.
The warehouse and backend are empty except for a single admin user, alice.
create_closed_collection_scenario
¶
create_closed_collection_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a closed collection scenario for permission testing.
Users: - alice: admin (from setup_mode_admin) - bob: member of ‘readers’ group (has READ) - charlie: member of ‘writers’ group (has READ + WRITE) - dave: no permissions (public group only)
Collection ‘restricted’ has: - READ permission granted to ‘readers’ group - WRITE permission granted to ‘writers’ group - No public permissions
create_preindex_scenario
¶
create_preindex_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a preindex TestkitDAG scenario.
One admin user, alice.
The warehouse contains three interlinked tables that cover common linkage problem scenarios, but are not yet indexed.
create_index_scenario
¶
create_index_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create an index TestkitDAG scenario.
One admin user, alice.
The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed in the backend.
create_dedupe_scenario
¶
create_dedupe_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a dedupe TestkitDAG scenario.
One admin user, alice.
The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed and deduplicated in the backend.
create_probabilistic_dedupe_scenario
¶
create_probabilistic_dedupe_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a probabilistic dedupe TestkitDAG scenario.
One admin user, alice.
The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed and deduplicated in the backend using probabilistic methodologies.
create_link_scenario
¶
create_link_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a link TestkitDAG scenario.
One admin user, alice.
The warehouse contains three interlinked tables that cover common linkage problem scenarios. They are indexed, deduplicated and linked in the backend.
create_alt_dedupe_scenario
¶
create_alt_dedupe_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a TestkitDAG scenario with two alternative dedupers.
One admin user, alice.
The warehouse contains a single table, indexed in the backend. It has been deduplicated twice, by two rival proabilistic models.
create_convergent_partial_scenario
¶
create_convergent_partial_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a TestkitDAG scenario with convergent sources.
One admin user, alice.
Two sources index almost identically. TestkitDAG contains two indexed sources with repetition, and two naive dedupe models that haven’t yet had their results inserted.
create_convergent_scenario
¶
create_convergent_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a TestkitDAG scenario with convergent sources, deduped.
One admin user, alice.
This is where two sources index almost identically. TestkitDAG contains two indexed sources with repetition, and two naive dedupe models, all inserted.
create_mega_scenario
¶
create_mega_scenario(backend: MatchboxDBAdapter, warehouse_engine: Engine, n_entities: int = 10, seed: int = 42, **kwargs: Any) -> TestkitDAG
Create a TestkitDAG scenario that produces large clusters.
One admin user, alice.
Two tables with many features are in the warehouse. They are indexed and linked in the backend.
Aims to produce “mega” clusters with more features than the CLI has screen rows, and more variations than the CLI has screen columns.
setup_scenario
¶
setup_scenario(backend: MatchboxDBAdapter, scenario_type: Literal['bare', 'admin', 'closed_collection', 'preindex', 'index', 'dedupe', 'link', 'probabilistic_dedupe', 'alt_dedupe', 'convergent'], warehouse: Engine, n_entities: int = 10, seed: int = 42, **kwargs: dict[str, Any]) -> Generator[TestkitDAG, None, None]
Context manager for creating TestkitDAG scenarios.