Skip to content

PostgreSQL

A backend adapter for deploying Matchbox with PostgreSQL.

This backend stores two connected structures:

  • An execution graph in steps and step_from, covering sources, models, and resolvers.
  • A data graph in clusters, contains, model_edges, resolver_clusters, and cluster_source_key.

Source steps index source clusters. Model steps store score edges between clusters. Resolver steps point to the clusters that form a published entity view.

erDiagram
    Collections {
        bigint collection_id PK
        text name
    }
    Runs {
        bigint run_id PK
        bigint collection_id FK
        boolean is_mutable
        boolean is_default
    }
    Steps {
        bigint step_id PK
        bigint run_id FK
        text name
        text description
        text type
        bytea fingerprint
        enum upload_stage
    }
    StepFrom {
        bigint parent PK,FK
        bigint child PK,FK
        integer level
    }
    SourceConfigs {
        bigint source_config_id PK
        bigint step_id FK
        text location_type
        text location_name
        text extract_transform
    }
    SourceFields {
        bigint field_id PK
        bigint source_config_id FK
        integer index
        text name
        text type
        boolean is_key
    }
    ModelConfigs {
        bigint model_config_id PK
        bigint step_id FK
        text model_class
        jsonb model_settings
        jsonb left_query
        jsonb right_query
    }
    ResolverConfigs {
        bigint resolver_config_id PK
        bigint step_id FK
        text resolver_class
        jsonb resolver_settings
    }
    Clusters {
        bigint cluster_id PK
        bytea cluster_hash
    }
    ClusterSourceKey {
        bigint key_id PK
        bigint cluster_id FK
        bigint source_config_id FK
        text key
    }
    Contains {
        bigint root PK,FK
        bigint leaf PK,FK
    }
    ModelEdges {
        bigint result_id PK
        bigint step_id FK
        bigint left_id FK
        bigint right_id FK
        real score
    }
    ResolverClusters {
        bigint step_id PK,FK
        bigint cluster_id PK,FK
    }
    Users {
        bigint user_id PK
        text name
        text email
    }
    Groups {
        bigint group_id PK
        text name
        text description
        boolean is_system
    }
    UserGroups {
        bigint user_id PK,FK
        bigint group_id PK,FK
    }
    Permissions {
        bigint permission_id PK
        text permission
        bigint group_id FK
        bigint collection_id FK
        boolean is_system
    }
    EvalJudgements {
        bigint judgement_id PK
        bigint user_id FK
        bigint endorsed_cluster_id FK
        bigint shown_cluster_id FK
        datetime timestamp
    }

    Collections ||--o{ Runs : ""
    Collections ||--o{ Permissions : ""
    Runs ||--o{ Steps : ""
    Steps ||--o{ StepFrom : "parent"
    StepFrom }o--|| Steps : "child"
    Steps |o--|| SourceConfigs : ""
    Steps |o--|| ModelConfigs : ""
    Steps |o--|| ResolverConfigs : ""
    Steps ||--o{ ModelEdges : ""
    Steps ||--o{ ResolverClusters : ""
    SourceConfigs ||--o{ SourceFields : ""
    SourceConfigs ||--o{ ClusterSourceKey : ""
    Clusters ||--o{ ClusterSourceKey : ""
    Clusters ||--o{ Contains : "root"
    Contains }o--|| Clusters : "leaf"
    Clusters ||--o{ ModelEdges : "left_id"
    Clusters ||--o{ ModelEdges : "right_id"
    Clusters ||--o{ ResolverClusters : ""
    Clusters ||--o{ EvalJudgements : "endorsed_cluster_id"
    Clusters ||--o{ EvalJudgements : "shown_cluster_id"
    Users ||--o{ UserGroups : ""
    Users ||--o{ EvalJudgements : ""
    Groups ||--o{ UserGroups : ""
    Groups ||--o{ Permissions : ""

matchbox.server.postgresql

PostgreSQL adapter for Matchbox server.

Modules:

  • adapter

    Composed PostgreSQL adapter for Matchbox server.

  • db

    Matchbox PostgreSQL database connection.

  • mixin

    A module for defining mixins for the PostgreSQL backend ORM.

  • orm

    ORM classes for the Matchbox PostgreSQL database.

  • utils

    Utilities for using the PostgreSQL backend.

Classes:

__all__ module-attribute

__all__ = ['MatchboxPostgres', 'MatchboxPostgresSettings']

MatchboxPostgres

MatchboxPostgres(settings: MatchboxPostgresSettings)

Bases: MatchboxPostgresQueryMixin, MatchboxPostgresEvaluationMixin, MatchboxPostgresCollectionsMixin, MatchboxPostgresAdminMixin, MatchboxPostgresGroupsMixin, MatchboxDBAdapter


              flowchart TD
              matchbox.server.postgresql.MatchboxPostgres[MatchboxPostgres]
              matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin[MatchboxPostgresQueryMixin]
              matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin[MatchboxPostgresEvaluationMixin]
              matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin[MatchboxPostgresCollectionsMixin]
              matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin[MatchboxPostgresAdminMixin]
              matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin[MatchboxPostgresGroupsMixin]
              matchbox.server.base.MatchboxDBAdapter[MatchboxDBAdapter]

                              matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin --> matchbox.server.postgresql.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin --> matchbox.server.postgresql.MatchboxPostgres
                                matchbox.server.postgresql.adapter.eval._MixinBase --> matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin
                

                matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin --> matchbox.server.postgresql.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin --> matchbox.server.postgresql.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin --> matchbox.server.postgresql.MatchboxPostgres
                
                matchbox.server.base.MatchboxDBAdapter --> matchbox.server.postgresql.MatchboxPostgres
                


              click matchbox.server.postgresql.MatchboxPostgres href "" "matchbox.server.postgresql.MatchboxPostgres"
              click matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin href "" "matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin"
              click matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin href "" "matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin"
              click matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin href "" "matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin"
              click matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin href "" "matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin"
              click matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin href "" "matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin"
              click matchbox.server.base.MatchboxDBAdapter href "" "matchbox.server.base.MatchboxDBAdapter"
            

A PostgreSQL adapter for Matchbox.

Methods:

Attributes:

settings instance-attribute

settings = settings

sources instance-attribute

sources = SourceConfigs

models instance-attribute

models = ModelConfigs

resolvers instance-attribute

resolvers = ResolverConfigs

source_clusters instance-attribute

source_clusters = FilteredClusters(has_source=True)

model_clusters instance-attribute

model_clusters = FilteredClusters(has_source=False)

all_clusters instance-attribute

all_clusters = FilteredClusters()

creates instance-attribute

creates = ResolverClusters

merges instance-attribute

merges = Contains

proposes instance-attribute

proposes = FilteredProbabilities()

source_steps instance-attribute

source_steps = FilteredSteps(sources=True, models=False, resolvers=False)

users instance-attribute

users = Users

query

query(source: SourceStepPath, resolver: ResolverStepPath | None = None, return_leaf_id: bool = False, limit: int | None = None) -> Table

match

match(key: str, source: SourceStepPath, targets: list[SourceStepPath], resolver: ResolverStepPath) -> list[Match]

create_collection

create_collection(name: CollectionName, permissions: list[PermissionGrant]) -> Collection

get_collection

get_collection(name: CollectionName) -> Collection

list_collections

list_collections() -> list[CollectionName]

delete_collection

delete_collection(name: CollectionName, certain: bool) -> None

create_run

create_run(collection: CollectionName) -> Run

set_run_mutable

set_run_mutable(collection: CollectionName, run_id: RunID, mutable: bool) -> Run

set_run_default

set_run_default(collection: CollectionName, run_id: RunID, default: bool) -> Run

get_run

get_run(collection: CollectionName, run_id: RunID) -> Run

delete_run

delete_run(collection: CollectionName, run_id: RunID, certain: bool) -> None

create_step

create_step(step: Step, path: StepPath) -> None

get_step

get_step(path: StepPath) -> Step

update_step

update_step(step: Step, path: StepPath) -> None

delete_step

delete_step(path: StepPath, certain: bool) -> None

lock_step_data

lock_step_data(path: StepPath) -> None

unlock_step_data

unlock_step_data(path: StepPath, complete: bool = False) -> None

get_step_stage

get_step_stage(path: StepPath) -> UploadStage

insert_source_data

insert_source_data(path: SourceStepPath, data_hashes: Table) -> None

insert_model_data

insert_model_data(path: ModelStepPath, results: Table) -> None

insert_resolver_data

insert_resolver_data(path: ResolverStepPath, data: Table) -> None

get_model_data

get_model_data(path: ModelStepPath) -> Table

get_resolver_data

get_resolver_data(path: ResolverStepPath) -> Table

validate_ids

validate_ids(ids: list[int]) -> bool

dump

dump() -> MatchboxSnapshot

drop

drop(certain: bool) -> None

clear

clear(certain: bool) -> None

restore

restore(snapshot: MatchboxSnapshot) -> None

delete_orphans

delete_orphans() -> int

login

login(user: User) -> LoginResponse

get_user_groups

get_user_groups(user_name: str) -> list[GroupName]

list_groups

list_groups() -> list[Group]

get_group

get_group(name: GroupName) -> Group

create_group

create_group(group: Group) -> None

delete_group

delete_group(name: GroupName, certain: bool = False) -> None

add_user_to_group

add_user_to_group(user_name: str, group_name: GroupName) -> None

remove_user_from_group

remove_user_from_group(user_name: str, group_name: GroupName) -> None

check_permission

check_permission(user_name: str, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> bool

get_permissions

get_permissions(resource: Literal[SYSTEM] | CollectionName) -> list[PermissionGrant]

grant_permission

grant_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None

revoke_permission

revoke_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None

insert_judgement

insert_judgement(user_name: str, judgement: Judgement) -> None

get_judgements

get_judgements(tag: str | None = None) -> tuple[Table, Table]

sample_for_eval

sample_for_eval(n: int, path: ResolverStepPath, user_name: str) -> Table

Sample some clusters from a resolver step.

MatchboxPostgresSettings

Bases: MatchboxServerSettings


              flowchart TD
              matchbox.server.postgresql.MatchboxPostgresSettings[MatchboxPostgresSettings]
              matchbox.server.base.MatchboxServerSettings[MatchboxServerSettings]

                              matchbox.server.base.MatchboxServerSettings --> matchbox.server.postgresql.MatchboxPostgresSettings
                


              click matchbox.server.postgresql.MatchboxPostgresSettings href "" "matchbox.server.postgresql.MatchboxPostgresSettings"
              click matchbox.server.base.MatchboxServerSettings href "" "matchbox.server.base.MatchboxServerSettings"
            

Settings for the Matchbox PostgreSQL backend.

Inherits the core settings and adds the PostgreSQL-specific settings.

Methods:

Attributes:

model_config class-attribute instance-attribute

model_config = SettingsConfigDict(env_prefix='MB__SERVER__', env_nested_delimiter='__', use_enum_values=True, env_file='.env', env_file_encoding='utf-8', extra='ignore')

batch_size class-attribute instance-attribute

batch_size: int = Field(default=250000)

datastore instance-attribute

task_runner instance-attribute

task_runner: Literal['api', 'celery']

redis_uri instance-attribute

redis_uri: str | None

uploads_expiry_minutes instance-attribute

uploads_expiry_minutes: int | None

authorisation class-attribute instance-attribute

authorisation: bool = True

public_key class-attribute instance-attribute

public_key: SecretBytes | None = Field(default=None)

log_level class-attribute instance-attribute

log_level: LogLevelType = 'INFO'

backend_type class-attribute instance-attribute

backend_type: MatchboxBackends = POSTGRES

postgres class-attribute instance-attribute

validate_public_key classmethod

validate_public_key(v: str | bytes | None) -> bytes | None

Validate and normalise PEM public key format.

check_settings

check_settings() -> Self

Check that legal combinations of settings are provided.

adapter

Composed PostgreSQL adapter for Matchbox server.

Modules:

  • admin

    Admin PostgreSQL mixin for Matchbox server.

  • collections

    Collections PostgreSQL mixin for Matchbox server.

  • eval

    Evaluation PostgreSQL mixin for Matchbox server.

  • groups

    Groups PostgreSQL mixin for Matchbox server.

  • main

    Composed PostgreSQL adapter for Matchbox server.

  • query

    Query PostgreSQL mixin for Matchbox server.

Classes:

__all__ module-attribute

__all__ = ('MatchboxPostgres', 'MatchboxPostgresSettings')

MatchboxPostgres

MatchboxPostgres(settings: MatchboxPostgresSettings)

Bases: MatchboxPostgresQueryMixin, MatchboxPostgresEvaluationMixin, MatchboxPostgresCollectionsMixin, MatchboxPostgresAdminMixin, MatchboxPostgresGroupsMixin, MatchboxDBAdapter


              flowchart TD
              matchbox.server.postgresql.adapter.MatchboxPostgres[MatchboxPostgres]
              matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin[MatchboxPostgresQueryMixin]
              matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin[MatchboxPostgresEvaluationMixin]
              matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin[MatchboxPostgresCollectionsMixin]
              matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin[MatchboxPostgresAdminMixin]
              matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin[MatchboxPostgresGroupsMixin]
              matchbox.server.base.MatchboxDBAdapter[MatchboxDBAdapter]

                              matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin --> matchbox.server.postgresql.adapter.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin --> matchbox.server.postgresql.adapter.MatchboxPostgres
                                matchbox.server.postgresql.adapter.eval._MixinBase --> matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin
                

                matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin --> matchbox.server.postgresql.adapter.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin --> matchbox.server.postgresql.adapter.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin --> matchbox.server.postgresql.adapter.MatchboxPostgres
                
                matchbox.server.base.MatchboxDBAdapter --> matchbox.server.postgresql.adapter.MatchboxPostgres
                


              click matchbox.server.postgresql.adapter.MatchboxPostgres href "" "matchbox.server.postgresql.adapter.MatchboxPostgres"
              click matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin href "" "matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin"
              click matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin href "" "matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin"
              click matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin href "" "matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin"
              click matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin href "" "matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin"
              click matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin href "" "matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin"
              click matchbox.server.base.MatchboxDBAdapter href "" "matchbox.server.base.MatchboxDBAdapter"
            

A PostgreSQL adapter for Matchbox.

Methods:

Attributes:

settings instance-attribute
settings = settings
sources instance-attribute
sources = SourceConfigs
models instance-attribute
models = ModelConfigs
resolvers instance-attribute
resolvers = ResolverConfigs
source_clusters instance-attribute
source_clusters = FilteredClusters(has_source=True)
model_clusters instance-attribute
model_clusters = FilteredClusters(has_source=False)
all_clusters instance-attribute
all_clusters = FilteredClusters()
creates instance-attribute
creates = ResolverClusters
merges instance-attribute
merges = Contains
proposes instance-attribute
proposes = FilteredProbabilities()
source_steps instance-attribute
source_steps = FilteredSteps(sources=True, models=False, resolvers=False)
users instance-attribute
users = Users
query
query(source: SourceStepPath, resolver: ResolverStepPath | None = None, return_leaf_id: bool = False, limit: int | None = None) -> Table
match
match(key: str, source: SourceStepPath, targets: list[SourceStepPath], resolver: ResolverStepPath) -> list[Match]
create_collection
create_collection(name: CollectionName, permissions: list[PermissionGrant]) -> Collection
get_collection
get_collection(name: CollectionName) -> Collection
list_collections
list_collections() -> list[CollectionName]
delete_collection
delete_collection(name: CollectionName, certain: bool) -> None
create_run
create_run(collection: CollectionName) -> Run
set_run_mutable
set_run_mutable(collection: CollectionName, run_id: RunID, mutable: bool) -> Run
set_run_default
set_run_default(collection: CollectionName, run_id: RunID, default: bool) -> Run
get_run
get_run(collection: CollectionName, run_id: RunID) -> Run
delete_run
delete_run(collection: CollectionName, run_id: RunID, certain: bool) -> None
create_step
create_step(step: Step, path: StepPath) -> None
get_step
get_step(path: StepPath) -> Step
update_step
update_step(step: Step, path: StepPath) -> None
delete_step
delete_step(path: StepPath, certain: bool) -> None
lock_step_data
lock_step_data(path: StepPath) -> None
unlock_step_data
unlock_step_data(path: StepPath, complete: bool = False) -> None
get_step_stage
get_step_stage(path: StepPath) -> UploadStage
insert_source_data
insert_source_data(path: SourceStepPath, data_hashes: Table) -> None
insert_model_data
insert_model_data(path: ModelStepPath, results: Table) -> None
insert_resolver_data
insert_resolver_data(path: ResolverStepPath, data: Table) -> None
get_model_data
get_model_data(path: ModelStepPath) -> Table
get_resolver_data
get_resolver_data(path: ResolverStepPath) -> Table
validate_ids
validate_ids(ids: list[int]) -> bool
dump
dump() -> MatchboxSnapshot
drop
drop(certain: bool) -> None
clear
clear(certain: bool) -> None
restore
restore(snapshot: MatchboxSnapshot) -> None
delete_orphans
delete_orphans() -> int
login
login(user: User) -> LoginResponse
get_user_groups
get_user_groups(user_name: str) -> list[GroupName]
list_groups
list_groups() -> list[Group]
get_group
get_group(name: GroupName) -> Group
create_group
create_group(group: Group) -> None
delete_group
delete_group(name: GroupName, certain: bool = False) -> None
add_user_to_group
add_user_to_group(user_name: str, group_name: GroupName) -> None
remove_user_from_group
remove_user_from_group(user_name: str, group_name: GroupName) -> None
check_permission
check_permission(user_name: str, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> bool
get_permissions
get_permissions(resource: Literal[SYSTEM] | CollectionName) -> list[PermissionGrant]
grant_permission
grant_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None
revoke_permission
revoke_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None
insert_judgement
insert_judgement(user_name: str, judgement: Judgement) -> None
get_judgements
get_judgements(tag: str | None = None) -> tuple[Table, Table]
sample_for_eval
sample_for_eval(n: int, path: ResolverStepPath, user_name: str) -> Table

Sample some clusters from a resolver step.

MatchboxPostgresSettings

Bases: MatchboxServerSettings


              flowchart TD
              matchbox.server.postgresql.adapter.MatchboxPostgresSettings[MatchboxPostgresSettings]
              matchbox.server.base.MatchboxServerSettings[MatchboxServerSettings]

                              matchbox.server.base.MatchboxServerSettings --> matchbox.server.postgresql.adapter.MatchboxPostgresSettings
                


              click matchbox.server.postgresql.adapter.MatchboxPostgresSettings href "" "matchbox.server.postgresql.adapter.MatchboxPostgresSettings"
              click matchbox.server.base.MatchboxServerSettings href "" "matchbox.server.base.MatchboxServerSettings"
            

Settings for the Matchbox PostgreSQL backend.

Inherits the core settings and adds the PostgreSQL-specific settings.

Methods:

Attributes:

model_config class-attribute instance-attribute
model_config = SettingsConfigDict(env_prefix='MB__SERVER__', env_nested_delimiter='__', use_enum_values=True, env_file='.env', env_file_encoding='utf-8', extra='ignore')
batch_size class-attribute instance-attribute
batch_size: int = Field(default=250000)
datastore instance-attribute
task_runner instance-attribute
task_runner: Literal['api', 'celery']
redis_uri instance-attribute
redis_uri: str | None
uploads_expiry_minutes instance-attribute
uploads_expiry_minutes: int | None
authorisation class-attribute instance-attribute
authorisation: bool = True
public_key class-attribute instance-attribute
public_key: SecretBytes | None = Field(default=None)
log_level class-attribute instance-attribute
log_level: LogLevelType = 'INFO'
backend_type class-attribute instance-attribute
backend_type: MatchboxBackends = POSTGRES
postgres class-attribute instance-attribute
validate_public_key classmethod
validate_public_key(v: str | bytes | None) -> bytes | None

Validate and normalise PEM public key format.

check_settings
check_settings() -> Self

Check that legal combinations of settings are provided.

admin

Admin PostgreSQL mixin for Matchbox server.

Classes:

MatchboxPostgresAdminMixin

Admin mixin for the PostgreSQL adapter for Matchbox.

Methods:

Attributes:

settings instance-attribute
login
login(user: User) -> LoginResponse
check_permission
check_permission(user_name: str, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> bool
get_permissions
get_permissions(resource: Literal[SYSTEM] | CollectionName) -> list[PermissionGrant]
grant_permission
grant_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None
revoke_permission
revoke_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None
validate_ids
validate_ids(ids: list[int]) -> bool
dump
dump() -> MatchboxSnapshot
drop
drop(certain: bool) -> None
clear
clear(certain: bool) -> None
restore
restore(snapshot: MatchboxSnapshot) -> None
delete_orphans
delete_orphans() -> int

collections

Collections PostgreSQL mixin for Matchbox server.

Classes:

MatchboxPostgresCollectionsMixin

Collections mixin for the PostgreSQL adapter for Matchbox.

Methods:

create_collection
create_collection(name: CollectionName, permissions: list[PermissionGrant]) -> Collection
get_collection
get_collection(name: CollectionName) -> Collection
list_collections
list_collections() -> list[CollectionName]
delete_collection
delete_collection(name: CollectionName, certain: bool) -> None
create_run
create_run(collection: CollectionName) -> Run
set_run_mutable
set_run_mutable(collection: CollectionName, run_id: RunID, mutable: bool) -> Run
set_run_default
set_run_default(collection: CollectionName, run_id: RunID, default: bool) -> Run
get_run
get_run(collection: CollectionName, run_id: RunID) -> Run
delete_run
delete_run(collection: CollectionName, run_id: RunID, certain: bool) -> None
create_step
create_step(step: Step, path: StepPath) -> None
get_step
get_step(path: StepPath) -> Step
update_step
update_step(step: Step, path: StepPath) -> None
delete_step
delete_step(path: StepPath, certain: bool) -> None
lock_step_data
lock_step_data(path: StepPath) -> None
unlock_step_data
unlock_step_data(path: StepPath, complete: bool = False) -> None
get_step_stage
get_step_stage(path: StepPath) -> UploadStage
insert_source_data
insert_source_data(path: SourceStepPath, data_hashes: Table) -> None
insert_model_data
insert_model_data(path: ModelStepPath, results: Table) -> None
insert_resolver_data
insert_resolver_data(path: ResolverStepPath, data: Table) -> None
get_model_data
get_model_data(path: ModelStepPath) -> Table
get_resolver_data
get_resolver_data(path: ResolverStepPath) -> Table

eval

Evaluation PostgreSQL mixin for Matchbox server.

Classes:

MatchboxPostgresEvaluationMixin

Bases: _MixinBase


              flowchart TD
              matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin[MatchboxPostgresEvaluationMixin]

                              matchbox.server.postgresql.adapter.eval._MixinBase --> matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin
                


              click matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin href "" "matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin"
            

Evaluation mixin for the PostgreSQL adapter for Matchbox.

Methods:

insert_judgement
insert_judgement(user_name: str, judgement: Judgement) -> None
get_judgements
get_judgements(tag: str | None = None) -> tuple[Table, Table]
sample_for_eval
sample_for_eval(n: int, path: ResolverStepPath, user_name: str) -> Table

Sample some clusters from a resolver step.

groups

Groups PostgreSQL mixin for Matchbox server.

Classes:

MatchboxPostgresGroupsMixin

Groups mixin for the PostgreSQL adapter for Matchbox.

Methods:

get_user_groups
get_user_groups(user_name: str) -> list[GroupName]
list_groups
list_groups() -> list[Group]
get_group
get_group(name: GroupName) -> Group
create_group
create_group(group: Group) -> None
delete_group
delete_group(name: GroupName, certain: bool = False) -> None
add_user_to_group
add_user_to_group(user_name: str, group_name: GroupName) -> None
remove_user_from_group
remove_user_from_group(user_name: str, group_name: GroupName) -> None

main

Composed PostgreSQL adapter for Matchbox server.

Classes:

FilteredClusters

Bases: BaseModel


              flowchart TD
              matchbox.server.postgresql.adapter.main.FilteredClusters[FilteredClusters]

              

              click matchbox.server.postgresql.adapter.main.FilteredClusters href "" "matchbox.server.postgresql.adapter.main.FilteredClusters"
            

Wrapper class for filtered cluster queries.

Methods:

  • count

    Counts the number of clusters in the database.

Attributes:

has_source class-attribute instance-attribute
has_source: bool | None = None
count
count() -> int

Counts the number of clusters in the database.

FilteredProbabilities

Bases: BaseModel


              flowchart TD
              matchbox.server.postgresql.adapter.main.FilteredProbabilities[FilteredProbabilities]

              

              click matchbox.server.postgresql.adapter.main.FilteredProbabilities href "" "matchbox.server.postgresql.adapter.main.FilteredProbabilities"
            

Wrapper class for filtered model edge queries.

Methods:

  • count

    Counts the number of model edges in the database.

count
count() -> int

Counts the number of model edges in the database.

FilteredSteps

Bases: BaseModel


              flowchart TD
              matchbox.server.postgresql.adapter.main.FilteredSteps[FilteredSteps]

              

              click matchbox.server.postgresql.adapter.main.FilteredSteps href "" "matchbox.server.postgresql.adapter.main.FilteredSteps"
            

Wrapper class for filtered step queries.

Methods:

  • count

    Counts the number of steps in the database.

Attributes:

sources class-attribute instance-attribute
sources: bool = False
models class-attribute instance-attribute
models: bool = False
resolvers class-attribute instance-attribute
resolvers: bool = False
count
count() -> int

Counts the number of steps in the database.

MatchboxPostgres
MatchboxPostgres(settings: MatchboxPostgresSettings)

Bases: MatchboxPostgresQueryMixin, MatchboxPostgresEvaluationMixin, MatchboxPostgresCollectionsMixin, MatchboxPostgresAdminMixin, MatchboxPostgresGroupsMixin, MatchboxDBAdapter


              flowchart TD
              matchbox.server.postgresql.adapter.main.MatchboxPostgres[MatchboxPostgres]
              matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin[MatchboxPostgresQueryMixin]
              matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin[MatchboxPostgresEvaluationMixin]
              matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin[MatchboxPostgresCollectionsMixin]
              matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin[MatchboxPostgresAdminMixin]
              matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin[MatchboxPostgresGroupsMixin]
              matchbox.server.base.MatchboxDBAdapter[MatchboxDBAdapter]

                              matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin --> matchbox.server.postgresql.adapter.main.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin --> matchbox.server.postgresql.adapter.main.MatchboxPostgres
                                matchbox.server.postgresql.adapter.eval._MixinBase --> matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin
                

                matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin --> matchbox.server.postgresql.adapter.main.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin --> matchbox.server.postgresql.adapter.main.MatchboxPostgres
                
                matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin --> matchbox.server.postgresql.adapter.main.MatchboxPostgres
                
                matchbox.server.base.MatchboxDBAdapter --> matchbox.server.postgresql.adapter.main.MatchboxPostgres
                


              click matchbox.server.postgresql.adapter.main.MatchboxPostgres href "" "matchbox.server.postgresql.adapter.main.MatchboxPostgres"
              click matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin href "" "matchbox.server.postgresql.adapter.query.MatchboxPostgresQueryMixin"
              click matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin href "" "matchbox.server.postgresql.adapter.eval.MatchboxPostgresEvaluationMixin"
              click matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin href "" "matchbox.server.postgresql.adapter.collections.MatchboxPostgresCollectionsMixin"
              click matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin href "" "matchbox.server.postgresql.adapter.admin.MatchboxPostgresAdminMixin"
              click matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin href "" "matchbox.server.postgresql.adapter.groups.MatchboxPostgresGroupsMixin"
              click matchbox.server.base.MatchboxDBAdapter href "" "matchbox.server.base.MatchboxDBAdapter"
            

A PostgreSQL adapter for Matchbox.

Methods:

Attributes:

settings instance-attribute
settings = settings
sources instance-attribute
sources = SourceConfigs
models instance-attribute
models = ModelConfigs
resolvers instance-attribute
resolvers = ResolverConfigs
source_clusters instance-attribute
source_clusters = FilteredClusters(has_source=True)
model_clusters instance-attribute
model_clusters = FilteredClusters(has_source=False)
all_clusters instance-attribute
all_clusters = FilteredClusters()
creates instance-attribute
creates = ResolverClusters
merges instance-attribute
merges = Contains
proposes instance-attribute
proposes = FilteredProbabilities()
source_steps instance-attribute
source_steps = FilteredSteps(sources=True, models=False, resolvers=False)
users instance-attribute
users = Users
query
query(source: SourceStepPath, resolver: ResolverStepPath | None = None, return_leaf_id: bool = False, limit: int | None = None) -> Table
match
match(key: str, source: SourceStepPath, targets: list[SourceStepPath], resolver: ResolverStepPath) -> list[Match]
create_collection
create_collection(name: CollectionName, permissions: list[PermissionGrant]) -> Collection
get_collection
get_collection(name: CollectionName) -> Collection
list_collections
list_collections() -> list[CollectionName]
delete_collection
delete_collection(name: CollectionName, certain: bool) -> None
create_run
create_run(collection: CollectionName) -> Run
set_run_mutable
set_run_mutable(collection: CollectionName, run_id: RunID, mutable: bool) -> Run
set_run_default
set_run_default(collection: CollectionName, run_id: RunID, default: bool) -> Run
get_run
get_run(collection: CollectionName, run_id: RunID) -> Run
delete_run
delete_run(collection: CollectionName, run_id: RunID, certain: bool) -> None
create_step
create_step(step: Step, path: StepPath) -> None
get_step
get_step(path: StepPath) -> Step
update_step
update_step(step: Step, path: StepPath) -> None
delete_step
delete_step(path: StepPath, certain: bool) -> None
lock_step_data
lock_step_data(path: StepPath) -> None
unlock_step_data
unlock_step_data(path: StepPath, complete: bool = False) -> None
get_step_stage
get_step_stage(path: StepPath) -> UploadStage
insert_source_data
insert_source_data(path: SourceStepPath, data_hashes: Table) -> None
insert_model_data
insert_model_data(path: ModelStepPath, results: Table) -> None
insert_resolver_data
insert_resolver_data(path: ResolverStepPath, data: Table) -> None
get_model_data
get_model_data(path: ModelStepPath) -> Table
get_resolver_data
get_resolver_data(path: ResolverStepPath) -> Table
validate_ids
validate_ids(ids: list[int]) -> bool
dump
dump() -> MatchboxSnapshot
drop
drop(certain: bool) -> None
clear
clear(certain: bool) -> None
restore
restore(snapshot: MatchboxSnapshot) -> None
delete_orphans
delete_orphans() -> int
login
login(user: User) -> LoginResponse
get_user_groups
get_user_groups(user_name: str) -> list[GroupName]
list_groups
list_groups() -> list[Group]
get_group
get_group(name: GroupName) -> Group
create_group
create_group(group: Group) -> None
delete_group
delete_group(name: GroupName, certain: bool = False) -> None
add_user_to_group
add_user_to_group(user_name: str, group_name: GroupName) -> None
remove_user_from_group
remove_user_from_group(user_name: str, group_name: GroupName) -> None
check_permission
check_permission(user_name: str, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> bool
get_permissions
get_permissions(resource: Literal[SYSTEM] | CollectionName) -> list[PermissionGrant]
grant_permission
grant_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None
revoke_permission
revoke_permission(group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None
insert_judgement
insert_judgement(user_name: str, judgement: Judgement) -> None
get_judgements
get_judgements(tag: str | None = None) -> tuple[Table, Table]
sample_for_eval
sample_for_eval(n: int, path: ResolverStepPath, user_name: str) -> Table

Sample some clusters from a resolver step.

query

Query PostgreSQL mixin for Matchbox server.

Classes:

MatchboxPostgresQueryMixin

Query mixin for the PostgreSQL adapter for Matchbox.

Methods:

query
query(source: SourceStepPath, resolver: ResolverStepPath | None = None, return_leaf_id: bool = False, limit: int | None = None) -> Table
match
match(key: str, source: SourceStepPath, targets: list[SourceStepPath], resolver: ResolverStepPath) -> list[Match]

db

Matchbox PostgreSQL database connection.

Classes:

Attributes:

MBDB module-attribute

MatchboxPostgresCoreSettings

Bases: BaseModel


              flowchart TD
              matchbox.server.postgresql.db.MatchboxPostgresCoreSettings[MatchboxPostgresCoreSettings]

              

              click matchbox.server.postgresql.db.MatchboxPostgresCoreSettings href "" "matchbox.server.postgresql.db.MatchboxPostgresCoreSettings"
            

PostgreSQL-specific settings for Matchbox.

Methods:

Attributes:

host instance-attribute
host: str
port instance-attribute
port: int
user instance-attribute
user: str
password instance-attribute
password: str
database instance-attribute
database: str
db_schema instance-attribute
db_schema: str
alembic_config class-attribute instance-attribute
alembic_config: Path = Field(default=Path('src/matchbox/server/postgresql/alembic.ini'))
get_alembic_config
get_alembic_config() -> Config

Get the Alembic config.

MatchboxPostgresSettings

Bases: MatchboxServerSettings


              flowchart TD
              matchbox.server.postgresql.db.MatchboxPostgresSettings[MatchboxPostgresSettings]
              matchbox.server.base.MatchboxServerSettings[MatchboxServerSettings]

                              matchbox.server.base.MatchboxServerSettings --> matchbox.server.postgresql.db.MatchboxPostgresSettings
                


              click matchbox.server.postgresql.db.MatchboxPostgresSettings href "" "matchbox.server.postgresql.db.MatchboxPostgresSettings"
              click matchbox.server.base.MatchboxServerSettings href "" "matchbox.server.base.MatchboxServerSettings"
            

Settings for the Matchbox PostgreSQL backend.

Inherits the core settings and adds the PostgreSQL-specific settings.

Methods:

Attributes:

backend_type class-attribute instance-attribute
backend_type: MatchboxBackends = POSTGRES
postgres class-attribute instance-attribute
model_config class-attribute instance-attribute
model_config = SettingsConfigDict(env_prefix='MB__SERVER__', env_nested_delimiter='__', use_enum_values=True, env_file='.env', env_file_encoding='utf-8', extra='ignore')
batch_size class-attribute instance-attribute
batch_size: int = Field(default=250000)
datastore instance-attribute
task_runner instance-attribute
task_runner: Literal['api', 'celery']
redis_uri instance-attribute
redis_uri: str | None
uploads_expiry_minutes instance-attribute
uploads_expiry_minutes: int | None
authorisation class-attribute instance-attribute
authorisation: bool = True
public_key class-attribute instance-attribute
public_key: SecretBytes | None = Field(default=None)
log_level class-attribute instance-attribute
log_level: LogLevelType = 'INFO'
validate_public_key classmethod
validate_public_key(v: str | bytes | None) -> bytes | None

Validate and normalise PEM public key format.

check_settings
check_settings() -> Self

Check that legal combinations of settings are provided.

MatchboxDatabase

MatchboxDatabase(settings: MatchboxPostgresSettings)

Matchbox PostgreSQL database connection.

Methods:

Attributes:

settings instance-attribute
settings = settings
MatchboxBase instance-attribute
MatchboxBase = declarative_base(metadata=MetaData(schema=db_schema))
alembic_config instance-attribute
alembic_config = get_alembic_config()
sorted_tables property
sorted_tables: list[Table]

Return a list of SQLAlchemy tables in order of creation.

connection_string
connection_string(driver: bool = True) -> str

Get the connection string for PostgreSQL.

get_engine
get_engine() -> Engine

Get the database engine.

get_session
get_session() -> Session

Get a new session.

get_adbc_connection
get_adbc_connection() -> Generator[Connection, Any, Any]

Get a new ADBC connection wrapped by a SQLAlchemy pool proxy.

The pool proxy is held and managed within the context manager, yielding the connection directly.

run_migrations
run_migrations() -> None

Create the database and all tables expected in the schema.

clear_database
clear_database() -> None

Delete all rows in every table in the database schema.

  • TRUNCATE tables that are part of the core ORM (preserves structure)
  • DROP tables that are not in the ORM (removes temporary/test tables)
drop_database
drop_database() -> None

Drop all tables in the database schema and re-recreate them.

vacuum_analyze
vacuum_analyze(*table_names: str) -> None

Run VACUUM ANALYZE on specified tables.

VACUUM ANALYZE reclaims storage and updates statistics for the query planner. PostgreSQL may not fully utilise indexes until VACUUM ANALYZE is run. According to https://www.postgresql.org/docs/current/sql-vacuum.html, VACUUM ANALYZE is recommended over just ANALYZE for optimal performance.

Parameters:

  • *table_names
    (str, default: () ) –

    Fully qualified table names to vacuum. If none provided, vacuums the entire database.

mixin

A module for defining mixins for the PostgreSQL backend ORM.

Classes:

  • CountMixin

    A mixin for counting the number of rows in a table.

Attributes:

  • T

T module-attribute

T = TypeVar('T')

CountMixin

A mixin for counting the number of rows in a table.

Methods:

  • count

    Counts the number of rows in the table.

count classmethod
count() -> int

Counts the number of rows in the table.

orm

ORM classes for the Matchbox PostgreSQL database.

Classes:

  • Collections

    Named collections of steps and runs.

  • Runs

    Runs of collections of steps.

  • StepFrom

    Step lineage closure table.

  • Steps

    Table of steps corresponding to models, resolvers, and sources.

  • SourceFields

    Table for storing column details for SourceConfigs.

  • ClusterSourceKey

    Table for storing source primary keys for clusters.

  • SourceConfigs

    Table of source_configs of data for Matchbox.

  • ModelConfigs

    Table of model configs for Matchbox.

  • ResolverConfigs

    Table of resolver configs for Matchbox.

  • Contains

    Cluster lineage table.

  • Clusters

    Table of indexed data and clusters that match it.

  • UserGroups

    Association table for user-group membership.

  • Users

    Table of user identities.

  • Groups

    Groups for permission management.

  • Permissions

    Permissions granted to groups on resources.

  • EvalJudgements

    Table of evaluation judgements produced by human validators.

  • ModelEdges

    Table of results for a model step.

  • ResolverClusters

    Association table linking resolver steps to cluster IDs.

Collections

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Collections[Collections]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Collections
                


              click matchbox.server.postgresql.orm.Collections href "" "matchbox.server.postgresql.orm.Collections"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Named collections of steps and runs.

Methods:

  • from_name

    Resolve a collection name to a Collections object.

  • to_dto

    Convert ORM collection to a matchbox.common Collection object.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'collections'
collection_id class-attribute instance-attribute
collection_id: Mapped[int] = mapped_column(BIGINT, primary_key=True, autoincrement=True)
name class-attribute instance-attribute
name: Mapped[str] = mapped_column(TEXT, nullable=False)
runs class-attribute instance-attribute
runs: Mapped[list[Runs]] = relationship(back_populates='collection')
permissions class-attribute instance-attribute
permissions: Mapped[list[Permissions]] = relationship(back_populates='collection', passive_deletes=True)
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('name', name='collections_name_key'),)
from_name classmethod
from_name(name: CollectionName, session: Session | None = None) -> Collections

Resolve a collection name to a Collections object.

Parameters:

  • name
    (CollectionName) –

    The name of the collection to resolve.

  • session
    (Session | None, default: None ) –

    Optional session to use for the query.

Raises:

to_dto
to_dto() -> Collection

Convert ORM collection to a matchbox.common Collection object.

count classmethod
count() -> int

Counts the number of rows in the table.

Runs

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Runs[Runs]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Runs
                


              click matchbox.server.postgresql.orm.Runs href "" "matchbox.server.postgresql.orm.Runs"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Runs of collections of steps.

Methods:

  • from_id

    Resolve a collection and run name to a Runs object.

  • to_dto

    Convert ORM run to a matchbox.common Run object.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'runs'
run_id class-attribute instance-attribute
run_id: Mapped[int] = mapped_column(BIGINT, primary_key=True, autoincrement=True)
collection_id class-attribute instance-attribute
collection_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('collections.collection_id', ondelete='CASCADE'), nullable=False)
is_mutable class-attribute instance-attribute
is_mutable: Mapped[bool] = mapped_column(BOOLEAN, default=False, nullable=True)
is_default class-attribute instance-attribute
is_default: Mapped[bool] = mapped_column(BOOLEAN, default=False, nullable=True)
collection class-attribute instance-attribute
collection: Mapped[Collections] = relationship(back_populates='runs')
steps class-attribute instance-attribute
steps: Mapped[list[Steps]] = relationship(back_populates='run')
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('collection_id', 'run_id', name='unique_run_id'), Index('ix_default_run_collection', 'collection_id', unique=True, postgresql_where=text('is_default = true')))
from_id classmethod
from_id(collection: CollectionName, run_id: RunID, session: Session | None = None) -> Runs

Resolve a collection and run name to a Runs object.

Parameters:

  • collection
    (CollectionName) –

    The name of the collection containing the run.

  • run_id
    (RunID) –

    The ID of the run within that collection.

  • session
    (Session | None, default: None ) –

    Optional session to use for the query.

Raises:

to_dto
to_dto() -> Run

Convert ORM run to a matchbox.common Run object.

count classmethod
count() -> int

Counts the number of rows in the table.

StepFrom

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.StepFrom[StepFrom]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.StepFrom
                


              click matchbox.server.postgresql.orm.StepFrom href "" "matchbox.server.postgresql.orm.StepFrom"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Step lineage closure table.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'step_from'
parent class-attribute instance-attribute
parent: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), primary_key=True)
child class-attribute instance-attribute
child: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), primary_key=True)
level class-attribute instance-attribute
level: Mapped[int] = mapped_column(INTEGER, nullable=False, primary_key=True)
__table_args__ class-attribute instance-attribute
__table_args__ = (CheckConstraint('parent != child', name='no_self_reference'), CheckConstraint('level > 0', name='positive_level'))
count classmethod
count() -> int

Counts the number of rows in the table.

Steps

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Steps[Steps]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Steps
                


              click matchbox.server.postgresql.orm.Steps href "" "matchbox.server.postgresql.orm.Steps"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of steps corresponding to models, resolvers, and sources.

Models produce edges and resolvers produce cluster assignments.

Methods:

  • get_lineage

    Returns lineage ordered by priority.

  • from_path

    Resolve a step path to a Step ORM object.

  • from_dto

    Create a Steps instance from a Step DTO object.

  • to_dto

    Convert ORM step to a matchbox.common Step object.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'steps'
step_id class-attribute instance-attribute
step_id: Mapped[int] = mapped_column(BIGINT, primary_key=True, autoincrement=True)
run_id class-attribute instance-attribute
run_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('runs.run_id', ondelete='CASCADE'), nullable=False)
upload_stage class-attribute instance-attribute
upload_stage: Mapped[UploadStage] = mapped_column(Enum(UploadStage, native_enum=True, name='upload_stages', schema='mb'), nullable=False, default=READY)
name class-attribute instance-attribute
name: Mapped[str] = mapped_column(TEXT, nullable=False)
description class-attribute instance-attribute
description: Mapped[str | None] = mapped_column(TEXT, nullable=True)
type class-attribute instance-attribute
type: Mapped[str] = mapped_column(TEXT, nullable=False)
fingerprint class-attribute instance-attribute
fingerprint: Mapped[bytes] = mapped_column(BYTEA, nullable=False)
source_config class-attribute instance-attribute
source_config: Mapped[Optional[SourceConfigs]] = relationship(back_populates='source_step', uselist=False)
model_config class-attribute instance-attribute
model_config: Mapped[Optional[ModelConfigs]] = relationship(back_populates='model_step', uselist=False)
resolver_config class-attribute instance-attribute
resolver_config: Mapped[Optional[ResolverConfigs]] = relationship(back_populates='resolver_step', uselist=False)
model_edges class-attribute instance-attribute
model_edges: Mapped[list[ModelEdges]] = relationship(back_populates='proposed_by', passive_deletes=True)
resolver_clusters class-attribute instance-attribute
resolver_clusters: Mapped[list[ResolverClusters]] = relationship(back_populates='proposed_by', passive_deletes=True)
parents class-attribute instance-attribute
parents: Mapped[list[Steps]] = relationship(secondary=__table__, primaryjoin=dedent('            and_(\n                Steps.step_id == StepFrom.child,\n                StepFrom.level == 1\n            )\n        '), secondaryjoin='Steps.step_id == StepFrom.parent', viewonly=True, order_by='StepFrom.parent')
run class-attribute instance-attribute
run: Mapped[Runs] = relationship(back_populates='steps')
__table_args__ class-attribute instance-attribute
__table_args__ = (CheckConstraint("type IN ('model', 'source', 'resolver')", name='step_type_constraints'), UniqueConstraint('run_id', 'name', name='steps_name_key'))
ancestors property
ancestors: set[Steps]

Return all ancestors (parents, grandparents, etc.) of this step.

descendants property
descendants: set[Steps]

Return descendants (children, grandchildren, etc.) of this step.

get_lineage
get_lineage(sources: list[SourceConfigs] | None = None, queryable_only: bool = False) -> list[tuple[int, int | None]]

Returns lineage ordered by priority.

Highest priority (lowest level) first, then by step_id for stability.

Parameters:

  • sources
    (list[SourceConfigs] | None, default: None ) –

    If provided, only return lineage paths that lead to these sources

  • queryable_only
    (bool, default: False ) –

    If true, only include queryable step types

Returns:

  • list[tuple[int, int | None]]

    List of tuples (step_id, source_config_id) ordered by priority.

from_path classmethod
from_path(path: StepPath, res_type: StepType | None = None, session: Session | None = None, for_update: bool = False) -> Steps

Resolve a step path to a Step ORM object.

Parameters:

  • path
    (StepPath) –

    The path of the step to resolve.

  • res_type
    (StepType | None, default: None ) –

    A step type to use as filter.

  • session
    (Session | None, default: None ) –

    A session to get the step for updates.

  • for_update
    (bool, default: False ) –

    Locks the row until updated.

Raises:

from_dto classmethod
from_dto(step: Step, path: StepPath, session: Session) -> Steps

Create a Steps instance from a Step DTO object.

The step will be added to the session and flushed (but not committed).

For model steps, lineage entries will be created automatically.

Parameters:

  • step
    (Step) –

    The Step DTO to convert

  • path
    (StepPath) –

    The full step path

  • session
    (Session) –

    Database session (caller must commit)

Returns:

  • Steps

    A Steps ORM instance with ID and relationships established

to_dto
to_dto() -> Step

Convert ORM step to a matchbox.common Step object.

count classmethod
count() -> int

Counts the number of rows in the table.

SourceFields

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.SourceFields[SourceFields]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.SourceFields
                


              click matchbox.server.postgresql.orm.SourceFields href "" "matchbox.server.postgresql.orm.SourceFields"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table for storing column details for SourceConfigs.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'source_fields'
field_id class-attribute instance-attribute
field_id: Mapped[int] = mapped_column(BIGINT, primary_key=True)
source_config_id class-attribute instance-attribute
source_config_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('source_configs.source_config_id', ondelete='CASCADE'), nullable=False)
index class-attribute instance-attribute
index: Mapped[int] = mapped_column(INTEGER, nullable=False)
name class-attribute instance-attribute
name: Mapped[str] = mapped_column(TEXT, nullable=False)
type class-attribute instance-attribute
type: Mapped[str] = mapped_column(TEXT, nullable=False)
is_key class-attribute instance-attribute
is_key: Mapped[bool] = mapped_column(BOOLEAN, nullable=False)
source_config class-attribute instance-attribute
source_config: Mapped[SourceConfigs] = relationship(back_populates='fields', foreign_keys=[source_config_id])
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('source_config_id', 'index', name='unique_index'), Index('ix_source_columns_source_config_id', 'source_config_id'), Index('ix_unique_key_field', 'source_config_id', unique=True, postgresql_where=text('is_key = true')))
count classmethod
count() -> int

Counts the number of rows in the table.

ClusterSourceKey

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.ClusterSourceKey[ClusterSourceKey]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.ClusterSourceKey
                


              click matchbox.server.postgresql.orm.ClusterSourceKey href "" "matchbox.server.postgresql.orm.ClusterSourceKey"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table for storing source primary keys for clusters.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'cluster_keys'
key_id class-attribute instance-attribute
key_id: Mapped[int] = mapped_column(BIGINT, primary_key=True)
cluster_id class-attribute instance-attribute
cluster_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), nullable=False)
source_config_id class-attribute instance-attribute
source_config_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('source_configs.source_config_id', ondelete='CASCADE'), nullable=False)
key class-attribute instance-attribute
key: Mapped[str] = mapped_column(TEXT, nullable=False)
cluster class-attribute instance-attribute
cluster: Mapped[Clusters] = relationship(back_populates='keys')
source_config class-attribute instance-attribute
source_config: Mapped[SourceConfigs] = relationship(back_populates='cluster_keys')
__table_args__ class-attribute instance-attribute
__table_args__ = (Index('ix_cluster_keys_cluster_id', 'cluster_id'), Index('ix_cluster_keys_keys', 'key'), Index('ix_cluster_keys_source_config_id', 'source_config_id'), UniqueConstraint('key_id', 'source_config_id', name='unique_keys_source'))
count classmethod
count() -> int

Counts the number of rows in the table.

SourceConfigs

SourceConfigs(key_field: SourceFields | None = None, index_fields: list[SourceFields] | None = None, **kwargs: Any)

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.SourceConfigs[SourceConfigs]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.SourceConfigs
                


              click matchbox.server.postgresql.orm.SourceConfigs href "" "matchbox.server.postgresql.orm.SourceConfigs"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of source_configs of data for Matchbox.

Methods:

  • list_all

    Returns all source_configs in the database.

  • from_dto

    Create a SourceConfigs instance from a Step DTO object.

  • to_dto

    Convert ORM source to a matchbox.common.SourceConfig object.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'source_configs'
source_config_id class-attribute instance-attribute
source_config_id: Mapped[int] = mapped_column(BIGINT, Identity(start=1), primary_key=True)
step_id class-attribute instance-attribute
step_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), nullable=False)
location_type class-attribute instance-attribute
location_type: Mapped[str] = mapped_column(TEXT, nullable=False)
location_name class-attribute instance-attribute
location_name: Mapped[str] = mapped_column(TEXT, nullable=False)
extract_transform class-attribute instance-attribute
extract_transform: Mapped[str] = mapped_column(TEXT, nullable=False)
name property
name: str

Get the name of the related step.

source_step class-attribute instance-attribute
source_step: Mapped[Steps] = relationship(back_populates='source_config')
fields class-attribute instance-attribute
fields: Mapped[list[SourceFields]] = relationship(back_populates='source_config', passive_deletes=True, cascade='all, delete-orphan')
key_field class-attribute instance-attribute
key_field: Mapped[Optional[SourceFields]] = relationship(primaryjoin='and_(SourceConfigs.source_config_id == SourceFields.source_config_id, SourceFields.is_key == True)', viewonly=True, uselist=False)
index_fields class-attribute instance-attribute
index_fields: Mapped[list[SourceFields]] = relationship(primaryjoin='and_(SourceConfigs.source_config_id == SourceFields.source_config_id, SourceFields.is_key == False)', viewonly=True, order_by='SourceFields.index', collection_class=list)
cluster_keys class-attribute instance-attribute
cluster_keys: Mapped[list[ClusterSourceKey]] = relationship(back_populates='source_config', passive_deletes=True)
clusters class-attribute instance-attribute
clusters: Mapped[list[Clusters]] = relationship(secondary=__table__, primaryjoin='SourceConfigs.source_config_id == ClusterSourceKey.source_config_id', secondaryjoin='ClusterSourceKey.cluster_id == Clusters.cluster_id', viewonly=True)
list_all classmethod
list_all() -> list[SourceConfigs]

Returns all source_configs in the database.

from_dto classmethod
from_dto(config: SourceConfig) -> SourceConfigs

Create a SourceConfigs instance from a Step DTO object.

to_dto
to_dto() -> SourceConfig

Convert ORM source to a matchbox.common.SourceConfig object.

count classmethod
count() -> int

Counts the number of rows in the table.

ModelConfigs

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.ModelConfigs[ModelConfigs]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.ModelConfigs
                


              click matchbox.server.postgresql.orm.ModelConfigs href "" "matchbox.server.postgresql.orm.ModelConfigs"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of model configs for Matchbox.

Methods:

  • list_all

    Returns all model_configs in the database.

  • from_dto

    Create a SourceConfigs instance from a Step DTO object.

  • to_dto

    Convert ORM source to a matchbox.common.ModelConfig object.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'model_configs'
model_config_id class-attribute instance-attribute
model_config_id: Mapped[int] = mapped_column(BIGINT, Identity(start=1), primary_key=True)
step_id class-attribute instance-attribute
step_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), nullable=False)
model_class class-attribute instance-attribute
model_class: Mapped[str] = mapped_column(TEXT, nullable=False)
model_settings class-attribute instance-attribute
model_settings: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
left_query class-attribute instance-attribute
left_query: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
right_query class-attribute instance-attribute
right_query: Mapped[dict[str, Any] | None] = mapped_column(JSONB, nullable=True)
name property
name: str

Get the name of the related step.

model_step class-attribute instance-attribute
model_step: Mapped[Steps] = relationship(back_populates='model_config')
list_all classmethod
list_all() -> list[SourceConfigs]

Returns all model_configs in the database.

from_dto classmethod
from_dto(config: ModelConfig) -> ModelConfigs

Create a SourceConfigs instance from a Step DTO object.

to_dto
to_dto() -> ModelConfig

Convert ORM source to a matchbox.common.ModelConfig object.

count classmethod
count() -> int

Counts the number of rows in the table.

ResolverConfigs

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.ResolverConfigs[ResolverConfigs]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.ResolverConfigs
                


              click matchbox.server.postgresql.orm.ResolverConfigs href "" "matchbox.server.postgresql.orm.ResolverConfigs"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of resolver configs for Matchbox.

Methods:

  • from_dto

    Create a ResolverConfigs instance from a Step DTO object.

  • to_dto

    Convert ORM resolver config to a matchbox.common ResolverConfig object.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'resolver_configs'
resolver_config_id class-attribute instance-attribute
resolver_config_id: Mapped[int] = mapped_column(BIGINT, Identity(start=1), primary_key=True)
step_id class-attribute instance-attribute
step_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), nullable=False)
resolver_class class-attribute instance-attribute
resolver_class: Mapped[str] = mapped_column(TEXT, nullable=False)
resolver_settings class-attribute instance-attribute
resolver_settings: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
resolver_step class-attribute instance-attribute
resolver_step: Mapped[Steps] = relationship(back_populates='resolver_config')
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('step_id', name='resolver_configs_step_key'),)
from_dto classmethod
from_dto(config: ResolverConfig) -> ResolverConfigs

Create a ResolverConfigs instance from a Step DTO object.

to_dto
to_dto() -> ResolverConfig

Convert ORM resolver config to a matchbox.common ResolverConfig object.

count classmethod
count() -> int

Counts the number of rows in the table.

Contains

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Contains[Contains]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Contains
                


              click matchbox.server.postgresql.orm.Contains href "" "matchbox.server.postgresql.orm.Contains"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Cluster lineage table.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'contains'
root class-attribute instance-attribute
root: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), primary_key=True)
leaf class-attribute instance-attribute
leaf: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), primary_key=True)
__table_args__ class-attribute instance-attribute
__table_args__ = (CheckConstraint('root != leaf', name='no_self_containment'), UniqueConstraint('root', 'leaf'), Index('ix_contains_root_leaf', 'root', 'leaf'), Index('ix_contains_leaf_root', 'leaf', 'root'))
count classmethod
count() -> int

Counts the number of rows in the table.

Clusters

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Clusters[Clusters]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Clusters
                


              click matchbox.server.postgresql.orm.Clusters href "" "matchbox.server.postgresql.orm.Clusters"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of indexed data and clusters that match it.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'clusters'
cluster_id class-attribute instance-attribute
cluster_id: Mapped[int] = mapped_column(BIGINT, primary_key=True)
cluster_hash class-attribute instance-attribute
cluster_hash: Mapped[bytes] = mapped_column(BYTEA, nullable=False)
keys class-attribute instance-attribute
keys: Mapped[list[ClusterSourceKey]] = relationship(back_populates='cluster', passive_deletes=True)
leaves class-attribute instance-attribute
leaves: Mapped[list[Clusters]] = relationship(secondary=__table__, primaryjoin='Clusters.cluster_id == Contains.root', secondaryjoin='Clusters.cluster_id == Contains.leaf', backref='roots')
source_configs class-attribute instance-attribute
source_configs: Mapped[list[SourceConfigs]] = relationship(secondary=__table__, primaryjoin='Clusters.cluster_id == ClusterSourceKey.cluster_id', secondaryjoin='ClusterSourceKey.source_config_id == SourceConfigs.source_config_id', viewonly=True)
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('cluster_hash', name='clusters_hash_key'),)
count classmethod
count() -> int

Counts the number of rows in the table.

UserGroups

Bases: MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.UserGroups[UserGroups]

              

              click matchbox.server.postgresql.orm.UserGroups href "" "matchbox.server.postgresql.orm.UserGroups"
            

Association table for user-group membership.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'user_groups'
user_id class-attribute instance-attribute
user_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('users.user_id', ondelete='CASCADE'), primary_key=True)
group_id class-attribute instance-attribute
group_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('groups.group_id', ondelete='CASCADE'), primary_key=True)

Users

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Users[Users]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Users
                


              click matchbox.server.postgresql.orm.Users href "" "matchbox.server.postgresql.orm.Users"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of user identities.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'users'
user_id class-attribute instance-attribute
user_id: Mapped[int] = mapped_column(BIGINT, primary_key=True)
name class-attribute instance-attribute
name: Mapped[str] = mapped_column(TEXT, nullable=False)
email class-attribute instance-attribute
email: Mapped[str] = mapped_column(TEXT, nullable=True)
judgements class-attribute instance-attribute
judgements: Mapped[list[EvalJudgements]] = relationship(back_populates='user')
groups class-attribute instance-attribute
groups: Mapped[list[Groups]] = relationship(secondary=__table__, back_populates='members')
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('name', name='user_name_unique'),)
count classmethod
count() -> int

Counts the number of rows in the table.

Groups

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Groups[Groups]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Groups
                


              click matchbox.server.postgresql.orm.Groups href "" "matchbox.server.postgresql.orm.Groups"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Groups for permission management.

Methods:

  • initialise

    Create standard users, groups, and permissions.

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'groups'
group_id class-attribute instance-attribute
group_id: Mapped[int] = mapped_column(BIGINT, primary_key=True, autoincrement=True)
name class-attribute instance-attribute
name: Mapped[str] = mapped_column(TEXT, nullable=False)
description class-attribute instance-attribute
description: Mapped[str | None] = mapped_column(TEXT, nullable=True)
is_system class-attribute instance-attribute
is_system: Mapped[bool] = mapped_column(BOOLEAN, default=False, nullable=False)
members class-attribute instance-attribute
members: Mapped[list[Users]] = relationship(secondary=__table__, back_populates='groups')
permissions class-attribute instance-attribute
permissions: Mapped[list[Permissions]] = relationship(back_populates='group', passive_deletes=True)
__table_args__ class-attribute instance-attribute
__table_args__ = (UniqueConstraint('name', name='groups_name_key'),)
initialise classmethod
initialise() -> None

Create standard users, groups, and permissions.

count classmethod
count() -> int

Counts the number of rows in the table.

Permissions

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.Permissions[Permissions]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.Permissions
                


              click matchbox.server.postgresql.orm.Permissions href "" "matchbox.server.postgresql.orm.Permissions"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Permissions granted to groups on resources.

Each resource type should have one column. This creates lots of nulls, which are cheap in PostgreSQL and are on an ultimately small table, and avoids a polymorphic association.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'permissions'
permission_id class-attribute instance-attribute
permission_id: Mapped[int] = mapped_column(BIGINT, primary_key=True, autoincrement=True)
permission class-attribute instance-attribute
permission: Mapped[str] = mapped_column(TEXT, nullable=False)
group_id class-attribute instance-attribute
group_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('groups.group_id', ondelete='CASCADE'), nullable=False)
collection_id class-attribute instance-attribute
collection_id: Mapped[int | None] = mapped_column(BIGINT, ForeignKey('collections.collection_id', ondelete='CASCADE'), nullable=True)
is_system class-attribute instance-attribute
is_system: Mapped[bool | None] = mapped_column(BOOLEAN, nullable=True)
group class-attribute instance-attribute
group: Mapped[Groups] = relationship(back_populates='permissions')
collection class-attribute instance-attribute
collection: Mapped[Collections | None] = relationship(back_populates='permissions')
__table_args__ class-attribute instance-attribute
__table_args__ = (CheckConstraint("permission IN ('read', 'write', 'admin')", name='valid_permission'), CheckConstraint('(collection_id IS NOT NULL AND is_system IS NULL) OR (collection_id IS NULL AND is_system = true)', name='exactly_one_resource'), UniqueConstraint('permission', 'group_id', 'collection_id', 'is_system', name='unique_permission_grant', postgresql_nulls_not_distinct=True))
count classmethod
count() -> int

Counts the number of rows in the table.

EvalJudgements

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.EvalJudgements[EvalJudgements]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.EvalJudgements
                


              click matchbox.server.postgresql.orm.EvalJudgements href "" "matchbox.server.postgresql.orm.EvalJudgements"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of evaluation judgements produced by human validators.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'eval_judgements'
judgement_id class-attribute instance-attribute
judgement_id: Mapped[int] = mapped_column(BIGINT, primary_key=True)
user_id class-attribute instance-attribute
user_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('users.user_id', ondelete='CASCADE'), nullable=False)
endorsed_cluster_id class-attribute instance-attribute
endorsed_cluster_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), nullable=False)
shown_cluster_id class-attribute instance-attribute
shown_cluster_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), nullable=False)
tag class-attribute instance-attribute
tag: Mapped[str] = mapped_column(TEXT, nullable=True)
timestamp class-attribute instance-attribute
timestamp: Mapped[DateTime] = mapped_column(DateTime(timezone=True), nullable=False)
user class-attribute instance-attribute
user: Mapped[Users] = relationship(back_populates='judgements')
count classmethod
count() -> int

Counts the number of rows in the table.

ModelEdges

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.ModelEdges[ModelEdges]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.ModelEdges
                


              click matchbox.server.postgresql.orm.ModelEdges href "" "matchbox.server.postgresql.orm.ModelEdges"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Table of results for a model step.

Stores the raw left/right scores created by a model.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'model_edges'
result_id class-attribute instance-attribute
result_id: Mapped[int] = mapped_column(BIGINT, primary_key=True, autoincrement=True)
step_id class-attribute instance-attribute
step_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), nullable=False)
left_id class-attribute instance-attribute
left_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), nullable=False)
right_id class-attribute instance-attribute
right_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), nullable=False)
score class-attribute instance-attribute
score: Mapped[float] = mapped_column(REAL, nullable=False)
proposed_by class-attribute instance-attribute
proposed_by: Mapped[Steps] = relationship(back_populates='model_edges')
__table_args__ class-attribute instance-attribute
__table_args__ = (Index('ix_model_edges_step', 'step_id'), CheckConstraint('score >= 0.0 AND score <= 1.0', name='valid_score'), UniqueConstraint('step_id', 'left_id', 'right_id'))
count classmethod
count() -> int

Counts the number of rows in the table.

ResolverClusters

Bases: CountMixin, MatchboxBase


              flowchart TD
              matchbox.server.postgresql.orm.ResolverClusters[ResolverClusters]
              matchbox.server.postgresql.mixin.CountMixin[CountMixin]

                              matchbox.server.postgresql.mixin.CountMixin --> matchbox.server.postgresql.orm.ResolverClusters
                


              click matchbox.server.postgresql.orm.ResolverClusters href "" "matchbox.server.postgresql.orm.ResolverClusters"
              click matchbox.server.postgresql.mixin.CountMixin href "" "matchbox.server.postgresql.mixin.CountMixin"
            

Association table linking resolver steps to cluster IDs.

Methods:

  • count

    Counts the number of rows in the table.

Attributes:

__tablename__ class-attribute instance-attribute
__tablename__ = 'resolver_clusters'
step_id class-attribute instance-attribute
step_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('steps.step_id', ondelete='CASCADE'), primary_key=True)
cluster_id class-attribute instance-attribute
cluster_id: Mapped[int] = mapped_column(BIGINT, ForeignKey('clusters.cluster_id', ondelete='CASCADE'), primary_key=True)
proposed_by class-attribute instance-attribute
proposed_by: Mapped[Steps] = relationship(back_populates='resolver_clusters')
__table_args__ class-attribute instance-attribute
__table_args__ = (Index('ix_resolver_clusters_step', 'step_id'),)
count classmethod
count() -> int

Counts the number of rows in the table.

utils

Utilities for using the PostgreSQL backend.

Modules:

  • db

    General utilities for the PostgreSQL backend.

  • insert

    Utilities for inserting data into the PostgreSQL backend.

  • query

    Utilities for querying and matching in the PostgreSQL backend.

db

General utilities for the PostgreSQL backend.

Functions:

dump
dump() -> MatchboxSnapshot

Dumps the entire database to a snapshot.

Returns:

  • MatchboxSnapshot

    A MatchboxSnapshot object of type “postgres” with the database’s current state.

restore

Restores the database from a snapshot.

Parameters:

  • snapshot
    (MatchboxSnapshot) –

    A MatchboxSnapshot object of type “postgres” with the database’s state

  • batch_size
    (int) –

    The number of records to insert in each batch

Raises:

grant_permission
grant_permission(session: Session, group_name: GroupName, permission: PermissionType, resource: Literal[SYSTEM] | CollectionName) -> None

Grant permission within a session.

Committing (or otherwise) left to the caller.

sqa_profiled
sqa_profiled() -> Generator[None, None, None]

SQLAlchemy profiler.

Taken directly from their docs: https://docs.sqlalchemy.org/en/20/faq/performance.html#query-profiling

compile_sql
compile_sql(stmt: Select) -> str

Compiles a SQLAlchemy statement into a string.

Parameters:

  • stmt
    (Select) –

    The SQLAlchemy statement to compile.

Returns:

  • str

    The compiled SQL statement as a string.

ingest_to_temporary_table
ingest_to_temporary_table(table_name: str, schema_name: str, data: Table, column_types: dict[str, TypeEngine], max_chunksize: int | None = None) -> Generator[Table, None, None]

Context manager to ingest Arrow data to a temporary table with explicit types.

Parameters:

  • table_name
    (str) –

    Base name for the temporary table

  • schema_name
    (str) –

    Schema where the temporary table will be created

  • data
    (Table) –

    PyArrow table containing the data to ingest

  • column_types
    (dict[str, TypeEngine]) –

    Map of column names to SQLAlchemy type instances

  • max_chunksize
    (int | None, default: None ) –

    Optional maximum chunk size for batches

Returns:

  • None

    A SQLAlchemy Table object representing the temporary table

insert

Utilities for inserting data into the PostgreSQL backend.

Functions:

insert_hashes
insert_hashes(path: SourceStepPath, data_hashes: Table, batch_size: int) -> None

Indexes hash data for a source.

Parameters:

  • path
    (SourceStepPath) –

    The path of the source step

  • data_hashes
    (Table) –

    Arrow table containing hash data

  • batch_size
    (int) –

    Batch size for bulk operations

Raises:

insert_model_edges
insert_model_edges(path: ModelStepPath, results: Table, batch_size: int) -> None

Writes model edges to Matchbox.

insert_clusters
insert_clusters(path: ResolverStepPath, cluster_assignments: Table, batch_size: int) -> None

Write resolver cluster assignments.

The function proceeds in three phases:

  1. Validate: check fingerprint, ensure no prior data, short-circuit if the upload is empty
  2. Compute hashes: ingest assignments to a temp table, expand each child cluster to its leaves, and derive a cluster hash per parent cluster (Python round-trip via _compute_resolver_hashes)
  3. Insert everything: with both temp tables live in one session, materialise new Clusters rows, then insert Contains and ResolverClusters membership rows

Parameters:

  • path
    (ResolverStepPath) –

    The resolver step path to upload cluster assignments for

  • cluster_assignments
    (Table) –

    Arrow table conforming to SCHEMA_CLUSTERS, having (parent_id, child_id) columns

  • batch_size
    (int) –

    Batch size for temporary table ingestion

Raises:

query

Utilities for querying and matching in the PostgreSQL backend.

Functions:

require_complete_resolver
require_complete_resolver(session: Session, path: ResolverStepPath) -> Steps

Resolve and validate a resolver path for query-time operations.

resolver_membership_subquery
resolver_membership_subquery(step_id: int, alias: str = 'resolver_membership') -> Subquery

Build root_id/leaf_id membership rows for a resolver.

query
query(source: SourceStepPath, resolver: ResolverStepPath | None = None, return_leaf_id: bool = False, limit: int | None = None) -> Table

Query Matchbox to retrieve linked data for a source.

match
match(key: str, source: SourceStepPath, targets: list[SourceStepPath], resolver: ResolverStepPath) -> list[Match]

Match a source key against targets via a resolver.