DTOs
matchbox.common.dtos
¶
Data transfer objects for Matchbox API.
Classes:
-
OKMessage–Generic HTTP OK response.
-
BackendCountableType–Enumeration of supported backend countable types.
-
BackendResourceType–Enumeration of resources types referenced by client or API.
-
BackendParameterType–Enumeration of parameter types passable to the API.
-
CRUDOperation–Enumeration of CRUD operations.
-
LocationType–Enumeration of location types.
-
PermissionType–Permission levels for resource access.
-
User–User identity.
-
LoginResponse–Response from login endpoint.
-
AuthStatusResponse–Response model for authentication status.
-
Group–Group definition.
-
PermissionGrant–A permission on a resource.
-
StepPath–Base step identifier with collection, run, and name.
-
StepType–Types of nodes in a DAG.
-
LocationConfig–Metadata for a location.
-
SourceField–A field in a source that can be indexed in the Matchbox database.
-
SourceConfig–Configuration of a source that can, or has been, indexed in the backend.
-
QueryCombineType–Enumeration of ways to combine multiple rows having the same matchbox ID.
-
QueryConfig–Configuration of query generating model inputs.
-
ModelType–Enumeration of supported model types.
-
ModelConfig–Configuration for model that has or could be added to the server.
-
ResolverType–Enumeration of supported resolver methodology types.
-
ResolverConfig–Configuration for resolver that combines model and resolver outputs.
-
Match–A match between primary keys in the Matchbox database.
-
Step–Unified step type with common fields and discriminated config.
-
Run–A run within a collection.
-
Collection–A collection of runs.
-
ResourceOperationStatus–Status response for any resource operation.
-
CountResult–Response model for count results.
-
UploadStage–Enumeration of stages of a file upload and its processing.
-
UploadInfo–Response model for file upload processes.
-
NotFoundError–API error for a 404 status code.
-
InvalidParameterError–API error for a custom 422 status code.
-
ErrorResponse–Unified error response for all HTTP error status codes.
-
DefaultUser–Default user identities.
-
DefaultGroup–Default group names.
Functions:
-
validate_matchbox_name–Validate matchbox name format.
Attributes:
-
MatchboxName(TypeAlias) – -
JsonObject(TypeAlias) – -
GroupName(TypeAlias) –Type alias for group names.
-
CollectionName(TypeAlias) –Type alias for collection names.
-
RunID(TypeAlias) –Type alias for run IDs.
-
SourceStepName(TypeAlias) –Type alias for source step names.
-
ModelStepName(TypeAlias) –Type alias for model step names.
-
ResolverStepName(TypeAlias) –Type alias for resolver step names.
-
StepName(TypeAlias) –Type alias for any step names.
-
SourceStepPath(TypeAlias) –Type alias for source step paths.
-
ModelStepPath(TypeAlias) –Type alias for model step paths.
-
ResolverStepPath(TypeAlias) –Type alias for resolver step paths.
MatchboxName
module-attribute
¶
MatchboxName: TypeAlias = Annotated[str, StringConstraints(pattern='^[a-zA-Z0-9_.-]+$', min_length=1, strip_whitespace=True), AfterValidator(validate_matchbox_name), Field(description='Valid name for Matchbox database objects. Must contain only alphanumeric characters, underscores, dots, or hyphens.', examples=['my-dataset', 'user_data.v2', 'experiment_001'], json_schema_extra={'pattern': '^[a-zA-Z0-9_.-]+$'})]
CollectionName
module-attribute
¶
CollectionName: TypeAlias = MatchboxName
Type alias for collection names.
SourceStepName
module-attribute
¶
SourceStepName: TypeAlias = MatchboxName
Type alias for source step names.
ModelStepName
module-attribute
¶
ModelStepName: TypeAlias = MatchboxName
Type alias for model step names.
ResolverStepName
module-attribute
¶
ResolverStepName: TypeAlias = MatchboxName
Type alias for resolver step names.
StepName
module-attribute
¶
StepName: TypeAlias = SourceStepName | ModelStepName | ResolverStepName
Type alias for any step names.
SourceStepPath
module-attribute
¶
Type alias for source step paths.
ModelStepPath
module-attribute
¶
Type alias for model step paths.
ResolverStepPath
module-attribute
¶
Type alias for resolver step paths.
OKMessage
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.OKMessage[OKMessage]
click matchbox.common.dtos.OKMessage href "" "matchbox.common.dtos.OKMessage"
Generic HTTP OK response.
Attributes:
BackendCountableType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.BackendCountableType[BackendCountableType]
click matchbox.common.dtos.BackendCountableType href "" "matchbox.common.dtos.BackendCountableType"
Enumeration of supported backend countable types.
Attributes:
-
SOURCES– -
MODELS– -
SOURCE_CLUSTERS– -
MODEL_CLUSTERS– -
CLUSTERS– -
CREATES– -
MERGES– -
PROPOSES–
BackendResourceType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.BackendResourceType[BackendResourceType]
click matchbox.common.dtos.BackendResourceType href "" "matchbox.common.dtos.BackendResourceType"
Enumeration of resources types referenced by client or API.
Attributes:
BackendParameterType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.BackendParameterType[BackendParameterType]
click matchbox.common.dtos.BackendParameterType href "" "matchbox.common.dtos.BackendParameterType"
Enumeration of parameter types passable to the API.
Attributes:
-
SAMPLE_SIZE– -
NAME–
CRUDOperation
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.CRUDOperation[CRUDOperation]
click matchbox.common.dtos.CRUDOperation href "" "matchbox.common.dtos.CRUDOperation"
Enumeration of CRUD operations.
Attributes:
LocationType
¶
PermissionType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.PermissionType[PermissionType]
click matchbox.common.dtos.PermissionType href "" "matchbox.common.dtos.PermissionType"
Permission levels for resource access.
Attributes:
User
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.User[User]
click matchbox.common.dtos.User href "" "matchbox.common.dtos.User"
User identity.
Attributes:
LoginResponse
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.LoginResponse[LoginResponse]
click matchbox.common.dtos.LoginResponse href "" "matchbox.common.dtos.LoginResponse"
Response from login endpoint.
Attributes:
-
user(User) – -
setup_mode_admin(bool) –
AuthStatusResponse
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.AuthStatusResponse[AuthStatusResponse]
click matchbox.common.dtos.AuthStatusResponse href "" "matchbox.common.dtos.AuthStatusResponse"
Response model for authentication status.
Attributes:
-
authenticated(bool) – -
user(User | None) –
Group
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.Group[Group]
click matchbox.common.dtos.Group href "" "matchbox.common.dtos.Group"
Group definition.
Attributes:
PermissionGrant
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.PermissionGrant[PermissionGrant]
click matchbox.common.dtos.PermissionGrant href "" "matchbox.common.dtos.PermissionGrant"
A permission on a resource.
Resource context should always be supplied.
Attributes:
-
group_name(GroupName) – -
permission(PermissionType) –
StepPath
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.StepPath[StepPath]
click matchbox.common.dtos.StepPath href "" "matchbox.common.dtos.StepPath"
Base step identifier with collection, run, and name.
Attributes:
-
collection(CollectionName) – -
run(RunID) – -
name(StepName) –
StepType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.StepType[StepType]
click matchbox.common.dtos.StepType href "" "matchbox.common.dtos.StepType"
Types of nodes in a DAG.
Attributes:
LocationConfig
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.LocationConfig[LocationConfig]
click matchbox.common.dtos.LocationConfig href "" "matchbox.common.dtos.LocationConfig"
Metadata for a location.
Attributes:
-
type(LocationType) – -
name(str) –
SourceField
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.SourceField[SourceField]
click matchbox.common.dtos.SourceField href "" "matchbox.common.dtos.SourceField"
A field in a source that can be indexed in the Matchbox database.
Attributes:
SourceConfig
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.SourceConfig[SourceConfig]
click matchbox.common.dtos.SourceConfig href "" "matchbox.common.dtos.SourceConfig"
Configuration of a source that can, or has been, indexed in the backend.
They are foundational processes on top of which linking and deduplication models can build new steps.
Methods:
-
validate_key_field–Ensure that the key field is a string and not in the index fields.
-
prefix–Get the prefix for the source.
-
qualified_key–Get the qualified key for the source.
-
qualified_index_fields–Get the qualified index fields for the source.
-
qualify_field–Qualify field names with the source name.
-
f–Qualify one or more field names with the source name.
Attributes:
-
location_config(LocationConfig) – -
extract_transform(str) – -
key_field(SourceField) – -
index_fields(tuple[SourceField, ...]) – -
dependencies(list[StepName]) –Local execution prerequisites.
-
parents(list[StepName]) –Direct DAG edges to this node.
location_config
class-attribute
instance-attribute
¶
location_config: LocationConfig = Field(description='The location of the source. Used to run the extract/tansform logic.')
extract_transform
class-attribute
instance-attribute
¶
extract_transform: str = Field(description='Logic to extract and transform data from the source. Language is location dependent.')
key_field
class-attribute
instance-attribute
¶
key_field: SourceField = Field(description=dedent("\n The key field. This is the source's key for unique\n entities, such as a primary key in a relational database.\n\n Keys must ALWAYS be a string.\n\n For example, if the source describes companies, it may have used\n a Companies House number as its key.\n\n This key is ALWAYS correct. It should be something generated and\n owned by the source being indexed.\n \n For example, your organisation's CRM ID is a key field within the CRM.\n \n A CRM ID entered by hand in another dataset shouldn't be used \n as a key field.\n "))
index_fields
class-attribute
instance-attribute
¶
index_fields: tuple[SourceField, ...] = Field(default=None, description=dedent('\n The fields to index in this source, after the extract/transform logic \n has been applied. \n\n This is usually set manually, and should map onto the columns that the\n extract/transform logic returns.\n '))
dependencies
property
¶
Local execution prerequisites.
While this can contain information about graph topology, it should only be used to check validity, never to reconstruct it.
validate_key_field
¶
validate_key_field() -> Self
Ensure that the key field is a string and not in the index fields.
prefix
¶
qualified_key
¶
qualified_index_fields
¶
qualify_field
¶
f
¶
QueryCombineType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.QueryCombineType[QueryCombineType]
click matchbox.common.dtos.QueryCombineType href "" "matchbox.common.dtos.QueryCombineType"
Enumeration of ways to combine multiple rows having the same matchbox ID.
Attributes:
QueryConfig
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.QueryConfig[QueryConfig]
click matchbox.common.dtos.QueryConfig href "" "matchbox.common.dtos.QueryConfig"
Configuration of query generating model inputs.
A QueryConfig is a view onto the step subgraph, a triangulation of a set of sources and an optional resolver. It doesn’t describe topology, which is why it has no .parents attribute.
Methods:
-
validate_steps–Ensure that step settings are compatible.
Attributes:
-
sources(tuple[SourceStepName, ...]) – -
resolver(ResolverStepName | None) – -
combine_type(QueryCombineType) – -
cleaning(dict[str, str] | None) – -
dependencies(list[StepName]) –Local execution prerequisites.
-
resolves_from(StepName) –Return the step name that the query resolves from.
dependencies
property
¶
Local execution prerequisites.
While this can contain information about graph topology, it should only be used to check validity, never to reconstruct it.
ModelType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.ModelType[ModelType]
click matchbox.common.dtos.ModelType href "" "matchbox.common.dtos.ModelType"
Enumeration of supported model types.
Attributes:
ModelConfig
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.ModelConfig[ModelConfig]
click matchbox.common.dtos.ModelConfig href "" "matchbox.common.dtos.ModelConfig"
Configuration for model that has or could be added to the server.
Methods:
-
validate_right_query–Ensure that a right query is set if and only if model is linker.
Attributes:
-
type(ModelType) – -
model_class(str) – -
model_settings(JsonObject) – -
left_query(QueryConfig) – -
right_query(QueryConfig | None) – -
dependencies(list[StepName]) –Local execution prerequisites.
-
parents(list[StepName]) –Direct DAG edges to this node.
dependencies
property
¶
Local execution prerequisites.
While this can contain information about graph topology, it should only be used to check validity, never to reconstruct it.
ResolverType
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.ResolverType[ResolverType]
click matchbox.common.dtos.ResolverType href "" "matchbox.common.dtos.ResolverType"
Enumeration of supported resolver methodology types.
Attributes:
ResolverConfig
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.ResolverConfig[ResolverConfig]
click matchbox.common.dtos.ResolverConfig href "" "matchbox.common.dtos.ResolverConfig"
Configuration for resolver that combines model and resolver outputs.
Methods:
-
validate_inputs–Ensure resolver config has at least one input.
Attributes:
-
resolver_class(str) – -
resolver_settings(JsonObject) – -
inputs(tuple[ModelStepName, ...]) – -
dependencies(list[ModelStepName]) –Local execution prerequisites.
-
parents(list[ModelStepName]) –Direct DAG edges to this node.
dependencies
property
¶
dependencies: list[ModelStepName]
Local execution prerequisites.
While this can contain information about graph topology, it should only be used to check validity, never to reconstruct it.
Match
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.Match[Match]
click matchbox.common.dtos.Match href "" "matchbox.common.dtos.Match"
A match between primary keys in the Matchbox database.
Methods:
-
found_or_none–Ensure that a match has sources and a cluster if target was found.
-
serialise_ids–Turn set to sorted list when serialising.
Attributes:
Step
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.Step[Step]
click matchbox.common.dtos.Step href "" "matchbox.common.dtos.Step"
Unified step type with common fields and discriminated config.
Methods:
-
validate_description–Ensure the description is not empty if provided.
-
validate_step_type_matches_config–Ensure step_type matches the config type.
Attributes:
-
description(str | None) – -
fingerprint(Annotated[bytes, PlainSerializer(hash_to_base64, return_type=str), PlainValidator(base64_to_hash)]) – -
step_type(StepType) – -
config(SourceConfig | ModelConfig | ResolverConfig) –
description
class-attribute
instance-attribute
¶
description: str | None = Field(default=None, description='Description')
fingerprint
instance-attribute
¶
fingerprint: Annotated[bytes, PlainSerializer(hash_to_base64, return_type=str), PlainValidator(base64_to_hash)]
validate_description
classmethod
¶
Ensure the description is not empty if provided.
Run
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.Run[Run]
click matchbox.common.dtos.Run href "" "matchbox.common.dtos.Run"
A run within a collection.
Attributes:
-
run_id(RunID | None) – -
is_default(bool) – -
is_mutable(bool) – -
steps(dict[StepName, Step]) –
run_id
class-attribute
instance-attribute
¶
run_id: RunID | None = Field(description='Unique ID of the run')
is_default
class-attribute
instance-attribute
¶
is_default: bool = Field(default=False, description='Whether this run is the default in its collection')
Collection
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.Collection[Collection]
click matchbox.common.dtos.Collection href "" "matchbox.common.dtos.Collection"
A collection of runs.
Methods:
-
validate_default_run–Check default run is within all runs.
Attributes:
-
default_run(RunID | None) – -
runs(list[RunID]) –
ResourceOperationStatus
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.ResourceOperationStatus[ResourceOperationStatus]
click matchbox.common.dtos.ResourceOperationStatus href "" "matchbox.common.dtos.ResourceOperationStatus"
Status response for any resource operation.
Methods:
-
error_examples–Examples for error codes.
Attributes:
CountResult
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.CountResult[CountResult]
click matchbox.common.dtos.CountResult href "" "matchbox.common.dtos.CountResult"
Response model for count results.
Attributes:
-
entities(dict[BackendCountableType, int]) –
UploadStage
¶
Bases: StrEnum
flowchart TD
matchbox.common.dtos.UploadStage[UploadStage]
click matchbox.common.dtos.UploadStage href "" "matchbox.common.dtos.UploadStage"
Enumeration of stages of a file upload and its processing.
Attributes:
-
READY– -
PROCESSING– -
COMPLETE–
UploadInfo
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.UploadInfo[UploadInfo]
click matchbox.common.dtos.UploadInfo href "" "matchbox.common.dtos.UploadInfo"
Response model for file upload processes.
Attributes:
-
stage(UploadStage | None) – -
error(str | None) –
NotFoundError
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.NotFoundError[NotFoundError]
click matchbox.common.dtos.NotFoundError href "" "matchbox.common.dtos.NotFoundError"
API error for a 404 status code.
Attributes:
-
details(str) – -
entity(BackendResourceType) –
InvalidParameterError
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.InvalidParameterError[InvalidParameterError]
click matchbox.common.dtos.InvalidParameterError href "" "matchbox.common.dtos.InvalidParameterError"
API error for a custom 422 status code.
Attributes:
-
details(str) – -
parameter(BackendParameterType | None) –
ErrorResponse
¶
Bases: BaseModel
flowchart TD
matchbox.common.dtos.ErrorResponse[ErrorResponse]
click matchbox.common.dtos.ErrorResponse href "" "matchbox.common.dtos.ErrorResponse"
Unified error response for all HTTP error status codes.
This DTO enables the client to reconstruct the exact exception type that was raised on the server.
Attributes:
-
exception_type(MatchboxExceptionType) – -
message(str) – -
details(dict[str, Any] | None) –
exception_type
class-attribute
instance-attribute
¶
exception_type: MatchboxExceptionType = Field(description='The name of the exception class raised on the server')