Skip to content

Results

matchbox.client.results

Objects representing the results of running a model client-side.

Classes:

ModelResults

ModelResults(probabilities: DataFrame, left_root_leaf: DataFrame | None = None, right_root_leaf: DataFrame | None = None)

Results of a model run.

Contains:

  • The probabilities of each pair being a match
  • (Optional) The clusters of connected components at each threshold
  • (Optional) The leaf_id mapping to trace results back to source clusters

Allows users to easily interrogate the outputs of models, explore decisions on choosing thresholds for clustering, and upload the results to Matchbox.

Parameters:

  • probabilities

    (DataFrame) –

    dataframe with SCHEMA_RESULTS

  • left_root_leaf

    (DataFrame | None, default: None ) –

    optional dataframe with columns: id, leaf_id

  • right_root_leaf

    (DataFrame | None, default: None ) –

    optional dataframe with columns: id, leaf_id

Methods:

  • root_leaf

    Returns all roots and leaves implied by these results.

Attributes:

left_root_leaf instance-attribute

left_root_leaf = None

right_root_leaf instance-attribute

right_root_leaf = None

probabilities instance-attribute

probabilities: DataFrame = cast(Schema(SCHEMA_RESULTS))

clusters property

clusters: DataFrame

Retrieve new clusters implied by these results.

root_leaf

root_leaf() -> DataFrame

Returns all roots and leaves implied by these results.

ResolvedMatches

ResolvedMatches(sources: list[Source], query_results: list[DataFrame])

Matches according to resolution.

Parameters:

  • sources

    (list[Source]) –

    List of Source objects

  • query_results

    (list[DataFrame]) –

    List of tables with SCHEMA_QUERY_WITH_LEAVES

Methods:

  • as_lookup

    Return lookup across matchbox ID and source keys.

  • as_cluster_key_map

    Return mapping across root, leaf, source and keys.

  • as_leaf_sets

    Return grouping of lead IDs.

  • view_cluster

    Return source data for all records in cluster.

  • merge

    Combine two instances of resolved matches by merging clusters.

Attributes:

sources instance-attribute

sources = sources

query_results instance-attribute

query_results = query_results

as_lookup

as_lookup() -> DataFrame

Return lookup across matchbox ID and source keys.

as_cluster_key_map

as_cluster_key_map() -> DataFrame

Return mapping across root, leaf, source and keys.

as_leaf_sets

as_leaf_sets() -> list[list[int]]

Return grouping of lead IDs.

view_cluster

view_cluster(cluster_id: int, merge_fields: bool = False) -> DataFrame

Return source data for all records in cluster.

Parameters:

  • cluster_id
    (int) –

    ID of root cluster to view

  • merge_fields
    (bool, default: False ) –

    whether to remove source qualifier when concatenating rows. Only applies to index fields - key fields are not affected.

merge

merge(other: Self) -> Self

Combine two instances of resolved matches by merging clusters.

All cluster IDs will be replaced with negative integers and lose their association with cluster IDs on the backend.