Skip to content

Evaluation

matchbox.client.eval

Module implementing client-side evaluation features.

Modules:

  • utils

    Collection of client-side functions in aid of model evaluation.

Classes:

  • EvalData

    Object which caches evaluation data to measure performance of models.

Functions:

  • compare_models

    Compare metrics of models based on evaluation data.

  • get_samples

    Retrieve samples enriched with source data, grouped by resolution cluster.

EvalData

EvalData()

Object which caches evaluation data to measure performance of models.

Methods:

  • precision_recall

    Computes precision and recall at one threshold.

  • pr_curve

    Computes precision and recall for each threshold in results.

precision_recall

precision_recall(
    results: Results, threshold: float
) -> PrecisionRecall

Computes precision and recall at one threshold.

pr_curve

pr_curve(results: Results) -> Figure

Computes precision and recall for each threshold in results.

compare_models

Compare metrics of models based on evaluation data.

Parameters:

Returns:

  • ModelComparison

    A model comparison object, listing metrics for each model.

get_samples

get_samples(
    n: int,
    user_id: int,
    resolution: ModelResolutionName | None = None,
    clients: dict[str, Any] | None = None,
    use_default_client: bool = False,
) -> dict[int, DataFrame]

Retrieve samples enriched with source data, grouped by resolution cluster.

Parameters:

  • n

    (int) –

    Number of clusters to sample

  • user_id

    (int) –

    ID of the user requesting the samples

  • resolution

    (ModelResolutionName | None, default: None ) –

    Model resolution proposing the clusters. If not set, will use a default resolution.

  • clients

    (dict[str, Any] | None, default: None ) –

    Dictionary from location names to valid client for each. Locations whose name is missing from the dictionary will be skipped.

  • use_default_client

    (bool, default: False ) –

    Whether to use for all unset location clients a SQLAlchemy engine for the default warehouse set in the environment variable MB__CLIENT__DEFAULT_WAREHOUSE.

Returns:

  • dict[int, DataFrame]

    Dictionary of cluster ID to dataframe describing the cluster

Raises:

utils

Collection of client-side functions in aid of model evaluation.

Classes:

  • EvalData

    Object which caches evaluation data to measure performance of models.

Functions:

  • get_samples

    Retrieve samples enriched with source data, grouped by resolution cluster.

  • compare_models

    Compare metrics of models based on evaluation data.

EvalData

EvalData()

Object which caches evaluation data to measure performance of models.

Methods:

  • precision_recall

    Computes precision and recall at one threshold.

  • pr_curve

    Computes precision and recall for each threshold in results.

precision_recall
precision_recall(
    results: Results, threshold: float
) -> PrecisionRecall

Computes precision and recall at one threshold.

pr_curve
pr_curve(results: Results) -> Figure

Computes precision and recall for each threshold in results.

get_samples

get_samples(
    n: int,
    user_id: int,
    resolution: ModelResolutionName | None = None,
    clients: dict[str, Any] | None = None,
    use_default_client: bool = False,
) -> dict[int, DataFrame]

Retrieve samples enriched with source data, grouped by resolution cluster.

Parameters:

  • n
    (int) –

    Number of clusters to sample

  • user_id
    (int) –

    ID of the user requesting the samples

  • resolution
    (ModelResolutionName | None, default: None ) –

    Model resolution proposing the clusters. If not set, will use a default resolution.

  • clients
    (dict[str, Any] | None, default: None ) –

    Dictionary from location names to valid client for each. Locations whose name is missing from the dictionary will be skipped.

  • use_default_client
    (bool, default: False ) –

    Whether to use for all unset location clients a SQLAlchemy engine for the default warehouse set in the environment variable MB__CLIENT__DEFAULT_WAREHOUSE.

Returns:

  • dict[int, DataFrame]

    Dictionary of cluster ID to dataframe describing the cluster

Raises:

compare_models

Compare metrics of models based on evaluation data.

Parameters:

Returns:

  • ModelComparison

    A model comparison object, listing metrics for each model.