Arrow
matchbox.common.arrow
¶
Common Arrow utilities.
Functions:
-
table_to_buffer
–Converts an Arrow table to a BytesIO buffer.
Attributes:
-
SCHEMA_MB_IDS
(Final[Schema]
) –Data transfer schema for Matchbox IDs keyed to primary keys.
-
SCHEMA_INDEX
(Final[Schema]
) –Data transfer schema for data to be indexed in Matchbox.
-
SCHEMA_RESULTS
(Final[Schema]
) –Data transfer schema for the results of a deduplication or linking process.
SCHEMA_MB_IDS
module-attribute
¶
SCHEMA_MB_IDS: Final[Schema] = schema(
[("id", int64()), ("source_pk", large_string())]
)
Data transfer schema for Matchbox IDs keyed to primary keys.
SCHEMA_INDEX
module-attribute
¶
SCHEMA_INDEX: Final[Schema] = schema(
[
("hash", large_binary()),
("source_pk", large_list(large_string())),
]
)
Data transfer schema for data to be indexed in Matchbox.