Database
matchbox.common.db
¶
Common database utilities for Matchbox.
Classes:
-
QueryReturnType
–Enumeration of dataframe types to return from query.
Functions:
-
sql_to_df
–Executes the given SQLAlchemy statement or SQL string using Polars.
Attributes:
QueryReturnType
¶
sql_to_df
¶
sql_to_df(stmt: str, connection: Engine | Connection, return_type: QueryReturnType, *, return_batches: Literal[False] = False, batch_size: int | None = None, rename: dict[str, str] | Callable | None = None, schema_overrides: dict[str, DataType] | None = None, execute_options: dict[str, Any] | None = None) -> QueryReturnClass
sql_to_df(stmt: str, connection: Engine | Connection, return_type: QueryReturnType, *, return_batches: Literal[True], batch_size: int | None = None, rename: dict[str, str] | Callable | None = None, schema_overrides: dict[str, DataType] | None = None, execute_options: dict[str, Any] | None = None) -> Iterator[QueryReturnClass]
sql_to_df(stmt: str, connection: Engine | Connection, return_type: QueryReturnType = PANDAS, *, return_batches: bool = False, batch_size: int | None = None, rename: dict[str, str] | Callable | None = None, schema_overrides: dict[str, DataType] | None = None, execute_options: dict[str, Any] | None = None) -> QueryReturnClass | Iterator[QueryReturnClass]
Executes the given SQLAlchemy statement or SQL string using Polars.
Parameters:
-
stmt
¶str
) –A SQL string to be executed.
-
connection
¶Engine | Connection
) –A SQLAlchemy Engine object or ADBC connection.
-
return_type
¶QueryReturnType
, default:PANDAS
) –The type of the return value. One of “arrow”, “pandas”, or “polars”.
-
return_batches
¶bool
, default:False
) –If True, return an iterator that yields each batch separately. If False, return a single DataFrame with all results. Default is False.
-
batch_size
¶int | None
, default:None
) –Indicate the size of each batch when processing data in batches. Default is None.
-
rename
¶dict[str, str] | Callable | None
, default:None
) –A dictionary mapping old column names to new column names, or a callable that takes a DataFrame and returns a DataFrame with renamed columns. Default is None.
-
schema_overrides
¶dict[str, DataType] | None
, default:None
) –A dictionary mapping column names to dtypes. Default is None.
-
execute_options
¶dict[str, Any] | None
, default:None
) –These options will be passed through into the underlying query execution method as kwargs. Default is None.
Returns:
-
QueryReturnClass | Iterator[QueryReturnClass]
–If return_batches is False: A dataframe of the query results in the specified format.
-
QueryReturnClass | Iterator[QueryReturnClass]
–If return_batches is True: An iterator of dataframes in the specified format.
Raises:
-
ValueError
–- If the connection is not properly configured or if an unsupported return type is specified.
- If batch_size and return_batches are either both set or both unset.