14. Simplification of current transaction queries¶
Date: 2021-09-27
Status¶
Approved
Context¶
The TaMaTo application keeps a version history of the UK tariff, with each version of a record having a version ID and an associated transaction ID. TaMaTo enables users to create draft edits to the tariff, and these are stored in the database as unpublished versions.
When querying the database, either to display the current UK tariff details, or to check business rules are not violated, those queries must specify the “current” transaction to fetch only the versions of records that are pertinent. This is further complicated by the need to fetch unpublished versions when querying data that has been modified in the user’s current session.
This complication affects several aspects of the application:
Data entry. When users enter data into forms, the options they are presented with are retrieved from the database. These options must be only those that are valid at the time of the request, and must not include any historical versions.
Business rules. The current workbasket contains versions of database records which are not yet considered fact, but must be included in business rule queries to check that when they are published in the tariff they will not violate those rules.
Alternatives considered¶
Thread-local storage¶
We store the current transaction in thread-local storage as a sort of global variable that provides the context to all queries in the current thread - this avoids interference with other concurrent requests to the application. Django-CRUM provides an API to managing thread-local storage.
Some Django form fields (eg ModelChoiceField
) take a queryset as an initialisation
argument and execute the query at render-time. In this case, we want to construct a
queryset that retrieves the current transaction at execution time and not at
initialisation. To ensure we fetch the current transaction ID value at query execution
time, we can use a dynamic Value
object, for example:
from crum import get_current_request
from django.db.models import IntegerField, Value
from workbaskets.models import WorkBasket
class CurrentTransactionId(Value):
def __init__(self, **kwargs):
kwargs["output_field"] = IntegerField()
# skip Value constructor which assigns to self.value
super(Value, self).__init__(**kwargs)
@property
def value(self):
request = get_current_request()
if request:
transaction = WorkBasket.get_current_transaction(request)
if transaction:
return transaction.id
return None
This allows us to construct a queryset that fetches the current transaction at query execution time:
queryset = Footnotes.objects.filter(transaction_id=CurrentTransactionId())
Using these dynamic Value
objects, we can refactor latest_approved
and
approved_up_to_transaction
as follows:
def latest_approved(
self,
transaction: Optional[Transaction] = None,
) -> TrackedModelQuerySet:
"""
Get the approved versions of the model, or the latest draft versions if they
exist within a transaction preceding (and including) the given transaction in
the workbasket of the given transaction.
"""
if transaction:
current_workbasket_id = Value(transaction.workbasket.id, output_field=IntegerField())
transaction_order = transaction.order
else:
# dynamic values defined similarly to CurrentTransactionId above
current_workbasket_id = CurrentWorkBasketId()
transaction_order = CurrentTransactionOrder()
is_current_version = Q(is_current__isnull=False)
in_current_workbasket = Q(
transaction__workbasket_id=F("current_workbasket_id"),
transaction__order__lte=transaction_order,
)
versions_in_workbasket = Q(
version_group__versions__transaction__workbasket_id=F("current_workbasket_id"),
version_group__versions__transaction__order__lte=transaction_order,
)
latest_version_id = Max(
"version_group__versions",
filter=(Q(current_workbasket_id__isnull=False) & versions_in_workbasket),
)
unapproved_if_no_workbasket = Q(current_workbasket_id__isnull=True, is_current__isnull=True)
older_versions_in_workbasket = Q(current_workbasket_id__isnull=False) & ~Q(latest=F("id"))
deletions = Q(update_type=UpdateType.DELETE)
return (
self.annotate(current_workbasket_id=current_workbasket_id)
.filter(is_current_version | in_current_workbasket)
.annotate(latest=latest_version_id)
.exclude(unapproved_if_no_workbasket)
.exclude(older_versions_in_workbasket)
.exclude(deletions)
)
This allows initialising (for example) a ModelChoiceField
with a queryset that uses the current
transaction at form render time:
measure_type = ModelChoiceField(queryset=MeasureType.objects.latest_approved())
Temporal database extension¶
We extend the Postgres database with a temporal tables extension, which handles the versioning of data so that we do not have to do it in our code. Compiled extensions may not be permitted to install on our hosted database servers, so nearform/temporal_tables could be used instead as it is implemented in PL/SQL.
This approach, while moving a lot of complexity in queries out of the application and into the database extension, would still require keeping track of the current transaction in the application, possibly using the thread-local storage solution above.
Decision¶
For the purposes of making our Django ORM queries simpler, we will make use of Django-CRUM and thread-local storage to automate fetching the current transaction from the request (if it exists) and filtering querysets using it.
Consequences¶
Easier to read and invoke queries that must filter by the current transaction
An added dependency to maintain
Slightly “magical” code may make maintenance more difficult