Transactions

Transaction streams
Transactions are atomic
- Validation rules are applied after transaction
- Errored transactions must be removed in full
Validation rules
- A record or subrecord may only appear once per transaction
  - Test data

Updates to the Tariff are communicated as an ordered sequence of data changes. This allows Tariff modifications to be efficiently shared. If the Tariff was only ever published in full, data consumers would need to download and process many millions of unchanged records every time a change is made. Communicating only changes to the data means changes can be frequent and simple.

A transaction is a set of changes to tariff records that must be considered together. Each transaction contains one or more unordered record changes that add a new version of the record.

Data is only guaranteed to make correct sense or comply with business rules after a whole transaction has been processed. This is because a compliant change to one record may require first making changes to other records which by themselves would not be valid.

For example, modifying the start date of a described record involves modifying the start date on the record itself and also modifying the start date on its earliest description period record. A validation rule failure would occur if either change was done in isolation, so the changes need to be grouped together in a transaction.

Note that a transaction should not represent a logical unit of work of arbitrary size which corresponds to modifications to a number of independent records. A transaction should contain only the minimum unit of work that is permissable to pass validation rules. A single transaction should only make changes to a single record or its subrecords.

For example, adding an end date to a commodity code may also require adding end dates to any measures that reference the code. These changes should be split across multiple transactions: first there would be one transaction per measure to add an end date to each measure and then a final transaction to add the end date to the commodity code.

Transactions have an identifier that is unique within each transaction stream. All records with the same transaction identifier are considered part of the same transaction.

Each implementing format uses a different physical data model for transactions. For example, in the SQLite format transactions are implemented as a separate database table whereas in the XML format transactions are somewhat ephemeral and are referred to by identifier only.

Transaction streams

A transaction stream is simply an ordered sequence of transactions from a single source. Each transaction is part of a single transaction stream.

Transaction streams are append-only. Once a transaction has been published as part of a transaction stream, it cannot be removed.

Each transaction stream has its own sequence of transaction identifiers. It is not required that identifiers must be contiguous within a stream (i.e. transaction 1 does not need to be followed by transaction 2) but subsequent transactions must have a larger identifier than previous transactions (i.e. transaction 1 only needs to be followed by a transaction with ID greater than 1).

This means that, without additional context, it’s ambiguous to refer to “transaction 1” without also specifying the transaction stream, e.g. as “transaction 1 from the UK transaction stream”. Each transaction stream may have its own “transaction 1”.

Transaction streams exist to multiple sources to create transactions without each source needing to share a global sequence number. This allows sources to be independent of each other.

For example, tariff updates from the EU form one transaction stream and tariff updates from the UK form another. An implementing system can ingest data from both sources as long as it keeps track of which transaction stream the transactions are coming from, and applies validation logic accordingly.

Each system that authors tariff data should begin a new transaction stream with its own set of unique transaction identifiers. Systems that only filter transactions or modify transactions without authoring new ones do not need to make a new stream. Systems that merge transactions from different streams and publish the resulting transactions should use their own transaction stream.

Transactions are atomic

Each transaction represents an indivisible set of changes. There is no intermediate step between “before” a transaction and “after” it – either all the changes to records in a transaction are considered or none are considered at all.

Validation rules are applied after transaction

Validation rules should be applied to every transaction.

Records within a transaction should not be considered to be in any specific order. Even if a record is physically transmitted first, no validation rules should be applied to it until after the rest of the transaction is processed.

Validation rules must operate over the full state of the data resulting from all of the changes in the transaction have been applied.

Implementations may try to apply validation rules as each record is received if subsequent records received in the same transaction would not affect the result. In this case, implementations must verify their assumptions about what is allowed to be contained in the rest of the transaction, e.g. by applying the transaction validation rules described below.

Errored transactions must be removed in full

If any validation rule fails for records contained within the transaction, all of the records contained in the transaction must be completely ignored. A transaction should not be accepted if any business rule fails.

Once a transaction has failed, further transactions should not be processed. This is because further transactions may rely on records in the errored transaction and will produce spurious validation rule failures.

When a transaction has failed, implementations should immediately stop and show the validation rule failures via an appropriate method.

Validation rules

A record or subrecord may only appear once per transaction

This record means that a transaction cannot contain change the same record twice. For example:

If a record is created in a transaction, the same transaction cannot update or delete it.
Similarly if a record is updated in a transaction, the same transaction cannot update it again or delete it.

Test data

In this test data each table represents a record in a transaction. The transaction is identified by Txn #.

The Record ID is the identifying field for the record type. If the A is a record identifier, then A.B denotes some subrecord of the record.

Case 1: Modifying a record multiple times in the same transaction

The following transactions should fail this rule:

Txn #	Record ID	Update type
1	A	CREATE
1	A	UPDATE
2	B	CREATE
2	B	DELETE
3	C.D	UPDATE
3	C.D	DELETE

Case 2: Modifying a record in different transactions

The following transactions should pass this rule:

Txn #	Record ID	Update type
1	A	CREATE
2	A	UPDATE
3	B	CREATE
4	B	DELETE
5	C	UPDATE
6	C	DELETE

Case 3: Modifying a record and subrecord in the same transaction

The following transactions should pass this rule:

Txn #	Record ID	Update type
1	A	CREATE
1	A.B	CREATE
2	C	UPDATE
2	C.D	CREATE
3	E	UPDATE
3	E.F	DELETE
4	G	DELETE
4	G.H	DELETE
5	I	UPDATE
5	I.J	UPDATE