Parsers and Namespaces

class importer.parsers.BooleanElement(*args, true_value: str = '1', false_value: str = '0', **kwargs)[source]

Represents an element which contains a true or false value.

The actual value in the XML by default is assumed to be a 1 for True and a 0 for False. This can be customised by passing in different values.

<msg:some.value>1</msg:some.value>
<msg:some.value>0</msg:some.value>
clean()[source]

Clean up data.

native_type

alias of bool

class importer.parsers.CompoundElement(tag: importer.namespaces.Tag, *extra_fields: str, separator: str = '|')[source]

Represents an element in XML that is actually a concatenation of one or more logical values and separators.

The separator by default is assumed to be a pipe character. The parsed data will always contain a tuple that is the size of the number of expected fields (the original field and any extras) – if less than the specified number of separators occur the rightmost fields will have value None.

<msg:some.value>one|two|three</msg:some.value>
clean()[source]

Clean up data.

native_type

alias of tuple

class importer.parsers.ConstantElement(tag: importer.namespaces.Tag, value: str)[source]

Represents an element that is always a constant value in the XML.

The actual value is ignored and not put into the database. The value specified in the constructor will be put back into the XML.

clean()[source]

Clean up data.

class importer.parsers.ElementParser(tag: importer.namespaces.Tag = None, many: bool = False, depth: int = 1)[source]

Base class for element specific parsers.

ElementParser classes uses introspection to build a lookup table of child element parsers to their output JSON field name.

This allows 2 options for adding child elements to a Parent element.

Option 1:

class ChildElement(ElementParser):
    tag = Tag("child", prefix="ns")
    field = TextElement("field")

class ParentElement(ElementParser):
    tag = Tag("parent", prefix="ns")
    child = ChildElement()

Option 2:

class ParentElement(ElementParser):
    tag = Tag("parent", prefix="ns")


@ParentElement.register_child("child")
class ChildElement(ElementParser):
    tag = Tag("child", prefix="ns")
    some_field = TextElement("field")

When handling XML such as:

<ns:parent>
    <ns:child id="2">
        <ns:field>Text</ns:field>
    </ns:child>
</ns:parent>

This class will build a JSON object in self.data with the following structure:

{"child": {"id": 2, "field": "Text"}}
clean()[source]

Clean up data.

is_parser_for_element(parser: importer.parsers.ElementParser, element: xml.etree.ElementTree.Element) bool[source]

Check if the parser matches the element.

record_code: str

The type id of this model’s type family in the TARIC specification.

This number groups together a number of different models into ‘records’. Where two models share a record code, they are conceptually expressing different properties of the same logical model.

In theory each Transaction should only contain models with a single record_code (but differing subrecord_code.)

start(element: xml.etree.ElementTree.Element, parent: importer.parsers.ElementParser = None)[source]

Handle the start of an XML tag. The tag may not yet have all of its children.

We have a few cases where there are tags nested within a tag of the same name.

Example:

<oub:additional.code>
    <oub:additional.code.sid>00000001</oub:additional.code.sid>
    <oub:additional.code.type.id>A</oub:additional.code.type.id>
    <oub:additional.code>AAA</oub:additional.code>
    <oub:validity.start.date>2021-01-01</oub:validity.start.date>
</oub:additional.code>

In this case matching on tags is not enough and so we also need to keep track of whether this parser is already parsing an element. If it is, we don’t want to select any child parsers. If it is not, we know that this is an element that this parser should be parsing.

subrecord_code: str

The type id of this model in the TARIC specification. The subrecord_code when combined with the record_code uniquely identifies the type within the specification.

The subrecord code gives the intended order for models in a transaction, with comparatively smaller subrecord codes needing to come before larger ones.

validate()[source]

Validate data.

class importer.parsers.IntElement(*args, format: str = 'FM99999999999999999999')[source]

Represents an element which contains an integer value.

<msg:record.code>430</msg:record.code>
native_type

alias of int

exception importer.parsers.InvalidDataError[source]
exception importer.parsers.ParserError[source]
class importer.parsers.RangeLowerElement(tag: importer.namespaces.Tag = None, many: bool = False, depth: int = 1)[source]

Represents an element that is the lower part of a range.

class importer.parsers.RangeUpperElement(tag: importer.namespaces.Tag = None, many: bool = False, depth: int = 1)[source]

Represents an element that is the upper part of a range.

class importer.parsers.TextElement(tag: importer.namespaces.Tag = None, many: bool = False, depth: int = 1)[source]

Represents an element which contains a text value.

<msg:record.code>Example Text</msg:record.code>
native_type

alias of str

class importer.parsers.ValidityMixin[source]

Parse validity start and end dates.

class importer.parsers.ValidityStartMixin[source]

Parse validity start date.

class importer.parsers.ValueElementMixin[source]

Provides a convenient way to define a parser for elements that contain only a text value and have no attributes or children.

native_type: type

The Python type that most closely matches the type of the XML element.

class importer.parsers.Writable[source]

A parser which implements the Writable interface can write its changes to the database.

Not all TARIC3 elements correspond to database entities (particularly simple text elements, but also envelopes and app.messages).

create(data: Mapping[str, Any], transaction_id: int)[source]

Preps the given data as a create record and submits it to the nursery for processing.

delete(data: Mapping[str, Any], transaction_id: int)[source]

Delete a DB record with provided data.

update(data: Mapping[str, Any], transaction_id: int)[source]

Update a DB record with provided data.

Provides dataclasses and config classes for xml elements and the taric schema.

class importer.namespaces.SchemaTagsBase[source]

Provides a base dataclass for schema element tag definitions.

class importer.namespaces.Tag(name: str, prefix: str = 'ns2', nsmap: Dict[str, str] = <factory>)[source]

A dataclass for xml element tags.

name corresponds to the name attribute of the Element element in the XML Schema.

prefix reflects namespace prefixes defined in the taric3 and envelope xsd-s.

nsmap this is a prefix-namespace mapping in the format required by xml.etree.ElementTree

first(parent: xml.etree.ElementTree.Element) xml.etree.ElementTree.Element[source]

Returns the first descendant of the parent matching this tag’s name.

property is_pattern: bool

Returns true if the tag name is a regex pattern.

iter(parent: xml.etree.ElementTree.Element) Iterator[xml.etree.ElementTree.Element][source]

Returns an iterator of descendants of the parent matching this tag’s name.

property namespace: str

Returns the namespace for the tag.

property pattern

Returns a compiled regex pattern.

property prefixed_name: str

Returns the prefixed element tag.

property qualified_name: str

Returns a fully qualified element tag.

importer.namespaces.make_schema_dataclass(xsd_schema_paths: Dict[str, str]) importer.namespaces.TTags[source]

Returns a dynamic dataclass with taric schema element tag definitions.

importer.namespaces.xsd_schema_paths: Dict[str, str] = (('env', PosixPath('/home/runner/work/tamato/tamato/common/assets/envelope.xsd')), ('oub', PosixPath('/home/runner/work/tamato/tamato/common/assets/taric3.xsd')))

Define additional groups in the below dictionary for use as a record_group argument to importer.chunker.chunk_taric.

Check importer.forms.UploadTaricForm.save for example usage when users check the ‘Commodities Only’ box in /importers/create.

The only group defined at the moment is commodities, which is easily extensible to additional record groups.