The XML matcher parses XML documents, creating a structure that adapts to the content of the XML document.
The following mapping rules apply when using the XML matcher:
@ as a prefix in the field name.
For example, attribute <... attr="..."> becomes field @attr.#text field.#text field becomes an array when mixing child elements (or comments) with multiple text content parts.<prefix:element> becomes field prefix:element.Alternatively, you can use the XML_PLAIN or XML_VERBOSE matchers with a fixed output structure.
An XML document:
<?xml version="1.0" encoding="UTF-8"?><messages xmlns:xhtml="http://www.w3.org/1999/xhtml"><thread id="1"><topic>XML Parsing</topic><!-- comment --><message id="101"><sender>Alice</sender><content type="plain"><b>text</b></content></message><message id="102"><sender>Bob</sender><content type="plain"><![CDATA[<b>text</b>]]></content></message><message id="103"><sender>John</sender><content type="xhtml">More <xhtml:b>text</xhtml:b> here.</content></message><message id="104"><sender>Mary</sender><content type="xhtml">Some <!-- hidden text --> included.</content></message></thread></messages>
Can be parsed using the pattern:
XML:xml
The result is:
With XML_PLAIN, you get a streamlined version of the XML data. The matcher discards attributes of XML elements. It can be helpful for cases where this information is unnecessary since it reduces the output structure's complexity, making it easier to work with the parsed data.
XML_PLAIN uses the following mapping rules:
<prefix:element> becomes field prefix:element.An XML document:
<?xml version="1.0" encoding="UTF-8"?><messages xmlns:xhtml="http://www.w3.org/1999/xhtml"><thread id="1"><topic>XML Parsing</topic><!-- comment --><message id="101"><sender>Alice</sender><content type="plain"><b>text</b></content></message><message id="102"><sender>Bob</sender><content type="plain"><![CDATA[<b>text</b>]]></content></message><message id="103"><sender>John</sender><content type="xhtml">More <xhtml:b>text</xhtml:b> here.</content></message><message id="104"><sender>Mary</sender><content type="xhtml">Some <!-- hidden text --> included.</content></message></thread></messages>
Can be parsed using the pattern:
XML_PLAIN:xml
The result is:
You receive the most detailed and comprehensive data structure when parsing XML documents using the XML_VERBOSE matcher. In contrast to the XML matcher, the XML_VERBOSE matcher creates an output structure that is fixed and does not depend on the presence of element attributes or child elements.
XML_VERBOSE uses the following mapping rules:
@ as a prefix in the field name.
For example, attribute <... attr="..."> becomes field @attr.#text field.#text field becomes an array when mixing child elements (or comments) with multiple text content parts.<prefix:element> becomes field prefix:element.An XML document:
<?xml version="1.0" encoding="UTF-8"?><messages xmlns:xhtml="http://www.w3.org/1999/xhtml"><thread id="1"><topic>XML Parsing</topic><!-- comment --><message id="101"><sender>Alice</sender><content type="plain"><b>text</b></content></message><message id="102"><sender>Bob</sender><content type="plain"><![CDATA[<b>text</b>]]></content></message><message id="103"><sender>John</sender><content type="xhtml">More <xhtml:b>text</xhtml:b> here.</content></message><message id="104"><sender>Mary</sender><content type="xhtml">Some <!-- hidden text --> included.</content></message></thread></messages>
Can be parsed using the pattern:
XML_VERBOSE:xml
The result is: