API¶

Main Interface¶

xml4h.parse(to_parse, ignore_whitespace_text_nodes=True, adapter=None)[source]¶

Parse an XML document into an xml4h-wrapped DOM representation using an underlying XML library implementation.

Parameters:

to_parse (a file-like object or string) – an XML document file, document bytes, or the path to an XML file. If a bytes value is given that contains a < character it is treated as literal XML data, otherwise a bytes value is treated as a file path.
ignore_whitespace_text_nodes (bool) – if True pure whitespace nodes are stripped from the parsed document, since these are usually noise introduced by XML docs serialized to be human-friendly.
adapter (adapter class or None) – the xml4h implementation adapter class used to parse the document and to interact with the resulting nodes. If None, best_adapter will be used.

Returns:

an xml4h.nodes.Document node representing the parsed document.

Delegates to an adapter’s parse_string() or parse_file() implementation.

xml4h.build(tagname_or_element, ns_uri=None, adapter=None)[source]¶

Return a Builder that represents an element in a new or existing XML DOM and provides “chainable” methods focussed specifically on adding XML content.

Parameters:

tagname_or_element (string or Element node) – a string name for the root node of a new XML document, or an Element node in an existing document.
ns_uri (string or None) – a namespace URI to apply to the new root node. This argument has no effect this method is acting on an element.
adapter (adapter class or None) – the xml4h implementation adapter class used to interact with the document DOM nodes. If None, best_adapter will be used.

Returns:

a Builder instance that represents an Element node in an XML DOM.

xml4h.best_adapter¶: alias of xml4h.impls.xml_etree_elementtree.cElementTreeAdapter

Builder¶

Builder is a utility class that makes it easy to create valid, well-formed XML documents using relatively sparse python code. The builder class works by wrapping an xml4h.nodes.Element node to provide “chainable” methods focussed specifically on adding XML content.

Each method that adds content returns a Builder instance representing the current or the newly-added element. Behind the scenes, the builder uses the xml4h.nodes node traversal and manipulation methods to add content directly to the underlying DOM.

You will not generally create Builder instances directly, but will instead call the xml4h.builder() method with the name for a new root element or with an existing xml4h.nodes.Element node.

class xml4h.builder.Builder(element)[source]¶

Builder class that wraps an xml4h.nodes.Element node with methods for adding XML content to an underlying DOM.

a(*args, **kwargs)¶: Alias of attributes()

attributes(*args, **kwargs)[source]¶

Add one or more attributes to the xml4h.nodes.Element node represented by this Builder.

Returns:	the current Builder.

Delegates to xml4h.nodes.Element.set_attributes().

attrs(*args, **kwargs)¶: Alias of attributes()

c(text)¶: Alias of comment()

cdata(text)[source]¶

Add a CDATA node to the xml4h.nodes.Element node represented by this Builder.

Returns:	the current Builder.

Delegates to xml4h.nodes.Element.add_cdata().

clone(node)[source]¶

Clone a node from another document to become a child of the xml4h.nodes.Element node represented by this Builder.

Returns:	a new Builder that represents the current element (not the cloned node).

Delegates to xml4h.nodes.Node.clone_node().

comment(text)[source]¶

Add a coment node to the xml4h.nodes.Element node represented by this Builder.

Returns:	the current Builder.

Delegates to xml4h.nodes.Element.add_comment().

d(text)¶: Alias of cdata()

data(text)¶: Alias of cdata()

document¶

Returns:	the `xml4h.nodes.Document` node that contains the element represented by this Builder.

dom_element¶

Returns:	the `xml4h.nodes.Element` node represented by this Builder.

e(*args, **kwargs)¶: Alias of element()

elem(*args, **kwargs)¶: Alias of element()

element(*args, **kwargs)[source]¶

Add a child element to the xml4h.nodes.Element node represented by this Builder.

Returns:	a new Builder that represents the child element.

Delegates to xml4h.nodes.Element.add_element().

find(**kwargs)[source]¶

Find descendants of the element represented by this builder that match the given constraints.

Returns:	a list of `xml4h.nodes.Element` nodes

Delegates to xml4h.nodes.Node.find()

find_doc(**kwargs)[source]¶

Find nodes in this element’s owning xml4h.nodes.Document that match the given constraints.

Returns:	a list of `xml4h.nodes.Element` nodes

Delegates to xml4h.nodes.Node.find_doc().

i(target, data)¶: Alias of processing_instruction()

instruction(target, data)¶: Alias of processing_instruction()

ns_prefix(prefix, ns_uri)[source]¶

Set the namespace prefix of the xml4h.nodes.Element node represented by this Builder.

Returns:	the current Builder.

Delegates to xml4h.nodes.Element.set_ns_prefix().

processing_instruction(target, data)[source]¶

Add a processing instruction node to the xml4h.nodes.Element node represented by this Builder.

Returns:	the current Builder.

Delegates to xml4h.nodes.Element.add_instruction().

root¶

Returns:	the `xml4h.nodes.Element` root node ancestor of the element represented by this Builder

t(text)¶: Alias of text()

text(text)[source]¶

Add a text node to the xml4h.nodes.Element node represented by this Builder.

Returns:	the current Builder.

Delegates to xml4h.nodes.Element.add_text().

transplant(node)[source]¶

Transplant a node from another document to become a child of the xml4h.nodes.Element node represented by this Builder.

Returns:	a new Builder that represents the current element (not the transplanted node).

Delegates to xml4h.nodes.Node.transplant_node().

up(count_or_element_name=1)[source]¶

Returns:	a builder representing an ancestor of the current element, by default the parent element.
Parameters:	count_or_element_name (integer or string) – when an integer, return the n’th ancestor element up to the document’s root element. when a string, return the nearest ancestor element with that name, or the document’s root element if there are no matching ancestors. Defaults to integer value 1 which means the immediate parent.

write(*args, **kwargs)[source]¶

Write XML bytes for the element represented by this builder.

Delegates to xml4h.nodes.Node.write().

write_doc(*args, **kwargs)[source]¶

Write XML bytes for the Document containing the element represented by this builder.

Delegates to xml4h.nodes.Node.write_doc().

xml(**kwargs)[source]¶

Returns:	XML string for the element represented by this builder.

Delegates to xml4h.nodes.Node.xml().

xml_doc(**kwargs)[source]¶

Returns:	XML string for the Document containing the element represented by this builder.

Delegates to xml4h.nodes.Node.xml_doc().

Writer¶

Writer to serialize XML DOM documents or sections to text.

xml4h.writer.write_node(node, writer, encoding='utf-8', indent=0, newline='', omit_declaration=False, node_depth=0, quote_char='"')[source]¶

Serialize an xml4h DOM node and its descendants to text, writing the output to the given writer.

Parameters:

node (an xml4h.nodes.Node or subclass) – the DOM node whose content and descendants will be serialized.
writer (a file, stream, etc) – a file or stream to which XML text is written.
encoding (string) – the character encoding for serialized text.
indent (string, int, bool, or None) –
indentation prefix to apply to descendent nodes for pretty-printing. The value can take many forms:
- int: the number of spaces to indent. 0 means no indent.
- string: a literal prefix for indented nodes, such as \t.
- bool: no indent if False, four spaces indent if True.
- None: no indent.
newline (string, bool, or None) –
the string value used to separate lines of output. The value can take a number of forms:
- string: the literal newline value, such as \n or \r. An empty string means no newline.
- bool: no newline if False, \n newline if True.
- None: no newline.
omit_declaration (boolean) – if True the XML declaration header is omitted, otherwise it is included. Note that the declaration is only output when serializing an xml4h.nodes.Document node.
node_depth (int) – the indentation level to start at, such as 2 to indent output as if the given node has two ancestors. This parameter will only be useful if you need to output XML text fragments that can be assembled into a document. This parameter has no effect unless indentation is applied.
quote_char (string) – the character that delimits quoted content. You should never need to mess with this.

DOM Nodes API¶

class xml4h.nodes.Attribute(node, adapter)[source]¶: Node representing an attribute of a Document or Element node.

class xml4h.nodes.AttributeDict(attr_impl_nodes, impl_element, adapter)[source]¶

Dictionary-like object of element attributes that always reflects the state of the underlying element node, and that allows for in-place modifications that will immediately affect the element.

__init__(attr_impl_nodes, impl_element, adapter)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__repr__()[source]¶: Return repr(self).

__weakref__¶: list of weak references to the object (if defined)

element¶

Returns:	the `Element` that contains these attributes.

impl_attributes¶

Returns:	the attribute node objects from the underlying XML implementation.

items()[source]¶

Returns:	a list of name/value attribute pairs sorted by attribute name.

keys()[source]¶

Returns:	a list of attribute name strings.

namespace_uri(name)[source]¶

Parameters:	name (string) – the name of an attribute to look up.
Returns:	the namespace URI associated with the named attribute, or None.

prefix(name)[source]¶

Parameters:	name (string) – the name of an attribute to look up.
Returns:	the prefix component of the named attribute’s name, or None.

to_dict¶

Returns:	an `OrderedDict` of attribute name/value pairs.

values()[source]¶

Returns:	a list of attribute value strings.

class xml4h.nodes.CDATA(node, adapter)[source]¶: Node representing character data in an XML document.

class xml4h.nodes.Comment(node, adapter)[source]¶: Node representing a comment in an XML document.

class xml4h.nodes.Document(node, adapter)[source]¶: Node representing an entire XML document.

class xml4h.nodes.DocumentFragment(node, adapter)[source]¶: Node representing an XML document fragment.

class xml4h.nodes.DocumentType(node, adapter)[source]¶: Node representing the type of an XML document.

class xml4h.nodes.Element(node, adapter)[source]¶

Node representing an element in an XML document, with support for manipulating and adding content to the element.

add_cdata(data)[source]¶

Add a character data node to this element.

Parameters:	data (string) – text content to add as character data.

add_comment(text)[source]¶

Add a comment node to this element.

Parameters:	text (string) – text content to add as a comment.

add_element(name, ns_uri=None, attributes=None, text=None, before_this_element=False)[source]¶

Add a new child element to this element, with an optional namespace definition. If no namespace is provided the child will be assigned to the default namespace.

Parameters:

name (string) –
a name for the child node. The name may be used to apply a namespace to the child by including:
- a prefix component in the name of the form ns_prefix:element_name, where the prefix has already been defined for a namespace URI (such as via set_ns_prefix()).
- a literal namespace URI value delimited by curly braces, of the form {ns_uri}element_name.
ns_uri (string or None) – a URI specifying the new element’s namespace. If the name parameter specifies a namespace this parameter is ignored.
attributes (dict, list, tuple, or None) – collection of attributes to assign to the new child.
text (string or None) – text value to assign to the new child.
before_this_element (bool) – if True the new element is added as a sibling preceding this element, instead of as a child. In other words, the new element will be a child of this element’s parent node, and will immediately precent this element in the DOM.

Returns:

the new child as a an Element node.

add_instruction(target, data)[source]¶

Add an instruction node to this element.

Parameters:	text (string) – text content to add as an instruction.

add_text(text)[source]¶

Add a text node to this element.

Adding text with this method is subtly different from assigning a new text value with text() accessor, because it “appends” to rather than replacing this element’s set of text nodes.

Parameters:	text – text content to add to this element. type – string or anything that can be coerced by `unicode()`.

attrib¶: Alias of attributes()

attribute_node(name, ns_uri=None)[source]¶

Parameters:	name (string) – the name of the attribute to return. ns_uri (string or None) – a URI defining a namespace constraint on the attribute.
Returns:	this element’s attributes that match `ns_uri` as `Attribute` nodes.

attribute_nodes¶

Returns:	a list of this element’s attributes as `Attribute` nodes.

attributes¶: Get or set this element’s attributes as name/value pairs.

Note

Setting element attributes via this accessor will remove any existing attributes, as opposed to the set_attributes() method which only updates and replaces them.

attrs¶: Alias of attributes()

builder¶

Returns:	a `Builder` representing this element with convenience methods for adding XML content.

set_attributes(attr_obj=None, ns_uri=None, **attr_dict)[source]¶

Add or update this element’s attributes, where attributes can be specified in a number of ways.

Parameters:	attr_obj (dict, list, tuple, or None) – a dictionary or list of attribute name/value pairs. ns_uri (string or None) – a URI defining a namespace for the new attributes. attr_dict (dict) – attribute name and values specified as keyword arguments.

set_ns_prefix(prefix, ns_uri)[source]¶

Define a namespace prefix that will serve as shorthand for the given namespace URI in element names.

Parameters:	prefix (string) – prefix that will serve as an alias for a the namespace URI. ns_uri (string) – namespace URI that will be denoted by the prefix.

text¶: Get or set the text content of this element.

class xml4h.nodes.Entity(node, adapter)[source]¶: Node representing an entity in an XML document.

class xml4h.nodes.EntityReference(node, adapter)[source]¶: Node representing an entity reference in an XML document.

class xml4h.nodes.NameValueNodeMixin(node, adapter)[source]¶

Provide methods to access node name and value attributes, where the node name may also be composed of “prefix” and “local” components.

__repr__()[source]¶: Return repr(self).

local_name¶

Returns:	the local component of a node name excluding any prefix.

name¶: Get the name of a node, possibly including prefix and local components.

prefix¶

Returns:	the namespace prefix component of a node name, or None.

value¶: Get or set the value of a node.

class xml4h.nodes.Node(node, adapter)[source]¶

Base class for xml4h DOM nodes that represent and interact with a node in the underlying XML implementation.

XMLNS_URI = 'http://www.w3.org/2000/xmlns/'¶: URI constant for XMLNS

__eq__(other)[source]¶: Return self==value.

__init__(node, adapter)[source]¶

Construct an object that represents and wraps a DOM node in the underlying XML implementation.

Parameters:	node – node object from the underlying XML implementation. adapter – the `xml4h.impls.XmlImplAdapter` subclass implementation to mediate operations on the node in the underlying XML implementation.

__repr__()[source]¶: Return repr(self).

__weakref__¶: list of weak references to the object (if defined)

_convert_nodelist(impl_nodelist)[source]¶: Convert a list of underlying implementation nodes into a list of xml4h wrapper nodes.

adapter¶

Returns:	the `xml4h.impls.XmlImplAdapter` subclass implementation that mediates operations on the node in the underlying XML implementation.

adapter_class¶

Returns:	the `class` of the `xml4h.impls.XmlImplAdapter` subclass implementation that mediates operations on the node in the underlying XML implementation.

ancestors¶

Returns:	the ancestors of this node in a list ordered by proximity to this node, that is: parent, grandparent, great-grandparent etc.

child(local_name=None, name=None, ns_uri=None, node_type=None, filter_fn=None)[source]¶

Returns:	the first child node matching the given constraints, or None if there are no matching child nodes.

Delegates to NodeList.filter().

children¶

Returns:	a `NodeList` of this node’s child nodes.

clone_node(node)[source]¶

Clone a node from another document to become a child of this node, by copying the node’s data into this document but leaving the node untouched in the source document. The node to be cloned can be a Node based on the same underlying XML library implementation and adapter, or a “raw” node from that implementation.

Parameters:	node (xml4h or implementation node) – the node in another document to clone.

delete(destroy=True)[source]¶

Delete this node from the owning document.

Parameters:	destroy (bool) – if True the child node will be destroyed in addition to being removed from the document.
Returns:	the removed child node, or None if the child was destroyed.

document¶

Returns:	the `Document` node that contains this node, or `self` if this node is the document.

find(name=None, ns_uri=None, first_only=False)[source]¶

Find Element node descendants of this node, with optional constraints to limit the results.

Parameters:	name (string or None) – limit results to elements with this name. If None or `''` all element names are matched. ns_uri* (string or None) – limit results to elements within this namespace URI. If None all elements are matched, regardless of namespace. first_only (bool) – if True only return the first result node or None if there is no matching node.
Returns:	a list of `Element` nodes matching any given constraints, or a single node if `first_only=True`.

find_doc(name=None, ns_uri=None, first_only=False)[source]¶

Find Element node descendants of the document containing this node, with optional constraints to limit the results.

Delegates to find() applied to this node’s owning document.

find_first(name=None, ns_uri=None)[source]¶

Find the first Element node descendant of this node that matches any optional constraints, or None if there are no matching elements.

Delegates to find() with first_only=True.

has_feature(feature_name)[source]¶

Returns:	True if a named feature is supported by the adapter implementation underlying this node.

impl_document¶

Returns:	the document object from the underlying XML implementation that contains the node represented by this xml4h node.

impl_node¶

Returns:	the node object from the underlying XML implementation that is represented by this xml4h node.

is_attribute¶

Returns:	True if this is an `Attribute` node.

is_cdata¶

Returns:	True if this is a `CDATA` node.

is_comment¶

Returns:	True if this is a `Comment` node.

is_document¶

Returns:	True if this is a `Document` node.

is_document_fragment¶

Returns:	True if this is a `DocumentFragment` node.

is_document_type¶

Returns:	True if this is a `DocumentType` node.

is_element¶

Returns:	True if this is an `Element` node.

is_entity¶

Returns:	True if this is an `Entity` node.

is_entity_reference¶

Returns:	True if this is an `EntityReference` node.

is_notation¶

Returns:	True if this is a `Notation` node.

is_processing_instruction¶

Returns:	True if this is a `ProcessingInstruction` node.

is_root¶

Returns:	True if this node is the document’s root element

is_text¶

Returns:	True if this is a `Text` node.

is_type(node_type_constant)[source]¶

Returns:	True if this node’s int type matches the given value.

namespace_uri¶

Returns:	this node’s namespace URI or None.

node_type¶

Returns:	an int constant value that identifies the type of this node, such as `ELEMENT_NODE` or `TEXT_NODE`.

ns_uri¶: Alias for namespace_uri()

parent¶

Returns:	the parent of this node, or None of the node has no parent.

root¶

Returns:	the root `Element` node of the document that contains this node, or `self` if this node is the root element.

siblings¶

Returns:	a list of this node’s sibling nodes.
Return type:	NodeList

siblings_after¶

Returns:	a list of this node’s siblings that occur after this node in the DOM.

siblings_before¶

Returns:	a list of this node’s siblings that occur before this node in the DOM.

transplant_node(node)[source]¶

Transplant a node from another document to become a child of this node, removing it from the source document. The node to be transplanted can be a Node based on the same underlying XML library implementation and adapter, or a “raw” node from that implementation.

Parameters:	node (xml4h or implementation node) – the node in another document to transplant.

write(writer, encoding='utf-8', indent=0, newline='', omit_declaration=False, node_depth=0, quote_char='"')[source]¶

Serialize this node and its descendants to text, writing the output to the given writer.

Parameters:

writer (a file, stream, etc) – a file or stream to which XML text is written.
encoding (string) – the character encoding for serialized text.
indent (string, int, bool, or None) –
indentation prefix to apply to descendent nodes for pretty-printing. The value can take many forms:
- int: the number of spaces to indent. 0 means no indent.
- string: a literal prefix for indented nodes, such as \t.
- bool: no indent if False, four spaces indent if True.
- None: no indent
newline (string, bool, or None) –
the string value used to separate lines of output. The value can take a number of forms:
- string: the literal newline value, such as \n or \r. An empty string means no newline.
- bool: no newline if False, \n newline if True.
- None: no newline.
omit_declaration (boolean) – if True the XML declaration header is omitted, otherwise it is included. Note that the declaration is only output when serializing an xml4h.nodes.Document node.
node_depth (int) – the indentation level to start at, such as 2 to indent output as if the given node has two ancestors. This parameter will only be useful if you need to output XML text fragments that can be assembled into a document. This parameter has no effect unless indentation is applied.
quote_char (string) – the character that delimits quoted content. You should never need to mess with this.

Delegates to xml4h.writer.write_node() applied to this node.

write_doc(writer, *args, **kwargs)[source]¶

Serialize to text the document containing this node, writing the output to the given writer.

Parameters:	writer (a file, stream, etc) – a file or stream to which XML text is written.

Delegates to write()

xml(encoding='utf-8', indent=4, **kwargs)[source]¶

Returns:	this node as an XML string.

Delegates to write()

xml_doc(encoding='utf-8', **kwargs)[source]¶

Returns:	the document containing this node as an XML string.

Delegates to xml()

class xml4h.nodes.NodeAttrAndChildElementLookupsMixin[source]¶

Perform “magical” lookup of a node’s attributes via dict-style keyword reference, and child elements via class attribute reference.

__getattr__(child_name)[source]¶

Retrieve this node’s child element by tag name regardless of the elements namespace, assuming the name given doesn’t match an existing attribute or method.

Parameters:	child_name (string) – tag name of the child element to look up. To avoid name clashes with class attributes the child name may includes a trailing underscore (`_`) character, which is removed to get the real child tag name. The child name must not begin with underscore characters.
Returns:	the type of the return value depends on how many child elements match the name: a single `Element` node if only one child element matches a list of `Element` nodes if there is more than 1 match.
Raise:	AttributeError if the node has no child element with the given name, or if the given name does not match the required pattern.

__getitem__(attr_name)[source]¶

Retrieve this node’s attribute value by name using dict-style keyword lookup.

Parameters:	attr_name (string) – name of the attribute. If the attribute has a namespace prefix that must be included, in other words the name must be a qname not local name.
Raise:	KeyError if the node has no such attribute.

__weakref__¶: list of weak references to the object (if defined)

class xml4h.nodes.NodeList[source]¶

Custom implementation for Node lists that provides additional functionality, such as node filtering.

__call__(local_name=None, name=None, ns_uri=None, node_type=None, filter_fn=None, first_only=False)¶: Alias for filter().

__weakref__¶: list of weak references to the object (if defined)

filter(local_name=None, name=None, ns_uri=None, node_type=None, filter_fn=None, first_only=False)[source]¶

Apply filters to the set of nodes in this list.

Parameters:

local_name (string or None) – a local name used to filter the nodes.
name (string or None) – a name used to filter the nodes.
ns_uri (string or None) – a namespace URI used to filter the nodes. If None all nodes are returned regardless of namespace.
node_type (int node type constant, class, or None) – a node type definition used to filter the nodes.
filter_fn (function or None) –
an arbitrary function to filter nodes in this list. This function must accept a single Node argument and return a bool indicating whether to include the node in the filtered results.

Note

if filter_fn is provided all other filter arguments are ignore.

Returns:

the type of the return value depends on the value of the first_only parameter and how many nodes match the filter:

if first_only=False return a NodeList of filtered nodes, which will be empty if there are no matching nodes.
if first_only=True and at least one node matches, return the first matching Node
if first_only=True and there are no matching nodes, return None

first¶

Returns:	the first of the available children nodes, or None if there are no children.

class xml4h.nodes.Notation(node, adapter)[source]¶: Node representing a notation in an XML document.

class xml4h.nodes.ProcessingInstruction(node, adapter)[source]¶

Node representing a processing instruction in an XML document.

data¶: Get or set the value of a node.

target¶: Get the name of a node, possibly including prefix and local components.

class xml4h.nodes.Text(node, adapter)[source]¶: Node representing text content in an XML document.

class xml4h.nodes.XPathMixin[source]¶

Provide xpath() method to nodes that support XPath searching.

__weakref__¶: list of weak references to the object (if defined)

xpath(xpath, **kwargs)[source]¶

Perform an XPath query on the current node.

Parameters:	xpath (string) – XPath query. kwargs (dict) – Optional keyword arguments that are passed through to the underlying XML library implementation.
Returns:	results of the query as a list of `Node` objects, or a list of base type objects if the XPath query does not reference node objects.

XML Libarary Adapters¶

class xml4h.impls.interface.XmlImplAdapter(document)[source]¶

Base class that defines how xml4h interacts with an underlying XML library that the adaptor “wraps” to provide additional (or at least different) functionality.

This class should be treated as an abstract class. It provides some common implementation code used by all xml4h adapter implementations, but mostly it sketches out the methods the real implementaiton subclasses must provide.

clear_caches()[source]¶

Clear any in-adapter cached data, for cases where cached data could become outdated e.g. by making DOM changes directly outside of xml4h.

This is a no-op if the implementing adapter has no cached data.

find_node_elements(node, name='*', ns_uri='*')[source]¶

Returns:	element node descendents of the given node that match the search constraints.
Parameters:	node – a node object from the underlying XML library. name (string) – only elements with a matching name will be returned. If the value is `` all names will match. ns_uri* (string) – only elements with a matching namespace URI will be returned. If the value is `*` all namespaces will match.

get_ns_info_from_node_name(name, impl_node)[source]¶: Return a three-element tuple with the prefix, local name, and namespace URI for the given element/attribute name (in the context of the given node’s hierarchy). If the name has no associated prefix or namespace information, None is return for those tuple members.

classmethod has_feature(feature_name)[source]¶

Returns:	True if a named feature is supported by this adapter.

classmethod ignore_whitespace_text_nodes(wrapped_node)[source]¶

Find and delete any text nodes containing nothing but whitespace in in the given node and its descendents.

This is useful for cleaning up excess low-value text nodes in a document DOM after parsing a pretty-printed XML document.

classmethod is_available()[source]¶

Returns:	True if this adapter’s underlying XML library is available in the Python environment.

class xml4h.impls.lxml_etree.LXMLAdapter(document)[source]¶

Adapter to the lxml XML library implementation.

find_node_elements(node, name='*', ns_uri='*')[source]¶

Returns:	element node descendents of the given node that match the search constraints.
Parameters:	node – a node object from the underlying XML library. name (string) – only elements with a matching name will be returned. If the value is `` all names will match. ns_uri* (string) – only elements with a matching namespace URI will be returned. If the value is `*` all namespaces will match.

classmethod is_available()[source]¶

Returns:	True if this adapter’s underlying XML library is available in the Python environment.

xpath_on_node(node, xpath, **kwargs)[source]¶

Return result of performing the given XPath query on the given node.

All known namespace prefix-to-URI mappings in the document are automatically included in the XPath invocation.

If an empty/default namespace (i.e. None) is defined, this is converted to the prefix name ‘_’ so it can be used despite empty namespace prefixes being unsupported by XPath.

class xml4h.impls.xml_etree_elementtree.ElementTreeAdapter(document)[source]¶

Adapter to the ElementTree XML library.

This code must work with either the base ElementTree pure python implementation or the C-based cElementTree implementation, since it is reused in the cElementTree class defined below.

ET = <module 'xml.etree.ElementTree' from '/home/docs/.pyenv/versions/3.7.3/lib/python3.7/xml/etree/ElementTree.py'>¶

clear_caches()[source]¶

Clear any in-adapter cached data, for cases where cached data could become outdated e.g. by making DOM changes directly outside of xml4h.

This is a no-op if the implementing adapter has no cached data.

find_node_elements(node, name='*', ns_uri='*')[source]¶

Returns:	element node descendents of the given node that match the search constraints.
Parameters:	node – a node object from the underlying XML library. name (string) – only elements with a matching name will be returned. If the value is `` all names will match. ns_uri* (string) – only elements with a matching namespace URI will be returned. If the value is `*` all namespaces will match.

classmethod is_available()[source]¶

Returns:	True if this adapter’s underlying XML library is available in the Python environment.

xpath_on_node(node, xpath, **kwargs)[source]¶

Return result of performing the given XPath query on the given node.

All known namespace prefix-to-URI mappings in the document are automatically included in the XPath invocation.

If an empty/default namespace (i.e. None) is defined, this is converted to the prefix name ‘_’ so it can be used despite empty namespace prefixes being unsupported by XPath.

class xml4h.impls.xml_etree_elementtree.cElementTreeAdapter(document)[source]¶

Adapter to the C-based implementation of the ElementTree XML library.

classmethod is_available()[source]¶

Returns:	True if this adapter’s underlying XML library is available in the Python environment.

class xml4h.impls.xml_dom_minidom.XmlDomImplAdapter(document)[source]¶

Adapter to the minidom XML library implementation.

find_node_elements(node, name='*', ns_uri='*')[source]¶

Returns:	element node descendents of the given node that match the search constraints.
Parameters:	node – a node object from the underlying XML library. name (string) – only elements with a matching name will be returned. If the value is `` all names will match. ns_uri* (string) – only elements with a matching namespace URI will be returned. If the value is `*` all namespaces will match.

get_node_text(node)[source]¶: Return contatenated value of all text node children of this element

classmethod is_available()[source]¶

Returns:	True if this adapter’s underlying XML library is available in the Python environment.

set_node_text(node, text)[source]¶: Set text value as sole Text child node of element; any existing Text nodes are removed

Custom Exceptions¶

Custom xml4h exceptions.

exception xml4h.exceptions.FeatureUnavailableException[source]¶: User has attempted to use a feature that is available in some xml4h implementations/adapters, but is not available in the current one.

exception xml4h.exceptions.IncorrectArgumentTypeException(arg, expected_types)[source]¶: Richer flavour of a ValueError that describes exactly what argument types are expected.

exception xml4h.exceptions.UnknownNamespaceException[source]¶: User has attempted to refer to an unknown or undeclared namespace by prefix or URI.

exception xml4h.exceptions.Xml4hException[source]¶: Base exception class for all non-standard exceptions raised by xml4h.

exception xml4h.exceptions.Xml4hImplementationBug[source]¶: xml4h implementation has a bug, probably.