API

Main Interface

xml4h.parse(to_parse, ignore_whitespace_text_nodes=True, adapter=None)

Parse an XML document into an xml4h-wrapped DOM representation using an underlying XML library implementation.

Parameters:
  • to_parse (a file-like object or string) – an XML document file, document string, or the path to an XML file. If a string value is given that contains a < character it is treated as literal XML data, otherwise a string value is treated as a file path.
  • ignore_whitespace_text_nodes (bool) – if True pure whitespace nodes are stripped from the parsed document, since these are usually noise introduced by XML docs serialized to be human-friendly.
  • adapter (adapter class or None) – the xml4h implementation adapter class used to parse the document and to interact with the resulting nodes. If None, best_adapter will be used.
Returns:

an xml4h.nodes.Document node representing the parsed document.

Delegates to an adapter’s parse_string() or parse_file() implementation.

xml4h.build(tagname_or_element, ns_uri=None, adapter=None)

Return a Builder that represents an element in a new or existing XML DOM and provides “chainable” methods focussed specifically on adding XML content.

Parameters:
  • tagname_or_element (string or Element node) – a string name for the root node of a new XML document, or an Element node in an existing document.
  • ns_uri (string or None) – a namespace URI to apply to the new root node. This argument has no effect this method is acting on an element.
  • adapter (adapter class or None) – the xml4h implementation adapter class used to interact with the document DOM nodes. If None, best_adapter will be used.
Returns:

a Builder instance that represents an Element node in an XML DOM.

xml4h.best_adapter

alias of cElementTreeAdapter

Builder

Builder is a utility class that makes it easy to create valid, well-formed XML documents using relatively sparse python code. The builder class works by wrapping an xml4h.nodes.Element node to provide “chainable” methods focussed specifically on adding XML content.

Each method that adds content returns a Builder instance representing the current or the newly-added element. Behind the scenes, the builder uses the xml4h.nodes node traversal and manipulation methods to add content directly to the underlying DOM.

You will not generally create Builder instances directly, but will instead call the xml4h.builder() method with the name for a new root element or with an existing xml4h.nodes.Element node.

class xml4h.builder.Builder(element)

Builder class that wraps an xml4h.nodes.Element node with methods for adding XML content to an underlying DOM.

a(*args, **kwargs)

Add one or more attributes to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.set_attributes().

attributes(*args, **kwargs)

Add one or more attributes to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.set_attributes().

attrs(*args, **kwargs)

Add one or more attributes to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.set_attributes().

c(text)

Add a coment node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_comment().

cdata(text)

Add a CDATA node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_cdata().

clone(node)

Clone a node from another document to become a child of the xml4h.nodes.Element node represented by this Builder.

Returns:a new Builder that represents the current element (not the cloned node).

Delegates to xml4h.nodes.Node.clone_node().

comment(text)

Add a coment node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_comment().

d(text)

Add a CDATA node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_cdata().

data(text)

Add a CDATA node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_cdata().

document
Returns:the xml4h.nodes.Document node that contains the element represented by this Builder.
dom_element
Returns:the xml4h.nodes.Element node represented by this Builder.
e(*args, **kwargs)

Add a child element to the xml4h.nodes.Element node represented by this Builder.

Returns:a new Builder that represents the child element.

Delegates to xml4h.nodes.Element.add_element().

elem(*args, **kwargs)

Add a child element to the xml4h.nodes.Element node represented by this Builder.

Returns:a new Builder that represents the child element.

Delegates to xml4h.nodes.Element.add_element().

element(*args, **kwargs)

Add a child element to the xml4h.nodes.Element node represented by this Builder.

Returns:a new Builder that represents the child element.

Delegates to xml4h.nodes.Element.add_element().

find(**kwargs)

Find descendants of the element represented by this builder that match the given constraints.

Returns:a list of xml4h.nodes.Element nodes

Delegates to xml4h.nodes.Node.find()

find_doc(**kwargs)

Find nodes in this element’s owning xml4h.nodes.Document that match the given constraints.

Returns:a list of xml4h.nodes.Element nodes

Delegates to xml4h.nodes.Node.find_doc().

i(target, data)

Add a processing instruction node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_instruction().

instruction(target, data)

Add a processing instruction node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_instruction().

ns_prefix(prefix, ns_uri)

Set the namespace prefix of the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.set_ns_prefix().

processing_instruction(target, data)

Add a processing instruction node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_instruction().

root
Returns:the xml4h.nodes.Element root node ancestor of the element represented by this Builder
t(text)

Add a text node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_text().

text(text)

Add a text node to the xml4h.nodes.Element node represented by this Builder.

Returns:the current Builder.

Delegates to xml4h.nodes.Element.add_text().

transplant(node)

Transplant a node from another document to become a child of the xml4h.nodes.Element node represented by this Builder.

Returns:a new Builder that represents the current element (not the transplanted node).

Delegates to xml4h.nodes.Node.transplant_node().

up(count=1, to_name=None)
Returns:

a builder representing an ancestor of the current element, by default the parent element.

Parameters:
  • count (integer >= 1 or None) – return the n’th ancestor element; defaults to 1 which means the immediate parent. If count is greater than the number of number of ancestors return the document’s root element.
  • to_name (string or None) – return the nearest ancestor element with the matching name, or the document’s root element if there are no matching elements. This argument trumps the count argument.
write(*args, **kwargs)

Write XML text for the element represented by this builder.

Delegates to xml4h.nodes.Node.write().

write_doc(*args, **kwargs)

Write XML text for the Document containing the element represented by this builder.

Delegates to xml4h.nodes.Node.write_doc().

Writer

Writer to serialize XML DOM documents or sections to text.

xml4h.writer.write_node(node, writer=None, encoding='utf-8', indent=0, newline='', omit_declaration=False, node_depth=0, quote_char='"')

Serialize an xml4h DOM node and its descendants to text, writing the output to a given writer or to stdout.

Parameters:
  • node (an xml4h.nodes.Node or subclass) – the DOM node whose content and descendants will be serialized.
  • writer (a file, stream, etc or None) – an object such as a file or stream to which XML text is sent. If None text is sent to sys.stdout.
  • encoding (string) – the character encoding for serialized text.
  • indent (string, int, bool, or None) –

    indentation prefix to apply to descendent nodes for pretty-printing. The value can take many forms:

    • int: the number of spaces to indent. 0 means no indent.
    • string: a literal prefix for indented nodes, such as \t.
    • bool: no indent if False, four spaces indent if True.
    • None: no indent.
  • newline (string, bool, or None) –

    the string value used to separate lines of output. The value can take a number of forms:

    • string: the literal newline value, such as \n or \r. An empty string means no newline.
    • bool: no newline if False, \n newline if True.
    • None: no newline.
  • omit_declaration (boolean) – if True the XML declaration header is omitted, otherwise it is included. Note that the declaration is only output when serializing an xml4h.nodes.Document node.
  • node_depth (int) – the indentation level to start at, such as 2 to indent output as if the given node has two ancestors. This parameter will only be useful if you need to output XML text fragments that can be assembled into a document. This parameter has no effect unless indentation is applied.
  • quote_char (string) – the character that delimits quoted content. You should never need to mess with this.

DOM Nodes API

class xml4h.nodes.Attribute(node, adapter)

Node representing an attribute of a Document or Element node.

class xml4h.nodes.AttributeDict(attr_impl_nodes, impl_element, adapter)

Dictionary-like object of element attributes that always reflects the state of the underlying element node, and that allows for in-place modifications that will immediately affect the element.

__weakref__

list of weak references to the object (if defined)

element
Returns:the Element that contains these attributes.
impl_attributes
Returns:the attribute node objects from the underlying XML implementation.
items()
Returns:a list of name/value attribute pairs sorted by attribute name.
keys()
Returns:a list of attribute name strings.
namespace_uri(name)
Parameters:name (string) – the name of an attribute to look up.
Returns:the namespace URI associated with the named attribute, or None.
prefix(name)
Parameters:name (string) – the name of an attribute to look up.
Returns:the prefix component of the named attribute’s name, or None.
to_dict
Returns:an OrderedDict of attribute name/value pairs.
values()
Returns:a list of attribute value strings.
class xml4h.nodes.CDATA(node, adapter)

Node representing character data in an XML document.

class xml4h.nodes.Comment(node, adapter)

Node representing a comment in an XML document.

class xml4h.nodes.Document(node, adapter)

Node representing an entire XML document.

class xml4h.nodes.DocumentFragment(node, adapter)

Node representing an XML document fragment.

class xml4h.nodes.DocumentType(node, adapter)

Node representing the type of an XML document.

class xml4h.nodes.Element(node, adapter)

Node representing an element in an XML document, with support for manipulating and adding content to the element.

add_cdata(data)

Add a character data node to this element.

Parameters:data (string) – text content to add as character data.
add_comment(text)

Add a comment node to this element.

Parameters:text (string) – text content to add as a comment.
add_element(name, ns_uri=None, attributes=None, text=None, before_this_element=False)

Add a new child element to this element, with an optional namespace definition. If no namespace is provided the child will be assigned to the default namespace.

Parameters:
  • name (string) –

    a name for the child node. The name may be used to apply a namespace to the child by including:

    • a prefix component in the name of the form ns_prefix:element_name, where the prefix has already been defined for a namespace URI (such as via set_ns_prefix()).
    • a literal namespace URI value delimited by curly braces, of the form {ns_uri}element_name.
  • ns_uri (string or None) – a URI specifying the new element’s namespace. If the name parameter specifies a namespace this parameter is ignored.
  • attributes (dict, list, tuple, or None) – collection of attributes to assign to the new child.
  • text (string or None) – text value to assign to the new child.
  • before_this_element (bool) – if True the new element is added as a sibling preceding this element, instead of as a child. In other words, the new element will be a child of this element’s parent node, and will immediately precent this element in the DOM.
Returns:

the new child as a an Element node.

add_instruction(target, data)

Add an instruction node to this element.

Parameters:text (string) – text content to add as an instruction.
add_text(text)

Add a text node to this element.

Adding text with this method is subtly different from assigning a new text value with text() accessor, because it “appends” to rather than replacing this element’s set of text nodes.

Parameters:
  • text – text content to add to this element.
  • type – string or anything that can be coerced by unicode().
attrib

Get or set this element’s attributes as name/value pairs.

Note

Setting element attributes via this accessor will remove any existing attributes, as opposed to the set_attributes() method which only updates and replaces them.

attribute_node(name, ns_uri=None)
Parameters:
  • name (string) – the name of the attribute to return.
  • ns_uri (string or None) – a URI defining a namespace constraint on the attribute.
Returns:

this element’s attributes that match ns_uri as Attribute nodes.

attribute_nodes
Returns:a list of this element’s attributes as Attribute nodes.
attributes

Get or set this element’s attributes as name/value pairs.

Note

Setting element attributes via this accessor will remove any existing attributes, as opposed to the set_attributes() method which only updates and replaces them.

attrs

Get or set this element’s attributes as name/value pairs.

Note

Setting element attributes via this accessor will remove any existing attributes, as opposed to the set_attributes() method which only updates and replaces them.

builder
Returns:a Builder representing this element with convenience methods for adding XML content.
set_attributes(attr_obj=None, ns_uri=None, **attr_dict)

Add or update this element’s attributes, where attributes can be specified in a number of ways.

Parameters:
  • attr_obj (dict, list, tuple, or None) – a dictionary or list of attribute name/value pairs.
  • ns_uri (string or None) – a URI defining a namespace for the new attributes.
  • attr_dict (dict) – attribute name and values specified as keyword arguments.
set_ns_prefix(prefix, ns_uri)

Define a namespace prefix that will serve as shorthand for the given namespace URI in element names.

Parameters:
  • prefix (string) – prefix that will serve as an alias for a the namespace URI.
  • ns_uri (string) – namespace URI that will be denoted by the prefix.
text

Get or set the text content of this element.

class xml4h.nodes.Entity(node, adapter)

Node representing an entity in an XML document.

class xml4h.nodes.EntityReference(node, adapter)

Node representing an entity reference in an XML document.

class xml4h.nodes.NameValueNodeMixin(node, adapter)

Provide methods to access node name and value attributes, where the node name may also be composed of “prefix” and “local” components.

local_name
Returns:the local component of a node name excluding any prefix.
name
Get or set the name of a node, possibly including prefix and local
components.
prefix
Returns:the namespace prefix component of a node name, or None.
value

Get or set the value of a node.

class xml4h.nodes.Node(node, adapter)

Base class for xml4h DOM nodes that represent and interact with a node in the underlying XML implementation.

__init__(node, adapter)

Construct an object that represents and wraps a DOM node in the underlying XML implementation.

Parameters:
  • node – node object from the underlying XML implementation.
  • adapter – the xml4h.impls.XmlImplAdapter subclass implementation to mediate operations on the node in the underlying XML implementation.
__weakref__

list of weak references to the object (if defined)

_convert_nodelist(impl_nodelist)

Convert a list of underlying implementation nodes into a list of xml4h wrapper nodes.

adapter
Returns:the xml4h.impls.XmlImplAdapter subclass implementation that mediates operations on the node in the underlying XML implementation.
adapter_class
Returns:the class of the xml4h.impls.XmlImplAdapter subclass implementation that mediates operations on the node in the underlying XML implementation.
ancestors
Returns:the ancestors of this node in a list ordered by proximity to this node, that is: parent, grandparent, great-grandparent etc.
child(local_name=None, name=None, ns_uri=None, node_type=None, filter_fn=None)
Returns:the first child node matching the given constraints, or None if there are no matching child nodes.

Delegates to NodeList.filter().

children
Returns:a NodeList of this node’s child nodes.
clone_node(node)

Clone a node from another document to become a child of this node, by copying the node’s data into this document but leaving the node untouched in the source document. The node to be cloned can be a Node based on the same underlying XML library implementation and adapter, or a “raw” node from that implementation.

Parameters:node (xml4h or implementation node) – the node in another document to clone.
delete(destroy=True)

Delete this node from the owning document.

Parameters:destroy (bool) – if True the child node will be destroyed in addition to being removed from the document.
Returns:the removed child node, or None if the child was destroyed.
document
Returns:the Document node that contains this node, or self if this node is the document.
find(name=None, ns_uri=None, first_only=False)

Find Element node descendants of this node, with optional constraints to limit the results.

Parameters:
  • name (string or None) – limit results to elements with this name. If None or '*' all element names are matched.
  • ns_uri (string or None) – limit results to elements within this namespace URI. If None all elements are matched, regardless of namespace.
  • first_only (bool) – if True only return the first result node or None if there is no matching node.
Returns:

a list of Element nodes matching any given constraints, or a single node if first_only=True.

find_doc(name=None, ns_uri=None, first_only=False)

Find Element node descendants of the document containing this node, with optional constraints to limit the results.

Delegates to find() applied to this node’s owning document.

find_first(name=None, ns_uri=None)

Find the first Element node descendant of this node that matches any optional constraints, or None if there are no matching elements.

Delegates to find() with first_only=True.

has_feature(feature_name)
Returns:True if a named feature is supported by the adapter implementation underlying this node.
impl_document
Returns:the document object from the underlying XML implementation that contains the node represented by this xml4h node.
impl_node
Returns:the node object from the underlying XML implementation that is represented by this xml4h node.
is_attribute
Returns:True if this is an Attribute node.
is_cdata
Returns:True if this is a CDATA node.
is_comment
Returns:True if this is a Comment node.
is_document
Returns:True if this is a Document node.
is_document_fragment
Returns:True if this is a DocumentFragment node.
is_document_type
Returns:True if this is a DocumentType node.
is_element
Returns:True if this is an Element node.
is_entity
Returns:True if this is an Entity node.
is_entity_reference
Returns:True if this is an EntityReference node.
is_notation
Returns:True if this is a Notation node.
is_processing_instruction
Returns:True if this is a ProcessingInstruction node.
is_root
Returns:True if this node is the document’s root element
is_text
Returns:True if this is a Text node.
is_type(node_type_constant)
Returns:True if this node’s int type matches the given value.
namespace_uri
Returns:this node’s namespace URI or None.
node_type
Returns:an int constant value that identifies the type of this node, such as ELEMENT_NODE or TEXT_NODE.
ns_uri
Returns:this node’s namespace URI or None.
parent
Returns:the parent of this node, or None of the node has no parent.
root
Returns:the root Element node of the document that contains this node, or self if this node is the root element.
siblings
Returns:a list of this node’s sibling nodes.
Return type:NodeList
siblings_after
Returns:a list of this node’s siblings that occur after this node in the DOM.
siblings_before
Returns:a list of this node’s siblings that occur before this node in the DOM.
transplant_node(node)

Transplant a node from another document to become a child of this node, removing it from the source document. The node to be transplanted can be a Node based on the same underlying XML library implementation and adapter, or a “raw” node from that implementation.

Parameters:node (xml4h or implementation node) – the node in another document to transplant.
write(writer=None, encoding='utf-8', indent=0, newline='', omit_declaration=False, node_depth=0, quote_char='"')

Serialize this node and its descendants to text, writing the output to a given writer or to stdout.

Parameters:
  • writer (a file, stream, etc or None) – an object such as a file or stream to which XML text is sent. If None text is sent to sys.stdout.
  • encoding (string) – the character encoding for serialized text.
  • indent (string, int, bool, or None) –

    indentation prefix to apply to descendent nodes for pretty-printing. The value can take many forms:

    • int: the number of spaces to indent. 0 means no indent.
    • string: a literal prefix for indented nodes, such as \t.
    • bool: no indent if False, four spaces indent if True.
    • None: no indent
  • newline (string, bool, or None) –

    the string value used to separate lines of output. The value can take a number of forms:

    • string: the literal newline value, such as \n or \r. An empty string means no newline.
    • bool: no newline if False, \n newline if True.
    • None: no newline.
  • omit_declaration (boolean) – if True the XML declaration header is omitted, otherwise it is included. Note that the declaration is only output when serializing an xml4h.nodes.Document node.
  • node_depth (int) – the indentation level to start at, such as 2 to indent output as if the given node has two ancestors. This parameter will only be useful if you need to output XML text fragments that can be assembled into a document. This parameter has no effect unless indentation is applied.
  • quote_char (string) – the character that delimits quoted content. You should never need to mess with this.

Delegates to xml4h.writer.write_node() applied to this node.

write_doc(*args, **kwargs)

Serialize to text the document containing this node, writing the output to a given writer or stdout.

Delegates to write()

xml(indent=4, **kwargs)
Returns:this node as XML text.

Delegates to write()

xml_doc(**kwargs)
Returns:the document containing this node as XML text.

Delegates to xml()

class xml4h.nodes.NodeAttrAndChildElementLookupsMixin

Perform “magical” lookup of a node’s attributes via dict-style keyword reference, and child elements via class attribute reference.

__getattr__(child_name)

Retrieve this node’s child element by tag name regardless of the elements namespace, assuming the name given doesn’t match an existing attribute or method.

Parameters:child_name (string) – tag name of the child element to look up. To avoid name clashes with class attributes the child name may includes a trailing underscore (_) character, which is removed to get the real child tag name. The child name must not begin with underscore characters.
Returns:the type of the return value depends on how many child elements match the name:
  • a single Element node if only one child element matches
  • a list of Element nodes if there is more than 1 match.
Raise:AttributeError if the node has no child element with the given name, or if the given name does not match the required pattern.
__getitem__(attr_name)

Retrieve this node’s attribute value by name using dict-style keyword lookup.

Parameters:attr_name (string) – name of the attribute. If the attribute has a namespace prefix that must be included, in other words the name must be a qname not local name.
Raise:KeyError if the node has no such attribute.
__weakref__

list of weak references to the object (if defined)

class xml4h.nodes.NodeList

Custom implementation for Node lists that provides additional functionality, such as node filtering.

__call__(local_name=None, name=None, ns_uri=None, node_type=None, filter_fn=None, first_only=False)

Apply filters to the set of nodes in this list.

Parameters:
  • local_name (string or None) – a local name used to filter the nodes.
  • name (string or None) – a name used to filter the nodes.
  • ns_uri (string or None) – a namespace URI used to filter the nodes. If None all nodes are returned regardless of namespace.
  • node_type (int node type constant, class, or None) – a node type definition used to filter the nodes.
  • filter_fn (function or None) –

    an arbitrary function to filter nodes in this list. This function must accept a single Node argument and return a bool indicating whether to include the node in the filtered results.

    Note

    if filter_fn is provided all other filter arguments are ignore.

Returns:

the type of the return value depends on the value of the first_only parameter and how many nodes match the filter:

  • if first_only=False return a NodeList of filtered nodes, which will be empty if there are no matching nodes.
  • if first_only=True and at least one node matches, return the first matching Node
  • if first_only=True and there are no matching nodes, return None

__weakref__

list of weak references to the object (if defined)

filter(local_name=None, name=None, ns_uri=None, node_type=None, filter_fn=None, first_only=False)

Apply filters to the set of nodes in this list.

Parameters:
  • local_name (string or None) – a local name used to filter the nodes.
  • name (string or None) – a name used to filter the nodes.
  • ns_uri (string or None) – a namespace URI used to filter the nodes. If None all nodes are returned regardless of namespace.
  • node_type (int node type constant, class, or None) – a node type definition used to filter the nodes.
  • filter_fn (function or None) –

    an arbitrary function to filter nodes in this list. This function must accept a single Node argument and return a bool indicating whether to include the node in the filtered results.

    Note

    if filter_fn is provided all other filter arguments are ignore.

Returns:

the type of the return value depends on the value of the first_only parameter and how many nodes match the filter:

  • if first_only=False return a NodeList of filtered nodes, which will be empty if there are no matching nodes.
  • if first_only=True and at least one node matches, return the first matching Node
  • if first_only=True and there are no matching nodes, return None

first
Returns:the first of the available children nodes, or None if there are no children.
class xml4h.nodes.Notation(node, adapter)

Node representing a notation in an XML document.

class xml4h.nodes.ProcessingInstruction(node, adapter)

Node representing a processing instruction in an XML document.

data

Get or set the value of a node.

target
Get or set the name of a node, possibly including prefix and local
components.
class xml4h.nodes.Text(node, adapter)

Node representing text content in an XML document.

class xml4h.nodes.XPathMixin

Provide xpath() method to nodes that support XPath searching.

__weakref__

list of weak references to the object (if defined)

xpath(xpath, **kwargs)

Perform an XPath query on the current node.

Parameters:
  • xpath (string) – XPath query.
  • kwargs (dict) – Optional keyword arguments that are passed through to the underlying XML library implementation.
Returns:

results of the query as a list of Node objects, or a list of base type objects if the XPath query does not reference node objects.

XML Libarary Adapters

class xml4h.impls.interface.XmlImplAdapter(document)

Base class that defines how xml4h interacts with an underlying XML library that the adaptor “wraps” to provide additional (or at least different) functionality.

This class should be treated as an abstract class. It provides some common implementation code used by all xml4h adapter implementations, but mostly it sketches out the methods the real implementaiton subclasses must provide.

clear_caches()

Clear any in-adapter cached data, for cases where cached data could become outdated e.g. by making DOM changes directly outside of xml4h.

This is a no-op if the implementing adapter has no cached data.

find_node_elements(node, name='*', ns_uri='*')
Returns:

element node descendents of the given node that match the search constraints.

Parameters:
  • node – a node object from the underlying XML library.
  • name (string) – only elements with a matching name will be returned. If the value is * all names will match.
  • ns_uri (string) – only elements with a matching namespace URI will be returned. If the value is * all namespaces will match.
get_ns_info_from_node_name(name, impl_node)

Return a three-element tuple with the prefix, local name, and namespace URI for the given element/attribute name (in the context of the given node’s hierarchy). If the name has no associated prefix or namespace information, None is return for those tuple members.

classmethod has_feature(feature_name)
Returns:True if a named feature is supported by this adapter.
classmethod ignore_whitespace_text_nodes(wrapped_node)

Find and delete any text nodes containing nothing but whitespace in in the given node and its descendents.

This is useful for cleaning up excess low-value text nodes in a document DOM after parsing a pretty-printed XML document.

classmethod is_available()
Returns:True if this adapter’s underlying XML library is available in the Python environment.
class xml4h.impls.lxml_etree.LXMLAdapter(document)

Adapter to the lxml XML library implementation.

find_node_elements(node, name='*', ns_uri='*')
Returns:

element node descendents of the given node that match the search constraints.

Parameters:
  • node – a node object from the underlying XML library.
  • name (string) – only elements with a matching name will be returned. If the value is * all names will match.
  • ns_uri (string) – only elements with a matching namespace URI will be returned. If the value is * all namespaces will match.
xpath_on_node(node, xpath, **kwargs)

Return result of performing the given XPath query on the given node.

All known namespace prefix-to-URI mappings in the document are automatically included in the XPath invocation.

If an empty/default namespace (i.e. None) is defined, this is converted to the prefix name ‘_’ so it can be used despite empty namespace prefixes being unsupported by XPath.

class xml4h.impls.xml_etree_elementtree.ElementTreeAdapter(document)

Adapter to the ElementTree XML library.

This code must work with either the base ElementTree pure python implementation or the C-based cElementTree implementation, since it is reused in the cElementTree class defined below.

find_node_elements(node, name='*', ns_uri='*')
Returns:

element node descendents of the given node that match the search constraints.

Parameters:
  • node – a node object from the underlying XML library.
  • name (string) – only elements with a matching name will be returned. If the value is * all names will match.
  • ns_uri (string) – only elements with a matching namespace URI will be returned. If the value is * all namespaces will match.
xpath_on_node(node, xpath, **kwargs)

Return result of performing the given XPath query on the given node.

All known namespace prefix-to-URI mappings in the document are automatically included in the XPath invocation.

If an empty/default namespace (i.e. None) is defined, this is converted to the prefix name ‘_’ so it can be used despite empty namespace prefixes being unsupported by XPath.

class xml4h.impls.xml_etree_elementtree.cElementTreeAdapter(document)

Adapter to the C-based implementation of the ElementTree XML library.

class xml4h.impls.xml_dom_minidom.XmlDomImplAdapter(document)

Adapter to the minidom XML library implementation.

get_node_text(node)

Return contatenated value of all text node children of this element

set_node_text(node, text)

Set text value as sole Text child node of element; any existing Text nodes are removed

Custom Exceptions

Custom xml4h exceptions.

exception xml4h.exceptions.FeatureUnavailableException

User has attempted to use a feature that is available in some xml4h implementations/adapters, but is not available in the current one.

exception xml4h.exceptions.IncorrectArgumentTypeException(arg, expected_types)

Richer flavour of a ValueError that describes exactly what argument types are expected.

exception xml4h.exceptions.UnknownNamespaceException

User has attempted to refer to an unknown or undeclared namespace by prefix or URI.

exception xml4h.exceptions.Xml4hException

Base exception class for all non-standard exceptions raised by xml4h.

exception xml4h.exceptions.Xml4hImplementationBug

xml4h implementation has a bug, probably.