XML
Parsing
Parse XML strings into Element trees with configurable options
Basic Parsing
import { xml2js } from "@office-open/xml";
const root = xml2js(`<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:r><w:t>Hello</w:t></w:r>
</w:p>`);
// root.name === "w:p"
// root.elements[0].name === "w:r"
The xml2js function converts an XML string into an Element tree following the xml-js format.
xml2js(xml, options?)
const root = xml2js(xmlString, {
trim: true,
ignoreComment: true,
});
Options
| Option | Type | Default | Description |
|---|---|---|---|
trim | boolean | false | Trim whitespace in text nodes |
ignoreDeclaration | boolean | false | Skip XML declaration (<?xml ...?>) |
ignoreComment | boolean | false | Skip XML comments (<!-- -->) |
ignoreCdata | boolean | false | Skip CDATA sections |
ignoreDoctype | boolean | false | Skip DOCTYPE declarations |
ignoreText | boolean | false | Skip text nodes |
nativeTypeAttributes | boolean | false | Convert attribute values to native types |
Element Structure
Each parsed node is an Element object — see for the full interface. An element typically uses name, attributes, and elements:
interface Element {
type?: string;
name?: string;
attributes?: Attributes;
elements?: Element[];
text?: string | number | boolean;
// ... plus cdata, comment, declaration, etc.
}
Text content lives on the element's text field, not in a separate node:
// parsed <w:t>Hello World</w:t>
{ type: "element", name: "w:t", text: "Hello World" }
Namespace Handling
Namespaces are preserved as regular attributes:
const root = xml2js(`<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:r><w:t>Hello</w:t></w:r>
</w:p>`);
// root.attributes["xmlns:w"] === "http://schemas.openxmlformats.org/..."
All element names include their prefix, so you query using "w:p", "w:r", etc.
xml2js / xml2json
For xml-js compatibility, aliases are available:
import { xml2js, xml2json } from "@office-open/xml";
const element = xml2js(xmlString);
const jsonString = xml2json(xmlString);
Reading from ZIP Archives
Combine with @office-open/core to parse XML from OOXML files:
import { readFileSync } from "node:fs";
import { parseArchive } from "@office-open/core";
const archive = parseArchive(readFileSync("document.docx"));
const document = archive.get("word/document.xml");
// document is already an Element tree