DOCX

Parsing

Parse and inspect existing .docx files with parseDocx

The parseDocx function reads an existing .docx file and provides access to its document parts.

Basic Usage

import { parseDocx } from "@office-open/docx";
import { readFileSync } from "node:fs";

const data = new Uint8Array(readFileSync("input.docx"));
const doc = parseDocx(data);

// Access document body
console.log(doc.body);

// Access styles (if present)
console.log(doc.styles);

// Access numbering (if present)
console.log(doc.numbering);

// Access settings (if present)
console.log(doc.settings);

DocxDocument API

The returned DocxDocument object contains:

PropertyTypeDescription
docParsedDocumentFull parsed document (all parts)
bodyElementDocument body element (w:body)
stylesElement | undefinedStyles element
numberingElement | undefinedNumbering definitions
settingsElement | undefinedDocument settings
fontTableElement | undefinedFont table
partRefsDocxPartRefsReferences to headers, footers, notes

Accessing Parts

const doc = parseDocx(data);

// Get all parts by path
const body = doc.doc.get("word/document.xml");
const styles = doc.doc.get("word/styles.xml");
const header = doc.doc.get(doc.partRefs.headers.get("rId1"));

// Part references map relationship IDs to paths
for (const [rId, path] of doc.partRefs.headers) {
    const header = doc.doc.get(path);
}

Working with XML Elements

The parsed elements use the @office-open/xml library's Element type:

import { attr } from "@office-open/xml";

// Access element attributes
const tagName = attr(doc.body, "tagName");

// Iterate child elements
for (const child of doc.body.elements ?? []) {
    console.log(child.name);
}

Use Cases

  • Extract text — Walk the body elements to extract paragraph text
  • Merge documents — Parse multiple .docx files and combine their content
  • Inspect formatting — Read style and numbering definitions
  • Transform — Modify parsed elements and rebuild a document
Copyright © 2026