CORE

Parser

Parse and modify existing OOXML documents with ParsedDocument

The parser reads .docx and .pptx files into a ParsedDocument that provides a key-value store interface for accessing and modifying XML parts.

Parse a Document

import { readFileSync } from "node:fs";
import { parseDocument } from "@office-open/core";

const data = readFileSync("document.docx");
const doc = parseDocument(data);

ParsedDocument API

Read Parts

// Read an XML part as an Element tree
const styles = doc.get("word/styles.xml");

// Read binary data (images, media)
const image = doc.getRaw("word/media/image1.png");

// Check if a part exists
doc.has("word/document.xml"); // true

// List all paths
doc.keys(); // ["[Content_Types].xml", "word/document.xml", ...]
doc.keys("word/media/"); // ["word/media/image1.png"]

Write Parts

import { Element } from "@office-open/xml";

// Write an XML part
doc.set("word/styles.xml", modifiedStylesElement);

// Write binary data
doc.setRaw("word/media/image2.png", imageBuffer);

Remove Parts

doc.remove("word/settings.xml"); // true if removed

Save

// Serialize back to ZIP buffer
const modified = doc.save();
writeFileSync("modified.docx", modified);

Complete Example

import { readFileSync, writeFileSync } from "node:fs";
import { parseDocument } from "@office-open/core";
import { findChild, children } from "@office-open/xml";

const doc = parseDocument(readFileSync("input.docx"));

// Read the main document body
const documentXml = doc.get("word/document.xml");

// List all media files
const mediaFiles = doc.keys("word/media/");
console.log("Media files:", mediaFiles);

// Save a copy
writeFileSync("output.docx", doc.save());

Use with @office-open/xml

Combine ParsedDocument with @office-open/xml query utilities to inspect and modify XML content:

import { parseDocument } from "@office-open/core";
import { findChild, children, childText } from "@office-open/xml";

const doc = parseDocument(readFileSync("input.docx"));
const body = doc.get("word/document.xml");

// Find all paragraphs
const paragraphs = children(body, "w:p");

// Get text from first paragraph
const text = childText(paragraphs[0], "w:t");
Copyright © 2026