CORE

Archive

读写 OOXML ZIP 归档,解析关系文件

OOXML 文件(.docx、.pptx)是包含 XML 部件的 ZIP 归档。archive 模块提供了读写这些归档的底层工具。

读取归档

unzipToMap

将 OOXML 文件解压为 Map<string, Uint8Array>

import { readFileSync } from "node:fs";
import { unzipToMap } from "@office-open/core";

const zip = unzipToMap(readFileSync("document.docx"));

读取函数

import { readTextFromZip, readXmlFromZip, readBinaryFromZip } from "@office-open/core";

// 读取文本内容
const contentTypes = readTextFromZip(zip, "[Content_Types].xml");

// 读取并解析 XML
const documentXml = readXmlFromZip(zip, "word/document.xml");

// 读取二进制数据(图片等)
const imageData = readBinaryFromZip(zip, "word/media/image1.png");

readAllXmlParts

解析归档中的所有 XML 部件,跳过二进制文件:

import { readAllXmlParts } from "@office-open/core";

const parts = readAllXmlParts(zip);
// { "[Content_Types].xml": Element, "word/document.xml": Element, ... }

listFiles

按路径前缀列出文件:

import { listFiles } from "@office-open/core";

const mediaFiles = listFiles(zip, "word/media/");
// ["word/media/image1.png", "word/media/image2.jpg"]

写入归档

zipToBuffer

从文件映射创建 ZIP 缓冲区:

import { zipToBuffer } from "@office-open/core";

const files = new Map<string, Uint8Array | string>();
files.set("word/document.xml", xmlString);
files.set("word/media/image.png", imageBuffer);

const zipBuffer = zipToBuffer(files);

关系文件

OOXML 使用 .rels 文件定义部件之间的关系。

parseRels

解析关系文件:

import { parseRels } from "@office-open/core";

const rels = parseRels(zip, "word/_rels/document.xml.rels");
// [{ id: "rId1", target: "styles.xml", type: "...", targetMode: "External" }, ...]

Relationship 接口

interface Relationship {
    id: string;
    target: string;
    type: string;
    targetMode?: string;
}

完整示例

import { readFileSync, writeFileSync } from "node:fs";
import { unzipToMap, zipToBuffer, readXmlFromZip, listFiles } from "@office-open/core";

// 读取归档
const zip = unzipToMap(readFileSync("input.docx"));

// 列出所有部件
const allFiles = listFiles(zip, "");
console.log("文件:", allFiles);

// 读取 XML 部件
const document = readXmlFromZip(zip, "word/document.xml");

// 修改映射
zip.delete("word/settings.xml");

// 写入修改后的归档
writeFileSync("output.docx", zipToBuffer(zip));

工具函数

函数说明
uint8ToBase64(data)Uint8Array 转换为 base64 字符串
getImageType(fileName)根据文件扩展名判断图片类型
elementToXml(el)Element 序列化为 XML 字符串
Copyright © 2026