The most famous redaction failures in recent memory have not been about black bars over text. They've been about tracked changes. A document is drafted, edited, revised by multiple people, finalized, "cleaned up" by Accepting All Changes, and sent out. Then someone unzips the .docx file (it's a ZIP archive), opens word/document.xml in a text editor, and reads every edit that was ever made — including the ones that were rejected.
What "Accept All" actually does
When you click "Accept All Changes" in Word, the application replaces the insertion or deletion markup with the resolved text. What it does not always do is remove the metadata about who made those edits, when, or what alternative was rejected.
Specifically, these XML elements can survive Accept All:
w:rPrChange— a record that a paragraph's formatting was changed, with the previous formatting preservedw:pPrChange— a record that a paragraph property (alignment, spacing) was changedw:moveFrom/w:moveTo— records that text was moved within the document- Author and date attributes inside revision markers — often preserved even after content resolution
The case files
The pattern of "redaction by Accept All" failures is consistent across organizations:
- A government agency releases a Word document with sensitive paragraphs marked "redacted" by deleting them as tracked changes
- Someone clicks Accept All before sending
- Someone else opens the .docx, looks at the XML, finds the rejected/deleted content sitting in
w:delelements with author and date - The "redacted" content is now public
This has happened to law firms (releasing draft contracts with rejected counterproposals visible), to government agencies (releasing FOIA-responsive documents with redacted-by-deletion content), and to corporations (publishing official statements that show every internal edit).
Why Word's Document Inspector doesn't always catch this
Microsoft's "Inspect Document" feature has improved over the years, but it operates at the application layer. It removes the things Word's API exposes — comments, document properties — but doesn't always strip every revision artifact from the XML. And it definitely doesn't strip EXIF from JPEGs embedded inside the document.
Even Document Inspector's most thorough mode often leaves behind:
- Custom XML parts — used by CRM systems, document management platforms, and enterprise templates to inject metadata into every document
- Template paths in
docProps/app.xml— sometimes revealing a full network share path - Embedded image EXIF — every JPEG in
word/media/retains its full EXIF
What proper scrubbing looks like
To actually clean a DOCX, you need to:
- Unzip the archive (it's a ZIP)
- Rewrite
docProps/core.xmlanddocProps/app.xmlwith empty values - Delete
docProps/custom.xmlif it exists - Parse
word/document.xmland remove everyw:ins,w:del,w:rPrChange,w:pPrChange,w:moveFrom,w:moveToelement (accepting insertions by keeping the content, rejecting deletions by removing the element entirely) - Delete
word/comments.xmland any extended-comments files - Walk through
word/media/and strip EXIF from every embedded JPEG - Repackage the archive
This is what our DOCX metadata removal tool does in your browser, in about a second per file. No upload, no signup.
Beyond metadata
Metadata removal handles the hidden stuff. Visible-content redaction is a separate problem — if you wrote sensitive text into the body of the document, you need to delete and re-save, not just strip metadata. Always do a final visual review before sending.