Introducing AnonymizeReviewer: Remove Author Metadata from Word Docs Without Losing Edits
When you’re collaborating on a Microsoft Word document, tracked changes and comments are essential tools. However, when it’s time to share those files outside your team, another concern often arises: the names behind every comment and edit are embedded directly within the document.
There are times when anonymity is important. Whether you’re simplifying client reviews, preparing materials for a broader audience, or just removing unnecessary distractions, Word doesn’t make it easy to cleanly strip names without accepting all changes or removing valuable context.
To solve that problem, I created AnonymizeReviewer, a lightweight Python script that lets you anonymize Word files while preserving everything that matters.
Why the Built-In Options in Word Aren’t Enough
Microsoft Word includes a feature called “Remove personal information on save.” While helpful in theory, it behaves inconsistently depending on your version and settings. It also doesn’t catch all the places author names can appear.
Here are some of the areas where Word stores author names:
- Tracked changes, including insertions, deletions, and formatting edits
- Comment threads and replies
- Style definitions, which retain metadata on who last modified them
- Document properties, like Author and Last Modified By
- Headers, footers, and sometimes embedded fields
Manually scrubbing each of these is time-consuming and prone to error. It’s easy to miss something. That’s why I built a more dependable solution.
What AnonymizeReviewer Does
AnonymizeReviewer reads the .docx file as a zip archive, scans its internal XML files, and replaces all instances of the original author name with a placeholder or alternate name of your choosing. It does this without altering the tracked changes or comments themselves.
Key features include:
- Replaces author names in tracked changes and comment metadata
- Strips identifying information from document properties
- Scans headers, footers, styles, and related files
- Supports single-file processing or entire folders
- Offers both an interactive prompt and command-line argument support
- Runs entirely offline and is open source under the MIT License
How to Get Started
First, clone the repository and install the Python dependency:
git clone https://github.com/gregvarghese/AnonymizeReviewer.git
cd AnonymizeReviewer
pip install -r requirements.txt
To run in interactive mode with file picker prompts:
python anonymize_docx.py
To run with arguments:
python anonymize_docx.py \
--input "/Users/you/Documents/Contract.docx" \
--output "/Users/you/Documents/Contract - Anonymized.docx" \
--old-name "John Doe" \
--new-name "Reviewer"
To batch-process all .docx files in a folder:
python anonymize_docx.py \
--folder "/Users/you/Documents/Contracts" \
--old-name "John Doe" \
--new-name "Reviewer"
Files that already end in “ – Anonymized.docx” will be skipped automatically to avoid duplication.
Why I Wrote It
This tool originated from a real-world use case. I needed a way to hand off tracked-change documents for client review without exposing internal names. Word did not provide a clean solution, and most online options were either unreliable or required uploading sensitive files.
By targeting the actual structure of a .docx file and addressing all areas where names can be stored, AnonymizeReviewer fills that gap, giving you control over what you share.
Open Source and Available Now
The project is available on GitHub at https://github.com/gregvarghese/AnonymizeReviewer
Pull requests and feedback are welcome. If you have ideas for additional features, such as redacting specific comment content or cleaning embedded metadata beyond names, I would love to hear them.
License
This project is licensed under the MIT License.
The software is provided as is, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and noninfringement.