Docx Proofreader with OpenAI

This repository provides a Python script for proofreading .docx documents using OpenAI models. The script checks for spelling, grammar, and agreement errors in any language, splits large texts into manageable chunks, and saves a detailed error report in plain text format.

Features

Proofreads .docx files for spelling, typo, grammar, and agreement errors using advanced OpenAI models.
Handles large documents by automatically splitting them into model-sized chunks.
Error findings are clearly listed and explained in the output.
Model selection is easy: choose your preferred model with a command-line flag.
Each run produces a report with the model name, date, and source file included in the header and filename.
Robust API error handling: Retries API calls on rate limits and network issues.
Easy to use: Just specify your .docx input file, and receive a full report as a .txt file.

Requirements

Python 3.7 or higher
Required libraries:
- openai
- python-docx
- tiktoken
A valid OpenAI API key.

To install the requirements, run:

pip install openai python-docx tiktoken

Setup

Configuration File

Create a config.json file in the same directory as the script and add your OpenAI API key:

{
    "OPENAI_API_KEY": "your-api-key-here"
}

Usage

Place your input .docx file (e.g., input.docx) in the script directory.
Run the script from the command line. Basic usage:
```
python proofreader.py
```
Specify a different model or input/output file if you want:
```
python proofreader.py --model o3 --input myfile.docx
```
The script will create an output file with a name like:
```
proofreading_report_gpt-4o_20250702_1334.txt
```
containing a detailed error report, the model name, the date, and the name of the original file.

Output

The report is saved as a plain text .txt file.
The file header shows the date, model name, and source file.
Each section corresponds to a chunk of your document, with errors listed and explained.
If no errors are found, the report will state so.

Customization

To change which model is used by default, edit the DEFAULT_MODEL variable at the top of the script.
To adjust chunk size or API retry behavior, modify the CHUNK_TOKENS or RETRIES variables.
You can further refine the proofreading prompt for other languages or criteria in the SYSTEM_PROMPT variable in the script.

Example Command

python proofreader.py --model o3 --input document_to_check.docx --output my_report.txt

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE.md		LICENSE.md
README.md		README.md
opanai_proofreader.py		opanai_proofreader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Docx Proofreader with OpenAI

Features

Requirements

Setup

Configuration File

Usage

Output

Customization

Example Command

About

Uh oh!

Releases

Packages

Languages

License

gasparl/openai_proofreader

Folders and files

Latest commit

History

Repository files navigation

Docx Proofreader with OpenAI

Features

Requirements

Setup

Configuration File

Usage

Output

Customization

Example Command

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages