A web application that extracts numerical data points from graph images using OpenAI Vision API. Upload images of graphs (line plots, scatter plots, etc.) and get the data exported to CSV and Excel formats.
- Image Upload: Support for PNG, JPG, JPEG, and PDF graph images
- AI-Powered Extraction: Uses OpenAI GPT-4o-mini via OpenRouter to analyze graphs and extract data points
- Accurate Coordinate System: Extracts data using graph coordinates (axes intersection as origin) rather than image pixel positions
- Multiple Output Formats: Downloads data as CSV and Excel files
- Web Interface: Simple drag-and-drop upload interface
- API Endpoint: REST API for programmatic access
- Graph Type Detection: Automatically detects graph types or accepts manual specification
- Axis Labeling: Supports custom X and Y axis labels
- Python 3.8+
- OpenRouter API key (for OpenAI Vision access)
- Flask
- OpenAI Python client
- Pandas
- Other dependencies listed in
req.txt
-
Clone or download this repository:
git clone <repository-url> cd graph_data_extractor
-
Install dependencies:
pip install -r req.txt
-
Create a
.envfile in the root directory with your API keys:OPENROUTER_API_KEY=your_openrouter_api_key_here FLASK_SECRET_KEY=your_secret_key_here -
Run the application:
python app.py
-
Open your browser and navigate to
http://localhost:5000
- Open the application in your browser
- Upload a graph image by dragging and dropping or clicking to select
- Optionally specify graph type, X-axis label, and Y-axis label
- Click "Extract Data" to process the image
- Download the extracted data as CSV or Excel files
Send a POST request to /api/extract with JSON payload:
{
"image": "data:image/png;base64,<base64-encoded-image>",
"graph_type": "linear",
"x_label": "Time",
"y_label": "Value"
}Response format:
{
"graph_type": "linear",
"x_axis": {"label": "Time", "unit": "s"},
"y_axis": {"label": "Value", "unit": "V"},
"data_points": [
{"x": 1.0, "y": 2.5, "uncertainty": 0.1},
{"x": 2.0, "y": 3.2, "uncertainty": 0.1}
],
"summary": {
"total_points": 2,
"x_range": [1.0, 2.0],
"y_range": [2.5, 3.2],
"confidence": 0.95,
"notes": []
}
}GET /: Main web interfacePOST /upload: File upload endpoint (used by web interface)POST /api/extract: Direct API for data extractionGET /download/<filename>: Download extracted data files
- Upload Folder: Files are temporarily stored in
uploads/directory - Max File Size: 16MB limit per upload
- Supported Formats: PNG, JPG, JPEG, PDF
OPENROUTER_API_KEY: Required for OpenAI Vision API accessFLASK_SECRET_KEY: Secret key for Flask sessions (optional, defaults to dev key)
- Image Processing: Uploaded images are encoded and sent to OpenAI Vision API
- AI Analysis: The AI analyzes the graph, identifies axes, and extracts data points using graph coordinates
- Data Extraction: Points are measured relative to the graph origin (axes intersection)
- Output Generation: Extracted data is formatted into CSV and Excel files for download
- API Key Issues: Ensure your OpenRouter API key is valid and has sufficient credits
- Large Files: Reduce image size if uploads fail due to size limits
- Inaccurate Extraction: Try specifying graph type and axis labels manually for better results
- Origin Detection: The system now correctly uses graph origin instead of image corners
This project is open source. Please check the license file for details.
Contributions are welcome! Please submit issues and pull requests on the project repository.