AI-powered document management system with RESTful API and web frontend for document upload, analysis, and management.
- 🔐 Secure Authentication - JWT-based user authentication
- 📄 Multi-format Support - PDF, DOCX, CSV, Excel, JSON, TXT
- 🤖 AI-Powered Processing - OpenAI integration for document analysis
- 📦 Smart Fragmentation - Automatic chunking for large documents
- ☁️ Cloud Storage - S3 integration for scalable file storage
- 🔄 Background Processing - Asynchronous job queue with Redis
- 📊 Admin Dashboard - Web interface for system management
- 🏷️ Template System - Tag and organize documents by templates
IMPORTANT: Always create files in the correct directory:
/logs/- ALL log files (*.log)/data/- Database files (*.db)/docs/- Documentation files (*.md, *.html)/tests/- Test files (test_*.py)/uploads/- User uploaded documents/app/- Application source code/alembic/- Database migrations
rapidoc_021891240361152688586/
├── app/ # Main application code
│ ├── core/ # Core configuration
│ ├── models/ # Database models
│ ├── routes/ # API endpoints
│ ├── schemas/ # Request/response schemas
│ ├── services/ # Business logic
│ ├── utils/ # Utility functions
│ ├── frontend/ # Web interface
│ │ ├── templates/ # HTML templates
│ │ └── static/ # CSS/JS files
│ ├── scripts/ # Utility scripts
│ └── workers/ # Background workers
├── alembic/ # Database migrations
│ └── versions/ # Migration files
├── data/ # Database files
├── docs/ # Documentation
├── logs/ # Log files
├── tests/ # Test files
├── uploads/ # User uploads
└── venv/ # Virtual environment
- Python 3.8 or higher
- PostgreSQL (for production)
- Redis (for background jobs)
- AWS S3 account (optional, for cloud storage)
For detailed setup instructions, see our Development Setup Guide.
-
Setup Environment
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Run Development Server
./run_dev.sh # API only ./run_dev_with_worker.sh # API + Worker (recommended)
-
Access Application
- Web Interface: http://localhost:8000/
- API Docs: http://localhost:8000/docs
requirements.txt- Python dependenciessample.env- Environment variables templaterun_dev.sh- Development server scriptrun_dev_with_worker.sh- Server + worker scriptrun_worker.sh- Worker process scriptProcfile- Heroku deploymentalembic.ini- Database migration config
- Copy
sample.envto.env - Set required environment variables:
DATABASE_URL- Database connectionJWT_SECRET_KEY- Authentication secretOPENAI_API_KEY- OpenAI API keyUSE_S3_STORAGE- Enable S3 storage (true/false)S3_BUCKET_NAME- S3 bucket nameAWS_ACCESS_KEY_ID- AWS access keyAWS_SECRET_ACCESS_KEY- AWS secret key
Deploy to Heroku:
heroku create your-app-name
heroku addons:create heroku-postgresql:mini
heroku addons:create heroku-redis:hobby-dev
git push heroku main
heroku run python -m alembic upgrade head
heroku ps:scale web=1 worker=1- User authentication (JWT)
- Document upload/download
- Multiple file formats (PDF, DOCX, CSV, Excel, JSON, TXT)
- Document fragmentation for large files
- Template tagging system
- Background job processing
- S3 storage support
- Admin dashboard
Run tests:
python -m pytest tests/For detailed documentation, see the /docs/ directory:
API_DOCUMENTATION.md- API referenceDEPLOYMENT_GUIDE.md- Deployment instructionsWORKER_GUIDE.md- Background worker details
- Follow PEP 8 guidelines
- Use type hints where appropriate
- Write docstrings for all functions and classes
- Keep functions small and focused
- Always create log files in
/logs/ - Database files go in
/data/ - Documentation goes in
/docs/ - Test files go in
/tests/ - User uploads go in
/uploads/
We welcome contributions! Please see our Contributing Guidelines for details.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions or issues:
- Open an issue on GitHub
- Check the documentation for detailed guides
- Review troubleshooting guide
