A TypeScript-based web scraper that extracts supplement information from Examine.com. This tool uses Puppeteer for web scraping and provides structured data about various supplements, including their benefits, side effects, and scientific names. The output contains limited data as would occur in console JSON output.
- Web scraping of supplement information from Examine.com
- Input validation using Zod
- Structured data output with TypeScript interfaces
- Headless browser automation using Puppeteer
- Error handling and logging
This project is working on main and nutrish-scraper branches for minimum requirements.
The nutrish-ai-chatbot branch provides some additional steps for chatbot AI applications for preferred pequirements. Especially creating API via Vercel. However, this part is not fully completed.
- Node.js (LTS version recommended)
- npm or yarn package manager
- TypeScript knowledge for development
- Clone the repository:
- Install dependencies:
git clone https://github.com/FCimendere/nutrish-case-implementation.git
cd nutrish-case-implementation
npm install
You can run the scraper using the following command:
npm run scrape
Or directly with ts-node:
npm run start
The scraper returns data in the following format:
typescriptCopyinterface SupplementData {
name: string;
scientificName: string;
overview: string;
benefits: string[];
sideEffects: string[];
otherNames: string[];
}
├── src/
│ ├── interfaces/
│ │ └── supplement.interface.ts
│ ├── services/
│ │ └── scraper.service.ts
│ └── index.ts
├── package.json
└── tsconfig.json
- Built with TypeScript for type safety
- Uses Puppeteer for browser automation
- Implements Zod for input validation
- Configured with strict TypeScript settings
- Includes proper error handling and logging
To build the project:
npm run build
The compiled JavaScript files will be output to the dist directory.
- puppeteer: Web scraping and browser automation
- zod: Runtime type checking and validation
- typescript: Static type checking
- ts-node: TypeScript execution environment
- This scraper is designed for educational purposes
- Please respect Examine.com's terms of service and rate limiting
- Ensure proper error handling when using in production
The project can scrape on Puppeteer and Typescript as a backend service. For better performance, the backend can be turned into an API endpoint via VERCEL as a Backend Project. The user can simply enter the word they want to search next to the Vercel link and perform a query query. The Vercel AI chatbot starter project at https://github.com/vercel/ai-chatbot can be run locally or deployed on Vercel. Then this API response can be sent to the Chatbot.
- Convert to Vercel Serverless Functions
- CORS Configuration
- API Authentication
- Vercel Configuration
- Required Dependencies
- Frontend Integration
ISC
[Fulya Cimendere]Note: This project is not affiliated with Examine.com and is for educational purposes only.