An automated tool that monitors all NBA team career pages for job opportunities matching your specified keywords and sends email notifications with the results.
- Smart Scraping: Handles different website structures and pagination across team sites
- Keyword Filtering: Filters jobs based on customizable keywords/phrases
- Email Notifications: Sends formatted email reports with job details and application links
- Duplicate Detection: Avoids sending duplicate job notifications
- Logging: Comprehensive logging for monitoring and debugging
- Periodic Execution: Can run as a one-time check or on a schedule
nba-career-scraper/
├── src/
│ ├── main.py # Entry point and orchestration
│ ├── scraper.py # Core scraping logic for NBA career pages
│ ├── email_sender.py # Email composition and sending
│ ├── config.py # Configuration management
│ └── utils.py # Utility functions and helpers
├── data/
│ ├── team_urls.json # Auto-discovered NBA team career URLs
│ └── keywords.json # Job keywords to search for
├── logs/ # Application logs
├── tests/ # Unit tests
├── .env.example # Environment configuration template
├── requirements.txt # Python dependencies
├── run.sh # Convenient run script
└── README.md
# Clone or download the project
cd nba-career-scraper
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Copy the environment template
cp .env.example .env
# Edit .env with your email credentials
# For Gmail, use an App Password: https://support.google.com/accounts/answer/185833Example .env configuration:
EMAIL_SENDER=[email protected]
EMAIL_PASSWORD=your_app_password
EMAIL_RECIPIENT=[email protected]
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587Edit data/keywords.json to specify the job types you're interested (or not interested) in:
{
"keywords": [
"analytics",
"data science",
"marketing",
"coaching",
"scouting",
"operations",
"management",
"player development",
"sports performance",
"communications"
],
"exclude": ["finance", "PR", "human resources"]
}# One-time check
./run.sh
# Or run directly with Python
python src/main.py
# Periodic checking (every 24 hours)
./run.sh --periodic 24
# Or with custom interval (every 12 hours)
./run.sh --periodic 12- Team Links: Stored in data/team_urls.json
- Content Scraping: For each team, it navigates to their career page and extracts job listings
- Smart Parsing: Uses multiple strategies to handle different website structures and find job information
- Keyword Matching: Filters jobs based on title and description keyword matches
- Email Reports: Formats and sends email notifications with matched jobs
- Logging: Records all activities for monitoring and troubleshooting
The scraper sends well-formatted email reports including:
- Total number of jobs found
- Jobs grouped by team
- For each job:
- Job title
- Direct application link
- Timestamp when scraped
Feel free to fork and scrape more info!
| Variable | Description | Default |
|---|---|---|
EMAIL_SENDER |
Your email address | Required |
EMAIL_PASSWORD |
Email password/app password | Required |
EMAIL_RECIPIENT |
Recipient email address | Same as sender |
SMTP_SERVER |
SMTP server address | smtp.gmail.com |
SMTP_PORT |
SMTP server port | 587 |
CHECK_INTERVAL_HOURS |
Hours between checks | 24 |
MAX_RETRIES |
Max retry attempts | 3 |
REQUEST_TIMEOUT |
Request timeout in seconds | 10 |
data/keywords.json: Customize job search keywordsdata/team_urls.json: Team URLs (can be manually edited)
Logs are automatically created in the logs/ directory with detailed information about:
- Scraping progress and results
- Email sending status
- Errors and warnings
- Performance metrics
-
Email Authentication Errors
- Use App Passwords for Gmail instead of regular passwords
- Enable 2-factor authentication first
- Check SMTP settings for other email providers
-
No Jobs Found
- Verify team URLs are still valid
- Check if keywords match available jobs
- Review logs for scraping errors
-
Website Structure Changes
- NBA team sites may update their layouts
- Check logs for parsing errors
- The scraper uses multiple fallback strategies
If a link is stale, you can manually update data/team_urls.json:
{
"Boston Celtics": "https://www.nba.com/celtics/careers",
"Los Angeles Lakers": "https://www.teamworkonline.com/basketball-jobs/los-angeles-lakers/los-angeles-lakers-jobs"
}# Run daily at 9 AM
0 9 * * * cd /path/to/nba-career-scraper && ./run.sh
# Run twice daily at 9 AM and 6 PM
0 9,18 * * * cd /path/to/nba-career-scraper && ./run.shCreate a scheduled task that runs:
Program: python
Arguments: src/main.py
Start in: C:\path\to\nba-career-scraper
# Run tests
python -m pytest tests/
# Run specific test
python -m pytest tests/test_scraper.py- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
- Rate Limiting: The scraper includes delays between requests to be respectful to NBA team websites
- Terms of Service: Ensure your usage complies with each team's website terms of service
- Data Accuracy: Job listings are scraped from public pages; always verify information on the official site
- Privacy: Store email credentials securely and never commit them to version control
This project is for educational and personal use. Please respect website terms of service and use responsibly.
Happy job hunting! 🏀