NBA Career Opportunities Scraper

An automated tool that monitors all NBA team career pages for job opportunities matching your specified keywords and sends email notifications with the results.

🏀 Features

Smart Scraping: Handles different website structures and pagination across team sites
Keyword Filtering: Filters jobs based on customizable keywords/phrases
Email Notifications: Sends formatted email reports with job details and application links
Duplicate Detection: Avoids sending duplicate job notifications
Logging: Comprehensive logging for monitoring and debugging
Periodic Execution: Can run as a one-time check or on a schedule

📁 Project Structure

nba-career-scraper/
├── src/
│   ├── main.py            # Entry point and orchestration
│   ├── scraper.py         # Core scraping logic for NBA career pages
│   ├── email_sender.py    # Email composition and sending
│   ├── config.py          # Configuration management
│   └── utils.py           # Utility functions and helpers
├── data/
│   ├── team_urls.json     # Auto-discovered NBA team career URLs
│   └── keywords.json      # Job keywords to search for
├── logs/                  # Application logs
├── tests/                 # Unit tests
├── .env.example          # Environment configuration template
├── requirements.txt      # Python dependencies
├── run.sh               # Convenient run script
└── README.md

🚀 Quick Start

1. Installation

# Clone or download the project
cd nba-career-scraper

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Configuration

# Copy the environment template
cp .env.example .env

# Edit .env with your email credentials
# For Gmail, use an App Password: https://support.google.com/accounts/answer/185833

Example .env configuration:

EMAIL_SENDER=[email protected]
EMAIL_PASSWORD=your_app_password
EMAIL_RECIPIENT=[email protected]
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587

3. Customize Keywords

Edit data/keywords.json to specify the job types you're interested (or not interested) in:

{
  "keywords": [
    "analytics",
    "data science",
    "marketing",
    "coaching",
    "scouting",
    "operations",
    "management",
    "player development",
    "sports performance",
    "communications"
  ],
  "exclude": ["finance", "PR", "human resources"]
}

4. Run the Scraper

# One-time check
./run.sh

# Or run directly with Python
python src/main.py

# Periodic checking (every 24 hours)
./run.sh --periodic 24

# Or with custom interval (every 12 hours)
./run.sh --periodic 12

🔧 How It Works

Team Links: Stored in data/team_urls.json
Content Scraping: For each team, it navigates to their career page and extracts job listings
Smart Parsing: Uses multiple strategies to handle different website structures and find job information
Keyword Matching: Filters jobs based on title and description keyword matches
Email Reports: Formats and sends email notifications with matched jobs
Logging: Records all activities for monitoring and troubleshooting

📧 Email Format

The scraper sends well-formatted email reports including:

Total number of jobs found
Jobs grouped by team
For each job:
- Job title
- Direct application link
- Timestamp when scraped

Feel free to fork and scrape more info!

⚙️ Configuration Options

Environment Variables

Variable	Description	Default
`EMAIL_SENDER`	Your email address	Required
`EMAIL_PASSWORD`	Email password/app password	Required
`EMAIL_RECIPIENT`	Recipient email address	Same as sender
`SMTP_SERVER`	SMTP server address	smtp.gmail.com
`SMTP_PORT`	SMTP server port	587
`CHECK_INTERVAL_HOURS`	Hours between checks	24
`MAX_RETRIES`	Max retry attempts	3
`REQUEST_TIMEOUT`	Request timeout in seconds	10

Data Files

data/keywords.json: Customize job search keywords
data/team_urls.json: Team URLs (can be manually edited)

🔍 Monitoring

Logs are automatically created in the logs/ directory with detailed information about:

Scraping progress and results
Email sending status
Errors and warnings
Performance metrics

🛠️ Troubleshooting

Common Issues

Email Authentication Errors
- Use App Passwords for Gmail instead of regular passwords
- Enable 2-factor authentication first
- Check SMTP settings for other email providers
No Jobs Found
- Verify team URLs are still valid
- Check if keywords match available jobs
- Review logs for scraping errors
Website Structure Changes
- NBA team sites may update their layouts
- Check logs for parsing errors
- The scraper uses multiple fallback strategies

Manual Team URL Updates

If a link is stale, you can manually update data/team_urls.json:

{
  "Boston Celtics": "https://www.nba.com/celtics/careers",
  "Los Angeles Lakers": "https://www.teamworkonline.com/basketball-jobs/los-angeles-lakers/los-angeles-lakers-jobs"
}

🔄 Scheduling

Using Cron (Linux/macOS)

# Run daily at 9 AM
0 9 * * * cd /path/to/nba-career-scraper && ./run.sh

# Run twice daily at 9 AM and 6 PM
0 9,18 * * * cd /path/to/nba-career-scraper && ./run.sh

Using Task Scheduler (Windows)

Create a scheduled task that runs:

Program: python
Arguments: src/main.py
Start in: C:\path\to\nba-career-scraper

🧪 Testing

# Run tests
python -m pytest tests/

# Run specific test
python -m pytest tests/test_scraper.py

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

Important Notes

Rate Limiting: The scraper includes delays between requests to be respectful to NBA team websites
Terms of Service: Ensure your usage complies with each team's website terms of service
Data Accuracy: Job listings are scraped from public pages; always verify information on the official site
Privacy: Store email credentials securely and never commit them to version control

License

This project is for educational and personal use. Please respect website terms of service and use responsibly.

Happy job hunting! 🏀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NBA Career Opportunities Scraper

🏀 Features

📁 Project Structure

🚀 Quick Start

1. Installation

2. Configuration

3. Customize Keywords

4. Run the Scraper

🔧 How It Works

📧 Email Format

⚙️ Configuration Options

Environment Variables

Data Files

🔍 Monitoring

🛠️ Troubleshooting

Common Issues

Manual Team URL Updates

🔄 Scheduling

Using Cron (Linux/macOS)

Using Task Scheduler (Windows)

🧪 Testing

Contributing

Important Notes

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
logs		logs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh

jmoletteire/nba-career-scraper

Folders and files

Latest commit

History

Repository files navigation

NBA Career Opportunities Scraper

🏀 Features

📁 Project Structure

🚀 Quick Start

1. Installation

2. Configuration

3. Customize Keywords

4. Run the Scraper

🔧 How It Works

📧 Email Format

⚙️ Configuration Options

Environment Variables

Data Files

🔍 Monitoring

🛠️ Troubleshooting

Common Issues

Manual Team URL Updates

🔄 Scheduling

Using Cron (Linux/macOS)

Using Task Scheduler (Windows)

🧪 Testing

Contributing

Important Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages