-
Notifications
You must be signed in to change notification settings - Fork 112
Add flight number, airport codes, and connecting airports extraction #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add flight_number field to Flight dataclass in schema.py - Implement flight number extraction from data-travelimpactmodelwebsiteurl attribute - Support extraction for multiple airlines including Delta, JetBlue, and Frontier - Add debug output for Delta and Frontier flights to help with development - Update test scripts to display flight numbers in output - Fix extraction logic to search within individual flight items instead of entire document
- Add departure_airport and arrival_airport fields to Flight dataclass - Extract airport codes from data-travelimpactmodelwebsiteurl attribute - Support extraction for all airlines (Delta, JetBlue, American, Frontier, etc.) - Update test script to display airport codes in output - Airport codes are extracted from URL patterns like 'itinerary=JFK-LAX-F9-2503-20250801'
|
What's the point of |
Manouchehri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypeError: Flight.__init__() got an unexpected keyword argument 'connecting_airports'
|
Would be nice if this could get the connecting flight numbers too. :) |
I fixed this in #68. |
|
Thanks! The reason the airport codes may not be the same is that some searches span multiple airports (i.e. NYC covers a few, so its helpful to know which specific airport is for a particular flight) |
This comment was marked as outdated.
This comment was marked as outdated.
|
Doesn't |
…field to Flight dataclass - Improve error handling to show relevant HTML parts instead of full page - Fix connecting airports extraction logic for multi-segment flights
Submitting this as a PR in case others find this useful. Note that this has been vibecoded with a fair amount of testing and is working well, but may have unanticipated bugs. Happy to continue to update if its helpful.
Summary
Enhanced the flight data extraction capabilities to include flight numbers, departure/arrival airport codes, and connecting airports for multi-segment flights.
Changes Made
New Fields Added to Flight Dataclass
flight_number: Extracted from itinerary URL data attributesdeparture_airport: First airport in the itineraryarrival_airport: Last airport in the itineraryconnecting_airports: List of intermediate airports for connecting flightsEnhanced HTML Parsing Logic
data-travelimpactmodelwebsiteurlattribute