A personal project that scrapes metadata from GraphicAudio and exposes a lightweight lookup API that can also serve as an Audiobookshelf Custom Metadata Provider.
β οΈ Note: While there is a public instance of this API, itβs hosted on a free plan with a very low data cap. If youβd like access, please send me a message and I can provide it.
This project is a personal hobby project.
β
You may use this project for personal archival or library metadata.
β This project is not affiliated with GraphicAudio, nor endorsed by them.
All trademarks, cover images, metadata, and intellectual property belong to their respective owners.
This project contains two components:
| Component | Language | Purpose |
|---|---|---|
index.js |
Node.js | Scrapes GraphicAudio product pages and saves results to results.json |
index.php |
PHP | Serves metadata via HTTP APIs, including ABS custom metadata provider |
The scraper produces a structured JSON file:
results.jsonThe PHP API loads that JSON (cached locally or via APCu), and exposes endpoints such as:
/isbn/{isbn}
/asin/{asin}
/series/{series-name}
/search/{query}
/audiobookshelf/search?query={isbn|asin|text}- Node.js 20
npm i
| File | Purpose |
|---|---|
index.js |
Scrapes entire GraphicAudio catalog |
urls.json |
Cached product URLs (improves resume) |
results.json |
Output metadata JSON from scraping |
node index.jsThe script will:
- Download the GraphicAudio product list
- Extract each product URL
- Visit each product page
- Save scraped data into
results.json
- Resumable scraping β will not duplicate previously scraped entries
- Cleans ISBN, title, series numbering, etc.
- Detects multipart episodes (example:
4.5from4 : Rhythm of War (5 of 6)) - Saves covers only when valid (ignores
tempcover.jpg)
π§ Metadata captured per entry includes:
{
"link": "https://www.graphicaudio.net/amelia-peabody-4-lion-in-the-valley.html",
"cover": "https://www.graphicaudio.net/media/catalog/product/cache/0164cd528593768540930b5b640a411b/a/m/amelia_peabody_4_lion_in_the_valley.jpg",
"seriesName": "Amelia Peabody",
"title": "Lion in the Valley",
"rawtitle": "Episode number 4 : Lion in the Valley",
"episodeNumber": 4,
"episodePart": "1",
"episodeCode": "4.1",
"totalParts": "1",
"subtitle": "[Dramatized Adaptation]",
"author": "Elizabeth Peters",
"releaseDate": "2025-11-17T00:00:00.000Z",
"isbn": "9798896520030",
"genre": "Mystery",
"description": "The 1895-96 season promises to be an exceptional one ...",
"copyright": "Copyright Β© 1986 Elizabeth Peters. All rights reserved...",
"cast": [
"Ken Jackson",
"Nanette Savard",
"Amelia Peabody",
"Michael Glenn",
"Radcliffe Emerson",
...
]
}- PHP 8.1+
- Optional: APCu extension (improves caching performance)
| File | Purpose |
|---|---|
index.php |
Main API router |
cache.json |
Cached version of results.json (auto created) |
/covers |
Cached cover images |
Edit these constants:
define("JSON_URL", "https://raw.githubusercontent.com/USERNAME/REPO/main/results.json");
define("REFRESH_KEY", "CHANGE_ME");
define("AUDIOBOOKSHELF_KEY", "abs"); // "abs" = no auth requiredIf you want ABS to require an API key, set:
define("AUDIOBOOKSHELF_KEY", "MYSECRETKEY123");/isbn/{isbn}Get cover:
/isbn/{isbn}/cover/search/{query}/series/{series-name}/audiobookshelf/search?query=stormlightAuto-detects:
| Query type | Handled as |
|---|---|
9781234567890 |
ISBN |
B09C4Y7T1Q |
ASIN |
Stormlight |
fuzzy search |
ABS receives results formatted like:
{
"matches": [
{
"title": "Rhythm of War",
"series": [{ "series": "Stormlight Archive", "sequence": "4.5" }],
"author": "Brandon Sanderson",
"publishedYear": "2020",
"cover": "https://yourdomain/isbn/9781427280583/cover",
"narrator": "Narrator One"
}
]
}PUT /refresh?key=YOURKEYCovers are downloaded automatically and cached in /covers/.
Once cached, they serve instantly without hitting GraphicAudio again.
| Feature | Status |
|---|---|
| Full catalog scraping | β |
| ISBN lookup | β |
| ASIN lookup | β |
| Series fuzzy detection | β |
| Audiobookshelf metadata provider | β |
| Cached covers | β |
- ASINs are not available on the GraphicAudio website. The scraper cannot retrieve them directly from GraphicAudio pages.
- If you want ASINs, you must manually match GraphicAudio titles with Audible or another source.
- Once you add an ASIN to a product entry in
results.json, the PHP API can serve it via:
/asin/{asin}
/asin/{asin}/cover- Example JSON with ASIN field added:
{
"link": "https://www.graphicaudio.net/amelia-peabody-4-lion-in-the-valley.html",
"cover": "https://www.graphicaudio.net/media/catalog/product/cache/0164cd528593768540930b5b640a411b/a/m/amelia_peabody_4_lion_in_the_valley.jpg",
"seriesName": "Amelia Peabody",
"title": "Lion in the Valley",
"rawtitle": "Episode number 4 : Lion in the Valley",
"episodeNumber": 4,
"episodePart": "1",
"episodeCode": "4.1",
"totalParts": "1",
"subtitle": "[Dramatized Adaptation]",
"author": "Elizabeth Peters",
"releaseDate": "2025-11-17T00:00:00.000Z",
"isbn": "9798896520030",
"asin": "B08EXAMPLE", // <- Add this manually
"genre": "Mystery",
"description": "The 1895-96 season promises to be an exceptional one ...",
"copyright": "Copyright Β© 1986 Elizabeth Peters. All rights reserved...",
"cast": [
"Ken Jackson",
"Nanette Savard",
"Amelia Peabody",
"Michael Glenn",
"Radcliffe Emerson",
...
]
}- Once added, the PHP API
findByField()will recognize it automatically.
To edit or improve results, simply delete:
urls.json
results.jsonNext run:
node scraper.jsTo force the PHP endpoint to refresh:
curl -X PUT "https://yourdomain/refresh?key=SECRET"PRs welcome β especially improvements to scraper logic or metadata mapping.
MIT License.