Article Content Scraper Service #200

vimscientist69 · 2025-10-16T03:53:44Z

Reason

AutomaWebCore provides a way for scraping the HTML of websites. We need to use AutomaWebCore API to scrape the HTML of an article, parse the HTML to get the text, and use an LLM to format that text into a dictionary. This is needed, because having an article as a dictionary will be used to generate Tweets.

Tech Details List

Implement AricleContentScraperService with proper error handling, metrics and logs
Refactor/clean scraper service code
Write a simple integration test for the service, testing the only public method

Tasks:

Implement AricleContentScraperService with proper error handling, metrics and logs
Refactor/clean scraper service code
Write a simple integration test for the service, testing the only public method

Un-important tasks:

Links:

Testing

Steps:

Run required commands to test all the code modified by this PR

# unit tests
# integration tests
swift test --filter "ArticleContentScraperServiceIntegrationTests"
# procs / testing logic

Further testing can be done by following the notes in TESTING-QA

Output:

Testing QA

… core api Made method to get website html. Updated AutomaUtilities to have the web core api endpoint payload type.

…client get website html

… llm to format article into json format

…to use multiple helper methods at same abstraction level to improve cleanliness

…trics

vimscientist69 added 6 commits October 13, 2025 06:30

feat(AutomaWebCoreClient): created client to interact with automa web…

e389fde

… core api Made method to get website html. Updated AutomaUtilities to have the web core api endpoint payload type.

Merge branch 'develop' into article-content-scraper

30db05c

test(AutomaWebCoreClientIntegrationTests): created test for web core …

0a903f0

…client get website html

feat(ArticleContentScraperService): scrape content of article and use…

67f788c

… llm to format article into json format

refactor(ArticleContentScraperService): converted long public method …

67da912

…to use multiple helper methods at same abstraction level to improve cleanliness

fix(ArticleContentScraperService): better error handling, logs and me…

ccc4f07

…trics

vimscientist69 self-assigned this Oct 16, 2025

vimscientist69 requested a review from AdonisCodes October 16, 2025 03:53

vimscientist69 merged commit ec87189 into develop Oct 16, 2025
3 of 4 checks passed

vimscientist69 deleted the article-content-scraper branch October 16, 2025 04:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Article Content Scraper Service #200

Article Content Scraper Service #200

Uh oh!

vimscientist69 commented Oct 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Article Content Scraper Service #200

Article Content Scraper Service #200

Uh oh!

Conversation

vimscientist69 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reason

Tech Details List

Testing

Testing QA

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vimscientist69 commented Oct 16, 2025 •

edited

Loading