Conversation
Signed-off-by: OEvortex <[email protected]>
…h and localization Signed-off-by: OEvortex <[email protected]>
…nguage and localization Signed-off-by: OEvortex <[email protected]>
…I capabilities; remove unused ChatZAI provider and line counter script; introduce docstring and reverse engineering agents Signed-off-by: OEvortex <[email protected]>
…tor Brave engine; update documentation and usage instructions Signed-off-by: OEvortex <[email protected]>
…d parsing - Updated the unified web search endpoint to include video search capabilities and enhanced suggestions handling. - Added Brave search engines for text, images, news, suggestions, and videos with comprehensive parsing logic. - Implemented unit tests for Brave text search parsing to ensure accuracy and reliability. - Introduced a new BraveSearch class to unify search functionalities across different media types. - Enhanced error handling in search engines to manage varying method signatures and improve robustness. Signed-off-by: OEvortex <[email protected]>
Signed-off-by: OEvortex <[email protected]>
There was a problem hiding this comment.
Pull request overview
This pull request introduces comprehensive Brave search engine support across multiple modalities (videos, news, images, suggestions) and adds a new BraveAI search provider. The implementation follows a consistent architecture pattern with a base class and specialized search engines for each content type, along with enhanced CLI formatting capabilities and server-side improvements.
Changes:
- Added five new Brave search engines: BraveTextSearch, BraveVideos, BraveNews, BraveImages, and BraveSuggestions, each with HTML/API parsing capabilities
- Introduced BraveAI provider for AI-powered search with streaming support
- Enhanced CLI with specialized formatting functions for videos, news, suggestions, and images
- Updated server routes with improved suggestions parameter handling
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| webscout/search/engines/brave/base.py | New base class for all Brave search implementations with session management |
| webscout/search/engines/brave/videos.py | Complete video search implementation with view count parsing and metadata extraction |
| webscout/search/engines/brave/news.py | News search with date detection and thumbnail handling |
| webscout/search/engines/brave/suggestions.py | Autocomplete API with rich entity support |
| webscout/search/engines/brave/images.py | Image search with base64 URL extraction |
| webscout/search/engines/brave/text.py | Text search with fallback retry logic |
| webscout/Provider/AISEARCH/brave_search.py | New AI search provider with async support |
| webscout/cli.py | Added specialized print functions and formatting utilities |
| webscout/server/routes.py | Improved suggestions endpoint parameter handling |
| tests/providers/test_brave_images.py | Comprehensive unit and integration tests |
| tests/providers/test_brave_text.py | Text search parsing tests |
| Multiple files | Removed # type: ignore comments (type checking cleanup) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| Uses server-rendered HTML and parses result containers with CSS selectors. | ||
| """ | ||
| from typing import cast |
There was a problem hiding this comment.
The cast import is imported but never used in this method. This import should be removed or moved to the top of the file if it's needed elsewhere.
| self.country = country | ||
| self.ui_lang = ui_lang | ||
| self.geoloc = geoloc | ||
| self.last_response = {} |
There was a problem hiding this comment.
The last_response attribute is initialized as an empty dict but later assigned SearchResponse objects (line 248, 279). This inconsistent typing could cause issues. Consider initializing as self.last_response: Optional[SearchResponse] = None or self.last_response: Union[SearchResponse, Dict] = SearchResponse("") for type consistency.
| self.last_response = {} | |
| self.last_response: Optional[SearchResponse] = None |
| # Suggestions method might have different signature | ||
| try: | ||
| results = method(q, region=region, max_results=max_results) | ||
| except TypeError: | ||
| # Fallback for engines that don't accept region | ||
| results = method(q, max_results=max_results) |
There was a problem hiding this comment.
The bare except TypeError catches any TypeError, not just parameter-related ones. This could mask other TypeErrors from within the method call. Consider catching more specific exceptions or logging the error details to distinguish between parameter signature mismatches and other issues.
| first_item = data[0] | ||
|
|
||
| # Handle dataclass objects by converting to dict | ||
| if hasattr(first_item, "__dataclass_fields__"): | ||
| # Convert dataclass to dict for each item | ||
| data = [ | ||
| {k: getattr(item, k, "") for k in first_item.__dataclass_fields__} | ||
| for item in data | ||
| ] | ||
| first_item = data[0] |
There was a problem hiding this comment.
The dataclass-to-dict conversion uses getattr(item, k, "") which defaults to an empty string for missing fields. This could hide actual missing attributes. Consider using getattr(item, k) without a default, or use dataclasses.asdict() for more robust conversion that will raise AttributeError for truly missing fields.
| if __name__ == "__main__": | ||
| ai = BraveAI() | ||
| res = ai.search("What is Python?", stream=True, enable_research=True) | ||
| from collections.abc import Iterable | ||
|
|
||
| if isinstance(res, Iterable): | ||
| for chunk in res: | ||
| print(chunk, end="", flush=True) | ||
| else: | ||
| print(res) |
There was a problem hiding this comment.
The import from collections.abc import Iterable is placed inside the if __name__ == "__main__" block after usage. While this works, it's unconventional. Move the import to the top of the block (line 290) for better readability: from collections.abc import Iterable before creating the ai instance.
| from __future__ import annotations | ||
|
|
||
| from time import sleep | ||
| from typing import List, Optional |
There was a problem hiding this comment.
Import of 'Optional' is not used.
| except (ValueError, IndexError): | ||
| pass | ||
| if "--height:" in style: | ||
| try: | ||
| height = int(style.split("--height:")[1].split(";")[0].strip()) | ||
| except (ValueError, IndexError): |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except (ValueError, IndexError): | |
| pass | |
| if "--height:" in style: | |
| try: | |
| height = int(style.split("--height:")[1].split(";")[0].strip()) | |
| except (ValueError, IndexError): | |
| except (ValueError, IndexError): | |
| # Ignore malformed or missing width value; default width=0 is acceptable. | |
| pass | |
| if "--height:" in style: | |
| try: | |
| height = int(style.split("--height:")[1].split(";")[0].strip()) | |
| except (ValueError, IndexError): | |
| # Ignore malformed or missing height value; default height=0 is acceptable. |
| if "--height:" in style: | ||
| try: | ||
| height = int(style.split("--height:")[1].split(";")[0].strip()) | ||
| except (ValueError, IndexError): |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| decoded = base64.b64decode(normalized).decode("utf-8", errors="ignore") | ||
| if decoded.startswith("http"): | ||
| return decoded | ||
| except Exception: |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| sib = fn("div", class_="snippet") | ||
| if sib: | ||
| candidates.append(sib.select_one(".generic-snippet")) | ||
| except Exception: |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
This pull request introduces major new functionality and improvements centered around Brave search engine support across multiple modalities, including videos, news, images, and suggestions. It also adds a new AI search provider, refines CLI output, and improves documentation. The most important changes are grouped by theme below.
Brave Search Engine Integration
BraveVideos,BraveNews,BraveSuggestions, andBraveImages, each capable of parsing HTML or API responses to extract rich results (videos, news articles, suggestions, images) with metadata and pagination support. ([changelog.mdR5-R31](https://github.com/OEvortex/Webscout/pull/113/files#diff-3bd14d078188074c410028847113ceae68865d0ad5b844a27183ef87fbe2fcc3R5-R31))BraveSearchinterface, enabling seamless access to all Brave search types. ([changelog.mdR5-R31](https://github.com/OEvortex/Webscout/pull/113/files#diff-3bd14d078188074c410028847113ceae68865d0ad5b844a27183ef87fbe2fcc3R5-R31))CLI and UI Improvements
webscout/cli.pyto introduce specialized print functions for videos, news, suggestions, and images, improving result formatting and user experience. Helper functions for formatting and truncation were also added. ([changelog.mdR5-R31](https://github.com/OEvortex/Webscout/pull/113/files#diff-3bd14d078188074c410028847113ceae68865d0ad5b844a27183ef87fbe2fcc3R5-R31))Provider and API Enhancements
BraveAI.pyprovider for AI-powered search and deep research, supporting both streaming and non-streaming modes. Updated exports and documentation for the new provider. ([changelog.mdR5-R31](https://github.com/OEvortex/Webscout/pull/113/files#diff-3bd14d078188074c410028847113ceae68865d0ad5b844a27183ef87fbe2fcc3R5-R31))Testing and Quality Assurance
BraveImagesintests/providers/test_brave_images.py, covering initialization, payload building, pagination, result extraction, and real web searches. ([tests/providers/test_brave_images.pyR1-R215](https://github.com/OEvortex/Webscout/pull/113/files#diff-b33b8220d17bb1b75eff4d256e0d0c7af22c83c7821a2da48b5b62d47b380efdR1-R215))lol.pyto exercise all new Brave engines and display their results using rich formatting. ([lol.pyR1-R63](https://github.com/OEvortex/Webscout/pull/113/files#diff-ee804a83ab10a6e0c63438659b6c7f76c9a7cf5dbcf4805424c65d8bb7a0240cR1-R63))Documentation Updates
[changelog.mdR5-R31](https://github.com/OEvortex/Webscout/pull/113/files#diff-3bd14d078188074c410028847113ceae68865d0ad5b844a27183ef87fbe2fcc3R5-R31))These changes collectively provide robust Brave search capabilities, improved developer experience, and higher code quality across the project.