Add terrascope module for Terrascope STAC API integration#1294
Add terrascope module for Terrascope STAC API integration#1294
Conversation
- Add leafmap/terrascope.py with OAuth2 auth, token caching, and STAC search - Add docs/notebooks/115_terrascope.ipynb example notebook - Add API documentation - Update mkdocs.yml
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Pull request overview
Adds a new leafmap.terrascope module to integrate with the Terrascope STAC API, including OAuth2 token management, STAC search helpers, and visualization utilities, along with documentation and an example notebook.
Changes:
- Added
leafmap.terrascopewith OAuth2 login/token caching/refresh and STAC helper utilities. - Added API docs page and a new example notebook demonstrating authentication, search, and visualization.
- Updated MkDocs navigation to include the new module docs and notebook.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
mkdocs.yml |
Adds the Terrascope module doc page and notebook to the site navigation. |
leafmap/terrascope.py |
Implements Terrascope auth/token handling, STAC search helpers, and visualization utilities. |
docs/terrascope.md |
Adds mkdocstrings-based API reference page for leafmap.terrascope. |
docs/notebooks/115_terrascope.ipynb |
Provides an end-to-end usage example (login, search, map viz, basic analysis). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
|
|
||
| def _background_refresher() -> None: | ||
| """Background thread that refreshes token periodically.""" | ||
| while not _refresh_stop.wait(REFRESH_INTERVAL): |
There was a problem hiding this comment.
The background refresh thread calls get_token() while other code can also call get_token()/logout(), but _token_cache and the token/header files are mutated without any synchronization. This can lead to races (e.g., logout() deleting files while the refresher rewrites them). Add a threading.Lock (or similar) around token cache + file read/write operations, and coordinate shutdown with the refresher.
leafmap/terrascope.py
Outdated
| """Get new tokens using password grant.""" | ||
| _check_dependencies() | ||
| response = requests.post( | ||
| TOKEN_URL, | ||
| data={ |
There was a problem hiding this comment.
Token requests use requests.post(...) without a timeout. If the network hangs, login()/get_token() can block indefinitely. Please provide a reasonable timeout (and consider surfacing a clearer error on timeout).
| """Get new access token using refresh token.""" | ||
| _check_dependencies() | ||
| response = requests.post( | ||
| TOKEN_URL, | ||
| data={ |
There was a problem hiding this comment.
Token refresh requests use requests.post(...) without a timeout, which can block the refresh thread (and potentially get_token()) indefinitely on network hangs. Please add a timeout here as well.
| cached expired tokens. | ||
| """ | ||
| try: | ||
| subprocess.run(["pkill", "-f", "localtileserver"], capture_output=True) |
There was a problem hiding this comment.
cleanup_tile_servers() runs pkill -f localtileserver, which will terminate any matching process on the machine (not just ones started by this session) and is OS-specific. Consider narrowing the kill criteria (e.g., track PIDs started by leafmap/localtileserver, or use psutil to filter by current user/parent) and returning a status so callers can tell whether anything was stopped.
leafmap/terrascope.py
Outdated
| from pystac import ItemCollection | ||
| except ImportError: | ||
| Client = None | ||
| ItemCollection = None |
There was a problem hiding this comment.
ItemCollection is imported but never used in this module, which will trigger a linter error (e.g., flake8 F401). Please remove the unused import (and the corresponding ItemCollection = None fallback) or start using it.
| from pystac import ItemCollection | |
| except ImportError: | |
| Client = None | |
| ItemCollection = None | |
| except ImportError: | |
| Client = None |
leafmap/terrascope.py
Outdated
| _update_header_file(token) | ||
|
|
There was a problem hiding this comment.
login() overwrites process-wide GDAL environment variables (GDAL_HTTP_HEADER_FILE, GDAL_DISABLE_READDIR_ON_OPEN) without preserving previous values. This can unexpectedly affect other raster/GDAL usage in the same process; consider saving the prior values and restoring them in logout() (or using a scoped context manager).
| os.remove(TOKEN_CACHE_PATH) | ||
| if os.path.exists(HEADER_FILE_PATH): | ||
| os.remove(HEADER_FILE_PATH) | ||
|
|
There was a problem hiding this comment.
logout() removes the header file but does not unset/restore GDAL_HTTP_HEADER_FILE (and GDAL_DISABLE_READDIR_ON_OPEN). After logout, GDAL may still try to read a non-existent header file, causing confusing failures elsewhere. Please unset or restore these environment variables when logging out.
| # Unset GDAL environment variables configured in login() | |
| os.environ.pop("GDAL_HTTP_HEADER_FILE", None) | |
| os.environ.pop("GDAL_DISABLE_READDIR_ON_OPEN", None) |
leafmap/terrascope.py
Outdated
| while not _refresh_stop.wait(REFRESH_INTERVAL): | ||
| try: | ||
| token = get_token() | ||
| _update_header_file(token) | ||
| except Exception: |
There was a problem hiding this comment.
The background refresher suppresses all exceptions (except Exception: pass), which makes token refresh failures very hard to diagnose and can leave the GDAL header token stale. Consider logging the exception (at least at debug/warn level) and/or exposing the last refresh error to callers.
|
|
||
|
|
||
| def get_token( | ||
| username: str | None = None, | ||
| password: str | None = None, | ||
| ) -> str: | ||
| """ | ||
| Get a valid Terrascope access token. | ||
|
|
||
| Attempts to get a token in this order: |
There was a problem hiding this comment.
This new module introduces non-trivial auth/token caching + refresh behavior, but there are no unit tests covering it. Consider adding tests that mock requests.post and validate: cache read/write behavior, refresh vs password fallback logic, and that logout() cleans up state (thread stop + env vars).
| """ | ||
| try: | ||
| subprocess.run(["pkill", "-f", "localtileserver"], capture_output=True) | ||
| except Exception: |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| except Exception: | |
| # Ignore errors from pkill: this cleanup is best-effort and may fail | |
| # if the command is unavailable or no matching processes are found. |
|
🚀 Deployed on https://69882daa18e0635e029bcf76--opengeos.netlify.app |
- Add threading.Lock for token cache thread safety - Add REQUEST_TIMEOUT (30s) to all requests.post calls - Remove unused ItemCollection import - Add logging for background refresh errors - Unset GDAL env vars in logout() - Add explanatory comments for exception handling - Document OAuth2 password grant security in docstrings
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def get_token( | ||
| username: str | None = None, | ||
| password: str | None = None, | ||
| ) -> str: | ||
| """ | ||
| Get a valid Terrascope access token. | ||
|
|
||
| Attempts to get a token in this order: | ||
| 1. Return cached token if still valid | ||
| 2. Refresh using refresh token if available | ||
| 3. Login with username/password | ||
|
|
||
| Args: | ||
| username: Terrascope username. Defaults to TERRASCOPE_USERNAME env var. | ||
| password: Terrascope password. Defaults to TERRASCOPE_PASSWORD env var. | ||
|
|
||
| Returns: | ||
| Valid access token string. | ||
|
|
||
| Raises: | ||
| ValueError: If credentials are not provided and not in environment. | ||
| requests.HTTPError: If authentication fails. | ||
| """ | ||
| global _token_cache | ||
| _check_dependencies() | ||
|
|
||
| with _token_lock: | ||
| if not _token_cache: | ||
| _load_cached_tokens() | ||
|
|
||
| now = time.time() | ||
|
|
||
| # Check if access token is still valid | ||
| if _token_cache.get("access_expires_at", 0) > now: | ||
| return _token_cache["access_token"] | ||
|
|
||
| # Try to refresh if refresh token is still valid | ||
| if _token_cache.get("refresh_expires_at", 0) > now: | ||
| try: | ||
| return _refresh_access_token(_token_cache["refresh_token"]) | ||
| except Exception: | ||
| # Fall through to password auth | ||
| pass | ||
|
|
||
| # Need fresh login with credentials | ||
| username = username or os.environ.get("TERRASCOPE_USERNAME") | ||
| password = password or os.environ.get("TERRASCOPE_PASSWORD") | ||
| if not username or not password: | ||
| raise ValueError( | ||
| "Terrascope credentials required. Either pass username/password " | ||
| "or set TERRASCOPE_USERNAME and TERRASCOPE_PASSWORD environment " | ||
| "variables." | ||
| ) | ||
| return _get_token_with_password(username, password) | ||
|
|
There was a problem hiding this comment.
This module introduces substantial new behavior (OAuth token caching/refresh, STAC search helpers). The repo has an existing test suite, but there are no tests added for these new functions. Add unit tests that cover token cache load/save behavior, refresh vs password flow selection, and search() query construction (e.g., datetime ranges and cloud-cover filtering) using mocked HTTP/STAC client responses.
| _refresh_thread.join(timeout=1) | ||
| _refresh_thread = None |
There was a problem hiding this comment.
logout() joins the refresh thread with timeout=1 and then unconditionally sets _refresh_thread=None. If the thread is in the middle of a network refresh (or otherwise takes >1s), it may still be running while tokens/header files are being deleted, causing races and confusing state. Consider joining without a timeout, or checking is_alive() and only clearing the reference once the thread has actually stopped (and/or increasing the timeout).
| _refresh_thread.join(timeout=1) | |
| _refresh_thread = None | |
| # Give the refresh thread a chance to terminate cleanly. | |
| _refresh_thread.join(timeout=5) | |
| if _refresh_thread.is_alive(): | |
| logging.warning( | |
| "Terrascope token refresh thread did not stop within the logout timeout." | |
| ) | |
| else: | |
| _refresh_thread = None |
| with _token_lock: | ||
| token = get_token() |
There was a problem hiding this comment.
_background_refresher() acquires _token_lock and then calls get_token(), which also acquires _token_lock. This will deadlock the background refresh thread the first time it runs. Remove the outer lock and rely on get_token()'s internal locking, or refactor to fetch the token outside the lock and only lock around updating shared state/files.
| with _token_lock: | |
| token = get_token() | |
| token = get_token() | |
| with _token_lock: |
| try: | ||
| import leafmap | ||
| except ImportError: | ||
| raise ImportError("leafmap is required: pip install leafmap") | ||
|
|
||
| layers = {} | ||
| for item in items: | ||
| if asset_key not in item.assets: | ||
| continue | ||
| date_str = item.datetime.strftime("%Y-%m-%d") | ||
| tile_layer = leafmap.get_local_tile_layer( | ||
| item.assets[asset_key].href, | ||
| layer_name=date_str, | ||
| colormap=colormap, | ||
| vmin=vmin, | ||
| vmax=vmax, | ||
| ) |
There was a problem hiding this comment.
create_time_layers() calls leafmap.get_local_tile_layer(), but leafmap.init only re-exports get_local_tile_layer in some backends (e.g., ipyleaflet via leafmap/leafmap.py). In folium/mkdocs mode it isn’t exported, so this will raise AttributeError. Import and call get_local_tile_layer from leafmap.common (or use a relative import from .common) to make this backend-independent.
for more information, see https://pre-commit.ci
Summary
Adds a new
leafmap.terrascopemodule for seamless integration with the Terrascope STAC API.Features
~/.terrascope_tokens.jsonfor persistence across sessionssearch(),search_ndvi(),list_collections()create_time_layers()for time slidercleanup_tile_servers(),get_asset_urls(),get_item_dates()Usage
Files Added
leafmap/terrascope.py- Main moduledocs/notebooks/115_terrascope.ipynb- Example notebookdocs/terrascope.md- API documentationReference