Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
da8cbfa
update figure 8
iamdonovan Oct 25, 2025
ce76a61
Update README.md
iamdonovan Oct 25, 2025
2ade854
doi version
iamdonovan Oct 25, 2025
48f532c
add explanation of auto_term
iamdonovan Oct 25, 2025
373034a
swap categories 1, 3
iamdonovan Oct 28, 2025
9ebddf6
add table 4 script
iamdonovan Oct 28, 2025
b3bba56
correct number of conflicts
iamdonovan Oct 28, 2025
6a3b679
add fig. 9 script, examples
iamdonovan Oct 29, 2025
4c2bde1
update example figures
iamdonovan Oct 29, 2025
774a893
add fig. 9 to essd readme
iamdonovan Oct 29, 2025
4909a0d
one last level -> category
iamdonovan Oct 29, 2025
ae985f6
fix inventory dois
iamdonovan Oct 29, 2025
cf8ba08
add a verb
iamdonovan Nov 1, 2025
0fe44b9
move environment file
iamdonovan Nov 2, 2025
4146fbd
add delayed-assert
iamdonovan Nov 2, 2025
b5cda57
add test for new csv files and gpkg files
iamdonovan Nov 2, 2025
f8c22fd
test for lake_cat values
iamdonovan Nov 2, 2025
aa1061c
add pytest dependency
iamdonovan Nov 2, 2025
3d8de11
add workflow action for running pytest
iamdonovan Nov 2, 2025
4403d94
use v5, checkout all tags, branches
iamdonovan Nov 2, 2025
8b5be26
undo checkout all tags, branches, compare to origin/main
iamdonovan Nov 2, 2025
d6d4848
re-do checkout all tags/branches
iamdonovan Nov 2, 2025
6f31bb5
Merge pull request #15 from GLIMS-RGI/action
iamdonovan Nov 2, 2025
0484086
remove essd_review branch (was a test)
iamdonovan Nov 2, 2025
30bfc34
update contributing section
iamdonovan Nov 2, 2025
f052b89
fix image date, lake category
iamdonovan Nov 2, 2025
69660d0
move list of regions to tools.py
iamdonovan Nov 4, 2025
93eb742
move generate_geopackage.py into tools.py
iamdonovan Nov 4, 2025
0d8fd1a
add docstrings for functions in tools.py
iamdonovan Nov 4, 2025
f768372
move summary_table.py to tools.py
iamdonovan Nov 4, 2025
33eec8f
update with new dataset layout
iamdonovan Nov 4, 2025
ece2004
add script to update dataset
iamdonovan Nov 4, 2025
2bff612
update gpkg files for regions 13, 15
iamdonovan Nov 5, 2025
5fcc7a6
add tqdm to test environment
iamdonovan Nov 5, 2025
98d31bb
remove script (incorporated in update_dataset.py)
iamdonovan Nov 5, 2025
411d283
update readme
iamdonovan Nov 5, 2025
991e245
always treat contributor as str
iamdonovan Nov 6, 2025
6874e63
patch to update imja/lhotse shar
iamdonovan Nov 6, 2025
1b3600e
add rgi_id to missing lakes
iamdonovan Nov 6, 2025
2c091fb
update figures, table after patching region 15
iamdonovan Nov 6, 2025
509638e
add global comparison figure (Fig. 10)
iamdonovan Dec 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .github/workflows/dataset_checks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

name: dataset checks

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

permissions:
contents: read

jobs:
test:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Set up Python 3.10
uses: actions/setup-python@v3
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest pandas geopandas delayed-assert gitpython tqdm
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Run Dataset Checks
run: |
pytest
72 changes: 65 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Global lake-terminating glacier classification: a community effort for the Randolph Glacier Inventory (RGI) and beyond

[![DOI](https://zenodo.org/badge/764141248.svg)](https://zenodo.org/badge/latestdoi/764141248)

Written by William Armstrong and Tobias Bolch with contributions from Robert McNabb, Rodrigo Aguayo, Fabien Maussion,
Jakob Steiner, and Will Kochtitzky

Expand Down Expand Up @@ -62,16 +64,72 @@ Each of the files in the dataset folders contains the following fields:
- `contributor` - the ID of the contributor(s) who classified the glacier
- `notes` - any additional notes from each contributor

Additionally, the `contributor_files/` may have an additional field, `auto_term`. When running
`scripts/finalize_csv.py`, the column `term_type` (if included in the CSV) is renamed to `auto_term` to indicate that
`scripts/assign_lake_flag.py` was run and there was an inventory lake within the buffer around the terminus
(`auto_term == 2`)

The `.gpkg` files contained in the `lakeflags/` and `outlines/` directories are generated from the `.csv` files and the
RGI outlines using `scripts/generate_geopackage.py`.

## Scripts

The repository has a number of scripts that can be used to update or work with the dataset. To be able to run any of the
scripts, you will first need to create a [conda]() environment using the `environment.yml` file found in the repository.

The following scripts are found in the `scripts/` folder:

- `test_submission.py`: this script contains a number of tests that can be run with `pytest` to check that any new or
updated files in `dataset/contributor_files` or `dataset/csv` match the formatting of the dataset.
- `update_dataset.py`: this script can be used to update the dataset with any new or revised files found in
`dataset/contributor_files`. The script first checks for any conflicts (glaciers with multiple `lake_cat` values). If
there are conflicts, these need to be resolved by editing the `{region}_conflicts.csv` file(s) created and changing
the conflicting `lake_cat` values to an agreed value, and the script then needs to be run again. Once all conflicts
have been resolved, the script updates the geopackage files, and re-creates `dataset/summary_table.csv` with the
updated number and area of lake-terminating glaciers. At this point, the updated files can be committed and merged
using a [pull request](https://github.com/GLIMS-RGI/lake_terminating/pulls). For more information about how to use
the script, run `python scripts/update_dataset.py -h` from within the `conda` environment.
- `assign_lake_flag.py`: can be used with a lake inventory to identify glaciers that have a lake within some buffer
around the terminus. For more information about how to use the script, run `python scripts/assign_lake_flag.py -h`
from within the `conda` environment, or see the "Workflow using existing lake inventory" section below.
- `assign_term_type.py`: this script can be used to update the RGI v7.0 `term_type` attribute, using the files found
in `dataset/csv`. Glaciers with a `lake_cat` of 2 or 3 will be assigned a `term_type` of 2 (lake-terminating), while
glaciers with a `lake_cat` of 0 or 1 will be assigned a `term_type` of 0 (land-terminating). The updated .csv files
can then be joined to the RGI shapefiles.
- `generate_geopackage.py`: re-generates the geopackage files for each region.
- `summary_table.py`: re-creates `dataset/summary_table.csv` based on the files in `dataset/csv`.
- `finalize_csv.py`: converts the attribute table from a shapefile used for mapping into a .csv file that is compatible
with the dataset. For more information about how to use the script, run `python scripts/finalize_csv.py -h`
from within the `conda` environment.


## Contributing

The classifications provided here are the result of a community effort, which means that there may be disagreement
between analysts. If you find any such issues, we invite you to submit your updates/corrections by first forking this
repository and submitting a Pull Request (https://github.com/GLIMS-RGI/lake_terminating/pulls).
between analysts. If you find any such issues, we invite you to submit your updates/corrections in one of the following
ways, depending on your level of comfort with git/GitHub:

### through GitHub

1. Open an issue (https://github.com/GLIMS-RGI/lake_terminating/issues) for further discussion;
2. Fork this repository, make your changes/updates, and open a Pull Request
(https://github.com/GLIMS-RGI/lake_terminating/pulls). **Please Note**: if you choose this option, ensure
that your updated files are placed in the relevant regional folder in `dataset/contributor_files`, rather than
updating the files in `dataset/csv` directly.

As part of the PR process, any updated CSV files will be checked for formatting consistency. Once those checks have
passed, the updates can be merged using the process outlined by `scripts/update_dataset.py`.

You are welcome to make these changes yourself; alternatively, one of the maintainers will be able to do the final
merge/update.

### outside of GitHub

Outside of GitHub, you are welcome to contact one or more of the maintainers to discuss any issues or to send along
your proposed updates/corrections.

Alternatively, you are welcome to open an issue for further discussion (https://github.com/GLIMS-RGI/lake_terminating/issues).
Please ensure that your submitted CSV file uses the same format as is found in the `lake_term_data_template.csv`
file found in this repository, as it will greatly simplify the integration process.

## 1. Definition and categories of lake-terminating glaciers

Expand Down Expand Up @@ -136,12 +194,12 @@ that have not amalgamated to form one lake spanning the majority of the glacier
Similarly, we did not consider glaciers with proglacial water bodies smaller than 0.01 km<sup>2</sup> to be
lake-terminating.

![Fig4_Level_0_Examples.png](essd/figures/Fig4_Category_0_Examples.png)
![](essd/figures/Fig4_Category_0_Examples.png)
*Background images are Landsat 7 ETM+ false-color composites (bands 5, 4, 3). RGI7 outlines are shown in red, while lake
outlines are shown in white. (a) Eklutna Glacier (RGI2000-v7.0-G-01-10928) in Alaska. Landsat image acquired 1999-07-31.
(b) Unnamed glacier (RGI2000-v7.0-G-01-11048) in Alaska (region 01). Landsat image acquired 1999-07-31. (c) Harris
Glacier (RGI2000-v7.0-G-01-08628) in Alaska (region 01). Landsat image acquired 2000-08-09. (d) Hispar Glacier, with
numerous supraglacial ponds (RGI2000-v7.0-G-14-21670) in South Asia West (region 14). Landsat image acquired 2000-09-11.
numerous supraglacial ponds (RGI2000-v7.0-G-14-21670) in South Asia West (region 14). Landsat image acquired 2000-09-11.*

### Ambiguous lake termini

Expand Down Expand Up @@ -175,7 +233,7 @@ includes exclusively glaciers that are definitely not lake-terminating is helpfu

### Workflow using existing lake inventory

We have provided a Python script (`scripts/assign_flag.py`) that utilizes an existing ice-marginal
We have provided a Python script (`scripts/assign_lake_flag.py`) that utilizes an existing ice-marginal
lake inventory to produce a limited subset of RGI glaciers that should be manually verified for lake-terminating status.

We have compiled a list of known [datasets here](https://github.com/GLIMS-RGI/lake_terminating/blob/main/Lake_databases_termini.csv).
Expand Down Expand Up @@ -228,7 +286,7 @@ The contributors to the lake inventory should provide a csv file with the follow

| `rgi_id` | `lake_cat` | `image_id` | `image_date` | `inventory_doi` | `contributor` |
|-------------------------|:----------:|:----------------------------------------:|:------------:|:----------------------------------:|:-------------:|
| RGI2000-v7.0-G-01-08604 | 1 | LT05_L1TP_066017_19990927_20200907_02_T1 | 1999/09/27 | https://doi.org/10.18739/A2MK6591G | Armstrong |
| RGI2000-v7.0-G-01-08604 | 3 | LT05_L1TP_066017_19990927_20200907_02_T1 | 1999-09-27 | https://doi.org/10.18739/A2MK6591G | Armstrong |

The fields are defined as:

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
rgi_id,auto_term,lake_cat,image_id,notes,image_date,contributor,inventory_doi
RGI2000-v7.0-G-15-06763,9,3,,,,,doi:10.5194/essd-12-2169-2020
Loading