-
Notifications
You must be signed in to change notification settings - Fork 0
Subgroups
Goal: We want A tool that lets us take photographic collections of individual Wild insects And be able to iteratively pass them through something like bioclip in order to rapidly id And refine IDs of insects taxonomically.
Ideally the program also runs offline as it will be used by field text gathering data and places where internet might not be great.
Members:
- Elizabeth Campolongo
- Ernie Parke
- Andy Quitmeyer
- Matt Thompson
Project Code: https://github.com/Digital-Naturalism-Laboratories/bucket-o-bugs
*Goal: Broad goal is to get computer vision generated species data and use those data for ecological application. Specifically, we want to develop the Species Distribution Models and predict the suitable habitats of beetle species under various climate scenario using computer vision identified species data.
Task: This goal needs two steps of the work.
-
First task is to generate the Convolution Neural Network (CNN) model using NEON beetle image data and use this model for species identification of unknown sample bee species.
-
Second task is to get climate data for associated with identified species and conduct the species distribution models which produce the species map for their suitable habitat.
-
Members:
- Khum Thapa-Magar, INSTAAR, University of Colorado
- Sarwan Ali, Georgia State University
- Hsunyi Hsieh, Michigan State University
- Feel free to join the group if you like
Project Code: https://github.com/Imageomics/sdm-beetlepalooza
Goal: We want to see how far taxonomically BioClip can get in identifying individual NEON beetles.
- Members:
- Sydne Record
- Hilmar Lapp
- Evan Waite
- Laura Nagel
- Kim Landsbergen
- Isa Betancourt
- Elizabeth Campolongo
-
Run BioClip; run 1 - run on segmented images run 2 - run on unsegemented images compare
-
open classification versus list of known taxa
-
Hilmar:
run 1 - Bioclip on 6 known images individually per bar code samples run listed below (08984, 08914, 08980, 08976, 40688, 40713...);
taxonomists' assessment - in one case, runs convened to same tribe (a group of subgenera) for 08914; not 08984;
run 2 - to level of rank 'biochip predict --rank genus [range of images] outcome - tribes not correct
Evan = going from subfamily to tribe - this is a huge leap
Evan = Can we train Bioclip to get to assess each image and stop at the tribe level? Laura = a goal would be to get to genus - that would be a time-saver
Imagining an example in-person tech workflow AI sort to tribe, then human tech can work on identification lower than tribe that would help eliminate a lot to be able to get to tribe (keys would be needed)
Samples are from Wisconsin - same domain 05, 2 different locations (UNDE, STEI)
Laura provided a file w/ all species found within Domain D5 - every unique species ID returned has been included in the list
^ list to be used in BioClip to limit identification to that domain-specific list filename D05_TaxaList.txt
The NEON Domain 05 list represents specimens already found, that have been expert verified But this is not the list of what could be there (which is a larger number)
Hilmar reran w/ the D05 list; Elizabeth helped w/ formatting table code
efforts below all include D05 list as part of BioClip
Evan - both of these are different species
- AI found them too be different
- but they are in the vial as same species
A00000008980-06
A00000008980-08
40688 - ran 10 subsets from this vial - correct ID is Synuchus impunctatus
40713 - correct ID for Bembidion transparens 40688 - using the full image - with all the beetles in the image
running it beetle by beetle - the ability to ID to correct taxon is variable running it as a full image with all beetles included - the correct ID is in the top 3
conversation about how to optimize photos - on Evan's high-res images of their specimens now running EWIC_00001460, EWIC_0000353, EWIC_0000799, EWIC_0000801, EWIC_0001164
Day 3 wrap-up
Sydne re-ran what we did yesterday, got rid of sub-species data Created summaries at the tribe, subfamily level What were scores for each image?
Laura - been data wrangling to evaluate what the cumulative scores were at each of those taxonomic levels Assigning flags, at each of those levels, what was right or wrong
Isa - it would be interested to evaluate the number of training images with the Right/Wrong flag
Elizabeth put together a script on Cyvers - where she summarized the training images for BioClip; 36 genera, how many images for training were used in BioClip runs
Hilmar battling to get individually segmented images ready to run each image with its own reference domain list. This file structure needing shuffling and wrangling.
Goal for tomorrow - to run BioClip on all of the segmented images with the newly wrangled dataset (thank you Hilmar!)
Members: Sydne Record, Isabelle Betancourt, Evan Waite, Laura Nagel, Kim Landsbergen, Hilmar Lapp, Elizabeth Campolongo
Group 3 code is in a group 3 folder in this repository.
Subgroup 4: EcoPalette: Integration of environmental data into species images to improve model accuracy
Members: Alyson East, Nicholas Gunner, Brennan Hays, Daniel Lopez, Isabella Viney
Subgroup goals:
- Represent ecosystem metadata visually on beetle image
- Improve AI model's classification confidence of beetle species using visualized metadata
- Assess the importance of image-encoded metadata in model's accuracy
Workflow:
- Segment NEON vial-level images of ground beetles into individual beetle images (thanks to Sarwan Ali and Michelle Ramirez)
- Subset beetle image dataset to include only 5 beetle species for proof-of-concept simplicity
- Identify abiotic and biotic ecosystem features of interest based on relevance to beetle niche
- Extract NEON ecosystem data of interest for year 2018 and link to beetle images
- Train and test AI models in identifying beetle species from (1) beetle image subset including image-encoded metadata and (2) beetle image subset NOT including image-encoded metadata
- Compare AI model accuracy from (1) and (2) above
Project code: https://github.com/Imageomics/EcoPalette/tree/main
Members: Isadora Fluck, Michelle Ramirez, Jennifer Girón, S M Rayeed, Ekaterina Nepovinnykh, Dhanyapriya Somasundaram, Hojin Yoo, Sydne Record
Goal: automate trait measurements from images
Workflow: 577 images of the beetles + code (phyton + R):
Code: Input: image with multiple individuals Output: a data table with columns:
- pictureID (that is linked to speciesID, plotID, siteID, etc);
- individualID (that can be linked to the individual images);
- elythra area;
- elythra width;
- elythra length;
Group5_b: Project Code: https://github.com/yoohj0416/predictbeetle
Members: Nathan, Blair, Alec, Parkash
Grad Cam for BioCLIP: https://github.com/mirkab/BeetlePalooza_2024_Mirka
Grad Cam for ResNet-50: https://github.com/parkash-ps/Imageomics-Beetlepalooza-2024
Goals:
- Identify where and why current CV models misidentify beetle species.
This event is sponsored by the Imageomics Institute and supported by the National Science Foundation under Awards No. OAC-2118240 and AWD-111317. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.