Brief notes of the iDigBio workshop

Advances in Digital Media Workshop Series: Yale

Here are just some of my very brief notes (pretty much just keywords).

LightningBug:
- digitizing specimen labels using ML
- Meta’s Segmentation tool Segment Anything Model (SAM) is good and faster than r-cnn
- 200k images, 6.9k specimens
Heritage Science
- NSF Mid-scale research program
MorphoSource: 3D, 2D, AV media data repository
- Maybe a good place to look for exemplary sites for PhenoBase
Audiovisual Core
Expanding LeafMachine2: new training data, models, and methods for processing herbarium specimens; Will Weaver, PhD Candidate, University of Michigan
Detectron by facebook to detect objects from images
Imageomics ?
Phylogeny-guided neural network (phylo-NNs) Elhamod et al, KDD 2023
IIIF
Phenotypic diversity
- Phenological diversity
- Phenome space
- Segament Anything Model (SAM) + Grounding DINO
- t-SNE visualization for clustering data
More training data is not always better for ML models
- If additional related but not present images are added
Multimodel AI models
- CLIP
- LMMs as effective rerankers
- INQUIRE: text-to-image search of iNaturalist images
2D to 3D reconstruction
- Surface-to-volume ratio seems to be well preserved in shark, snakes