Dominica Sperm Whale Project Archive
Long-term acoustic dataset of sperm whale codas collected off the coast of Dominica. The foundational dataset for sperm whale communication research.
A browsable, searchable directory of models, datasets, and tools. With generation lineage, provenance chains, and an AI clerk that takes submissions in natural language.
Language models trained with non-human phonetic data. Track architecture, size, human/animal dataset ratios, injection method, and generation lineage.
Audio recordings, video, phonetic transcriptions. Raw or model-prepared, with full provenance: which generation of model prepared the data, and which generation it's intended to feed.
Annotation pipelines, tokenizers, phonetic alphabets, evaluation frameworks. With supported species and direct links to source.
Instead of dropdown menus, you'll chat with an AI research clerk. It asks the clarifying questions, categorises your submission, and validates completeness — no friction.
Long-term acoustic dataset of sperm whale codas collected off the coast of Dominica. The foundational dataset for sperm whale communication research.
Transformer-based pipeline that automatically detects, segments, and annotates sperm whale codas using the phonetic alphabet. Runs on public datasets.
First generative model for dolphin vocalizations. Predicts and generates realistic whistles, clicks, and burst pulses. Trained on 40+ years of Atlantic spotted dolphin recordings.
Leave your email to be notified when the registry opens for submissions.