More ocean data has been collected in the last two years than in all previous years combined, and we are on a path to continue to break that record. More than ever, we need to establish a solid foundation for processing this ceaseless stream of data. This is especially true for visual data, where ocean-going platforms are beginning to integrate multi-camera feeds for observation and navigation. Techniques to efficiently process and utilize visual datasets with machine learning exist and continue to be transformative, but have had limited success in the ocean world due to:
- Lack of data set standardization;
- Sparse annotation tools for the wider oceanographic community; and
- Insufficient formatting of existing, expertly curated imagery for use by data scientists.
Building on successes of the machine learning community, we are developing a public platform that makes use of existing (and future) expertly curated data. Our efforts will establish a new baseline dataset, optimized to directly accelerate development of modern, intelligent, automated analysis of underwater visual data. This effort will ultimately enable scientists, explorers, policymakers, storytellers, and the public to know what’s in the ocean and where it is for effective and responsible marine stewardship.