• Login
  • Register

Work for a Member organization and need a Member Portal account? Register here with your official email address.

Project

Wandering Through Bird Image Archives

Copyright

Jessica Stringham

Jessica Stringham

Groups

I usually work with code that creates images, which I means I can easily explore the system using more code. Here, I wanted to see what I could do with exploring an existing images.

I collected two image archives: one on public domain images of birds in visual art, and another on my personal photographs of birds. Then I experimented with different ways to explore the data.

Image Archives

I assembled two bird image archives, my personal bird photographs and birds in visual culture.

Photos of the Birds I have met

A part of my birdwatching practice is trying to snap good photos of the birds I see. I've ended up with around 10,000 photos, most of which are not of birds: out of focus, or are focused on branches where a bird used to be.

Copyright

Jessica Stringham

Birds in Visual Culture

I assembled (public domain and permissively licensed) images of visual representations of birds from Wikimedia, Open Access at the Met, and the Biodiversity Heritage Library. What a delightful view of hundreds of years depicting birds!

Copyright

public domain

Gathering the birds

This was mostly a data pipeline problem. After gathering the original images, I ran YOLOv8m over the datasets to draw bounding boxes around things it recognized as birds. I manually drew bounding boxes for some images, especially the birds in art. 

I was trying out the new AI agents quite a bit, and it was nice to be able to have it create custom interfaces to process and correct this data. 

Copyright

Jessica Stringham

Choosing one winner from YOLO bounding box overlaps

Sometimes YOLO would draw boxes around the same bird multiple times. This overlap detector would show intersecting overlaps and let me choose which to keep.

Copyright

Jessica Stringham

Removing backgrounds

I ran code to remove the background from the background, which sometimes worked great. Other times, I needed to correct it.

Copyright

Jessica Stringham

The background cleaning tools grew with tools to draw straight lines (good to capture thin bird legs), and even something to make edges magnetic. I thought about switching to an established photo-editing tool, vibecoding an interface straight into the web app worked fine, and was convenient for the rest of the workflow.

Embeddings

I embedded the bird images using the image foundational models CLIP and BioCLIP (the CLIP objective trained on photos of birds and other lifeforms). This gives me a location in space that is relevant semantically.

So for example, if I took a list of bird names and embedded it with BioCLIP, and then did the same thing on these images of chipping sparrows, ideally they'd show up near each other, and I get a way to automatically label my images with the bird species. Nice!

Copyright

Jessica Stringham

But for my purposes, I'm less interested in labeling the photos accurately, and more interested in similar things being near each other.

For visualization, it helps to temporarily reduce the dimensions from CLIP's 512 numbers down to two dimensions. I use UMAP, which tries to keep similar things near each other, while spreading things out in a way that's easier to read.

Copyright

Jessica Stringham

A Nature Walk through an Embedding Space

Here are a few ways I visualized this space.

Snug

I wrote an physics simulator to nudge bird outlines around so they fit snuggly near each other.

Copyright

Jessica Stringham

Copyright

Jessica Stringham + public domain images

Copyright

Jessica Stringham

To do this, I would run the physics engine on outlines of the birds, which makes its own nice visualization. (The lines connect bird vertices to one another, and are mostly for show. The actual forces are based on a corner to a line segment.)

Copyright

Jessica Stringham

I also realized that since I could had removed the background from the birds, I could also remove the birds from the background. I did this to my photographs, and arranged them by the CLIP embedding of their backgrounds, which nicely arranged the sky, water, trees, and ground. 

Copyright

Jessica Stringham

Copyright

Jessica Stringham

Beak Embeddings

Embedding using one of these semantic embeddings is fine, but you can use anything to embed. I tried one where I labeled the direction of the birds' beaks. I had AI write an interface that let's me draw the beak direction, and label the top-down direction the beak is pointing.

Copyright

Jessica Stringham

Then, most naturally, I can arrange the birds on the surface of a sphere so their beaks point outward (or inward) to the center of the sphere.

Copyright

Jessica Stringham

Alternatively, I can navigate the dataset by showing birds whose beaks are pointing at the mouse.

Copyright

Jessica Stringham

Silverspot

I also used the interface I previously developed for exploring generative systems, Silverspot

(Since Silverspot picks point in space to generate from, I needed to use nearest neighbors for the image archive to find the closest image. That's why you end up with regions of geese!)

Copyright

Jessica Stringham

 Bonus

As a bonus, here is something I made last year when I was first working with my bird photography, using custom symmetries and repetition commands. This also had a basic script of using YOLO on my photos, but I used photoshop to remove the backgrounds and just kept the few birds where it worked best. It's fun to see how it changed when I picked it back up again!

Copyright

Jessica Stringham

Research Topics
#design #art