I close my eyes and hear a flock of birds
November 5, 2020 11:21 AM   Subscribe

BirdNET: automated bird identification based on audio recording.

The system, developed by scientists at Cornell Lab of Ornithology and Chemnitz University of Technology, is based on machine-learning with soundscapes, trained to recognise the species from "984 of the most common species of North America and Europe".

The website showcases an Android app, but for other systems the "Upload Recording" link should work for most browsers. After successful upload, a clickable spectrogram is shown with the playable recording, with the list of candidate species displayed below. A spectrogram is like a music sheet, but for all kinds of signals: it shows the presence and strengths of "notes" on certain pitches (vertical axis) at some given time (horizontal axis).

What I find the coolest is the feature to drag over a time span in the spectrogram that show up as a "passage" and have the section analysed. By default, the results are shown for each 2.5-second chunk.

Personally I became aware of the BirdNET because I wanted to identify the invisible little birb singing very loudly in a tree nearby. It failed, because the African species wasn't in the training data as of now. However the research [303-page PDF] and tech behind it are really impressive. For European and North American species it should fare much better, although in the words of the primary researcher, "the lack of a 'gold standard' prevents a fully objective result estimation." (p. 167)

Also previously mentioned on Ask.
posted by runcifex (18 comments total) 25 users marked this as a favorite
 
Warning: fooled by mimics. BirdNet is about the only one who falls for a Blue Jay's imitation of a Red-tailed Hawk.
posted by Freyja at 11:42 AM on November 5, 2020 [4 favorites]


Birds are great! That is all.
posted by Going To Maine at 11:47 AM on November 5, 2020 [3 favorites]


I have wanted something like this for YEARS
posted by cheshyre at 11:59 AM on November 5, 2020


This is great and as someone who works on a bird reference app, I’ve been trying it out for a couple days. Was told a couple years back that “Shazam for Birds” is an elusive holy grail, due to a number of obstacles. Interested to read in the paper about what progress has been made.
posted by johngoren at 12:01 PM on November 5, 2020


Birds are great! That is all.
posted by Going To Maine


Preach!

Also, one I've been using for a while now. Merlin Bird ID.
posted by Splunge at 12:03 PM on November 5, 2020 [2 favorites]


> Warning: fooled by mimics.
I really hope more data for worldwide species could be available in the future, although this is certainly going to be very costly. Even better, if high-quality data for mimics can be obtained (i.e. labelled with the correct species, preferably with "THIS PASSAGE IS MIMICRY" tags), it will be really interesting to see how it performs. But again that kind of data is even more difficult to secure.

Near here in the residential green space, there are four regularly seen mimic species. Possibly mimicking each other. It can get really chaotic.

(Metafilter: Fooled by mimics.)
posted by runcifex at 12:07 PM on November 5, 2020


Warning: fooled by mimics

I dunno - it was mostly confident about this recording of a Northern Mockingbird I made back in April. Unless it's reading the mp3 tags … What I found neat was it hazarded a guess at the species the birb was trying to be. But I guess the vocal tract of a mockingbird can't quite “do” other birds properly.

I must go and feed it the famous 19th May 1942 recording and see what it makes of it, even though that one always makes me tear up.

(and yeah, if you've never heard a mockingbird, this one's quite a performer. The car alarm was a nice touch, I thought.)
posted by scruss at 12:23 PM on November 5, 2020 [2 favorites]


Warning: fooled by mimics. BirdNet is about the only one who falls for a Blue Jay's imitation of a Red-tailed Hawk.

Ah! I have way too many ML stories from my short time not making money at a failed ML startup. A lot of people try to trick ML classification in entirely the wrong ways, but it came up in a meeting where we were trying to identify beer based on the packaging and luckily all consumer brands have intentionally easily recognizable logos no matter the color scheme or the environment. Perfect for computer vision. Stores run promotions based on geography that prints branded NFL packaging for beer. So Philadelphia's market might have Steelers branded Bud Light boxes but Bud Light is such a recognizable logo you can identify it ignoring the color values.

Somehow this lead to a woman who was somehow involved in a luxury brand company and wanted to know how to spot fakes and was willing to pay a lot of money to do so. For simplicity sake she was selling white t-shirts that were like Givenchy or other name brand but $800 instead of $8 and they not surprisingly had a problem with fakes. I had the most surreal, Don Dellilo people-taking-picture-of-pictures-of-a-barn conversation. I had to explain that ML models can detect patterns that we might not recognize as people but they're not magic if there's no information encoded in the picture to differentiate it. A good example being that for lighter skinned faces you can tell deepfakes because our skin tends to blush as we speak in a way that corresponds with our heart rate. It is really subtle and you can only tell visually by contrasting the right colors but its not like reading your heart rate, right? It is just noticing the skin tone changes just slightly enough in correspondence with your heart rate.

She was insistent that I could somehow pick an expensive white t-shirt from another white t-shirt, without seeing the label. I asked how their current fraud detection system worked, basically did they have people going to Chinatown picking out fake white t-shirts? Could you put a fake white t-shirt and a Givenchy white t-shirt next to each other and someone trained could pick it out? Maybe the stitching was different, I would assume regardless of price there might be irregularities among manufacturers or even manufacturing runs that I could use to visually identify the difference. There were none. So I explained in the art world there's the common concept of provenance that imitations are so good, you can only really rely on provenance and only then that's as good as the word and research. There's a certain amount of trust and any uncertainty is factored into the price.

I think she thought that putting a higher price tag on any arbitrary item somehow transformed the composition of that item in some sort of capitalistic alchemy. She could not get over the fact the luxury t-shirt could not be distinguished from a regular t-shirt, which I think might've been the designer's point? I guess she found someone who could do or could it at least convince her that their program could do it. So fucking birds and their imitations!
posted by geoff. at 12:51 PM on November 5, 2020 [2 favorites]


I use Chirp-o-matic for iPhone and it has done well at my location.
posted by terrapin at 1:59 PM on November 5, 2020


I used BirdNET a lot this spring and found it to be a fun and useful little little widget. I am really shit at remembering bird calls; it's just information that doesn't slot right into my brain for some reason. One early evening I had it on as I was listening and it picked up the sound of a scarlet tanager, and I was able to trace the call to the tree it was sitting in, though I was only able to see it from quite a distance. First time I'd ever seen one!
posted by drlith at 7:36 PM on November 5, 2020


Hey, I work with these guys!

We recently finished hosting a Kaggle competition together. I can talk about this stuff endlessly, so, uh, feel free to AMA.
posted by kaibutsu at 8:12 PM on November 5, 2020 [2 favorites]


How do you deal with mimics? Is there a way, even in a controlled setting, to detect mimics with your methodology?
posted by geoff. at 8:22 PM on November 5, 2020


The training is pretty much exactly what the Kaggle competition involved; mainly on Xeno-Canto, which includes some labeled mimic calls, but is by no means exhaustive. The problem space is hard for a whole host of reasons. There's also geographic variance within species, and for some species flock-level variance as well, which are also tough to deal with. The app basically just comes up with a score, and has a threshold for saying it's not sure.

For the work with scientists at the Cal Academy, we annotate some field data and check the model performance on the actual birds in the area we want to understand, and then go about figuring out what kinds of thresholds for detections to set, and figuring out a policy for if/when we want to investigate specific calls. They're interested in 'occupancy' over pretty long periods of time (ie, does this species appear at all?) for modeling ecosystem makeup and health, so surfacing just the strongest examples for human evaluation is a great place to start. A partial picture (eg, we are explicitly ignoring these species that the model sucks at) is also better than nothing.

And there are still a lot of species for which the models just suck in one way or another, TBH. (Thus the Kaggle competition.)
posted by kaibutsu at 8:57 PM on November 5, 2020 [3 favorites]


Hello kaibutsu, that's impressive! I wonder how the model distinguishes between "not sure about which species among the catalogued species" vs. "not sure if this is among the catalogued species at all"? My experience with an uncatalogued bird is that it would even assign a species to it with probability 1 (whatever that means), which was way greater than the distant second. It didn't "seem" confused when it should have been.
posted by runcifex at 11:47 PM on November 5, 2020


Yeah, handling (and registering) uncertainty is definitely a big problem. For training we use data augmentation, adding noises from other sources to help the models learn what's a bird and what's not. For work in a specific geographic area, I typically train on the entire possible species list for the area of interest, which helps cut down the problem of uncatalogued species. For the BirdNET app, Stefan has also done a lot of work incorporating recording metadata to help narrow things down.

(fun fact: there's a north american red tailed hawk and a south american red tailed hawk that are baaaaasically the same bird, but with non-overlapping geographies, and considered different species. but really, geolocation is the only way to tell them apart.)

Finally, one of the really nice ideas to come out of the Kaggle competition is using different model outputs to 'vote' for a particular species, and only accept the label if the votes line up. Eg, make a guess for a particular 5-second segment, and another guess for the 30-second segment that it sits in. If both guesses say that bird X is present, accept the label, otherwise reject. I haven't worked this into any of my own models just yet, but I'm looking forward to seeing how it helps.
posted by kaibutsu at 12:21 AM on November 6, 2020 [1 favorite]


This reminds me that I'm crossing my fingers for a Norwegian localization of Wingspan (a board game that looks absolutely stunning).
posted by Harald74 at 4:07 AM on November 6, 2020 [1 favorite]


I wrote an email to the company handling the localized versions of the game and got a reply in 9 minutes (!) telling me that a Norwegian version is coming in Q2 2021!
posted by Harald74 at 4:24 AM on November 6, 2020 [1 favorite]


I love bird.net, been using it a lot this year after moving to a forested area and wondering what on earth all the bird calls were. Really fantastic tool!
posted by chrispy at 4:29 AM on November 6, 2020


« Older Presentations on fictional naval forces   |   Cretaceous geological upheaval and African... Newer »


This thread has been archived and is closed to new comments