the slashdot crowd is discussing [sic] tuneprint. one of the creators speaks up to improve the signal-to-noise ratio:

“The general idea is pretty simple. We take the input audio. We condition it (adjust it to a known sampling rate and volume.) We pass it through the psychoacoustic model (it’s about a notch more complicated than what you’d see in a mp3 encoder, which ain’t saying much. This is all stuff that was mostly hashed out decades ago.) This model effectively strips the parts of the sound you can’t hear — the desired result being that even if the audio has been compressed or manipulated subaudibly, the
result is still the same. Okay, so the net result of all of this is a vector that covers a very small segment (fraction of a second) of audio. We stack several of these vectors (possibly separated in time by a bit) side-by-side to get a big vector. Then we do completely boring and standard and well-understood statistical and pattern-matching stuff on the vector to make it smaller and more palatable for the server — think of it as lossy compression. Then it goes off to the server. The server is about equal in
complexity to a text search engine. (I say this fully realizing that I have only a vague impression how Google works. It’s certainly a lot more complicated than the obvious hash-table-of-sorted-lists stuff.) It finds the database vector that’s the best match in a fairly boring but efficient way. (No, it does not involve searching through all tracks one by one, no more than Altavista searches through all web pages one by one every time you want to find some porn.) Call the result a submatch. Back at the client, the whole process is repeated a bunch more times, generating a stream of submatches (“Radiohead offset 0.. Radiohead offset 1024 or 16384.. Slashdot’s Gr34test Hits 5262324.. Radiohead offset 3072..”) from the input audio stream. Then, the client looks at the submatches and tries to figure out what the input audio was and where the song boundaries are (did somebody really stick in a sample from Slashdot’s Gr34test Hits, or was that just an unlucky match?)

See? Not magic. It’s a challenging problem, but not an impossible problem. The reason that this doesn’t exist right now is not that generations of scientists have tried and failed, but rather that people didn’t care too much until lately and nobody’s gotten off their ass and done anything about it yet. I like big but approachable problems, which is one of the reasons I’m excited about this.

FOR ALL OF YOU WHO FELL ASLEEP THROUGH THAT: YOU CANNOT ADD AN INAUDIBLE TONE TO THE MUSIC AND BREAK TUNEPRINT. THE FINGERPRINT IS BASED ON THE LARGE-SCALE PSYCHOACOUSTIC FEATURES OF THE MUSIC. IF MP3 ENCODERS CAN DO IT, SO CAN WE. Maybe not perfectly, but enough to have a fighting chance. THAT’S THE WHOLE POINT HERE. ”

so take that, naysayers.

so a funny thing happened on my through a davenet piece. i stumbled upon tuneprint. and it made my head spin:

“Tuneprint is an audio fingerprinting algorithm. It takes the unique ‘fingerprint’ of a sound clip, which can then be compared to a fingerprint database to get more information about the clip, like title and artist, lyrics, URLs, related music, copyright status, or almost anything else. The fingerprint doesn’t change even if the sound is compressed, converted to a different file format, broadcast over the radio, and so on.

Artists: You can use it to stop people from putting their name on your band’s mp3’s and distributing them as their own, or you can use it to embed lyrics, links to your homepage, and stupid banner ads in mp3’s.

Haxxors: You can use it to stop warez kiddies from uploading copyrighted mp3’s to your webserver, or you can use it to build the ultimate mp3 search engine.

Terrorists: You can use it as the foundation of an international fascist copy protection enforcement network, or you can use it to automatically rip, separate, categorize, and save to disk all songs played on all radio stations everywhere!!”

made a few enhancements to ol’ virtual homestead today. amazing what you’ll get done when you’re trying to avoid breaking up a concrete sidewalk [don’t ask]. mostly minor color and css changes, but you’ll also notice that you can now search via google. the best general purpose search engine on the web now lets you easily set up the option to search a specified set of domains or the whole web. i’m leaving the old atomz search box up because, although it’s harder to get relevant site-specific searches, atomz indexes my site more frequently – but
i don’t like having two search boxes since it introduces clutter so it’ll probably come down sooner rather than later. if you’re keeping score at that would be: relevancy 1 frequency 0.

duck! incoming entries to the annotated bookmark bin. i’ve had some rss/rdf links piling up and it’s time to shovel them somewhere where there is at least a tiny chance i’ll happen upon them. i imagine most of these are snarfed from you know who, but i can’t really remember for sure. first, a semi-interesting thread on definitions, then an older, but still gooder bit on what rdf is good for from tim bray and finally a more recent piece on xml and rdf as enablers for the so-called “semantic web“.

zeldman examines the inevitability of the natural life cycle of mailing lists on dreamless:

“Everyone loves online communities, until they start hating them. Communities always evolve and change, and as they do, the earliest members begin complaining.

A few months back, half the comments at metafilter.com seemed to be about how metafilter had changed.

Before Dreamless was two weeks old, one or two people were worrying that it might go downhill.

Yesterday Astoundingweb.org got its first (well written and intelligent) complaint that the community was no longer fulfilling its mission.

Online communities are always changing. That is their nature. Is this change always negative? Or is it something we should expect and learn to appreciate?”

The usability lifecycle by Jakob Nielsen:

“Doing things right will only add a few percent to the cost of a development project. You will save many times this cost by not having to make expensive adjustments and dot releases. Plus, the resulting user interface will probably be around 50% to 100% easier to use, reducing training budgets dramatically and increasing user productivity. If you happen to be running an e-commerce site, you will have the sweetest
gains of all: customers will finally start making purchases now that they can find what they want on the site. Rule #1 of e-commerce: if you can’t find it, you can’t buy it.”


[words and images via mersault*thinking]

{ intertwingled since 2000 }