i found metacrap on the always informative webvoice [ which incidently is where i found the taxonomies post yesterday ]. i know cory doctorow is a Smart Man and although there’s a kernel of truth in much of metacrap, most of it is “trivially obvious” [ as one of my logic professors used to be fond of repeating over and over. and over. ]. after a long rant about how arbitrary most forms of metadata ultimately can be, cory concludes that, by gosh:
“Metadata can be quite useful, if taken with a sufficiently large pinch of salt. The meta-utopia will never come into being, but metadata is often a good means of making rough assumptions about the information that floats through the Internet.
“Certain kinds of implicit metadata is awfully useful, in fact. Google exploits metadata about the structure of the World Wide Web: by examining the number of links pointing at a page (and the number of links pointing at each linker), Google can derive statistics about the number of Web-authors who believe that that page is important enough to link to, and hence make extremely reliable guesses about how reputable the information on that page is.
“This sort of observational metadata is far more reliable than the stuff that human beings create for the purposes of having their documents found. It cuts through the marketing bullshit, the self-delusion, and the vocabulary collisions.”
so human categorization can be capricious and filled with undertainty. this has been a well observed fact since aristotle and the scholastics decided it would be big fun to walk around and bin everything into not-so-tidy categories that people have been debating since about five minutes after they started saying that’s a dog and that’s a platypus
speaking of platypusses, programmers have been dealing with the problem of Encapsulation, Inheritance and the Platypus effect as well:
“A number of programmers have described their class hierarchies as being “brittle”. Class hierarchies are often used to represent taxonomies. In the “real world”, the term “taxonomy” refers to the system of biological classification into phyla, genus, species and so forth. In the software domain, the term is sometimes used to refer to a hierarchical categorization of a diverse set of objects. An example would be the various flavors of widgets in a moden GUI environment.
However, real object collections aren’t always hierarchical.”
yup. more than laziness and hubris – things are complicated. and it’s certainly a craw in metadata’s jaw. and yet – it moves [ with apologies to galileo ], as yahoo and dmoz prove. there is, in certain cases, value in a good ol’ fashioned human intervention.
in fact, the author seems to be arguing that automated “implicit metadata” is the way to go. but who decided that there was a complicated and reliable relationship between linking and relevancy?
a warm body.
of course, cory has an agenda to convince you that things like openfolders need to be done. and that’s fine. i hear openfolders is superfine and i look forward to playing with it. my only point is that this isn’t the first time in the history of computing when somebody claims that we’d all be better off if humans weren’t involved to muck things up because we’re just too damn arbitrary and illogical.
and it won’t be the last.