First of all I know Clay Shirky, and he's a good fellow. But he's simply wrong about his claim that "tagging" (of the flavor that is appearing on del.icio.us -- what I call "social tagging") is inherently better than the use of formal ontologies. Clay favors the tagging approach because it is bottom-up and emergent in nature, and he argues against ontologies because pre-specification cannot anticipate the future. But this is a simplistic view of both approaches. One could just as easily argue against tagging systems because they don't anticipate the future -- they are shortsighted, now-oriented systems that fail to capture the "big picture" or to optimally organize resources for the long-term. Their saving grace is that over time they do (hopefully) self-organize and prune out the chaff, but that depends both on the level of participation and the quality of that participation.
Tagging is certainly useful -- and indeed collaborative authoring, editing and filtering are powerful paradigms -- but folksonomies (at least present day ones) suffer from having too little formal structure -- tagging systems easily result in "metadata soup." Ontologies are on the other end of the spectrum -- they are particulary useful for accurately modeling the actual structure of the world, or of conceptual domains -- but admittedly in some cases their formal structure can be overly rigid and specific. The benefit of tagging is primarily the adaptive nature of the resulting taxonomies. The benefit of ontologies is the rich, and unambiguous, semantics they define. Tagging systems are useful when all that is needed is the ability to link items to topics; ontologies are useful when what is needed is to rigorously define or understand what is meant, or not meant, by particular classes, fields and relationships -- something that is essential for good machine-processing of data.
One point that Clay makes, which I think is very interesting, is his
view that perhaps the world is moving from a graph-theory information
model (ontologies) to a set-theory model (folksonomies) -- but in
fact, under the surface this argument falls apart. OWL is nothing other
than a language for enabling extremely sophisticated set-theoretic
operations on information. In fact, if you actually look at the OWL
language itself, it is primarily comprised of set-theoretic statements.
I don't really view graph-theory and set-theory as mutually exclusive
-- in fact, they are highly connected, if not equivalent at a deep
level. But expressing information in graph form or set form does have
different benefits for certain types of information processing. In
particular, graphs can be beneficial when associative reasoning is
important -- for example, when traversing links or networks between
nodes is key. Sets on the other hand are useful when relevance or
mutual membership are most important.
Clay discounts ontologies for many reasons. He has many arguments, most of which have some merit, but fall short of convincing me (or anyone in the field of knowledge representation). Indeed, tagging systems are just special, highly simplistic cases of ontologies -- namely, they are ontologies with extremely basic semantics and almost no constraints -- they are even lower on the spectrum than taxonomies. In fact, we could graph the spectrum of knowledge management as follows:
<--------------------------------------------------------------->
Tags Folders Taxonomies Databases Ontologies
One of Clay's early arguments against ontologies was that they are merely systems for syllogistic logic -- but in fact, that is simply not the case. While the formal semantics of OWL doe support logical inferencing and reasoning, that is not the only value of ontologies. In fact, I think a much more important benefit of ontologies is simply that they make the semantics of data structures explicit -- which makes it much easier to both process information, and integrate information across different applications and representations. Ontologies are, in my opinion, simply the next evolution of database schemas. Surely, Clay would not argue that database schemas have no place in the world!
Another way of looking at ontologies and the semantic web is that they do for the meaning of data what other markup languages have done for the layout and structure of data. HTML provided a way to markup the formatting of content. XML provided a way to markup the structure of content. RDF and OWL provide a way to markup the meaning of information. This is a logical progression, and it is something that will really make the Web, desktop and enterprise easier to cope with. Ontologies are not panaceas -- but they are incredibly powerful when used appropriately. And that is the operative word -- they are not for everything. Indeed, in cases where social tagging is sufficient, ontologies may simply be overkill. But there are many, many cases where social tagging simply does not, and cannot, have the semantic rigour that is needed.
So what's next? I think that ultimately we will see a synthesis of these two approaches emerge. Imagine a folksonomy combined with an ontology -- a "folktology." In a folktology, users could instantly propose or modify ontological classes and properties in the same manner that they do with tags in tagging systems. The most popular ontological constructs (the most-instantiated classes, or slots on classes, for example) would "rise to the top" and self-amplify, while the less-instantiated ones would "fall to the bottom" over time. In this way an emergent, self-organizing, and self-pruning ontology could emerge within a community. Such a system would have the ease and adaptability of a folksonomy plus the semantic richness and formal structure of an ontology. I think ultimately a <i>folktology</i> approach will be better than either folksonomies or ontolgoies on their own.
Comments