« Defining Microcontent | Main | The Metaweb: Beyond Weblogs »

December 10, 2003

From the Metaweb to the Semantic Web: A Roadmap

In previous articles on this Weblog I have suggested that we name the new evolution of the Web that is emerging from the confluence of Weblogging and RSS, "The Metaweb." The Metaweb is a meta-data driven Web of microcontent. We can see it emerging and chart its growth by looking at technorati and daypop for example. The Metaweb is happening today, it is real. You are browsing it now by reading this page.


I believe that the Metaweb is the first step in the evolution of the coming Semantic Web. The Semantic Web is a Web of ontologically-defined information. Ontologies are formal systems of concepts that can be used to rigorously define what things mean and how they relate. So for example, an ontology about cameras would define basic concepts about cameras such as "lens," "viewfinder," "film," "tripod," "zoom lens," "shutter speed," etc.

By linking content about cameras to the appropriate definitions in the camera ontology, it then becomes possible for software to do a better job of understanding what the content means. That's the first goal of the Semantic Web -- simply adding more semantics to information so that it can be understood better by machines (and people). This can be done today.

The second goal of the Semantic Web is to enable software to think more intelligently about information by providing a formal means to express and derive abstract logical relationships, inferences and proofs, and arbitrary formal statements about information. This can be done today too, but to do it well requires artificial intelligence. The first goal of the Semantic Web -- semantic metadata -- is near-term, the second goal -- intelligent information processing -- is long-term. The point of this article is that the Metaweb is the first step towards achieving both these goals.

It all starts with RSS, in my opinion.

RSS is a metadata format for publishing and subscribing to metacontent objects, the units of the Metaweb. RSS (in various flavors and soon in Atom, a new open standard based on RSS) is already in wide use on Weblogs and content syndication sites today. Numerous large and small organizations and content providers publish and subscribe to RSS as a means to exchange and track ideas.

The first step in the process of evolving the Semantic Web is to bring about widespread adoption of the Metaweb -- of weblogs, RSS, Atom and other emerging microcontent media. As microcontent begins to play an increasingly important role on the Web, and in our personal and work lives, it will set the stage for the gradual introduction of ever-richer microcontant formats and protocols, eventually leading to full Semantic Web microcontent.

Existing microcontent standards such as RSS are extremely barebones, and the emerging Atom spec looks to be no less lightweight so far. There are many problems with RSS and Atom -- chief among them in my opinion is that while they are extensible there is really no easy way to make use of extensions, and secondly, they do not provide semantically defined metatags. Anyone is free to extend such formats with whatever custom metatags they want to put in, but currently there is no way to instantly make those metatags useful in applications that were not written specifically to recognize them, nor is there a way to semantically define the meaning of those tags so that software can understand how to interpret them without human intervention.

So the next step after the widespread adoption of metacontent standards like RSS and Atom is to add support for pervasively and ubiquitously extending the formats and also for putting more semantics into microcontent. We are working on these problems at Radar Networks.

In order to add rich semantics to microcontent (or any content for that matter), there needs to be a formally defined semantics in the first place. This brings us to the subject of ontologies. Ontologies, as I have explained earlier, are formal conceptual models. They define systems of concepts.

There are a number of ontologies in existence today, however for the most part they are either too high-level and abstract or too vertical and specific to be of much use to the average Web surfer. For example, the SUMO ontology provides a good Upper Ontology that defines abstract concepts such as various units of measurement, various types of common entities, and common relationships among them. OpenCYC is another ontology that focuses mainly on "common sense knowledge" -- such as concepts related to shopping or social relationships, places, etc. Other ontologies are more vertical -- for instance DARPA has funded the development of a number of ontologies that provide knowledge related to warfare. The NIH has funded work on ontologies related to medicine. But there is no ontology that provides good semantic definitions of the kinds of things that ordinary consumers and knowledge workers deal with.

At Radar Networks we have been working to define this ontology -- which we call "The Infoworker Ontology" -- with a goal of evententually contributing it to a standards body in the future. The Infoworker Ontology is a mid-level horizontal ontology that defines the semantics of common entities and relationships in the domain of knowledge work -- things like documents, events, projects, tasks, people, groups, etc. The development and adoption of an open, extensible, and widely-used Infoworker ontology is a necessary step towards making the Semantic Web useful to ordinary mortals (as opposed to academic researchers).

By connecting microcontent objects to the Infoworker Ontology a new generation of semantic-microcontent (what we call "metacontent") is enabled. With the right tools even non-technical consumers will be able to author and use metacontent.

It is at this point that the Metaweb begins to evolve into the Semantic Web: The moment when someone adds semantics to microcontent in a manner that everyone can use. This is what we have done at Radar Networks. But to do it right is non-trivial: a number of incredibly complex and subtle issues must be solved.

After 3 years of working on this problem we are confident that we have the right approach. In future months I will begin to describe our approach on this Weblog. Stay tuned!

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2271/301893

Listed below are links to weblogs that reference From the Metaweb to the Semantic Web: A Roadmap:

» Materialsammlung: Weblogs from mediatope
Allgemein Dave Winer: What makes a weblog a weblog? Für Winers Harvard-Blawg-Projekt geschriebene Intro. [Read More]

» From the Metaweb to the Semantic Web: A Roadmap from Raw
In previous articles on this Weblog I have suggested that we name the new evolution of the Web that... [Read More]

Comments

We are interested in a collaboration.

Dr Paul Prueitt
Founding Committee
BCNGroup

We do not have funding right now, but have some good contacts. The question is about whether the architecture we have specified solves certain problems that have been with AI


See RoadMap (Microsoft Word doc)

http://www.bcngroup.org/area1/2005beads/GIF/RoadMap.doc

My respect! Very interesting site - a good resource for everybody!

You are zeroing in on the problem when you say:
“…there is really no easy way to make use of extensions, and secondly, they do not provide semantically defined metatags. Anyone is free to extend such formats with whatever custom metatags they want to put in, but currently there is no way to instantly make those metatags useful in applications that were not written specifically to recognize them, nor is there a way to semantically define the meaning of those tags so that software can understand how to interpret them without human intervention.”

Ontologies may provide part of the answer, but only part and anyway how do you make ontologies interoperable and automatable by an undetermined application? You might want to take a look at what OMG is doing with MOF (Meta Object Facility). You may find concepts there that can be adapted to the MetaWeb.
As you hint, what’s data and what’s metadata depends on the point of view. In MOF Level 1 Metadata is a model of real world data (Level 0). (I’m using metadata in the sense of information that defines and controls the meaning and structure of data – not descriptive metadata which is really just another kind of data.) Just as metadata makes data “smart”, there needs to be a mechanism to make metadata smart – i.e. to model the metadata – to define and control its meaning and structure so it can be automated. This is where the Level 2 Meta Model comes in – a way to define the language of the metadata.
Finally you need a way to model the meta model (Level 3 or Meta Meta Model) so that your models and meta models can be interchanged with people and applications using a different modeling approach (e.g. relational database).

This (or a similar – maybe with additional dimensions) 4 level meta-structure concept (combined with additional ideas from XML registries) might be partial keys to realizing the full potential for interoperability across metaweb, databases, applications (services) and unstructured content. (stealth radar will detect big blue whales in this vicinity.)

It seems to me that the greatest benefit of a a standard "metaweb" ontology is simply the fact that it is a standard, that everyone is speaking the same langauge.

I think that the ontological infrastructure that wins will be the one that can bootstrap itself. By this I mean an ontology that contains data pertenent to consensus building, such as: opinions on a specific topic, polls, reviews of standards, etc. For this to work a distributed reputation system I think would be mandatory, and not just a good idea.

PS. It just occurred to me that your ontology may not RDF/OWL-based. It also occurred to me that there are existing ontologies that cover the areas you mention: documents [Dublin Core], events [iCal], projects, tasks [http://purl.org/stuff/project#] , people, groups [FOAF].
How then does it differ from these?

PPS. I've just had my agents noseying around, and spoken to a few bots and it doesn't look like our paths have crossed before. So I'd better formally say "hi!", I've been working around the same area for the past few years.

Great stuff!

Any examples of the kind of terms in "The Infoworker Ontology"?

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In

My Photo

Get my RSS Feed

Radar Networks

  • twine.jpg
  • logo_v5_03b.jpg
  • logo_v5_03b.jpg

Nova's Trip to Edge of Space

  • Stepsedgestratosphere
    In 1999 I flew to the edge of space with the Russian air force, with Space Adventures. I made it to an altitude of just under 100,000 feet and flew at Mach 3 in a Mig-25 piloted by one of Russia's best test-pilots. These pics were taken by Space Adventures from similar flights to mine. I didn't take digital stills -- I got the whole flight on digital video, which was featured on the Discovery Channel.

Nova & Friends, Training For Space...

  • Img047
    In 1999 I was invited to Russia as a guest of the Russian Space Agency to participate in zero-gravity training on an Ilyushin-76 parabolic flight training aircraft. It was really fun!!!! Among other people on that adventure were Peter Diamandis (founder of the X-Prize and Zero-G Corporation), Bijal Trivedi (a good friend of mine, science journalist), and "Lord British" (creator of the Ultima games). Here are some pictures from that trip...

Featured Past Articles

Recent Comments

Pages

People I Like

  • Kris Thorisson
    Kris has been working with me for years on the design of the Radar Networks software, a new platform for the Semantic Web. He has a PhD from the MIT Media Lab. He designs intelligent humanoids and virtual realities. He is from Iceland, which makes him pretty cool.
  • Jim Wissner
    Jim is among the most talented software developers I've ever worked with. He's a prolific Java coder and an expert on XML. He's the lead engineer for Radar Networks.
  • Marin Spivack
    Marin Spivack is my brother. He is the one of the only western 20th generation lineage holders of the original Chen Family Tai Chi tradition in China. He's been practicing Tai Chi for about 6 to 10 hours a day for the last 10 years and is now one of the best and most qualified Tai Chi teachers in America. He just returned from 3 years in China studying privately with a direct descendant of the original Chen family that created Tai Chi. The styles that he teaches are mainly secret and are not known or taught in the USA. One thing is for sure, this is not your grandmother's Tai Chi: This is serious combat Tai Chi -- the original, authentic Tai Chi, not the "new age" form that is taught in the USA -- it's intense, physically-demanding, fast, powerful and extremely deadly. If you are serious about Tai Chi and want to learn the authentic style and applications, the way it was meant to be, you should study with my brother. He's located in Boston these days but also travels when invited to teach master classes.
  • Paul Ford
    Paul is an accidental Semantic Web guru. He is really a writer. Ftrain is his masterpiece. You should his famous article on the Semantic Web
  • Josh Kirschenbaum
    Josh is a visual effects whiz, director and generalist hacker in LA. We have been pals and collaborators since the 1980's. Josh is probably going to be the next Jim Cameron. He's also a really good writer.
  • Joey Tamer
    Joey is a long-time friend and advisor. She is an expert on high-tech strategic planning.
  • Jerry Michalski
    I have been friends with Jerry for many years; he's been advising Radar Networks on social software technology.
  • Bram Boroson
    Bram is an astrophysicist and college pal of mine. We spend hours and hours brainstorming about cellular automata simulations of the universe. He's one of the smartest people I ever met.
  • Adam Cohen
    Adam Cohen is a long-term friend; we were roommates in college. He is a really talented composer and film-scorer. He doesn't have a Web site but I like him anyway! He's in Hollywood living the dream.
  • Mayer Spivack
    Mayer Spivack is my father; he's a brilliant inventor, cognitive scientist, sculptor, designer and therapist. He also builds carbon fiber trimarans in his spare time, and studies animal intelligence. He is working on several theories related to the origins of violence and ways to prevent it, new treatments for learning disabilities, and new theories of cognition. He doesn't have a Web site yet, but I'm working on him...
  • Louise Freedman
    Louise specializes in art-restoration. She does really big projects like The Museum of Fine Arts in Boston, The Gardner Museum and Harvard University. She's also a psychotherapist and she's married to my dad. She likes really smart parrots and she knows how to navigate a large sailboat.
  • Kathleen Spivack
    Kathleen Spivack is my mother. She's a poet, novelist and creative writing teacher. She was a personal student of Robert Lowell and was in the same group of poets with Silvia Plath, Elizabeth Bishop and Anne Sexton. She coaches novelists, playwrites and poets in France and the USA. She teaches privately and her students, as well as being published, have won many of the top writing prizes.
  • Peter F. Drucker
    Peter F. Drucker was my grandfather. He was one of my principal teachers and inspirations all my life. My many talks with him really got me interested in organizations and society. He had one of the most impressive minds I've ever encountered. He died in 2005 at age 95. Here is what I wrote about his death. His foundation is at http://www.pfdf.org/
  • Bari Koral
    Bari Koral is a really talented singer songwriter. We co-write songs together sometimes. She's getting some buzz these days -- she recently opened for India Arie. She worked at EarthWeb many years ago. Now she tours almost all year long and she just had a hit in Europe. Check out her video, on her site.
  • Chris Jones
    Chris is a long-time friend and now works with me in Radar Networks, as our director of user-experience. He's a genius level product designer, GUI designer, and product manager.

Interesting Links

Blog powered by TypePad
Member since 08/2003

Tip Jar

Give me a tip!

Tip Jar