« Blogs on the Job -- New Stats | Main | Top 10 Sources Selects Minding the Planet »

October 27, 2005

Towards a World Wide Database (WWDB)

I believe the next big leap for the Web is what I am calling "The World Wide Database." The World Wide Database is a globally distributed network of data records that reside on millions of nodes around the network which collectively behaves as a giant virtual, decentralized database system. Google Base is an attempt to try to build such a database on a single node. But I don't think that approach will ultimately become the WWDB. At best it will be a huge data silo, or many silos in one place.

I think that for the WWDB to emerge it has to be distributed, just like the Web itself. Think about it. Would the Web have spread as it did in 1995-1996 if all Web sites had to be hosted on Yahoo? I don't think so. Not only would such a restriction have stifled innovation and competition, it simply would not have scaled. There is no way that today's Web could live on a single node! This means that Google Base -- whatever it intends to be -- is not a candidate for becoming the WWDB -- at least not if Google intends to host the whole thing. Ultimately what we are really going to need is a system that enables anyone to run their own node in the WWDB as easily as they can run their own Web server today.

I think there are several steps necessary to evolving the WWDB:

  1. Level 1: The Document Web. This is sometimes called "Web 1.0." It is a Web of HTML formatted documents connected by hyperlinks. We have this already. The content on the Document Web is unstructured or semi-structured. It is mostly flat text and images.
  2. Level 2: The Data Web.  This is a Web of structured data, defined and expressed in XML. XML does for content structure what HTML does for content formatting. OK, for the purists out there this analogy is simplistic, but I still think it's useful. The content on the Data Web is mostly structured data records of one form or another. The Data Web is one component of "Web 2.0" but not all of the story (Web 2.0 also includes other technologies and methods besides just XML). The Data Web makes it possible to publish and consume data on the Web, but it doesn't solve the problem of data interoperability. The data created on the Data Web is largely non-interoperable. Applications must be explicitly coded to work with each data schema.
  3. Level 3: The Semantic Web. The Semantic Web -- what we might call "Web 3.0" -- takes the Data Web one step further by providing formal languages (RDF and OWL) for defining the semantics of data structures, mapping between them, publishing data records, and searching across them (using SPARQL, a new query language). The Semantic Web solves the problem of data interoperability by providing open standards for defining and integrating data schemas using formal ontologies. Ontologies may be used to define top-level schemas, and/or to map between lower-level schemas, making it possible to integrate data schemas at a meta-level.
  4. Level 4: The World Wide Database. This is when it all comes together. The Semantic Web combined with the Data Web and the Document Web enables the Web to function as a vast, decentralized database. A core set of upper and mid-level ontologies define common concepts, data types and relationships. These ontologies in turn are used to map between thousands of lower level domain ontologies about specific subject areas. On the basis of this ontological fabric, all data is integrated and accessible. Applications can add records to this database at any node on the Web, it has no center. Agents roam autonomously within it, discovering knowledge, adding content, and making inferences and links. Search engines syndicate distributed queries across millions of nodes in order to scan billions of data records at once. Within this network, services aggregate, remix, and organize subsets of the data into virtual databases about various subjects such that the same data records can be referenced in multiple different applications and contexts.

The WWDB cannot function with the Data Web alone -- it requires the Semantic Web. Without the Semantic Web, the data on the Data Web is still siloed -- it cannot behave as a single database. By adding the Semantic Web layer to the Data Web we can dissolve these silos, making data and applications more interoperable. Only once this happens will it be possible to treat the entire Web as a single virtual database. Until we have the Semantic Web, the Data Web will continue to be a complex system of thousands or millions of databases at best.

I think Google Base is an attempt to create a large, centralized Data Web -- But even within Google Base itself, I see huge potential data interoperability obstacles and I wonder whether it will behave as a single database or millions of little database silos that don't work together. It doesn't seem to be a candidate for becoming the WWDB. But who knows, maybe Google will gradually embrace semantics over time (their statements in the past have been very opposed to the Semantic Web however).

For the WWDB to emerge, we need a more decentralized approach, and we probably also need a new kind of server for hosting WWDB nodes. In addition, we probably will also need a core ontology or set of core ontologies that everyone can start using for high-level data interoperability. It's very difficult (probably impossible, in fact) to come up with one ontology that covers everyone's perspectives and needs -- But I think we can do a pretty good job of coming up with a simple ontology that covers common concepts -- if we carefully restrict the domain and purpose of this ontology. According to my own research there are really only a few core concepts that we all need to share in order to achieve very high degrees of data interoperability for most of our data. Once we agree on these, branch ontologies can be developed by special interest groups for particular vertical domains of data, and mappings can be made from these to the common upper and middle ontology layers, as well as laterally to other alternative mappings within their own domains, and other vertical ontologies in other related domains. This is a fair amount of work and won't happen overnight. I think it will take place in both a top-down and bottom-up manner simultaneously. Gradually, islands will emerge and form bridges to one another. Meanwhile, here at Radar Networks we are working on this problem from several angles and hopefully in the future we will be able to make a useful contribution to the evolution of the WWDB.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2271/3466371

Listed below are links to weblogs that reference Towards a World Wide Database (WWDB):

My Photo

Get my RSS Feed

Radar Networks

  • twine.jpg
  • logo_v5_03b.jpg
  • logo_v5_03b.jpg

Nova's Trip to Edge of Space

  • Stepsedgestratosphere
    In 1999 I flew to the edge of space with the Russian air force, with Space Adventures. I made it to an altitude of just under 100,000 feet and flew at Mach 3 in a Mig-25 piloted by one of Russia's best test-pilots. These pics were taken by Space Adventures from similar flights to mine. I didn't take digital stills -- I got the whole flight on digital video, which was featured on the Discovery Channel.

Nova & Friends, Training For Space...

  • Img047
    In 1999 I was invited to Russia as a guest of the Russian Space Agency to participate in zero-gravity training on an Ilyushin-76 parabolic flight training aircraft. It was really fun!!!! Among other people on that adventure were Peter Diamandis (founder of the X-Prize and Zero-G Corporation), Bijal Trivedi (a good friend of mine, science journalist), and "Lord British" (creator of the Ultima games). Here are some pictures from that trip...

Featured Past Articles

Recent Comments

Pages

People I Like

  • Kris Thorisson
    Kris has been working with me for years on the design of the Radar Networks software, a new platform for the Semantic Web. He has a PhD from the MIT Media Lab. He designs intelligent humanoids and virtual realities. He is from Iceland, which makes him pretty cool.
  • Jim Wissner
    Jim is among the most talented software developers I've ever worked with. He's a prolific Java coder and an expert on XML. He's the lead engineer for Radar Networks.
  • Marin Spivack
    Marin Spivack is my brother. He is the one of the only western 20th generation lineage holders of the original Chen Family Tai Chi tradition in China. He's been practicing Tai Chi for about 6 to 10 hours a day for the last 10 years and is now one of the best and most qualified Tai Chi teachers in America. He just returned from 3 years in China studying privately with a direct descendant of the original Chen family that created Tai Chi. The styles that he teaches are mainly secret and are not known or taught in the USA. One thing is for sure, this is not your grandmother's Tai Chi: This is serious combat Tai Chi -- the original, authentic Tai Chi, not the "new age" form that is taught in the USA -- it's intense, physically-demanding, fast, powerful and extremely deadly. If you are serious about Tai Chi and want to learn the authentic style and applications, the way it was meant to be, you should study with my brother. He's located in Boston these days but also travels when invited to teach master classes.
  • Paul Ford
    Paul is an accidental Semantic Web guru. He is really a writer. Ftrain is his masterpiece. You should his famous article on the Semantic Web
  • Josh Kirschenbaum
    Josh is a visual effects whiz, director and generalist hacker in LA. We have been pals and collaborators since the 1980's. Josh is probably going to be the next Jim Cameron. He's also a really good writer.
  • Joey Tamer
    Joey is a long-time friend and advisor. She is an expert on high-tech strategic planning.
  • Jerry Michalski
    I have been friends with Jerry for many years; he's been advising Radar Networks on social software technology.
  • Bram Boroson
    Bram is an astrophysicist and college pal of mine. We spend hours and hours brainstorming about cellular automata simulations of the universe. He's one of the smartest people I ever met.
  • Adam Cohen
    Adam Cohen is a long-term friend; we were roommates in college. He is a really talented composer and film-scorer. He doesn't have a Web site but I like him anyway! He's in Hollywood living the dream.
  • Mayer Spivack
    Mayer Spivack is my father; he's a brilliant inventor, cognitive scientist, sculptor, designer and therapist. He also builds carbon fiber trimarans in his spare time, and studies animal intelligence. He is working on several theories related to the origins of violence and ways to prevent it, new treatments for learning disabilities, and new theories of cognition. He doesn't have a Web site yet, but I'm working on him...
  • Louise Freedman
    Louise specializes in art-restoration. She does really big projects like The Museum of Fine Arts in Boston, The Gardner Museum and Harvard University. She's also a psychotherapist and she's married to my dad. She likes really smart parrots and she knows how to navigate a large sailboat.
  • Kathleen Spivack
    Kathleen Spivack is my mother. She's a poet, novelist and creative writing teacher. She was a personal student of Robert Lowell and was in the same group of poets with Silvia Plath, Elizabeth Bishop and Anne Sexton. She coaches novelists, playwrites and poets in France and the USA. She teaches privately and her students, as well as being published, have won many of the top writing prizes.
  • Peter F. Drucker
    Peter F. Drucker was my grandfather. He was one of my principal teachers and inspirations all my life. My many talks with him really got me interested in organizations and society. He had one of the most impressive minds I've ever encountered. He died in 2005 at age 95. Here is what I wrote about his death. His foundation is at http://www.pfdf.org/
  • Bari Koral
    Bari Koral is a really talented singer songwriter. We co-write songs together sometimes. She's getting some buzz these days -- she recently opened for India Arie. She worked at EarthWeb many years ago. Now she tours almost all year long and she just had a hit in Europe. Check out her video, on her site.
  • Chris Jones
    Chris is a long-time friend and now works with me in Radar Networks, as our director of user-experience. He's a genius level product designer, GUI designer, and product manager.

Interesting Links

Blog powered by TypePad
Member since 08/2003

Tip Jar

Give me a tip!

Tip Jar