78 posts categorized "The Metaweb"

February 13, 2009

Video: My Talk on the Evolution of the Global Brain at the Singularity Summit

If you are interested in collective intelligence, consciousness, the global brain and the evolution of artificial intelligence and superhuman intelligence, you may want to see my talk at the 2008 Singularity Summit. The videos from the Summit have just come online.

(Many thanks to Hrafn Thorisson who worked with me as my research assistant for this talk).

February 06, 2009

Twine's Explosive Growth

Twine has been growing at 50% per month since launch in October. We've been keeping that quiet while we wait to see if it holds. VentureBeat just noticed and did an article about it. It turns out our January numbers are higher than Compete.com estimates and February is looking strong too. We have a slew of cool viral features coming out in the next few months too as we start to integrate with other social networks. Should be an interesting season.

July 02, 2008

Most of My Blogging is Now in Twine

This is a note to readers of this blog. As many of you know, I'm the CEO of Radar Networks, the makers of a new service called Twine.

Twine is a service for "interest networking," which I believe is the next evolution of social media.

How are social networks and interest networks different?

  • Social networks are about connecting to people and messaging with them -- they are basically the next evolution of contact management and email.
  • Interest networks are about leveraging collective intelligence to discover and share great content around your interests -- they are the next evolution of social media (discussion forums, wikis, blogs, social news aggregation, and social bookmarking). Interest networks are for making sense of information and discovering new information that matters to you.

I now use Twine as my main place for authoring and sharing content on the Web. (I also use Twine as my main place for keeping up with my many interests. The Twine community does a great job of scouring the Web to find the content that I want to know about. Generally if there is an article that matters to me, it shows up in Twine very quickly. I no longer have to read as many RSS feeds. This is the power of collective intelligence at its best.)

However, although Twine can be used both to author and discover content around interest, in this article I will focus on the authoring side of the story.

Of course I am biased, but speaking from the perspective of a blogger, I can say that Twine is rapidly becoming the personal publishing environment I always dreamed of having. It's an ideal environment to author content and distribute it to highly relevant audiences.

In Twine, I have many different public and private microblogs on various topics that matter to me, and I also participate in microblogs that others have created. It's super easy to post to one or many of them at once.

Twine also has good support for discussions. It's very easy to have discussions around any piece of content -- and the discussions simply work better than they do in my Typepad blog. And of course, Twine has cool features such as automatic semantic tagging of all my posts, great content management features for finding all the content I have added, and powerful contextual recommendations to other interesting content that are added to my content.

As a result of these benefits, in the last month, I have found that my blogging activity in Twine has become about 100X my blogging activity here in Typepad (no offense to Typepad, by the way -- I really like Typepad too, but as a means of distributing content, it just isn't as useful to me as Twine).

Posting in a traditional blog is a labor intensive process and in the end my post only appears to the readers of one blog. But in Twine it is as easy as bookmarking something, or authoring a note, and then sharing it across a bunch of different communities. And Twine helps me keep track of the discussion around each of my posts as it evolves.

So if you are interested in what I'm reading, what I'm thinking about, and what matters to me, you'll find a lot more of that in Twine.

If you are not yet a Twine member already, register and you will be let in very quickly.

Here is where I hang out in Twine:

  • Nova Spivack's Public Twine -- This is my blog in Twine, for general posts.
  • Web 3.0 - Semantic Web -- This is a twine about, well, what the title says. There are thousands of participants.
  • Cool -- This is a twine about unsually cool things. It's the Twine equivalent of Boing Boing. But instead of a small elite group controlling what gets in, the entire community helps.
  • News of the Strange -- I admit it, I really like fringe news and odd news stories.
  • Science Discoveries -- A twine about emerging discoveries in science.
  • Web Industry Trends -- A twine about new ideas and trends in the Web biz.
  • And many, many more... You can see them on my Profile in Twine.

And if you want to track all my public posts in Twine, go to my profile and subscribe to my RSS feed in Twine.

Twine is still in invite only beta -- but in the second half of July we will be opening up all the public content in Twine to the open Web. Anyone will be able to read it and we will be letting people in faster as well.

I will still blog here when I have larger articles to share. But on a day-to-day basis, I will be posting a lot more in Twine. Hope to see you there!

(By the way, if you are a member of Twine and you are also finding that Twine is becoming the center of your social media life, feel free to copy and paste this post and adapt it into your own blog)

June 03, 2008

Video of my Presentation at The Next Web 2008 Conference

Here is the full video of my talk on the Semantic Web at The Next Web 2008 Conference. Thanks to Boris and the NextWeb gang!

April 12, 2008

A Few Predictions for the Near Future

This is a five minute video in which I was asked to make some predictions for the next decade about the Semantic Web, search and artificial intelligence. It was done at the NextWeb conference and was a fun interview.


Learning from the Future with Nova Spivack from Maarten on Vimeo.

March 26, 2008

My Visit to DERI -- World's Premier Semantic Web Research Institute

Earlier this month I had the opportunity to visit, and speak at, the Digital Enterprise Research Institute (DERI), located in Galway, Ireland. My hosts were Stefan Decker, the director of the lab, and John Breslin who is heading the SIOC project.

DERI has become the world's premier research institute for the Semantic Web. Everyone working in the field should know about them, and if you can, you should visit the lab to see what's happening there.

Part of the National University of Ireland, Galway. With over 100 researchers focused solely on the Semantic Web, and very significant financial backing, DERI has, to my knowledge, the highest concentration of Semantic Web expertise on the planet today. Needless to say, I was very impressed with what I saw there. Here is a brief synopsis of some of the projects that I was introduced to:

  • Semantic Web Search Engine (SWSE) and YARS, a massively scalable triplestore.  These projects are concerned with crawling and indexing the information on the Semantic Web so that end-users can find it. They have done good work on consolidating data and also on building a highly scalable triplestore architecture.
  • Sindice -- An API and search infrastructure for the Semantic Web. This project is focused on providing a rapid indexing API that apps can use to get their semantic content indexed, and that can also be used by apps to do semantic searches and retrieve semantic content from the rest of the Semantic Web. Sindice provides Web-scale semantic search capabilities to any semantic application or service.
  • SIOC -- Semantically Interlinked Online Communities. This is an ontology for linking and sharing data across online communities in an open manner, that is getting a lot of traction. SIOC is on its way to becoming a standard and may play a big role in enabling portability and interoperability of social Web data.
  • JeromeDL is developing technology for semantically enabled digital libraries. I was impressed with the powerful faceted navigation and search capabilities they demonstrated.
  • notitio.us. is a project for personal knowledge management of bookmarks and unstructured data.
  • SCOT, OpenTagging and Int.ere.st.  These projects are focused on making tags more interoperable, and for generating social networks and communities from tags. They provide a richer tag ontology and framework for representing, connecting and sharing tags across applications.
  • Semantic Web Services.  One of the big opportunities for the Semantic Web that is often overlooked by the media is Web services. Semantics can be used to describe Web services so they can find one another and connect, and even to compose and orchestrate transactions and other solutions across networks of Web services, using rules and reasoning capabilities. Think of this as dynamic semantic middleware, with reasoning built-in.
  • eLite. I was introduced to the eLite project, a large e-learning initiative that is applying the Semantic Web.
  • Nepomuk.  Nepomuk is a large effort supported by many big industry players. They are making a social semantic desktop and a set of developer tools and libraries for semantic applications that are being shipped in the Linux KDE distribution. This is a big step for the Semantic Web!
  • Semantic Reality. Last but not least, and perhaps one of the most eye-opening demos I saw at DERI, is the Semantic Reality project. They are using semantics to integrate sensors with the real world. They are creating an infrastructure that can scale to handle trillions of sensors eventually. Among other things I saw, you can ask things like "where are my keys?" and the system will search a network of sensors and show you a live image of your keys on the desk where you left them, and even give you a map showing the exact location. The service can also email you or phone you when things happen in the real world that you care about -- for example, if someone opens the door to your office, or a file cabinet, or your car, etc. Very groundbreaking research that could seed an entire new industry.

In summary, my visit to DERI was really eye-opening and impressive. I recommend that major organizations that want to really see the potential of the Semantic Web, and get involved on a research and development level, should consider a relationship with DERI -- they are clearly the leader in the space.

March 06, 2008

Insightful Article About Twine

Carla Thompson, an analyst for Guidewire Group, has written what I think is a very insightful article about her experience participating in the early-access wave of the Twine beta.

We are now starting to let the press in and next week we will begin to let waves of people in from our over 30,000 user wait list. We will be letting people into the beta in waves every week going forward.

As Carla notes, Twine is a work in progress and we are mainly focused on learning from our users now. We have lots more to do, but we're very excited about the direction Twine is headed in, and it's really great to see Twine getting so much active use.

Continue reading "Insightful Article About Twine" »

March 03, 2008

How about Web 3G?

I'm here at the BlogTalk conference in Cork, Ireland with a range of bloggers and technologists discussing the emerging social Web. Including myself, Ian Davis and Paul Miller from Talis, there are also a bunch of other Semantic Web folks including Dan Brickley, and a group from DERI Galway.

Over dinner a few of us were discussing the terms "Semantic Web" versus "Web 3.0" and we all felt a better term was needed. After some thinking, Ian Davis suggested "Web 3G." I like this term better than Web 3.0 because it loses the "version number" aspect that so many objected to. It has a familiar ring to it as well, reminding me of the 3G wireless phone initiative. It also suggests Tim Berners-Lee's "Giant Global Graph" or GGG -- a synonym for the Semantic Web. Ian stayed up late and put together a nice blog post about the term, echoing many of my own sentiments about how this term should apply to a decade (the third decade of the Web), rather than to a particular technology.

January 14, 2008

A Nice Video Intro to The Semantic Web for Non-Geeks

Question: What do you do if you're not a computer scientist but you are interested in understanding what all this Semantic Web stuff is about?

Answer: Watch this video!

November 21, 2007

Powerpoint Deck: Making Sense of the Semantic Web, and Twine

Now that I have been asked by several dozen people for the slides from my talk on "Making Sense of the Semantic Web," I guess it's time to put them online. So here they are, under the Creative Commons Attribution License (you can share it with attribution this site).

You can download the Powerpoint file at the link below:

Download nova_spivack_semantic_web_talk.ppt


Or you can view it right here:

Enjoy! And I look forward to your thoughts and comments.

November 09, 2007

Quick Video Preview of Twine

The New Scientist just posted a quick video preview of Twine to YouTube. It only shows a tiny bit of the functionality, but it's a sneak peak.

We've been letting early beta testers into Twine and we're learning a lot from all the great feedback, and also starting to see some cool new uses of Twine. There are around 20,000 people on the wait-list already, and more joining every day. We're letting testers in slowly, focusing mainly on people who can really help us beta test the software at this early stage, as we go through iterations on the app. We're getting some very helpful user feedback to make Twine better before we open it up the world.

For now, here's a quick video preview:

November 07, 2007

True Knowledge is Cool

The most interesting and exciting new app I've seen this month (other than Twine of course!) is a new semantic search engine called True Knowledge. Go to their site and watch their screencast to see what the next generation of search is really going to look like.

True Knowledge is doing something very different from Twine -- whereas Twine is about helping individuals, groups and teams manage their private and shared knowledge, True Knowledge is about making a better public knowledgebase on the Web -- in a sense they are a better search engine combined with a better Wikipedia. They seem to overlap more with what is being done by natural language search companies like Powerset and companies working on public databases, such as Metaweb and Wikia.

I don't yet know whether True Knowledge is supporting W3C open-standards for the Semantic Web, but if they do, they will be well-positioned to become a very central service in the next phase of the Web. If they don't they will just be yet another silo of data -- but a very useful one at least. I personally hope they provide SPARQL API access at the very least. Congratulations to the team at True Knowledge! This is a very impressive piece of work.

October 29, 2007

The Next Big Thing: User-Contributed Metadata

Dan Farber has an interesting piece today about how user-contributed metadata will revolutionize online advertising. He mentions Facebook, Metaweb and Twine as examples. I agree, of course, with Dan's thoughts on this, since these are some of the underlying motivations of Twine. The rich user-generated metadata in Twine is not just about users however, it's about everything -- products, companies, events, places, web pages, etc. The "semantic graph" we are building is far richer than a graph that is just about people. I'll be blogging more about this in the future.

October 25, 2007

A Video and an Audio Cast About Twine

Last night I saw that the video of my presentation of Twine at the Web 2.0 Summit is online. My session, "The Semantic Edge," featured Danny Hillis of Metaweb demoing Freebase, Barney Pell demoing Powerset, and myself Demoing Twine, followed by a brief panel discussion with Tim O'Reilly (in that order). It's a good panel and I recommend the video, however, the folks at Web 2.0 only filmed the presenters; they didn't capture what we were showing on our screens, so you have to use your imagination as we describe our demos.

An audio cast of one of my presentations about Twine to a reporter was also put online recently, for a more in-depth description.

October 18, 2007

Radar Networks Announces Twine.com

My company, Radar Networks, has just come out of stealth. We've announced what we've been working on all these years: It's called Twine.com. We're going to be showing Twine publicly for the first time at the Web 2.0 Summit tomorrow. There's lot's of press coming out where you can read about what we're doing in more detail. The team is extremely psyched and we're all working really hard right now so I'll be brief for now. I'll write a lot more about this later.

Continue reading "Radar Networks Announces Twine.com" »

October 15, 2007

Radar Networks Coming Out of Stealth - Friday, October 19

News Flash!

My company, Radar Networks, is coming out of stealth this Friday, October 19, 2007 at the Web 2.0 Summit, in San Francisco. I'll be speaking on "The Semantic Edge Panel" at 4:10 PM, and publicly showing our Semantic Web online service for the first time. If you are planning to come to Web 2.0, I hope to see you at my panel.

Here's the official Media Alert below:

               

(PRWEB) October 15, 2007 -- At the Web2.0 Summit on October 19th, Radar Networks will announce a revolutionary new service that uses the power of the emerging Semantic Web to enable a smarter way of sharing, organizing and finding information. Founder and CEO Nova Spivack will also give the first public preview of Radar’s application, which is one of the first examples of “Web 3.0” – the next-generation of the Web, in which the Web begins to function more like a database, and software grows more intelligent and helpful.

Join Nova as he participates in “The Semantic Edge” panel discussion with esteemed colleagues including Powerset’s Barney Pell and Metaweb’s Daniel Hillis, moderated by Tim O’Reilly.

Who:   
Radar Networks Founder and CEO Nova Spivack

When:   
Friday, October 19, 2007
4:10 – 4:55 p.m.
   
Where: 
Web2.0 Summit
Palace Hotel
Grand Ballroom
2 New Montgomery Street
San Francisco,  California  94105
   

September 18, 2007

The Semantic Web, Collective Intelligence and Hyperdata

I'm posting this in response to a recent post by Tim O'Reilly which focused on disambiguating what the Semantic Web is and is not, as well as the subject of Collective Intelligence. I generally agree with Tim's post, but I do have some points I would add by way of clarification. In particular, in my opinion,  the Semantic Web is all about collective intelligence, on several levels. I would also suggest that the term "hyperdata" is a possibly useful way to express what the Semantic Web is really all about.

What Makes Something a Semantic Web Application?

I agree with Tim that the term "Semantic Web" refers to the use of a particular set of emerging W3C open standards. These standards include RDF, OWL, SPARQL, and GRDDL. A key requirement for an application to have "Semantic Web inside" so to speak, is that it makes use of or is compatible with, at the very least, basic RDF. Another alternative definition is that for an application to be "Semantic Web" it must make at least some use of an ontology, using a W3C standard for doing so.

Semantic Versus Semantic Web

Many applications and services claim to be "semantic" in one manner or another, but that does not mean they are "Semantic Web." Semantic applications include any applications that can make sense of meaning, particularly in language such as unstructured text, or structured data in some cases. By this definition, all search engines today are somewhat "semantic" but few would qualify as "Semantic Web" apps.

The Difference Between "Data On the Web" and a "Web of Data"

The Semantic Web is principally about working with  data in a new and hopefully better way, and making that data available on the Web if desired in an open fashion such that other applications can understand and reuse it more easily. We call this idea "The Data Web" -- the notion is that we are transforming the Web from a distributed file server into something that is more like a distributed database.

Instead of the basic objects being web pages, they are actually pieces of data (triples) and records formed from them (sets, trees, graphs or objects comprised of triples). There can be any number of triples within a Web page, and there can also be triples on the Web that do not exist within Web pages at all -- they can come directly from databases for example.

One might respond to this by noting that there is already a lot of data on the Web, in XML and other formats -- how is the Semantic Web different from that? What is the difference between "Data on the Web" and the idea of "The Data Web?"

The best answer to this question that I have heard was something that Dean Allemang said at a recent Semantic Web SIG in Palo Alto. Dean said, "Sure there is data on the Web, but it's not actually a web of data."  The difference is that in the Semantic Web paradigm, the data can be linked to other data in other places, it's a web of data, not just data on the Web.

I call this concept of interconnected data, "Hyperdata." It does for data what hypertext did for text. I'm probably not the originator of this term, but I think it is a very useful term and analogy for explaining the value of the Semantic Web.

Another way to think of it is that the current Web is a big graph of interconnected nodes, where the nodes are usually HTML documents, but in the Semantic Web we are talking about a graph of interconnected data statements that can be as general or specific as you want. A data record is a set of data statements about the same subject, and they don't have to live in one place on the network -- they could be spread over many locations around the Web.

A statement to the effect of "Sue lives in Palo Alto" could exist on site A, refer to a URI for a statement defining Sue on site B, a URI for a statement that defines "lives in" on site C, and a URI for a statement defining "Palo Alto" on site D. That's a web of data. What's cool is that anyone can potentially add statements to this web of data, it can be completely emergent.

The Semantic Web is Built by and for Collective Intelligence

This is where I think Tim and others who think about the Semantic Web may be missing an essential point. The Semantic Web is in fact highly conducive to "collective intelligence." It doesn't require that machines add all the statements using fancy AI. In fact, in a next-generation folksonomy, when tags are created by human users, manually, they can easily be encoded as RDF statements. And by doing this you get lots of new capabilities, like being able to link tags to concepts that define their meaning, and to other related tags.

Humans can add tags that become semantic web content. They can do this manually or software can help them. Humans can also fill out forms that generate RDF behind the scenes, just as filling out a blog posting form generates HTML, XML, ATOM etc. Humans don't actually write all that code, software does it for them, yet blogging and wikis for example are considered to be collective intelligence tools.

So the concept of folksonomy and tagging is truly orthogonal to the Semantic Web. They are not mutually exclusive at all. In fact the Semantic Web -- or at least "Semantic Web Lite" (RDF + only basic use of OWL + basic SPARQL) is capable of modelling and publishing any data in the world in a more open way.

Any application that uses data could do everything it does using these technologies. Every single form of social, user-generated content and community could, and probably will, be implemented using RDF in one manner or another within the next decade or so. And in particular, RDF and OWL + SPARQL are ideal for social networking services -- the data model is a much better match for the structure of the data and the network of users and the kinds of queries that need to be done.

Folktologies

This notion that somehow the Semantic Web is not about folksonomy needs to be corrected. For example, take Metaweb's Freebase. Freebase is what I call a "folktology" -- it's an emergent, community generated ontology. Users collaborate to add to the ontology and the knowledge base that is populated within it. That's a wonderful example of collective intelligence, user generated content, and semantics (although technically to my knowledge they are not using RDF for this, their data model is from what I can see functionally equivalent and I would expect at least a SPARQL interface from them eventually).

But that's not all -- check out TagCommons and this Tag Ontology discussion, and also the SKOS ontology -- all of which are working on semantic ways of characterizing simple tags in order to enrich folksonomies and enable better collective intelligence.

There are at least two other places where the Semantic Web naturally leverages and supports collective intelligence. The first is the fact that people and software can generate triples (people could do it by hand, but generally they will do it by filling out Web forms or answering questions or dialog boxes etc.) and these triples can live all over the Web, yet interconnect or intersect (when they are about the same subjects or objects).

I can create data about a piece of data you created, for example to state that I agree with it, or that I know something else about it. You can create data about my data. Thus a data-set can be generated in a distributed way -- it's not unlike a wiki for example. It doesn't have to work this way, but at least it can if people do this.

The second point is that OWL, the ontology language, is designed to support an infinite number of ontologies -- there doesn't have to be just one big ontology to "rule them all." Anyone can make a simple or complex ontology and start to then make data statements that refer to it. Ontologies can link to or include other ontologies, or pieces of them, to create bigger distributed ontologies that cover more things.

This is kind of like not only mashing up the data, but also mashing up the schemas too. Both of these are examples of collective intelligence. In the case of ontologies, this is already happening, for example many ontologies already make use of other ontologies like the Dublin Core and Foaf.

The point here is that there is in fact a natural and very beneficial fit between the technologies of the Semantic Web and what Tim O'Reilly defines Web 2.0 to be about (essentially collective intelligence). In fact the designers of the underlying standards of the Semantic Web specifically had "collective intelligence" in mind when they came up with these ideas. They were specifically trying to rectify several problems in the closed, data-silo world of old fashioned databases. The big motivation was to make data more integrated, to enable applications to share data more easily, and to be able to build data with other data, and to build schemas with other schemas. It's all about enabling connections and network effects.

Now, whether people end up using these technologies to do interesting things that enable human-level collective intelligence (as opposed to just software level collective intelligence) is an open question. At least some companies such as my own Radar Networks and Metaweb, and Talis (thanks, Danny), are directly focused on this, and I think it is safe to say this will be a big emerging trend. RDF is a great fit for social and folksonomy-based applications.

Web 3.0 and the concept of "Hyperdata"

Where Tim defines Web 2.0 as being about collective intelligence generally, I would define Web 3.0 as being about "connective intelligence." It's about connecting data, concepts, applications and ultimately people. The real essence of what makes the Web great is that it enables a global hypertext medium in which collective intelligence can emerge. In the case of Web 3.0, which begins with the Data Web and will evolve into the full-blown Semantic Web over a decade or more, the key is that it enables a global hyperdata medium (not just hypertext).

As I mentioned above, hyperdata is to data what hypertext is to text. Hyperdata is a great word -- it is so simple and yet makes a big point. It's about data that links to other data. It does for data what hypertext does for text. That's what RDF and the Semantic Web are really all about. Reasoning is NOT the main point (but is a nice future side-effect...). The main point is about growing a web of data.

Just as the Web enabled a huge outpouring of collective intelligence via an open global hypertext medium, the Semantic Web is going to enable a similarly huge outpouring of collective knowledge and cognition via a global hyperdata medium. It's the Web, only better.

September 08, 2007

DBpedia.org is Among the Coolest Semantic Web Datasets I've Seen

I've been poking around in the DBpedia, and I'm amazed at the progress. It is definitely one of the coolest (launched) example of the Semantic Web I've seen. It's going to be a truly useful resource to everyone. If you haven't heard of it yet, check it out!

July 18, 2007

Radar Networks Progress Update

I'm sitting the Dynasty Lounge in Taipei, enroute to Singpore where I will be addressing ministers of the government there on the potential of the Semantic Web. Singapore is a very forward-looking country and they have some very exciting new initiatives in the works there. After that I hope to have a little time for a vacation and then I'm heading back to San Francisco, returning on August 1.

I should have email for all or most of the time here, so that is the best way to reach me directly. And of course you can comment on this blog too.

As for the company -- lots of good news here at Radar Networks.

First of all the team has gotten the next version of our alpha up (our hosted Web service for the Semantic Web) and it's getting awesome! We're on track for a invite only launch in the fall timeframe as planned.

We also chose a brand for our product, with help from the mad geniuses at Igor International. The new brand is secret until launch but we love it. We'll be announcing the brand close to launch.

If you want to be invited to our launch and be one of the first to see how useful the Semantic Web really can be -- sign up for our mailing list at http://www.radarnetworks.com/ -- and feel free to invite your friends to sign up too. Only people who sign up will get on our waiting list. We already have around 2000 bloggers and other influencers pre-registered, and more are coming every day, so don't wait -- it will be on a first-come, first-serve basis. We'll be letting people into the service in waves.

Another exciting development: Several of the world's big media empires have started approaching me to see how they can get involved in the network we are building here at Radar Networks. They are interested in the potential of the Semantic Web for adding new capabilities to their content and new services for their audiences. That's an exciting direction to explore for us. If you have large collections of interesting, useful, content of value to particular audiences, or if you have large audiences that need a better way to do stuff on the Web, feel free to drop me a line and we can discuss how you might be able to get involved with the Semantic Web in partnership with us.

In other news, I am still inundated with hundreds of emails from interesting people who read the articles about us in this month's Business 2.0 and BusinessWeek. It's been very interesting to connect with so many other thinkers and businesses. Forgive me in advance if takes me a while to write back -- I promise I will.

I can't wait to come back to San Francisco and start playing with our alpha -- it's really getting there. All the credit should go to our awesome development team. They've been writing tons of code and it's starting to really pay off.

July 03, 2007

Enriching the Connections of the Web -- Making the Web Smarter

Web 3.0 -- aka The Semantic Web -- is about enriching the connections of the Web. By enriching the connections within the Web, the entire Web may become smarter.

I  believe that collective intelligence primarily comes from connections -- this is certainly the case in the brain where the number of connections between neurons far outnumbers the number of neurons; certainly there is more "intelligence" encoded in the brain's connections than in the neurons alone. There are several kinds of connections on the Web:

  1. Connections between information (such as links)
  2. Connections between people (such as opt-in social relationships, buddy lists, etc.)
  3. Connections between applications (web services, mashups, client server sessions, etc.)
  4. Connections between information and people (personal data collections, blogs, social bookmarking, search results, etc.)
  5. Connections between information and applications (databases and data sets stored or accessible by particular apps)
  6. Connections between people and applications (user accounts, preferences, cookies, etc.)

Are there other kinds of connections that I haven't listed -- please let me know!

I believe that the Semantic Web can actually enrich all of these types of connections, adding more semantics not only to the things being connected (such as representations of information or people or apps) but also to the connections themselves.

In the Semantic Web approach, connections are represented with statements of the form (subject, predicate, object) where the elements have URIs that connect them to various ontologies where their precise intended meaning can be defined. These simple statements are sometimes called "triples" because they have three elements. In fact, many of us are working with statements that have more than three elements ("tuples"), so that we can represent not only subject, predicate, object of statements, but also things like provenance (where did the data for the statement come from?), timestamp (when was the statement made), and other attributes. There really is no limit to what kind of metadata can be stored in these statements. It's a very simple, yet very flexible and extensible data model that can represent any kind of data structure.

The important point for this article however is that in this data model rather than there being just a single type of connection (as is the case on the present Web which basically just provides the HREF hotlink, which simply means "A and B are linked" and may carry minimal metadata in some cases), the Semantic Web enables an infinite range of arbitrarily defined connections to be used.  The meaning of these connections can be very specific or very general.

For example one might define a type of connection called "friend of" or a type of connection called "employee of" -- these have very different meanings (different semantics) which can be made explicit and also machine-readable using OWL. By linking a page about a person with the "employee of" link to another page about a different person, we can express that one of them employs the other. That is a statement that any application which can read OWL is able to see and correctly interpret, by referencing the underlying definition of "employee of" which is defined in some ontology and might for example specify that an "employee of" relation connects a person to a person or organization who is their employer. In other words, rather than just linking things with the generic "hotlink" we are all used to, they can now be linked with specific kinds of links that have very particular and unambiguous meaning and logical implications.

This has the potential at least to dramatically enrich the information-carrying capacity of connections (links) on the Web. It means that connections can carry more meaning, on their own. It's a new place to put meaning in fact -- you can put meaning between things to express their relationships. And since connections (links) far outnumber objects (information, people or applications) on the Web, this means we can radically improve the semantics of the structure of the Web as a whole -- the Web can become more meaningful, literally. This makes a difference, even if all we do is just enrich connections between gross-level objects (in other words, connections between Web pages or data records, as opposed to connections between concepts expressed within them, such as for example, people and companies mentioned within a single document).

Even if the granularity of this improvement in connection technology is relatively gross level it could still be a major improvement to the Web. The long-term implications of this have hardly been imagined let alone understood -- it is analogous to upgrading the dendrites in the human brain; it could be a catalyst for new levels of computation and intelligence to emerge.

It is important to note that, as illustrated above, there are many types of connections that involve people. In other words the Semantic Web, and Web 3.0, are just as much about people as they are about other things. Rather than excluding people, they actually enrich their relationships to other things. The Semantic Web, should, among other things, enable dramatically better social networking and collaboration to take place on the Web. It is not only about enriching content.

Now where will all these rich semantic connections come from? That's the billion dollar question. Personally I think they will come from many places: from end-users as they find things, author content, bookmark content, share content and comment on content (just as hotlinks come from people today), as well as from applications which mine the Web and automatically create them. Note that even when Mining the Web a lot of the data actually still comes from people -- for example, mining the Wikipedia, or a social network yields lots of great data that was ultimately extracted from user-contributions. So mining and artificial intelligence does not always imply "replacing people" -- far from it! In fact, mining is often best applied as a means to effectively leverage the collective intelligence of millions of people.

These are subtle points that are very hard for non-specialists to see -- without actually working with the underlying technologies such as RDF and OWL they are basically impossible to see right now. But soon there will be a range of Semantically-powered end-user-facing apps that will demonstrate this quite obviously. Stay tuned!

Of course these are just my opinions from years of hands-on experience with this stuff, but you are free to disagree or add to what I'm saying. I think there is something big happening though. Upgrading the connections of the Web is bound to have a significant effect on how the Web functions. It may take a while for all this to unfold however. I think we need to think in decades about big changes of this nature.

Web 3.0 -- Next-Step for Web?

The Business 2.0 Article on Radar Networks and the Semantic Web just came online. It's a huge article. In many ways it's one of the best popular articles written about the Semantic Web in the mainstream press. It also goes into a lot of detail about what Radar Networks is working on.

One point of clarification, just in case anyone is wondering...

Web 3.0 is not just about machines -- it's actually all about humans -- it leverages social networks, folksonomies, communities and social filtering AS WELL AS the Semantic Web, data mining, and artificial intelligence. The combination of the two is more powerful than either one on it's own. Web 3.0 is Web 2.0 + 1. It's NOT Web 2.0 - people. The "+ 1" is the addition of software and metadata that help people and other applications organize and make better sense of the Web. That new layer of semantics -- often called "The Semantic Web" -- will add to and build on the existing value provided by social networks, folksonomies, and collaborative filtering that are already on the Web.

So at least here at Radar Networks, we are focusing much of our effort on facilitating people to help them help themselves, and to help each other, make sense of the Web. We leverage the amazing intelligence of the human brain, and we augment that using the Semantic Web, data mining, and artificial intelligence. We really believe that the next generation of collective intelligence is about creating systems of experts not expert systems.

June 29, 2007

Business 2.0 and BusinessWeek Articles About Radar Networks

It's been an interesting month for news about Radar Networks. Two significant articles came out recently:

Business 2.0 Magazine published a feature article about Radar Networks in their July 2007 issue. This article is perhaps the most comprehensive article to-date about what we are working on at Radar Networks, it's also one of the better articulations of the value proposition of the Semantic Web in general. It's a fun read, with gorgeous illustrations, and I highly recommend reading it.

BusinessWeek  posted an article about Radar Networks on the Web. The article covers some of the background that led to my interests in collective intelligence and the creation of the company. It's a good article and covers some of the bigger issues related to the Semantic Web as a paradigm shift. I would add one or two points of clarification in addition to what was stated in the article: Radar Networks is not relying solely on software to organize the Internet -- in fact, the service we will be launching combines human intelligence and machine intelligence to start making sense of information, and helping people search and collaborate around interests more productively. One other minor point related to the article -- it mentions the story of EarthWeb, the Internet company that I co-founded in the early 1990's: EarthWeb's content business actually was sold after the bubble burst, and the remaining lines of business were taken private under the name Dice.com. Dice is the leading job board for techies and was one of our properties. Dice has been highly profitable all along and recently filed for a $100M IPO.

March 23, 2007

A Bunch of New Press About Radar Networks

We had a bunch of press hits today for my startup, Radar Networks...

PC World  Article on  Web 3.0 and Radar Networks

Entrepreneur Magazine interview

We're also proud to announce that Jim Hendler, one of the founding gurus of the Semantic Web, has joined our technical advisory board.

March 12, 2007

Radar Networks Profiled in Technology Review

The MIT Technology Review just published a large article on the Semantic Web and Web 3.0, in which Radar Networks, Metaweb, Joost, RealTravel and other ventures are profiled.

March 09, 2007

Metaweb and Radar Networks

This is just a brief post because I am actually slammed with VC meetings right now. But I wanted to congratulate our friends at Metaweb for their pre-launch announcement. My company, Radar Networks, is the only other major venture-funded play working on the Semantic Web for consumers so we are thrilled to see more action in this sector.

Metaweb and Radar Networks are working on two very different applications (fortunately!). Metaweb is essentially making the Wikipedia of the Semantic Web. Here at Radar Networks we are making something else -- but equally big -- and in a different category. Just as Metaweb is making a semantic analogue to something that exists and is big, so are we: but we're more focused on the social web -- we're building something that everyone will use. But we are still in stealth so that's all I can say for now.

This is now an exciting two-horse space. We look forward to others joining the excitement too. Web 3.0 is really taking off this year.

An interesting side note: Danny Hillis (founder of Metaweb), myself (founder of Radar Networks) and Lew Tucker (CTO of Radar Networks) all worked together at Thinking Machines (an early AI massively parallel computer company). It's fascinating that we've all somehow come to think that the only practical way to move machine intelligence forward is by having us humans and applications start to employ real semantics in what we record in the digital world.

February 13, 2007

Web 3.0 Roundup: Radar Networks, Powerset, Metaweb and Others...

It's been a while since I posted about what my stealth venture, Radar Networks, is working on. Lately I've been seeing growing buzz in the industry around the "semantics" meme -- for example at the recent DEMO conference, several companies used the word "semantics" in their pitches. And of course there have been some fundings in this area in the last year, including Radar Networks and other companies.

Clearly the "semantic" sector is starting to heat up. As a result, I've been getting a lot of questions from reporters and VC's about how what we are doing compares to other companies such as for example, Powerset, Textdigger, and Metaweb. There was even a rumor that we had already closed our series B round! (That rumor is not true; in fact the round hasn't started yet, although I am getting very strong VC interest and we will start the round pretty soon).

In light of all this I thought it might be helpful to clarify what we are doing, how we understand what other leading players in this space are doing, and how we look at this sector.

Indexing the Decades of the Web

First of all, before we get started, there is one thing to clear up. The Semantic Web is part of what is being called "Web 3.0" by some, but it is in my opinion really just one of several converging technologies and trends that will define this coming era of the Web. I've written here about a proposed definition of Web 3.0, in more detail.

For those of you who don't like terms like Web 2.0, and Web 3.0, I also want to mention that  I agree --- we all want to avoid a rapid series of such labels or an arms-race of companies claiming to be > x.0. So I have a practical proposal: Let's use these terms to index decades since the Web began. This is objective -- we can all agree on when decades begin and end, and if we look at history each decade is characterized by various trends. 

I think this is reasonable proposal and actually useful (and also avoids endless new x.0's being announced every year). Web 1.0 was therefore the first decade of the Web: 1990 - 2000. Web 2.0 is the second decade, 2000 - 2010. Web 3.0 is the coming third decade, 2010 - 2020 and so on. Each of these decades is (or will be) characterized by particular technology movements, themes and trends, and these indices, 1.0, 2.0, etc. are just a convenient way of referencing them. This is a useful way to discuss history, and it's not without precedent. For example, various dynasties and historical periods are also given names and this provides shorthand way of referring to those periods and their unique flavors. To see my timeline of these decades, click here.

So with that said, what is Radar Networks actually working on? First of all, Radar Networks is still in stealth, although we are planning to go beta in 2007. Until we get closer to launch what I can say without an NDA is still limited. But at least I can give some helpful hints for those who are interested. This article provides some hints, as well as what I hope is a helpful tutorial about natural language search and the Semantic Web, and how they differ. I'll also discuss how Radar Networks compares some of the key startup ventures working with semantics in various ways today (there are many other companies in this sector -- if you know of any interesting ones, please let me know in the comments; I'm starting to compile a list).

 

(click the link below to keep reading the rest of this article...)

Continue reading "Web 3.0 Roundup: Radar Networks, Powerset, Metaweb and Others..." »

February 09, 2007

How the WebOS Evolves?

Here is my timeline of the past, present and future of the Web. Feel free to put this meme on your own site, but please link back to the master image at this site (the URL that the thumbnail below points to) because I'll be updating the image from time to time.

Radarnetworkstowardsawebos

This slide illustrates my current thinking here at Radar Networks about where the Web (and we) are heading. It shows a timeline of technology leading from the prehistoric desktop era to the possible future of the WebOS...

Note that as well as mapping a possible future of the Web, here I am also proposing that the Web x.0 terminology be used to index the decades of the Web since 1990. Thus we are now in the tail end of Web 2.0 and are starting to lay the groundwork for Web 3.0, which fully arrives in 2010.

This makes sense to me. Web 2.0 was really about upgrading the "front-end" and user-experience of the Web. Much of the innovation taking place today is about starting to upgrade the "backend" of the Web and I think that will be the focus of Web 3.0 (the front-end will probably not be that different from Web 2.0, but the underlying technologies will advance significantly enabling new capabilities and features).

See also: This article I wrote redefining what the term "Web 3.0" means.

See also: A Visual Graph of the Future of Productivity

Please note: This is a work in progress and is not perfect yet. I've been tweaking the positions to get the technologies and dates right. Part of the challenge is fitting the text into the available spaces. If anyone out there has suggestions regarding where I've placed things on the timeline, or if I've left anything out that should be there, please let me know in the comments on this post and I'll try to readjust and update the image from time to time. If you would like to produce a better version of this image, please do so and send it to me for inclusion here, with the same Creative Commons license, ideally.

November 13, 2006

Web 3.0 Versus Web 2.0

Wow -- there has been quite a firestorm over the term Web 3.0 on the blogosphere today and yesterday. While I am remaining neutral, I also have an open mind regarding what it could be defined to represent. Here are some random thoughts towards defining term:

Continue reading "Web 3.0 Versus Web 2.0" »

November 12, 2006

What is the Semantic Web, Actually?

I've read several blog posts reacting to John Markoff's article today. There seem to be some misconceptions in those posts about what the Semantic Web is and is not. Here I will try to  succinctly correct a few of the larger misconceptions I've run into:

  • The Semantic Web is not just a single Web. There won't be one Semantic Web, there will be thousands or even millions of them, each in their own area. They will all be part of one Semantic Web in that they will use the same open-standard languages and their data will be universally accessible, but they won't all be run by any single company. They will connect together over time, forming a tapestry. But nobody will own this or run this as a single service. It will be just as decentralized as the Web already is.
  • The Semantic Web is not separate from the existing Web. The Semantic Web won't be a new Web apart from the Web we already have. It simply adds new metadata and data to the existing Web. It merges right into the existing HTML Web just like XML does, except this new metadata is in RDF (since RDF can in fact be expressed in XML).
  • The Semantic Web is not just about unstructured data. In fact, the Semantic Web is really about structured data: it provides a means (RDF) to turn any content or data into structured data that other software can make use of. This is really what RDF enables.
  • The Semantic Web does not require complex ontologies. Even without making use of OWL and more sophisticated ontologies, powerful data-sharing and data-integration can be enabled on the existing Web using even just RDF alone.
  • The Semantic Web does not only exist on Web pages. RDF works inside of applications and databases, not just on Web pages. Calling it a "Web" is a misnomer of sorts -- it's not just about the Web, it's about all information, data and applications.
  • The Semantic Web is not only about AI, and doesn't require it. There are huge benefits from the Semantic Web without ever using a single line of artificial intelligence code. While the next-generation of AI will certainly be enabled by richer semantics, AI is not the only benefit of RDF. Making data available in RDF makes it more accessible, integratable, and reusable -- regardless of any AI. The long-term future of the Semantic Web is AI for sure -- but to get immediate benefits from RDF no AI is necessary.
  • The Semantic Web is not only about mining, search engines and spidering. Application developers and content providers, and end-users, can benefit from using the Semantic Web (RDF) within their own services, regardless of whether they expose that RDF metadata to outside parties. RDF is useful without doing any data-mining -- it can be baked right into content within authoring tools and created transparently when information is published. RDF makes content more manageable and frees developers and content providers from having to look at relational data models. It also gives end-users better ways to collect and manage content they find.
  • The Semantic Web is not just research. It's already in use and starting to reach the market. The government uses it of course. But also so do companies like Adobe, and more recently Yahoo (Yahoo Food has started to use some Semantic Web technologies now). And one flavor of RSS is defined with RDF. Oracle has released native RDF support in their products. The list goes on...

Learning more:

November 11, 2006

New York Times Article About the Emerging Semantic Web

A New York Times article came out today about the Semantic Web -- in which I was quoted, speaking about my company Radar Networks. Here's an excerpt:

Referred to as Web 3.0, the effort is in its infancy, and the very idea has given rise to skeptics who have called it an unobtainable vision. But the underlying technologies are rapidly gaining adherents, at big companies like I.B.M. and  Google as well as small ones. Their projects often center on simple, practical uses, from producing vacation recommendations to predicting the next hit song.

But in the future, more powerful systems could act as personal advisers in areas as diverse as financial planning, with an intelligent system mapping out a retirement plan for a couple, for instance, or educational consulting, with the Web helping a high school student identify the right college.

The projects aimed at creating Web 3.0 all take advantage of increasingly powerful computers that can quickly and completely scour the Web.

“I call it the World Wide Database,” said Nova Spivack, the founder of a start-up firm whose technology detects relationships between nuggets of information mining the World Wide Web. “We are going from a Web of connected documents to a Web of connected data.”

Web 2.0, which describes the ability to seamlessly connect applications (like geographical mapping) and services (like photo-sharing) over the Internet, has in recent months become the focus of dot-com-style hype in Silicon Valley. But commercial interest in Web 3.0 — or the “semantic Web,” for the idea of adding meaning — is only now emerging.

November 06, 2006

Minding The Planet -- The Meaning and Future of the Semantic Web

NOTES

 

Prelude

Many years ago, in the late 1980s, while I was still a college student, I visited my late grandfather, Peter F. Drucker, at his home in Claremont, California. He lived near the campus of Claremont College where he was a professor emeritus. On that particular day, I handed him a manuscript of a book I was trying to write, entitled, "Minding the Planet" about how the Internet would enable the evolution of higher forms of collective intelligence.

My grandfather read my manuscript and later that afternoon we sat together on the outside back porch and he said to me, "One thing is certain: Someday, you will write this book." We both knew that the manuscript I had handed him was not that book, a fact that was later verified when I tried to get it published. I gave up for a while and focused on college, where I was studying philosophy with a focus on artificial intelligence. And soon I started working in the fields of artificial intelligence and supercomputing at companies like Kurzweil, Thinking Machines, and Individual.

A few years later, I co-founded one of the early Web companies, EarthWeb, where among other things we built many of the first large commercial Websites and later helped to pioneer Java by creating several large knowledge-sharing communities for software developers. Along the way I continued to think about collective intelligence. EarthWeb and the first wave of the Web came and went. But this interest and vision continued to grow. In 2000 I started researching the necessary technologies to begin building a more intelligent Web. And eventually that led me to start my present company, Radar Networks, where we are now focused on enabling the next-generation of collective intelligence on the Web, using the new technologies of the Semantic Web. 

But ever since that day on the porch with my grandfather, I remembered what he said: "Someday, you will write this book." I've tried many times since then to write it. But it never came out the way I had hoped. So I tried again. Eventually I let go of the book form and created this weblog instead. And as many of my readers know, I've continued to write here about my observations and evolving understanding of this idea over the years. This article is my latest installment, and I think it's the first one that meets my own standards for what I really wanted to communicate. And so I dedicate this article to my grandfather, who inspired me to keep writing this, and who gave me his prediction that I would one day complete it.

This is an article about a new generation of technology that is sometimes called the Semantic Web, and which could also be called the Intelligent Web, or the global mind. But what is the Semantic Web, and why does it matter, and how does it enable collective intelligence? And where is this all headed? And what is the long-term far future going to be like? Is the global mind just science-fiction? Will a world that has a global mind be good place to live in, or will it be some kind of technological nightmare?

Continue reading "Minding The Planet -- The Meaning and Future of the Semantic Web" »

September 01, 2006

Excellent Feedback from Om Malik

Today A-List blogger and emerging "media 2.0" mogul, Om Malik, dropped by our offices to get a confidential demo of what we are building. We've asked Om to keep a tight lid on what we showed him, but he may be releasing at least a few hints in the near future.

Om was there in the early days of the Web and really understands the industry and the content ecosystem. I remember running into him in NYC when I was a co-founder of EarthWeb. He's seen a lot of technologies come and go, and he has a huge knowledgebase in his head. So he was an excellent person to speak to about what we are doing.

He gave us some of the most useful user-feedback about our product that we've ever gotten. One of our target audiences is content creators, and what Om is building over at Gigaom is a perfect example. He is a hard-core content creator. So he really understands deeply the market pain that we are addressing. And he had some incredibly useful comments, tweaks and suggestions for us. During the meeting there were quite a few Aha's for me personally -- Several new angles and benefits of our product. Meeting with folks like Om, who represent potential users of what we are building, is really helpful to us in understanding what the needs and preferences of content creators are today. I'm really excited to start doing some design around some of the suggestions he made.

Of course, the needs of content providers are only one half of the equation. We're also addressing the needs of content consumers with our product. In order to really solve the problems facing content creators we also have to address the problems faced by their readers. It's a full ecosystem, a virtuous cycle -- a whole new dimension of the Web.

August 31, 2006

The Ontology Integration Problem

The OWL language, and tools such as Protege and TopBraid Composer make it easy to design ontologies. But what about the problem of integrating disparate ontologies? I haven't really found a good solution for this yet.

In my own experience designing a number of OWL ontologies (500 classes - 3000 classes on average) it has often been easier to create my own custom ontology branches to cover various concepts than to try to integrate other ontologies of those concepts into my own.

One of the reasons for this is that each ontology has it's own naming conventions, philosophical orientation, domain nuances, design biases and tradeoffs, often guided by particular people and needs that drove their creation. Integrating across these different worldviews and underlying constraints is often hard. Simply stating that various classes or properties are equivalent is not necessarily a solution because thier inheritance may not in fact be equivalent and thus they may actually be semantically quite different in function, regardless of expressions of equivalence. OWL probably needs to be a lot more expressive in defining mappings between ontologies to truly resolve such subtle problems.

The alternative to mapping -- importing external ontologies into your own -- is also not great because it usually results in redundancies, as well as inconsistent naming conventions and points of view. As you keep adding colors to your pallete, it starts to become kind of brown. If the goal is to make ontologies that are elegant, easy to maintain, extend, understand and apply, importing ontologies into other ontologies doesn't seem to be the way to accomplish that. Different ontologies usually don't  fit together well, or even at all in some cases.

Continue reading "The Ontology Integration Problem" »

Workin Hard and Making Progress

Sorry I didn't post much today. I pulled an all-nighter last night working on Web-mining algorithms and today we had back to back meetings all day.

I just came back from a really good product team meeting facilitaed by Chris Jones on our product messaging. It's really getting simple, direct, clear and tangible. Very positive. It all makes sense.

It's pretty exciting around here these days -- a lot of pieces we have been working on for months and even years are falling into place and there's a whole-is-greater-than-the-sum-of-it's-parts effect kicking in. The vision is starting to become real -- we really are making a new dimension of the Web, and it's not just an idea, it's something that actually works and we're playing with it in the lab. It's visual, tangible, and useful.

Another cool thing today was a presentation by Peter Royal, about the work he and Bob McWhirter have done architecting our distributed grid. For those of you who don't know, part of our system is a homegrown distributed grid server architecture for massive-scale semantic search. It's not the end-product, but it's something we need for our product. It's kind of our equivalent of Google's backend -- only semantically aware. Like Google, our distributed server architecture is designed to scale efficiently to large numbers of nodes and huge query loads. What's hard, and what's new about what we have done, is that we've accomplished this for much more complex data than the simple flat files that Google indexes. In a way you could say that what this enables is the database equivalent of what Google has done for files. All of us in the presentation were struck by how elegantly designed the architecture is.

I couldn't help grinning a few times in the meeting because there is just so much technology there -- I'm really impressed by what the team has built. This is deep tech at its best. And it's pretty cool that a small company like ours can actually build the kind of system that can hold it's own against the backends of the major players out there. We're talking hundreds of thousands of lines of Java code.

It's really impressive to see how much my team has built. It just goes to show that a small team of really brilliant engineers can run circles around much larger teams.

And to think, just a few years ago there were only three of us with nothing but a dream.

August 30, 2006

Good Meeting With Shel Israel

Today our product team met with Shel Isreal to show him the alpha version of what we are building here at Radar Networks and get his feedback. Shel had a lot of good insights. We showed him our full product and explained the vision, and gave him a tour of the new dimension of the Web that we are building. We also showed him how content providers such as bloggers and other site creators, and content consumers, can benefit by joining this system. Then we asked him how he would describe it.

Shel suggested that one way to express the benefit of our product is that it helps content creators, like bloggers, become part of more conversations. "Conversation" is a key word for Shel, as many of you know. He views the Web as a network of conversations, not just a network of content. In a sense, content is a means to an end -- conversation -- rather than an end in itself. So from that perspective we are advancing the state-of-the-art in conversations (broadly speaking, not just in the sense of discussions, but in the sense of connecting people and information together in smarter ways). That's an interesting take on what we are doing that I hadn't really thought about.

Shel also suggested that even though we are still a ways from being ready to launch the beta, he thought what we had was "so much better than anything he has seen" that we should start talking about it more -- without getting into the actual details of how we are doing it (gotta save something for later, after all!).

I'll explain more in future posts.

August 29, 2006

Radar Networks is Seeking Search Engineers for Large-Scale Web Mining Initiative

My company, Radar Networks, is building a very large dataset by crawling and mining the Web. We then apply a range of new algorithms to the data (part of our secret sauce) to generate some very interesting and useful new information about the Web. We are looking for a few experienced search engineers to join our team -- specifically people with hands-on experience designing and building large-scale, high-performance Web crawling and text-mining systems. If you are interested, or you know anyone who is interested or might be qualified for this, please send them our way. This is your chance to help architect and build a really large and potentially important new system. You can read more specifics abour our open jobs here.

August 27, 2006

Great News for Radar Networks

I'm very pleased to announce that two distinguished Silicon Valley veterans, Lew Tucker Ph.D. and Mike Clary, have joined Radar Networks (http://www.radarnetworks.com).

In addition, we have just launched a new version of the Radar Networks corporate website with these details and more. It's been a great few weeks at Radar: As well as Lew and Mike, we've made a number of great new hires at other levels of the company, including several new senior engineers, a search architect, an additional UI designer, and our first office manager. On top of that we've come up with several very interesting new algorithms related to what we are doing over the last few weeks and our alpha is making solid progress. We're now around 15 people and growing and it really feels like the company has shifted into a new stage of growth. And we're having a lot of fun!

August 26, 2006

I'm Going to Start Blogging About Radar Networks Here

I haven't blogged very much about my stealth startup, Radar Networks, yet. At the most, I've made a few cryptic posts and announcements in the past, but we've been keeping things pretty quiet. That's been a conscious  decision because we have been working intensively on R&D  and we just weren't ready to say much yet.

Unlike some companies which have done massive and deliberate hype about unreleased vapor software, we really felt it would be better to just focus on our work and let it speak for itself when we release it.

The fact is we have been working quietly for several years on something really big, and really hard. It hasn't always been easy -- there have been some  technical challenges that took a long time to overcome. And it took us a long time to find VC's daring enough to back us.

The thing is, what we are making is not a typical Web 2.0 "build it and flip it in 6 months" kind of project. It's deep technology that has long-term infrastructure-level implications for the Web and the future of content. And until recently we really didn't even have a good way to describe it to non-techies. So we just focused on our work and figured we would talk about it someday in the future.

But perhaps I've erred on the side of caution -- being so averse to gratuitous hype that I have literally said almost nothing publicly about the company. We didn't even issue a press release about our Series A round (which happened last April -- I'll be adding one to our new corporate site, which launches on Sunday night however, for historical purposes), and until today, our site at Radar has been  just a one-page placeholder with no info at all about what we are doing.

But something happened that changed my mind about this recently. I had lunch with my friend Munjal Shah, the CEO of Riya. Listening to Munjal tell his stories about how he has blogged so openly about Riya's growth, even from way before their launch, and how that has provided him and his team with amazingly valuable community feedback, support, critiques, and new ideas, really got me thinking. Maybe it's time Radar Networks started telling a little more of its story? It seems like the team at Riya really benefitted from being so open. So although, we're still in stealth-mode and there are limits to what we can say at this point, I do think there are some aspects we can start to talk about, even before we've launched. And besides that our story itself is interesting -- it's the story of what it's like to build and work in a deep-technology play in today's venture economy.

So that's what I'm going to start doing here -- I'm going to start telling our story on this blog, Minding the Planet. I already have around 500 regular readers, and most of them are scientists and hard-core techies and entrepreneurs. I've been writing mainly about emerging technologies that are interesting enough to inspire me to post about them, and once in a while about ideas I have been thinking about. These are also subjects that are of interest to the people who read this blog. But now I'm also going to start blogging more about Radar Networks and what we are doing and how it's going. I'll post about our progress, the questions we have, the achievements on our team, and of course news about our launch plans. And I hope to hear from people out there who are interested in joining us when we do our private invite-only beta tests.

We're still quite a ways from a public launch, but we do have something working in the lab and it's very exciting. Our VC's want us to launch it now, but it's still an early alpha and we think it needs a lot more work (and testing) before our baby is ready to step out into the big world out there. But it looks promising. I do think, all modesty aside for a moment, that it has the potential to really advance the Web on a broad scale. And it's exciting to work on.

This post is already long enough, so I'll finish here for the moment. In my upcoming posts I will start to talk a little bit more about the new category that Radar Networks is going to define, and some of the technologies we're using, and challenges we've overcome along the way. And I'll share some insights, and stories, and successes we've had.

But I'm getting ahead of myself, and besides that, my dinner's ready. More later.

August 25, 2006

Radar Networks News...Coming Soon

My company, Radar Networks, will be announcing some news next week. Stay tuned. We'll be issuing some press releases along with several new sections on our Website, and more clues about what we are building...

August 05, 2006

What is Radar Networks up to?

Shel Israel and I just finished up working together for 10 days. I needed Shel's perspective on what we are working on at Radar Networks. Shel lived up to his reviews as a brilliant thinker on strategic messaging, branding and positioning. So what are the 15 people at Radar Networks working on? It's still a secret, but yes, it's related to the Semantic Web, and yes, Shel has hinted on his blog at some of it. But it's probably not what you think. And, no, it's not semantic video blogging either. More hints later on. For now, if you are a blogger and you have a wish-list for what wikis or blogs could do next, feel free to submit your list in the comments on this post: I'm doing some informal market research...

[Corrected due to typo.]

July 29, 2006

Microsoft Photosynth is Incredible

Check out this video demo of Microsoft Photosynth -- an experimental technology that combines multiple photos of the same thing into a 3-D model that can then be navigated and explored -- it's beautiful, visionary and well... just awesome.

February 12, 2006

Open IRIS - Semantic Desktop PIM Released!

Yesterday, the first public open-source release of Open IRIS was annnounced. IRIS is a Java-based desktop semantic personal information manager developed by SRI (with help from my own company, Radar Networks -- we provided a some of our early semantic object libraries and a native triplestore, and some work on UI; note that our own upcoming products, and our semantic applications platform, are quite different from IRIS and focused on different needs, however), as part of the DARPA CALO program. IRIS provides a rich semantic web based environment for desktop personal knowledge management across activities, applications and types of information. This release is primarily for semantic web and AI researchers for now -- in other words, it's still early-stage software (not intended for end-user consumers...yet) -- but for researchers IRIS provides what may be the most comprehensive, robust development  platform  for building next-generation learning applications that help people work with their desktop information more productively. If you're interested in a practical example of how the semantic web looks and feels on the desktop, see the information on the Open IRIS site, or if you're a bit more of a geek, download it and try it yourself. Congratulations to the IRIS team at SRI on this release!

January 24, 2006

Collective Intelligence 2.0

Introduction:

This article proposes the creation of a new open, nonprofit service on the Web that will provide something akin to “collective self-awareness” back to the Web. This service is like a "Google Zeitgeist" on steroids, but with a lot more real-time, interactive, participatory data, technology and features in it. The goal is to measure and visualize the state of the collective mind of humanity, and provide this back to humanity in as close to real-time as is possible, from as many data sources as we can handle -- as a web service.

By providing this service, we will enable higher levels of collective intelligence to emerge and self-organize on the Web. The key to collective intelligence (or any intelligence in fact) is self-awareness. Self-awareness is, in essence, a feedback loop in which a system measures its own internal state and the state of its environment, then builds a representation of that state, and then reasons about and reacts to that representation in order to generate future behavior. This feedback loop can be provided to any intelligent system -- even the Web, even humanity as-a-whole. If we can provide the Web with such a service, then the Web can begin to “see itself” and react to its own state for the first time. And this is the first step to enabling the Web, and humanity as-a-whole, to become more collectively intelligent.

It should be noted that by "self-awareness" I don’t mean consciousness or sentience – I think that the consciousness comes from humans at this point and we are not trying to synthesize it (we don't need to; it's already there). Instead, by "self-awareness" I mean a specific type of feedback loop -- a specific Web service -- that provides a mirror of the state of the whole back to its parts. The parts are the conscious elements of the system – whether humans and/or machines – and can then look at this meta-mirror to understand the whole as well as their place in it. By simply providing this meta-level mirror, along with ways that the individual parts of the system can report their state to it, and get the state of the whole back from it, we can enable a richer feedback loop between the parts and the whole. And as soon as this loop exists the entire system suddenly can and will become much more collectively intelligent.

What I am proposing is something quite common in artificial intelligence. For example, in the field of robotics, such as when building an autonomous robot. Until a robot is provided with a means by which it can sense its own internal state and the state of its nearby environment, it cannot behave intelligently or very autonomously. But once this self-representation and feedback loop is provided, it can then react to it’s own state and environment and suddenly can behave far more intelligently. All cybernetic systems rely on this basic design pattern. I’m simply proposing we implement something like this for the entire Web and the mass of humanity that is connected to it. It's just a larger application of an existing pattern. Currently people get their views of “the whole” from the news media and the government – but these views suffer from bias, narrowness, lack of granularity, lack of real-time data, and the fact that they are one-way, top-down services with no feedback loop capabilities. Our global collective self-awareness -- in order to be truly useful and legitimate really must be two-way, inclusive, comprehensive, real-time and democratic. In the global collective awareness, unlike traditional media, the view of the whole is created in a bottom-up, emergent fashion from the sum of the reports from all the parts (instead of just a small pool of reporters or publishers, etc.).

The system I envision would visualize the state of the global mind on a number of key dimensions, in real-time, based on what people and software and organizations that comprise its “neurons” and “regions” report to it (or what it can figure out by mining artifacts they create). For example, this system would discover and rank the current most timely and active topics, current events, people, places, organizations, events, products, articles, websites, in the world right now. From these topics it would link to related resources, discussions, opinions, etc. It would also provide a real-time mass opinion polling system, where people could start polls, vote on them, and see the results in real-time. And it would provide real-time statistics about the Web, the economy, the environment, and other key indicators.

The idea is to try to visualize the global mind – to make it concrete and real for people, to enable them to see what it is thinking, what is going on, and where they fit in it – and to enable them to start adapting and guiding their own behavior to it. By giving the parts of the system more visibility into the state of the whole, they can begin to self-organize collectively which in turn makes the whole system function more intelligently

Essentially I am proposing the creation of the largest and most sophisticated mirror ever built – a mirror that can reflect the state of the collective mind of humanity back to itself. This will enable an evolutionary process which eventually will result in humanity becoming more collectively self-aware and intelligent as-a-whole (instead of what it is today -- just a set of separeate interacting intelligent parts). By providing such a service, we can catalyze the evolution of higher-order meta-intelligence on this planet -- the next step in human evolution. Creating this system is a grand cultural project of profound social value to all people on earth, now and in the future.

This proposal calls for creating a nonprofit orgnaization to build and host this service as a major open-source initiative on the Web, like the Wikipedia, but with a very different user-experience and focus. It also calls for implementing the system with a hybrid central and distributed architecture. Although this vision is big, the specific technologies, design patterns, and features that are necessary to implement it are quite specific and already exist. They just have to be integrated, wrapped and rolled out. This will require an extraordinary and multidisciplanary team. If you're interested in getting involved and think you can contribute resources that this project will need, let me know (see below for details).


Further Thoughts

Today I re-read this beautiful, visionary article by Kevin Kelley, about the birth of the global mind, in which he states:

The planet-sized "Web" computer is already more complex than a human brain and has surpassed the 20-petahertz threshold for potential intelligence as calculated by Ray Kurzweil. In 10 years, it will be ubiquitous. So will superintelligence emerge on the Web, not a supercomputer?

Kevin's article got me thinking once again about an idea that has been on my mind for over a decade. I have often thought that the Web is growing into the collective nervous system of our species. This will in turn enable the human species to function increasingly as an intelligent superorganism, for example, like a beehive, or an ant colony -- but perhaps even more intelligent. But the key to bringing this process about is self-awareness. In short, the planetary supermind cannot become truly intelligent until it evolves a form of collective self-awareness. Self-awareness is the most critical component of human intelligence -- the sophistication of human self-awareness is what makes humans different from dumb machines, and from less intelligent species.

The Big Idea that I have been thinking about for over a decade is that if we can build something that functions like a collective self-awareness, then this could catalyze a huge leap in collective intelligence that would essentially "wake up" the global supermind and usher in a massive evolution in its intelligence and behavior. As the planetary supermind becomes more aware of its environment, its own state, and its own actions and plans, it will then naturally evolve higher levels of collective intelligence around this core. This evolutionary leap is of unimaginable importance to the future of our species.

In order for the collective mind to think and act more intelligently it must be able to sense itself and its world, and reason about them, with more precision -- it must have a form of self-awareness. The essence of self-awareness is self-representation -- the ability to sense, map,  reason about, and react to, one's own internal state and the state of one's nearby environment. In other words, self-awareness is a feedback loop by which a system measures and reacts to its own self-representations. Just as is the case with the evolution of individual human intelligence, the evolution of more sophisticated collective human intelligence will depend on the emergence of better collective feedback loops and self-representations. By enabling a feedback loop in which information can flow in both directions between the self-representations of individuals and a meta-level self-representation for the set of all individuals, the dynamics of the parts and the whole become more closely coupled. And when this happens, the system can truly start to adapt to itself intelligently, as a single collective intelligence instead of a collection of single intelligences.

In summary, in order to achieve higher levels of collective intelligence and behavior, the global mind will first need something that functions as its collective self-awareness -- something that enables the parts to better sense and react to the state of the whole, and the whole to better sense and react to the state of its parts. What is needed essentially is something that functions as a collective analogue to a self -- a global collective self.

Think of the global self as a vast mirror, reflecting the state of the global supermind back to itself. Mirrors are interesting things. At first they merely reflect, but soon they begin to guide decisionmaking. By simply providing humanity with a giant virtual mirror of what is going on across the minds of billions of individuals, and millions of groups and organizations, the collective mind will crystallize, see itself for the first time, and then it will begin to react to its own image. And this is the beginning of true collective cognition. When the parts can see themselves as a whole and react in real-time, then they begin to function as a whole instead of just a collection of separate parts. As this shift transpires the state of the whole begins to feedback into the behavior of the parts, and the state of the parts in turns feeds back to the state of the whole. This cycle of bidirectional feedback between the parts and whole is the essence of cognition in all intelligent systems, whether individual brains, artificial intelligences, or entire worlds.

I believe that the time has come for this collective self to emerge on our planet. Like a vast virtual mirror, it will function as the planetary analogue to our own individual self-representations -- that capacity of our individual minds which represents us back to ourselves. It will be comprised of maps that combine real-time periodic data updates, and historical data, from perhaps trillions of data sources (one for each person, group, organization and software agent on the grid). The resulting visualizations will be something like a vast fluid flow, or a many particle simulation. It will require a massive computing capability to render it -- perhaps a distributed supercomputer comprised of the nodes on the Web themselves, each hosting a part of the process. It will require new thinking about how to visualize trends in such vast amounts of data and dimensions. This is a great unexplored frontier in data visualization and knowledge discovery.


How It Might Work

I envision the planetary self functioning as a sort of portal -- a Web service that aggregates and distributes all kinds of current real-time and historical data about the state of the whole, as well as its past states and future projected states. This portal would collect opinions, trends, and statistics about the human global mind, the environment, the economy, society, geopolitical events, and other indicators, and would map them graphically in time, geography, demography, and subject space -- enabling everyone to see and explore the state of the global mind from different perspectives, with various overlays, and at arbitrary levels of magnification.

I think this system should provide an open data model, and open API for adding and growing data sets, querying, remixing, visualizing, and subscribing to the data. All services that provide data sets, analysis or visualizations (or other interpretations) of potential value to understanding the state of the whole would be able to post data into our service for anyone to find and use. Search engines could post in the top search query terms. Sites that create tag clouds could post in tags and tag statistics. Sites that analyze the blogosphere could post in statistics about blogs, bloggers, and blog posts. Organizations that do public opinion polling, market and industry research, trend analysis, social research, or economic research could post in statistics they are generating. Academic researchers could post in statistics generated by projects they are doing to analyze trends on the Web, or within our data-set itself.

As data is pushed to us, or pulled by us, we would grow the largest central data repository about the state of the whole. Others could then write programs to analyze and remix our data, and then post their results back into the system for others to use as well. We would make use of our data for our own analysis, but anyone else could also do research and share their analysis through our system. End users and others could also subscribe to particular data, reports, or visualizations from our service, and could post in their own individual opinions, attention data feeds, or other inputs. We would serve as a central hub for search, analysis, and distribution of collective self-awareness.

The collective self would provide a sense of collective identity: who are we, how do we appear, what are we thinking about, what do we think about what we are thinking about, what are we doing, how well are we doing it, where are we now, where have we been, where are we going next. Perhaps it could be segmented by nation, or by age group, or by other dimensions as well to view various perspectives on these questions within it. It could gather its data by mining for it, as well as through direct push contributions from various data-sources. Individuals could even report on their own opinions, state, and activities to it if they wanted to, and these votes and data points would be reflected back in the whole in real time. Think of it as a giant emergent conversation comprised of trillions of participants, all helping to make sense of the same subject -- our global self identity -- together. It could even have real-time views that are animated and alive -- like a functional brain image scan -- so that people could see the virtual neurons and pathways in the global brain firing as they watch.

If this global self-representation existed, I would want to subscribe to it as a data feed on my desktop. I would want to run it in a dashboard in the upper right corner of my monitor -- that I could expand at any time to explore further. It would provide me with alerts when events transpired that matched my particular interests, causes, or relationships. It would solicit my opinions and votes on issues of importance and interest to me. It would simultaneously function as my window to the world, and the world's window to me. It would be my way of participating in the meta-level whole, whenever I wanted to. I could tell it my opinions about key issues, current events, problems, people, organizations, or even legislative proposals. I could tell it about the quality of life from my perspective, where I am living, in my industry and demographic niche. I could tell it about my hopes and fears for the future. I could tell it what I think is cool, or not cool, interesting or not interesting, good or bad, etc. I could tell it what news I was reading and what I think is noteworthy or important. And it would listen and learn, and take my contributions into account democratically along with those of billions of other people just like me all around the world. From this would emerge global visualizations and reports about what we are all thinking and doing, in aggregate, that I could track and respond to. Linked from these flows I could then find relevant news, conversations, organizations, people, products, services, events, and knowledge. And from all of this would emerge something greater than anything I can yet imagine -- a thought process too big for any one human mind to contain.

I want to build this. I want to build the planetary Self. I am not suggesting that we build the entire global mind, I am just suggesting that we build the part of the system that functions as its collective self-awareness. The rest of the global mind is already there, as raw potential at least, and doesn't have to be built. The Web, human minds, software agents, and organizations already exist. Their collective state just needs to be reflected in a single virtual mirror. As soon as this mirror exists they can begin to collectively self-organize and behave more intelligently, simply because they will have, for the first time, a way of measuring their collective state and behavior. Once there is a central collective self-awareness loop, the intelligence of the global mind will emerge and self-organize naturally over time. This collective self-awareness infrastructure is the central enabling technology that has to be there first for the next-leap in intelligence of the global mind to evolve.

Project Structure

I think this should be created as a non-profit open-source project. In fact, that is the only way that it can have legitimacy -- it must be independent of any government, cultural or commercial perspective. It must be by and for the people, as purely and cleanly as possible. My guess is that to build this properly we would need to create a distributed grid computing system to collect, compute, visualize and distribute the data -- it could be similar to SETI@Home; everyone could help host it. At the center of this grid, or perhaps in a set of supernodes, would be a vast supercomputing array that would manage the grid, do focused computations and data fusion operations. There would also need to be some serious money behind this project as well -- perhaps from major foundations and donors. This system would be a global resource of potential incalculable value to the future of human evolution. It would be a project worth funding.

My Past Writing On This Topic

A Physics of Ideas: Measuring the Physical Properties of Memes
Towards a Worldwide Database
The Metaweb: A Graph of the Future
From Semantic Web to Global Mind
The Birth of the Metaweb
Are Organizations Organisms?
From Application-Centric to Data-Centric Computing
The Human Menome Project

Other Noteworthy Projects

Principia Cybernetica -- the Global Mind Group
The Global Consciousness Project
W3C - The Semantic Web Working Group
Amazon's Mechanical Turk
CHI -- Harnessing Networks of Humans

January 11, 2006

New Text-Mining Project Aims to Help Scientists

A new project applies text-mining to help scientists in the UK discover knowledge in large collections of research articles and data (Found in: KurzweilAI):

Julie Nightingale
Tuesday   January   10, 2006
The Guardian
 
 Scientific research is being added to at an alarming rate: the Human Genome Project alone is generating enough documentation to "sink battleships". So it's not surprising that academics seeking data to support a new hypothesis are getting swamped with information overload. As data banks build up worldwide, and access gets easier through technology, it has become easier to overlook vital facts and figures that could bring about groundbreaking discoveries.

The government's response has been to set up the National Centre for Text Mining, the world's first centre devoted to developing tools that can systematically analyse multiple research papers, abstracts and other documents, and then swiftly determine what they contain.

The article above also cites some recent discoveries that have been enabled by text-mining approaches:

The more breathtaking results have included the discovery of new therapeutic uses for the drug Thalidomide to treat conditions such as chronic hepatitis C and acute pancreatitis and that chlorpromazine may reduce cardiac hypertrophy - enlargement of the heart leading to heart failure.

November 03, 2005

Amazon Launches new Service that Harnesses Networks of Human Minds to Do Tasks

Amazon has launched a new service that seeks to create a marketplace for human intelligence on the Net. The idea is to utilize humans like one might utilize intelligent agents, to help complete tasks that humans do better than computers -- for example like image adjustments, formatting, tagging and marking up content, adding metatdata to documents, filing and filtering, etc. The idea is that people can sign up to do these tasks and make money. People who need tasks can farm them out to the marketplace. It's like a big army of "human agents" who can use "human intelligence" to do stuff for you.

The name of the service is "Amazon Mechanical Turk" -- quite bizarre. But OK. It's a cool idea. I think the combination of human and machine intelligence is ultimately going to be smarter than either form of intelligence on its own. This system is at least a start -- it harnesses groups of human intelligence to help do things.

But think about where this could go: For example, the system could actually be built right into applications --  for example, imagine if in Photoshop there was a new menu command for "fix this image" that charged you a dollar and farmed the image out to 2 or 3 humans who each attempted to improve the image. It would function just like a filter, but instead of software doing the work it would be humans. For you, the end-user, it would be functionally equivalent. You would get 3 versions of your adjusted image back in a few minutes and could choose the best one or use them all.

The idea of building in menu options into software and services that actually trigger behaviors among networks of humans is very interesting.

But to do this well you really need and API that all applications can use to harness "human intelligence" and "human functions" in their apps. One the best proposals for how to do this more  is here.  And an update about that is here.

November 01, 2005

Turing's Cathedral

George Dyson wrote a nice piece on his impressions from a visit to Google, and some speculations about the future of AI on the Net.

October 27, 2005

Towards a World Wide Database (WWDB)

I believe the next big leap for the Web is what I am calling "The World Wide Database." The World Wide Database is a globally distributed network of data records that reside on millions of nodes around the network which collectively behaves as a giant virtual, decentralized database system. Google Base is an attempt to try to build such a database on a single node. But I don't think that approach will ultimately become the WWDB. At best it will be a huge data silo, or many silos in one place.

I think that for the WWDB to emerge it has to be distributed, just like the Web itself. Think about it. Would the Web have spread as it did in 1995-1996 if all Web sites had to be hosted on Yahoo? I don't think so. Not only would such a restriction have stifled innovation and competition, it simply would not have scaled. There is no way that today's Web could live on a single node! This means that Google Base -- whatever it intends to be -- is not a candidate for becoming the WWDB -- at least not if Google intends to host the whole thing. Ultimately what we are really going to need is a system that enables anyone to run their own node in the WWDB as easily as they can run their own Web server today.

I think there are several steps necessary to evolving the WWDB:

  1. Level 1: The Document Web. This is sometimes called "Web 1.0." It is a Web of HTML formatted documents connected by hyperlinks. We have this already. The content on the Document Web is unstructured or semi-structured. It is mostly flat text and images.
  2. Level 2: The Data Web.  This is a Web of structured data, defined and expressed in XML. XML does for content structure what HTML does for content formatting. OK, for the purists out there this analogy is simplistic, but I still think it's useful. The content on the Data Web is mostly structured data records of one form or another. The Data Web is one component of "Web 2.0" but not all of the story (Web 2.0 also includes other technologies and methods besides just XML). The Data Web makes it possible to publish and consume data on the Web, but it doesn't solve the problem of data interoperability. The data created on the Data Web is largely non-interoperable. Applications must be explicitly coded to work with each data schema.
  3. Level 3: The Semantic Web. The Semantic Web -- what we might call "Web 3.0" -- takes the Data Web one step further by providing formal languages (RDF and OWL) for defining the semantics of data structures, mapping between them, publishing data records, and searching across them (using SPARQL, a new query language). The Semantic Web solves the problem of data interoperability by providing open standards for defining and integrating data schemas using formal ontologies. Ontologies may be used to define top-level schemas, and/or to map between lower-level schemas, making it possible to integrate data schemas at a meta-level.
  4. Level 4: The World Wide Database. This is when it all comes together. The Semantic Web combined with the Data Web and the Document Web enables the Web to function as a vast, decentralized database. A core set of upper and mid-level ontologies define common concepts, data types and relationships. These ontologies in turn are used to map between thousands of lower level domain ontologies about specific subject areas. On the basis of this ontological fabric, all data is integrated and accessible. Applications can add records to this database at any node on the Web, it has no center. Agents roam autonomously within it, discovering knowledge, adding content, and making inferences and links. Search engines syndicate distributed queries across millions of nodes in order to scan billions of data records at once. Within this network, services aggregate, remix, and organize subsets of the data into virtual databases about various subjects such that the same data records can be referenced in multiple different applications and contexts.

The WWDB cannot function with the Data Web alone -- it requires the Semantic Web. Without the Semantic Web, the data on the Data Web is still siloed -- it cannot behave as a single database. By adding the Semantic Web layer to the Data Web we can dissolve these silos, making data and applications more interoperable. Only once this happens will it be possible to treat the entire Web as a single virtual database. Until we have the Semantic Web, the Data Web will continue to be a complex system of thousands or millions of databases at best.

I think Google Base is an attempt to create a large, centralized Data Web -- But even within Google Base itself, I see huge potential data interoperability obstacles and I wonder whether it will behave as a single database or millions of little database silos that don't work together. It doesn't seem to be a candidate for becoming the WWDB. But who knows, maybe Google will gradually embrace semantics over time (their statements in the past have been very opposed to the Semantic Web however).

For the WWDB to emerge, we need a more decentralized approach, and we probably also need a new kind of server for hosting WWDB nodes. In addition, we probably will also need a core ontology or set of core ontologies that everyone can start using for high-level data interoperability. It's very difficult (probably impossible, in fact) to come up with one ontology that covers everyone's perspectives and needs -- But I think we can do a pretty good job of coming up with a simple ontology that covers common concepts -- if we carefully restrict the domain and purpose of this ontology. According to my own research there are really only a few core concepts that we all need to share in order to achieve very high degrees of data interoperability for most of our data. Once we agree on these, branch ontologies can be developed by special interest groups for particular vertical domains of data, and mappings can be made from these to the common upper and middle ontology layers, as well as laterally to other alternative mappings within their own domains, and other vertical ontologies in other related domains. This is a fair amount of work and won't happen overnight. I think it will take place in both a top-down and bottom-up manner simultaneously. Gradually, islands will emerge and form bridges to one another. Meanwhile, here at Radar Networks we are working on this problem from several angles and hopefully in the future we will be able to make a useful contribution to the evolution of the WWDB.

October 26, 2005

The Problem with Google Base and Ning

There is a hidden problem with open databases such as Google Base and Ning -- as presently designed -- a problem that I have not seen any discussion of yet.

Briefly stated: As the number of unique data schemas created in such systems grows, the probability of applications that use those schemas breaking also grows (perhaps exponentially).

Here's why:

Let's say that Sue creates a new schema in Ning (or Google Base) for a "Person." They make an app that uses this record structure. Now Joe makes a calendar app that takes Sue's Person record and connects it with his own unique "Event" record schema. Joe's app relies on Sue's Person schema to work. Next, Bob makes a To-Do list app that uses Joe's Event schema and Sue's Person Schema and pumps out "To-Do-Entry" records. Finally, Lisa creates a Project manager app that uses Sue's Person schema, Joe's Event schema, and Bob's To-Do-Entry schema, to pump out "Project" records.

So we have a network of apps that rely on data schemas from other apps. Next, let's say that Sue decides to change one of the attribute-value pairs in her Person schema -- perhaps changing it to map to a string instead of an integer value. That 1 simple change has huge ripple effects. First it causes Joe's app to break, which then causes Bob's app to break, which causes Lisa's app to break, etc. In other words, we have a chain reaction of broken apps.

As the number of unique schemas increases, the likelihood that a given schema will be modified in a given time frame also increases. At the extreme end of this curve, with large numbers of users, schemas and apps, the likelihood approaches 100% that at any given time some schema that is directly or indirectly required by a given app will have changed, causing that app to break. So in other words if such services are successful, apps within them will break ever more frequently, causing endless problems for developers.

The only solution from a developer perspective is either to submit to constantly fixing your apps as they break, or to simply not make use of data produced by other apps on the platform. In the latter case, developers can protect their apps from breaking by simply "reinventing the wheel" and creating their own schemas for every data structure they wish to use, but the tradeoff is that they then will not be making use of existing content from other apps. The problem with this choice is that, at least in the case of Ning, "re-mixing" of data between apps is the very value proposition of using such a system. Without this capability why even use such a system instead of running your own database on your own server? So clearly neither pole of this tradeoff is optimal from a developer standpoint.,

Systems such as Google Base and Ning present an N-squared integration challenge to developers. Every app has to be potentially continually re-integrated with up to every other app in the worst case. But even in the best case, they present unworkable challenges to developers because every app may have to be continually re-integrated with at least a few other apps.

This is the very problem that the Semantic Web was created to solve. The Semantic Web provides tools for data schema integration and interoperability. The base value of RDF and OWL is that they provide a means to define, publish and map between data schemas in an open way. So for example, application creators can map their unique schemas to centrally agreed upon ontologies enabling the best of both worlds: individual developer freedom and global standards.

Of course using ontologies isn't a magic bullet -- it simply pushes the problem to a higher level. If the ontologies are changed, then any apps that rely on them may break. But at least everyone can integrate their apps with one single ontology (or a few) instead of potentially millions of disparate schemas. From a developer standpoint this is a far more manageable problem.

You can read more about my thoughts on the evolution of a World Wide Database (WWDB) here.

October 25, 2005

The World Wide Database -- Google Base Thoughts

I am playing around with the barely functional live beta of Google Base that just launched. There's not much there, but what I do see is interesting. At the very least this is going to be serious competition for Ning. Beyond that it may compete with Craigslist and other classifieds and events listing services. It's an interesting first step.

But I also see several potential major problems with the approach that Google Base  is taking -- in particular there does not seem to be any notion of real semantics under the hood. Is the data at least available as RDF? But even if it is -- how will it be integrated as everyone starts creating their own types? From what I can see, without data type standards, Google Base is likely to develop into billions of non-integrated data record types -- an unuseable "data soup." Searching across these non-normalized records will be next to impossible without an ontology or some form of higher-level data integration. I wonder if the folks at Google have thought this through? At my own startup, Radar Networks, we've spent several years exploring these issues in our own work and in our DARPA work -- all of which is centered around making use of richer semantics in applications. And we've built a working system that makes this much more practical. 

We believe that "the world wide database" requires the Semantic Web as its key enabling infrastructure. The technologies of the Semantic Web (RDF/OWL principally, and perhaps XML Topic Maps as well) enable a truly interoperable, open data exchange layer. From what I can see neither Ning nor Google see this, but they are interesting first steps at least. If anyone from Google or Yahoo is reading this, I would be interested in speaking further with you.

Update: Google has taken Google Base offline for a while it seems.  

October 20, 2005

The Future of the Web is Semantic

Here is a good article from IBM that provides decent, not-overly-technical, overview of the technologies that make up the Semantic Web, and the value they offer.

My Photo

Get my RSS Feed

Twine | Nova Spivack - My Public Twine items

Radar Networks

  • twine.jpg
  • logo_v5_03b.jpg
  • logo_v5_03b.jpg

Nova's Trip to Edge of Space

  • Stepsedgestratosphere
    In 1999 I flew to the edge of space with the Russian air force, with Space Adventures. I made it to an altitude of just under 100,000 feet and flew at Mach 3 in a Mig-25 piloted by one of Russia's best test-pilots. These pics were taken by Space Adventures from similar flights to mine. I didn't take digital stills -- I got the whole flight on digital video, which was featured on the Discovery Channel.

Nova & Friends, Training For Space...

  • Img021
    In 1999 I was invited to Russia as a guest of the Russian Space Agency to participate in zero-gravity training on an Ilyushin-76 parabolic flight training aircraft. It was really fun!!!! Among other people on that adventure were Peter Diamandis (founder of the X-Prize and Zero-G Corporation), Bijal Trivedi (a good friend of mine, science journalist), and "Lord British" (creator of the Ultima games). Here are some pictures from that trip...

Featured Past Articles

Pages

People I Like

  • Peter F. Drucker
    Peter F. Drucker was my grandfather. He was one of my principal teachers and inspirations all my life. My many talks with him really got me interested in organizations and society. He had one of the most impressive minds I've ever encountered. He died in 2005 at age 95. Here is what I wrote about his death. His foundation is at http://www.pfdf.org/
  • Mayer Spivack
    Mayer Spivack is my father; he's a brilliant inventor, cognitive scientist, sculptor, designer and therapist. He also builds carbon fiber trimarans in his spare time, and studies animal intelligence. He is working on several theories related to the origins of violence and ways to prevent it, new treatments for learning disabilities, and new theories of cognition. He doesn't have a Web site yet, but I'm working on him...
  • Marin Spivack
    Marin Spivack is my brother. He is the one of the only western 20th generation lineage holders of the original Chen Family Tai Chi tradition in China. He's been practicing Tai Chi for about 6 to 10 hours a day for the last 10 years and is now one of the best and most qualified Tai Chi teachers in America. He just returned from 3 years in China studying privately with a direct descendant of the original Chen family that created Tai Chi. The styles that he teaches are mainly secret and are not known or taught in the USA. One thing is for sure, this is not your grandmother's Tai Chi: This is serious combat Tai Chi -- the original, authentic Tai Chi, not the "new age" form that is taught in the USA -- it's intense, physically-demanding, fast, powerful and extremely deadly. If you are serious about Tai Chi and want to learn the authentic style and applications, the way it was meant to be, you should study with my brother. He's located in Boston these days but also travels when invited to teach master classes.
  • Louise Freedman
    Louise specializes in art-restoration. She does really big projects like The Museum of Fine Arts in Boston, The Gardner Museum and Harvard University. She's also a psychotherapist and she's married to my dad. She likes really smart parrots and she knows how to navigate a large sailboat.
  • Kris Thorisson
    Kris has been working with me for years on the design of the Radar Networks software, a new platform for the Semantic Web. He has a PhD from the MIT Media Lab. He designs intelligent humanoids and virtual realities. He is from Iceland, which makes him pretty cool.
  • Kimberly Rubin
    Kim is my girlfriend and partner, and also a producer of 11 TV movies, and now an entrepreneur in the pet industry. She is passionate about animals. She has unusual compassion and a great sense of humor.
  • Kathleen Spivack
    Kathleen Spivack is my mother. She's a poet, novelist and creative writing teacher. She was a personal student of Robert Lowell and was in the same group of poets with Silvia Plath, Elizabeth Bishop and Anne Sexton. She coaches novelists, playwrites and poets in France and the USA. She teaches privately and her students, as well as being published, have won many of the top writing prizes.
  • Josh Kirschenbaum
    Josh is a visual effects whiz, director and generalist hacker in LA. We have been pals and collaborators since the 1980's. Josh is probably going to be the next Jim Cameron. He's also a really good writer.
  • Joey Tamer
    Joey is a long-time friend and advisor. She is an expert on high-tech strategic planning.
  • Jim Wissner
    Jim is among the most talented software developers I've ever worked with. He's a prolific Java coder and an expert on XML. He's the lead engineer for Radar Networks.
  • Jerry Michalski
    I have been friends with Jerry for many years; he's been advising Radar Networks on social software technology.
  • Chris Jones
    Chris is a long-time friend and now works with me in Radar Networks, as our director of user-experience. He's a genius level product designer, GUI designer, and product manager.
  • Bram Boroson
    Bram is an astrophysicist and college pal of mine. We spend hours and hours brainstorming about cellular automata simulations of the universe. He's one of the smartest people I ever met.
  • Bari Koral
    Bari Koral is a really talented singer songwriter. We co-write songs together sometimes. She's getting some buzz these days -- she recently opened for India Arie. She worked at EarthWeb many years ago. Now she tours almost all year long and she just had a hit in Europe. Check out her video, on her site.
  • Adam Cohen
    Adam Cohen is a long-term friend; we were roommates in college. He is a really talented composer and film-scorer. He doesn't have a Web site but I like him anyway! He's in Hollywood living the dream.

Interesting Links

Blog powered by TypePad
Member since 08/2003

Tip Jar

Give me a tip!

Tip Jar