Saturday, November 20, 2010


The current technological landscape is obsessed with data. Open data, data API's, walled gardens, data silos, data stores, the semantic web - the list is endless. Gov 2.0 is all about getting access to government data: the US has, the UK assigned Tim Berners-Lee to kick off, and similar efforts are underway elsewhere. Data, data, data. Indeed, Tim O'Reilly says the internet OS is a data OS. In reality, all operating systems are data operating systems, and the internet OS is no different.

Data or Process?
So what's wrong? Well, we seem to be confused: we seem to be separating data from the processes which operate on that data. Open data and open software are separate topics right now. Inert data as the next internet frontier is being heralded as a profound observation, and that's a mistake. Sometimes boiling things down so that they're simple and concise shows a superior grasp of both the subject matter and the communication medium. Sometimes it just means you've missed something important.

The processes which operate on data are data. If you look at the bits and bytes of your hard drive, it is impossible to distinguish between Photoshop the application and the Photoshop files. They're just data. Look at it another way - when a developer saves a code file, the code is data to the development environment. And the code for the development environment is data to whatever was used to develop it. Even more philosophically - which came first - data or process? A simple demonstration of how much easier this makes things: transparency in government - we don't just want census 'data' to be made available, we also want the process of census taking to be open. In fact, the latter has significantly greater implications for our ability to participate in the government machine.

From this perspective, the internet OS is just like any other - a magical structure that bootstraps itself from a singularity and delivers a universe of complexity and beauty. How we managed to use the term operating system and forget process is a mystery. We can observe the damage that is caused quite plainly - what would an OS that didn't appreciate process look like? All the applications would be completely different, they would each require separate logins, have different controls, non-standard interfaces, install differently, fail differently, report differently, vary significantly in quality, fail to integrate in most cases, or in ad-hoc manner in a few - we'd have silo's and lack of transparency, lack of trust, poor resource usage, lock-in... what a nightmare! Oh wait... that's the internet - an OS that's way too focussed on a concept of inert data. We are starting to see the open data discussion extend to things like - 'who should maintain this data?', 'how should this data be analysed?', 'what means were used to collect this data?' - Oops! Did we forget something? Time to apply our understanding of how an OS really works. Time to reboot with a new kernel version that better understands process.

Friday, November 19, 2010


A recent article in the Wall Street Journal highlights the emergence of new internet monopolies around points of control. Strangely, they aren't emerging due to clever positioning, supplier agreements, partnerships or high market entry costs. They are emerging because monopoly is the most effective configuration for delivering user benefit.  A connective system delivers the greatest convenience and perceived benefit when it is universal. For example, the bigger and more connected the social graph, the more powerful it is. Ubiquity is inevitable. The internet operating system is emerging, not as loosely connected competing components, but as ubiquitous infrastructure.

Our power infrastructure is ubiquitous, our roads, the internet itself - all of them connective systems. There is no competition for the internet - what use would an alternative be? Its unconnected value is too low - no matter how brilliant its engineering. If we look at roads: sure, private companies build roads - but they don’t get to choose what side we drive on, what a stop sign looks like, or what the national speed limit is. The universal nature of the road infrastructure is what drives the incredible competition in the auto industry, and the user benefit is enormous. When such platforms are freely available, we reap the greatest benefit from competition. Ubiquitous infrastructure shouldn’t be what we compete for, but what we compete on. Of course, this doesn't stop companies trying to own the platform, and many succeed in doing so for long periods of time. However, without exception, the greater benefit is derived when the platform is the arena for competition, not the subject of it.

There’s an interesting conclusion to be drawn here - Facebook cannot own the social graph any more than Ford can own the road infrastructure. If Ford could control Toyota's access to the road infrastructure, you would expect a situation similar to that between Google and Facebook. Competition would be severely restricted. Facebook has 'won', but only something that will slip inevitably from its grasp. The social graph must be a platform for competition, not the target of it. Anti-competition litigation seems inevitable.

Facebook losing control of the social graph also highlights the ethereal and necessary companion of ubiquitous infrastructure - benevolent governance. Who should administer the social graph for the good of all? It's not something you're likely to get from a corporate monopoly, but something that is going to become increasingly necessary. Terry Jones observes the following when responding to Tim O’Reilly's question ‘Where is the Web 2.0 address book?’:
Relief does not lie in the direction of more applications behind more API’s. It lies instead in allowing related data to co-exist in the same place.’
A call for ubiquitous infrastructure, and the question of governance arises in the article's first comment -
‘But who owns and runs the central datastore? Why should they be trusted? Who foots the bill and how?’
A common shared database would make our lives easier - one might argue, in fact, that the social graph is simply a subset of this.

What we are seeing here is the emergence of new components that belong in the fabric of the web - things that should join HTTP and DNS and perhaps learn lessons from their governance. The social graph and the common database are just the beginning - we are witnessing the formation of the internet operating system - not as a loosely connected set of competing technologies (for that is just the chaotic state prior to equilibrium), but as an emergent, ubiquitous internet infrastructure upon which real competition can thrive. This is not a process that ends - new candidates for inclusion will appear continuously, and it may be the case that the natural emergence of monopolies highlights these candidates for us. The sooner this infrastructure is delivered as an open and level playing field, the sooner we will reap the true rewards of competition in this new age of connectivity.