For years now in the digital age I have heard the common refrain, “Content is King”. This idea that the most important thing to have in the age of information is compelling content to bring users to my website from their PC, Tablet, Smartphone and any other device the future holds. You need to have something exciting. I agree with most of the premise, but in my view content is not king, it’s just a prince. The real king is data and what the companies or governments who own the data intend to do with the data. We are seeking information. That is clearly evident with the current industry buzzword “Big Data”. The amount of data has indeed become “big”. Data has become so large we easily talk in terms of terabytes for our local hard drive on our desktops and laptops. In the cloud all storage is quoted in terabytes or petabytes and we are starting to hear new terms beyond exabytes and now have zettabyte’s and even yottabyte’s. We are entering a time where every movement we make is being recorded and digitized. It could be our web sites we visit or when we ship at the shopping mall, or drive on the freeway. The compelling key to all this is can we turn all this Big Data into something useful and actionable. The view from a far is we can and will and it will cause a consumer revolution unlike anything we have seen before in human history.
The amazing thing is how readily data is available. The amount of data on the internet is estimated to hit 1 zettabyte in 2015. Many of you reading this article did not even know the term zettabytye existed (admit it). Once we add 999 more zettabytes we get to a yottabyte, which will probably happen sooner than we think. With this amount of data on the public web and the backdrop of the existing NSA leak scandal featuring Edward Snowden should there be reason for concern? Yes and no. Our ability to record history will be unparalleled. In time every activity and event in a person’s lifetime will exist on a thumb drive. Historians of the future will have little to research or ask. We are such spontaneous society that when a little pop up happens asking if we accept the terms of whatever the website, we say “yes” with little hesitation. I predict an entire human life will be recorded and stored on a thumb drive. From your birth until your death. Every high emotional event and every crime committed will all be recorded for humanity to witness.
With this type of large amount of data available and an increasing array of information regarding user behavior there remain challenges to get from point A to B. Though we can collect data across the web using technologies like Hadoop, we still rely on traditional databases to store and analyze data. It is one thing when you are talking gigabytes or terabytes, but it is a wholly different thing trying to capture, store and analyze data sets in the neighborhood of petabytes and exabytes on a traditional DBMS. I was fortunate enough to attend a presentation by the late great Microsoft researcher, Jim Gray in 2006, a year before he disappeared off the coast of California. At the time he was working on a project where you would move a exabyte of data from Geneva to his lab in the Bay Area (when you were Jim Gray you got this type of funding). The challenge he discovered was not moving the data across the wire, but getting the data on and off the wire. It turned out it was a limitation in the PCI Bus architecture involving the southbridge and northbridge. It turned out they had 500gb limitations. At this point I could only imagine Jim getting out the duck tape and adding some additional bridges. Jim Gray was Bid Data before we had a term for it. If he were still alive he would be enjoying life more than ever. This highlights but one example of some of the technical issues with Big Data, and since Jim Gray did this project data sets have only become larger and the internet as highlighted above continues with its abundant growth projections.
The reason we tackle this area of Big Data is the promise it can deliver. There are examples for the future and examples that exist today. When you do any search on the web the ability of the search engine to quickly identify and recommend to you information is an example of Big Data. When you go shopping in a website and the website recommends an additional purchase based on prior purchasing patterns, that is an example of Big Data. The providers of this type of detailed are not satisfied as they want to collect more information about each individual and be better able to service and sell to them. recently Amazon filed for a patent that was about predictive user behavior. Identifying what a user will purchase before they have purchased and ship it to them. Sounds a bit far-fetched but then this is the reality we live in. We are accruing so much data and we are creating the ability to analyze the data that makes these far-fetched scenarios not so far-fetched
Like any major trend in technology those who make the big bets early will stand to reap the rewards. It is still early in the game as the ability to identify and collect the data is maturing, but the real value will be to analyze, decide and execute upon the data. The opportunities are there. As much that has been done in the open source community the traditional database players of Oracle, Microsoft and IBM will play a big role and I am sure recognize a big monetary opportunity that will please shareholders down the road. the one thing I would suggest to all players is extreme focus. Some of the biggest winners will be the consulting firms that can develop the IP and hire the talent to create robust Big Data practices. In particular in the short-term as many companies struggle with what all this means
We are still very early in the Big Data revolution. Though the ideas and vision are there, the tools and expertise necessary to make them happen are still infantile. As in everything in technology, a technical challenge will be overcome. In the early days of the PC it was things like memory management and disk compression. In search it was relevance. Each step along the way will be met with opportunity for some company or companies to fill a temporary void until a solution is developed. Those voids are usually temporary opportunities in the billions of dollars. But in the end the goal will be met in turning Big Data into meaningful date. Because of it all our lives are set to change, yet again.
Good Night and Good Luck
Hans Henrik Hoffmann January 28, 2014