Sunnyvale January 14th – The Hive got off with a great start for the New Year by hosting an event on “The Future of Data” featuring Doug Cutting, the founder of Hadoop. More than 600 people attended the talk at NetApp HQ which raised much debate.

doug cutting big dataDoug Cutting serves as Chief Architect of Cloudera, Inc. He is the founder of several successful open-source projects, Lucene, Nutch, Avro, and especially Hadoop. He presented the underlying causes of the current revolution in data processing methods and based on these facts he predicted how the data world might evolve.

First, according to Doug as hardware gets cheaper we would be able to process more data. This means a lot because data are incredibly valuable. In every industry, data is necessary to stay competitive. Then, he stated that platform technologies will be open-sourced. According to Doug people are concerned about data security and prefer having a vendor-independent platform. Then Doug provided some definition on Hadoop; a software library that allows for the distributed processing of large data sets across clusters of computers using simple programming models, or in short a platform to help work with data.

The Hadoop project is the result of years of work and it overcame many challenges., For instance, in 2008, after two years at Yahoo to create Hadoop (reusing a lot of code from a 2002 project), Doug and his team got a web-scalable open-source engine to the level they wanted, but it was not very secure. Indeed, every device was linked; if one machine was down, everything was down. A lot of work has been done since to solve this problem and thus improve the platform.

In conclusion, Hadoop has known an incredible growth and gained a large audience, beyond math computer scientists. Hadoop dominates Big Data. As people seem to have adopted it as a standard (Microsoft, soon IBM…), Doug claimed that it is becoming the central component for data management. He even created another expression to name the entire ecosystem around Hadoop: “The enterprise Data Hub”.