Sunday, March 30, 2014

You Can't Spell Big Data Without BI


When I first started attending conferences and meet-ups branded as "Big Data events" some odd years ago, I have to be honest, as a person who tracks business intelligence (BI) technology I felt a bit lost. Hip, new Hadoop-based startups would discuss the latest dot-zero release of Pig Latin, Mahout, and Hive in painful detail, while NoSQL vendors bragged about how many petabytes they could process and store. What was often ignored, however, was a plan on how a mere mortal (i.e. someone who doesn't understand MapReduce, HiveQL, or similar) would actually be able to access, analyze, and draw insights from all of this data in a straightforward manner.

Fast forward to today and I would argue that there has been a change in mentality in the Big Data camp. There is a much more mature and pragmatic look at businesses' needs and wants, as well as a healthy honesty of Hadoop's (and similar Big Data frameworks/platforms) limitations. Most importantly, there seems to be an ongoing goal to converge Big Data and BI technology - something I believe is crucial if we ever want to create real business value out of Big Data.

BI brings home the (Big Data) bacon
In contrast to Big Data, BI is already an integral part of many organizations' fabric. If implemented correctly, BI can empower organizations with data-driven (rather than gut-feel) decision-making. BI's ability to analyze, explore, and deliver business information in a timely and proactive way makes BI more of a "must-have" than a "nice-to-have" in today's tight markets and economy. Being able to just store Big Data, on the other hand, creates zero business value in the present.

The key is to pair Big Data with the right BI tools. One of the more interesting developments in the BI market during the last couple of years has been the addition of visual data discovery tools. These types of visual data-exploration tools allow users to see and discover patterns in large and complex datasets in an intuitive and visually compelling manner and can be low-hanging fruit for BI users who want to dabble with Big Data.

It should be noted that making BI and Big Data mesh together is still a work in progress, but there are some significant trends that point to continued and improved coupling of BI and Big Data technology:

1. BI vendors are scrambling to be 'Big Data ready' (and vice versa) - Big Data is a great opportunity for any BI vendor and every vendor out there is already investing money to be able to grab mind- and marketshare gained from it. But the problem BI vendors have bumped into is that querying something like Hive directly, which is batch-oriented, is not very interactive and can often be unreliable. To bypass this, BI vendors are improving their integration with Hadoop, NoSQL, and NewSQL vendors/technology to better the access to Big Data stores.

2. SQL is the new black - Access to data in Big Data stores has often been coupled with complex querying and processing languages that are read-only and often lack ACID functionality (i.e. lacks transaction reliability). A couple of years ago, if you ever mentioned SQL to a Hadoop or non-relational Big Data vendor they would look at you like you were from a different planet. However, during the last year there has been a change in mentality and SQL is back in vogue amongst Big Data vendors. In particular, interactive SQL on top of non-relational environments seems to be all the rave these days. For BI this means organizations will be able to leverage skills and tools they are already invested in.

3. Realization that non-relational environments will augment, not replace, EDWs - You might have heard that the enterprise data warehouse (EDW) is on its deathbed. More than likely these types of statements started at a marketing department at a non-relational Big Data vendor. So please take it with a grain of salt. EDWs will continue to be a good way to deliver structured, reoccurring, and reliable data in a BI environment.

What a non-relational environment can add to the BI stack, however, is the ability for BI users to explore beyond aggregated and structured data sets, adding innovative ways of getting new types of answers and insights with functionality such as active archiving and massive data exploration of raw data. At the end of the day, EDWs and non-relational environments aren't enemies, they are friends and can both be part of an organization's BI architecture.

4. BI startups born in a Big Data era - There is also a fresh wave of innovative BI and analytics startups that were born with one goal in mind: run efficiently on top of Big Data technology. Whereas traditional BI vendors either have to fork its codebase or create a new set of products, these vendors were created ground-up to enable BI users to gain insights from large, complex, and disparate data stored in relational and non-relational environments.

Happy to hear your feedback - are you seeing similar trends? Are your organizations looking to couple Big Data with BI?

No comments:

Post a Comment