The Doppler Quarterly Spring 2017 | Page 26

Why is “export to Excel” the most popular button in a BI tool? A typical list of analytics activity in a large enterprise may look like this: • Monthly data mining computation that involves running large scale neu- ral networks on a twenty node cluster • Filtering, joining and summarizing terabytes of data over the weekend for Monday’s CxO dashboard • Nightly fuzzy de-duplication and record linkage process crawling through multiple data feeds, connecting and grouping such data • Full-text searches against terabytes of text that require sub-second response time It is simply not possible to standardize on a small set of tools that gracefully serves all these masters without running into performance issues. If we con- strain users with enterprise standards, they start generating hundreds of feeds out of the data warehouse to run specific workloads, mostly using Excel. We’ve seen a large enterprise use Business Objects mainly as a data feeder to Excel. Dependence on IT grows, self-service business intelligence remains an aspira- tion and the proliferation of Excel worksheets permeates all levels of the orga- nization. To enable innovation across the organization, analytics infrastruc- ture should support a variety of front end analysis patterns and a range of tools. Polyglot Persistence Rather than Relational Models James Serra defines polyglot persistence in one of his blogs as follows: “Polyglot Persistence is a fancy term to mean that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by individual applications or components of a single application.” Speculative Retailers Web Application User Sessions Financial Data Shopping Cart Recommendations Redis RDBMS Riak Neo4J Product Catalog Reporting Analytics User Activity Logs MongoDB RDBMS Cassandra Cassandra Figure 3: An Example E-commerce Application with Polyglot Persistence 24 | THE DOPPLER | SPRING 2017