The Doppler Quarterly Spring 2017 | Page 32

Ensuring Quality Throughout Your Data Migration to the Cloud

Amit Dutta , Joey Jablonski & Seth Rao
Before any major analytical deployment in the cloud , it ’ s vital that your organization work to define key standards for data quality .
The cloud is empowering organizations to grow their data footprints faster than traditional data centers ever allowed . Concurrently , the rise of advanced data science tools is enabling broader audiences within organizations to analyze data and identify new insights . However , this can lead to small levels of error creeping in during data transfer between different IT systems , with or without the cloud . Even a single partper-million ( PPM ) level of error could be significant , considering the hundreds of millions of records that are transferred every day . These errors have the potential to impact the business if not handled proactively .
The intersection of cloud adoption and citizen data scientists requires that organizations deliver high quality , documented data sets for analysts to consume . The promise of cloud for data storage hinges on the certainty that the integrity of the data will be preserved as it ’ s moved between different IT systems and platforms . This article outlines the requirements needed to achieve business-driven data quality .
Tolerance for Errors
Cloud adopters must be on the lookout for poor data quality permeating through their ecosystems . For data focused on targeted marketing , a higher error level of post-analysis data is more acceptable , given the low cost of a targeted advertisement . However , when the data is critical from a PR , compliance or legal standpoint , errors can be expensive . For example , data required for corporate or industry governance at a financial services institution , must maintain a 100 % accuracy level to ensure that stock holdings are accurately attributed and taxes are calculated correctly .
Cloud Challenges
The benefits of moving data to the cloud are very appealing . Major cloud providers have an impressive offering that is extraordinarily economical compared with on-premises costs . Cloud has introduced new opportunities , but with those benefits also come challenges to data integration . The biggest change in the cloud , when moving from on-premises data management is the wide range of services available for the storage , processing and analysis of data . Yet each service introduces new challenges for data manipulation , resulting in output that may not meet the level of quality demanded by the business . So why move ? Because leveraging cloud platforms offers the flexibility to store data in a variety of formats , while optimizing the platform technology to the query and analysis patterns of the users .
But cloud providers , while still in early states of maturity , do not yet have the tools to ensure the data is fit
30 | THE DOPPLER | SPRING 2017