2) Poor data quality – “Garbage in equals garbage out.” Data scientists cannot extract accurate and
valuable insights without clean, non-duplicated data. In fact, they are being overwhelmed with messy
big data, and in a recent survey of data scientists:
"2/3 of data scientists surveyed stated that cleaning and organizing data is the least interesting and
most time-consuming task in their jobs.”
So sending a bunch of data filled with junk to your data science team will decrease their productivity,
skew and invalidate results, and is just going to aggravate them. It creates a poor work environment,
which also relates to the culture.
Further, a dose of reality: Carl Thomas, the IG lead at JPMorganChase stated that during their IG
efforts when they dug down deep into their business units, the number one comment or complaint they
found was business managers unsure of, or dissatisfied with, information quality. Managers generally
did not have confidence in the underlying information used as a basis for decisions, and this is within
the highly-automated, controlled, and managed environment of a market leader. In addition, various
studies from leading research firms confirm this to be widely true: that a significant percentage of
information (~25%+) in organizations is flawed—IG efforts work to improve information quality.
3) Cost – first off you have soaring e-discovery and regulatory costs. These are real. When digital
information is not well-organized or easily found, it adds greatly to the costs of e-discovery and
meeting regulatory demands. The processes are costly and labor-intensive, rather than streamlined,
repeatable, routine, and automated. Also, when, on average, 40%-70% of information most
organizations are storing is duplicate, then that is wasting resources that could go to the bottom line or
be invested elsewhere. One client we have spends $40M/year on digital storage and it is increasing by
40% per year. Cleaning up what they have can cut storage needs. Even stemming this growth will save
hard dollars in the future.
"Freedom ain’t free. And neither is storage.”
What many cloud providers are doing is simply cost-shifting, otherwise known as the old "bait and
switch" in the used car sales world. They are cutting the price they charge for storage but making it up
on the back end by charging for analytics and other services when you need to access that
information.They are willing to take losses to gain market share, although later they will be under
pressure to earn a profit for shareholders. But if your organization is going the cloud route, cleaner,
unique (de-duplicated) information as a result of your IG program means less cost as many providers
charge per gigabyte.
To be sure: Regardless of pricing model, digital storage operations have hard costs.There are servers,
disk drives, optical units, controllers, cables, tapes, software (master data management, file
management, compression, security, etc.) and such, and it all has to be housed in a secure, airconditioned facility with raised flooring. There are labor costs associated with storage operations and
the hardware and software must be serviced and maintained.
Beware of cloud providers that offer nearly “free” storage—once they have most all your digital
content, what’s to keep them from changing their business model in a few years and holding your
information hostage? Has any organization experienced this phenomenon when they sent boxes upon
boxes of paper files off to be stored in a warehouse for a cheap upfront fee? Isn’t it strange how the
complicated storage and retrieval fees add up over time? Same thing can happen in the digital world.
Only the consequences will be worse. Your organization will not be able to very easily or cheaply
migrate all that information over to a competitive cloud provider once it is housed in a cloud
repository. There aren’t the available tools (today) to do so. And it is not in your cloud provider’s
interest: they have no business motivation to make it easy for you to leave. You’re basically locked in.