Enterprise- Mapping, Trans-
data-catalog form, Quality
(EDC)
(BDM/BDQ)
Interactive
HDinsight
Query
Advanced Analytic layer
Business
Intelligence
Azure Analysis
Services
Structured
Data
Semi-Struc-
tured Data
Batch
Movement
Power BI
Batch
Movement
SQL
Ingest Data
Sqoop
Datalake
External
Data
Meta-store
DataWarehouse
(HOT)
Cosmos DB
Data
Sources
Events
ChangeLog
Trigger
DataWare-
house
(COLD)
SQL DB
Event
Uploads
Real Time
Movement
Bursts
of Data
Streaming
Events
Python, Scala,
Spark SQL, SparkSQL,
Spark ML
Azure
IoT Hub
Event
Updates
Azure
Databricks
Azure
Functions
Figure 1: Azure-Based Data Movement Pipelines (Batch and Real-Time Process)
Flowing Through Different Pipelines
Data follows two separate and distinct pipelines depending on how it is captured.
Existing processes for businesses usually follow batch movements and associated
extract, transform and load (ETL) processes, to ensure that the data is cleaned and
de-duped to enable on-premises capabilities and products. Batch movements can bring
in bundles of structured, semi-structured and external data. Big data systems capture
the variety, velocity and volume of the data that needs to be collected, processed, trans-
formed and managed, to derive relevant, meaningful insights.
Then there are the new streams of data that need to move through the system in bursts.
Events-based, stream-based and IoT-based data capture and processing is exploding in
the data ecosystem, along with the associated architectures and cloud services. Insights
can be derived from live streams, interactive sessions and logs from website click-
streams, and processed in real time.
Cloud-native warehouses are a breed of products which are taking advantage of the
decoupled storage and compute in the cloud, which delivers scalability, elasticity and
cost effectiveness. The decoupled storage (e.g., S3 in AWS or Blob in Azure) can persist
and grow independently, while the compute can autoscale, be paused or resumed.
Cloud-native warehouses replicate across regions, providing reliability and availability.
SPRING 2019 | THE DOPPLER | 23