The Building Blocks of Azure Data Factory

The Building Blocks of Azure Data Factory

Getting started with Azure Data Factory

A plethora of tools exist to create data pipelines. Every major player in the cloud computing business provides one (Azure Data Factory, AWS Glue, Google Cloud Composer, Databricks Workflows), though various open-source alternatives exist as well (e.g., Airflow, Airbyte, or Prefect. Though they’re open source, fully managed versions of these do exist (e.g., Astronomer provides a managed version of Airflow called Astro). Previously, I’ve had enjoyable experiences building pipelines with Airflow (self-hosted), Databricks Workflows, and Delta Live Tables (though not really an orchestrator).

Ten (minus one) Key Principles For Data Quality

Ten (minus one) Key Principles For Data Quality

Why data quality is more than quality data

I was once asked a simple question during a job interview: “what constitutes data quality?” Surely that must have been an easy question to answer. After all, we all have a gut feeling of what quality data is. Or at least most of us will have a sense of what bad data looks like. And yet it stumped me. Why was I unable to just give a comprehensive and cohesive answer?

Pagination