Overview
Databricks is the leading lakehouse platform, originally founded by the creators of Apache Spark. The platform unifies data engineering, data warehousing (Databricks SQL), machine learning (MLflow, Mosaic AI), and AI applications on a common storage layer using Delta Lake and the Unity Catalog governance plane. Databricks runs as a managed service on AWS, Azure, and GCP, with deep partnerships including the strategic Azure Databricks integration co-engineered with Microsoft.
The 2023 acquisition of MosaicML positioned Databricks as a credible foundation model training and serving platform. The company has consistently driven open standards (Delta Lake, Iceberg interoperability via Uniform, MLflow) which lower switching costs. Pricing is two-tier: Databricks consumption (DBUs) plus underlying cloud compute. Total cost can be material; sophisticated capacity planning and workload management are required.
Key Features
- Delta Lake transactional storage layer over Parquet
- Databricks SQL for serverless data warehousing
- Unity Catalog for centralised data and AI governance
- Notebooks with collaborative editing (Python, SQL, Scala, R)
- Mosaic AI for foundation model fine-tuning and serving
- MLflow for experiment tracking and model management
- Delta Live Tables for declarative pipelines
- Workflows orchestration with task dependencies
- Photon vectorised query engine
- Genie (AI/BI Genie) natural-language analytics
- Lakeflow data integration (announced 2024)
- Marketplace for datasets, models, and apps
Pricing
| Edition | Model | Typical Cost |
|---|---|---|
| Databricks SQL Serverless (Pro) | Per DBU | $0.55/DBU (us-east-1) |
| Jobs Compute | Per DBU | $0.15/DBU (Standard tier) |
| All-Purpose Compute | Per DBU | $0.55/DBU (Premium tier) |
| Model Serving (Provisioned) | Per DBU | $0.07–$0.95/DBU (model dependent) |
Pricing verified May 2026. DBU rates vary by cloud region, edition (Standard/Premium/Enterprise), and workload type. Underlying cloud compute and storage costs are separate from DBU consumption.
Strengths
- Strongest unified platform for data engineering, ML, and AI workloads
- Open standards (Delta, MLflow, Iceberg via Uniform) lower lock-in vs proprietary alternatives
- Mosaic AI is a credible foundation model training and serving platform
- Unity Catalog provides genuine fine-grained governance across data and AI assets
- Multi-cloud parity with deep co-engineering on Azure (Azure Databricks)
Limitations
- Steeper learning curve than Snowflake for SQL-only teams
- Two-tier pricing (DBU + cloud compute) complicates cost forecasting
- Notebook-based workflows can entrench technical debt without engineering discipline
- Some governance and lineage features remain Premium/Enterprise-tier only
- Smaller ecosystem of BI tool integrations vs Snowflake (closing gap)
Buyer Considerations
Databricks succeeds in organisations with mature data engineering practices and clear governance ownership. Without those foundations, the platform's flexibility becomes technical debt accumulation. Mature deployments typically combine Databricks for data engineering and ML with a dedicated BI platform (Power BI, Tableau) for last-mile consumption. Single-platform aspirations covering raw ingest through executive dashboards are achievable but require committed central data platform team investment.