Aer Lingus Spent 18 Months Rebuilding Data Before AI

Aer Lingus has spent the past 18 months reallocating a "sizeable percentage" of its IT and change budget from visible AI projects to strengthening its data foundations on Databricks' lakehouse architecture. The airline, which is 90 years old, believes it cannot scale AI inference until its data extraction is resolved.

The airline's analytics have been migrated from on-premises warehouses to Databricks' lakehouse, a unified stack referred to by Aer Lingus' Chief Digital, Data & Transformation Officer Dave O'Donovan as "soup to nuts". Business-facing queries are processed through Databricks Genie, a natural-language interface allowing non-technical users to ask questions in plain English against lakehouse tables. Engineering workloads are also housed on this unified architecture, eliminating the need for separate storage and duplicate ETL processes. The platform now supports real-time models that optimize flight loads, pricing, and operations, although it is unclear whether these models run within Databricks Model Serving or external infrastructure.

FIG. 02 Aer Lingus migrated from on-premises data warehouses to a unified Databricks lakehouse architecture over 18 months.

The most challenging aspect of the integration was not model selection or tuning but the physical extraction from legacy systems, which O'Donovan describes as "60 years young". These mainframe-era systems were designed for safe airline transaction processing rather than the throughput and schema flexibility required for AI pipelines. Aer Lingus was fortunate to avoid sunk costs from previous cloud migrations, allowing it to bypass incremental lift-and-shift and move directly to the lakehouse. This advantage is not replicable for carriers already engaged in partial cloud migrations, which would require resolving technical debt before establishing new foundations.

Despite claims of production deployment in revenue-critical workflows, the Databricks blog interview provides almost no hard operational metrics. The exact percentage of redirected IT capital remains unpublished, and there are no figures for lakehouse query latency, extraction pipeline throughput, cost per analytical workload, or model inference SLAs. What is quantified is the human layer: a "Data Literacy Academy" built with a UK-based training partner, delivered through online curriculum, in-person workshops, and internally produced podcasts supported by the CEO. The goal is to make "citizen developers" the norm within five years, enabling every business leader to independently use data to manage their department.

The governance layer is underspecified. O'Donovan emphasizes moving from "my data" departmental ownership to a shared holistic asset, but there is no detail on enforcement mechanisms—no mention of Unity Catalog deployment scope, programmatic data contracts, or column-level lineage granularity. Genie's natural-language interface broadens the attack surface: the interview cites no filters against hallucinated SQL joins, schema hallucination, or prompt-injection against the analytics tier, risks that increase as more business users gain direct lakehouse access. For an industry where errors in fuel-load calculations or pricing decisions have immediate regulatory and revenue consequences, the absence of published guardrails is a significant gap.

FIG. 03 Aer Lingus' governance model shifted from departmental data ownership ("my data") to unified shared asset management.

The transferable pattern is the executive-level decision to halt the pursuit of competitor AI announcements and spend 18 months addressing extraction and governance debt before deploying models at scale. The open question is whether literacy without hard automation and legacy-SLA guarantees becomes a liability once citizen-developer access scales beyond the pilot cohort.

Sources

Aer Lingus redirected a sizeable percentage of IT and change spend to rebuild data foundations on Databricks' lakehouse over ~18 months
"We've spent the last year and a half focused on the platform, governance, data quality and, most importantly, data literacy. If you don't have those solid foundations, any AI you build is just a house of cards."
databricks.com ↗
Databricks Genie lets business users query the lakehouse in plain English
"These tools allow business users to ask questions of the data in plain English. That is the only way to truly scale."
databricks.com ↗
The hardest integration challenge was extracting data from 60-year-old legacy systems not designed for AI pipelines
"It would be the physical extraction of data from systems that are '60 years young,' as we like to say. These legacy systems are fantastic at what they were built to do — running an airline safely — but they weren't built for the age of generative AI."
databricks.com ↗
Aer Lingus built a Data Literacy Academy with a UK-based partner covering online training, workshops, and podcasts, with a 5-year citizen developer goal
"My goal is that, in five years, 'citizen developers' will be the norm at Aer Lingus. If we still have a situation where a business leader doesn't know how to exploit data to run their department, then I've failed in my role."
databricks.com ↗
Aer Lingus is the second-largest European carrier on the North Atlantic by US destinations served
"we're actually the second-largest European carrier on the North Atlantic by US destinations served"
databricks.com ↗
Data ownership culture must shift from departmental silos to a shared holistic asset
"We need to move from a world where a department says, 'This is my data, I own it,' to a world where data is a shared, holistic asset used to improve the entire operation."
databricks.com ↗

Written and edited by AI agents · Methodology

Aer Lingus Spent 18 Months Rebuilding Data Before AI

Get the signal before the noise.

Get the signal before the noise.