Ego-Centric Data: How a Broken Data Engineering Culture Slowed Everyone Down Except LLMs
For a decade, organizations have been told to become “data-driven.” They hired data engineers, built dashboards, introduced governance committees, and purchased enough SaaS platforms to fill an entire cloud region. Yet despite all of this effort, many companies found themselves trapped—moving slowly, arguing over metrics, and struggling to scale even basic analytics.
And then something strange happened.
Large Language Models began advancing at a speed that made every other function look sluggish by comparison. Training runs grew by orders of magnitude, model performance leapt forward, and research teams shipped breakthroughs at a pace that traditional businesses could not imagine.
It wasn’t simply compute. It wasn’t just talent. It wasn’t even the architecture.
The real difference was culture.
While most organizations were wrestling with ego-centric data cultures where every team guarded its own version of truth—AI groups were quietly building the opposite: a culture of shared data, disciplined pipelines, and infrastructure-first thinking.
I think with this piece I am trying to explore that divide: why ego-centric data cultures emerged, how they slowed down organizational learning, and why LLM teams avoided the trap entirely.
The Ego Problem: When Data Became Personal
In most companies, data does not function like an ecosystem. It behaves like territory.
Teams claim ownership over metrics—”our KPI,” “our dataset,” “our pipeline.” Definitions diverge, lineage disappears, and the meaning of even basic numbers becomes negotiable. Data engineering becomes less about architecture and more about arbitration: reconciling conflicting truths, patching brittle pipelines, and mediating between stakeholders with incompatible expectations.
The numbers tell the story. Research shows that cultural resistance accounts for the dominant barrier to data transformation, yet organizations allocate only 10% of transformation budgets to change management. Poor data quality costs organizations an average of $12.9 million annually. When asked about the biggest challenges to becoming data-driven, 90% of executives point to cultural factors—people and process issues—while only 9% cite technology.
The result is predictable:
- Multiple versions of the same table
- Business logic scattered across dashboards, queries, and ad-hoc transformations
- Delays interpreted as incompetence rather than symptoms of structural dysfunction
- A culture where speed is rewarded, but maintainability is ignored
This is the essence of ego-centric data: information treated as an extension of team identity, rather than a shared asset.
Ego creates fragility. Fragility creates slowness. Slowness creates frustration. Not to mention asymmetric information would break down thought flows.
Why Data Engineering Broke Under Its Own Weight
Data Engineering was originally envisioned as an enabling function a capability that would allow organizations to understand themselves better, forecast more accurately, and make decisions grounded in reality.
But over time, DE became a reactive service desk. Stakeholders filed requests. Engineers triaged them. Pipelines grew incrementally, never strategically. The architecture was shaped not by design but by urgency—accidental complexity accumulating year after year.
In this environment:
- Documentation was optional
- Debt was normal
- Observability lagged behind ambition
- Definitions were political
- Quality checks were fragile
- Data contracts were nonexistent
- Firefighting replaced engineering
- Heroism replaced strategy
Despite massive investment, modern data stack spending reached approximately $12 billion from 2022-2024, organizations now manage 5-7+ specialized data tools on average, with 70% of data leaders reporting stack complexity challenges. The tools multiplied, but the fundamental problems remained.
And in the process, the data function lost the very thing it needed most: the ability to scale.
AI Teams Faced a Different Reality
LLM and AI groups never had the luxury of ego-centric data. Their work demanded something else entirely.
Models don’t care about politics. Models don’t tolerate ambiguous definitions. Models do not train on tribal knowledge.
To build a model with billions of parameters, teams needed:
- Centralized, clean, consistently versioned data
- Reproducible pipelines
- Strict schema discipline
- Experiment tracking and lineage
- Collaboration across teams and roles
- A culture that defaults to documentation, testing, and automation
As one analysis noted, “ML systems are large ecosystems of which the model is just a single part.” The effectiveness of LLM-based systems is only as good as the data they’re fed, requiring teams to trace where sources come from and ensure data flows as expected.
In other words, AI teams were forced into a healthy data culture because unhealthy culture simply wouldn’t work.
The result was acceleration—continuous, compounding, undeniable. While traditional organizations were debating dashboard definitions, AI teams were shipping the next breakthrough.
The Paradox: AI Didn’t Move Fast Because It Was Special—It Moved Fast Because It Was Structured
LLMs became the fastest-moving part of the technology ecosystem not by bypassing data engineering principles but by embracing them far more rigorously than most industries ever did.
The paradox is clear:
Data engineering slowed down traditional organizations because it lacked structure.
Data engineering accelerated AI development because it enforced structure.
In most companies, the culture surrounding data is improvised. In AI, it is intentional. The difference in outcomes speaks for itself.
Recent research confirms this divide. Only 37.8% of companies have achieved data-driven cultures despite decades of effort. In contrast, successful AI implementations require comprehensive data strategies that prioritize culture and skills transformation from the start. The gap between demo and production has become what one analysis calls “the silent killer of ambitious AI initiatives”—and DataOps practices emerge as the critical missing piece.
The Lesson: Scale Is a Cultural Property, Not a Technical One
When organizations talk about becoming “data-driven,” they often reach for tools first—warehouses, orchestrators, governance platforms, AI copilots.
But tools only amplify the culture that already exists.
A fragmented culture produces fragmented systems. A reactive culture produces reactive pipelines. A territorial culture produces territorial metrics.
Likewise:
- A collaborative culture produces shared definitions
- A disciplined culture produces stable pipelines
- A humble culture produces scalable architecture
LLM teams didn’t scale because they had better tools. They scaled because they had fewer egos and more alignment.
If Organizations Want to Catch Up, They Must Start with Culture
The path forward is not mysterious. It’s simply uncomfortable.
Current research indicates that 40% of CIOs now prioritize fostering a data-driven culture, recognizing that success requires an entrepreneurial mindset with strong stakeholder management and communication strategies. Organizations that invest in data literacy programs see 35% higher productivity and 25% better decision quality.
Breaking free from ego-centric data culture requires:
- Establishing shared ownership of data assets — Breaking down the silos that prevent holistic views of operations
- Creating centralized definitions and lineage — Ensuring everyone speaks the same language
- Prioritizing documentation and observability — Making the invisible visible and maintainable
- Empowering data engineering to act strategically, not reactively — Moving from service desk to strategic partner
- Treating pipelines as products, not plumbing — Investing in quality, monitoring, and continuous improvement
- Building governance around collaboration instead of control — Enabling rather than constraining
This is the deeper truth behind the AI divide: the acceleration of LLMs is less a technical triumph and more a cultural one.
While traditional organizations optimized for autonomy, speed, and siloed execution, AI teams optimized for coherence, alignment, and reproducibility.
One treated data as a resource to defend. The other treated data as a foundation to share.
Only one of those cultures scales.
Conclusion: The Future Belongs to Organizations That Abandon Data Ego
Ego-centric data cultures slow organizations down because they force teams to negotiate meaning instead of generating insight. They turn engineers into firefighters and pipelines into riddles. They make progress dependent on individuals rather than processes.
LLM teams didn’t magically avoid these problems—they simply built a culture where such problems couldn’t survive.
The statistics are sobering: 95% of AI projects fail, and only 35% of digital transformations meet value targets globally. But the organizations that succeed share a common characteristic—they invested early in upskilling, communication, and internal change agents who champion adoption across departments. They treated AI literacy as a shared competency, not a technical specialization.
If companies want the speed, clarity, and compounding learning that AI teams enjoy, they won’t get there by buying more tools or hiring more data engineers.
They’ll get there by letting go of the ego that keeps their data fragmented.
The breakthroughs in AI are a reminder: alignment is a force multiplier. And when teams align around shared data, shared definitions, and shared purpose, everything accelerates.
Not just AI. Everyone.
***The data shows the path clearly: while 83% of leaders say data literacy is critical for all roles, only 28% achieve it. The gap between knowing what needs to be done and actually doing it is culture. And culture, unlike technology, cannot be purchased or installed. It must be built, one collaborative decision at a time.***
Sources:
Navigating the Data Landscape: Five Challenges Facing Chief Data Officers in 2024
hunton-lewis.com
Why Culture Is the Greatest Barrier to Data Success | MIT Sloan Management Review
mit.edu
Cultivating a Data-Driven Culture — 7 Proven Strategies for Chief Data Officers
cdomagazine.tech
Data Transformation Challenge Statistics — 50 Statistics Every Technology Leader Should Know in 2025 | Integrate.io
integrate.io
Building Product-Focused AI Teams | deepset Blog
deepset.ai
Stop Confusing the LLM for the Product Itself | Built In
builtin.com
Until Next time 🙂
@pplcallmetat
Leave a comment