Data Value Lineage, meaning at last? | by Marc Delbaere | Aug, 2024


Maximise the business value of your data

Picture by the author (some of these I have read!)

I have always had a soft spot for words that perfectly capture the essence of a concept. During one of my trips to Japan, I discovered the word Tsundoku. It refers to the habit of acquiring books and letting them pile up without reading them. I immediately fell in love with the word because, like many, I have a habit of buying more books than I can read. Some I eventually get to, while others simply accumulate.

There’s something amusing and oddly satisfying about these piles of unread books — they symbolise potential knowledge and the joy of collecting. They stand as a testament to our intellectual aspirations, even if we don’t always fulfil them.

As founder of Kindata.io and a consultant for large companies on data value, I often encounter usage of data catalogues that make me want to scream, “Tsundoku!”. These catalogues, like stacks of unread books, are filled with detailed descriptions of data that sits idle, giving a false sense of accomplishment. A book not read is knowledge wasted; similarly, data not exploited for business value is potential wasted.

Nobody likes waste, and the professionals I work with are no exception. They are eager to see their data used to its full potential. They also understand that getting there will require a change in mindset. Working in data, they also appreciate the power of well-named concepts. I have heard many terms being used for this work of alignment to business value, but I must say that I have become fond of one that is picking up: Data Value Lineage. This term resonates deeply with me because it perfectly captures what I am championing. It highlights the need to turn idle data into actionable insights by making sure that everything that you do in the data teams is firmly linked to value creation.

At first glance, choosing to name a concept one way instead of another might seem inconsequential. However, naming concepts is powerful especially if you need to bring with you an entire organisation.

In the sections that follow, we will delve deeper into the specifics of Data Value Lineage. After defining the concept and what it entails, we will explore how to get started with a pragmatic implementation approach within your data organisation.

Data value lineage is the aggregation of two terms commonly used among data professionals: data lineage and data value. Let’s get back to the roots of these concepts.

Data Lineage

Data Lineage is defined by the IBM Knowledge Center:

“Data lineage is the process of tracking the flow of data over time, providing a clear understanding of where the data originated, how it has changed, and its ultimate destination within the data pipeline.”

Data Value

“Data Value is the economic worth that an organisation can derive from its data. This includes both tangible benefits like revenue generation and cost savings, as well as intangible benefits like improved decision-making, enhanced customer experiences, and competitive advantage.”

This definition is synthesised from commonly accepted industry principles, as a direct authoritative source is not currently available.

Reconciling two different worlds

The two previous definitions take us in different directions. Data lineage is all about understanding dependencies, bias, and potential quality issues. It provides explainability and helps data engineering teams repair broken pipelines. It is effectively a formal description of your data supply chains. Yet, for most people, it stops when the data is effectively used by an application.

Data value on the other hand focuses on something completely different: the contribution of data to the organisation business objectives. The underlying assumption is that this data value can be formally measured. We are firmly in the world of strategy, finance and business cases, not in the world of pipelines debugging.

I am now picturing Jean-Claude Van Damme performing one of his legendary splits on two high piles of unread books and thinking very hard about unifying these two concepts…

A formal definition of data value lineage

Here is the first formal definition of data value lineage.

“Data value lineage is the process of reconciling at all times the enterprise data assets (including their maintenance costs) and the main delivery tasks performed within a data organisation with the measured contribution to enterprise value drivers”

There we are, we have a formal definition! I would like to insist on a couple of aspects of this definition that I think are fundamental:

  • Process: This is not just about providing after the fact documentation, this is about making sure that the organisation is considering the link to business value as part of its standard ways of getting things done.
  • At all times: Things change and contribution to business objectives is not insulated from that. Data providing elements of value at one point in time is just that, no more, no less and there is no guarantee that this value will hold through time.
  • Enterprise data assets: I am purposefully using a wide notion. Of course when prioritising data value lineage initiatives, you might want to start smaller. My own pragmatic recommendation based on our first projects with kindata.io would be to start with data products and business use cases (more on this later).
  • Maintenance costs: keeping data assets available for consumption has a cost. Part of it is cloud resources (where this rejoins Finopps practices) but a lot is data engineers’ time allocated to maintaining, repairing and evolving the assets.
  • The main delivery tasks: This is pushing the definition a little but ideally, any significant task performed within the data organisation should one way or another be linked to value drivers. If time is spent improving timeliness of a data set for example, how does this provide measurable business value?
  • Measured contribution: We want as much as possible to formalise the contribution to business value. It is critical that these measurements are performed by the business sponsors, not within the data organisation. We also have to be pragmatic in these measurements, this is a means, not an end.
  • Enterprise value drivers: We are aligned with the definition of Gartner of business value. Anything that the business values counts, even more so if it can be measured.

As you can see, data value lineage, while inherently a simple concept, covers a very wide scope. Inevitably, this involves bringing together individuals within your organisation with very different profiles, mandates and concerns. Let’s explore how to get things moving pragmatically in the right direction.

Some change management considerations

Getting to think and act across silos and mentalities is not trivial. Inertia is a powerful force and putting eyes blinders on is sometimes the only way to get anything done in large corporations.

Through the many change management projects that I have driven or even observed, I have found three invariably useful critical success factors:

  • Meaning and worthiness: When changes “just make sense,” they are more easily adopted. This ultimately leads to the sentiment, “Why haven’t we always done it like that?” When the new approach feels intuitive and obviously better, resistance diminishes significantly.
  • Natural extension of existing practices: Change is more likely to succeed if it requires minimal deviation from current practices. By allowing people to keep doing most of what they were doing before and changing small things consistently, you integrate the new practices smoothly into their routines.
  • Immediate perception of value: For any change to be embraced, individuals need to see immediate benefits. If things that were hard or out of reach become easy and natural, people are more likely to adopt the new practices enthusiastically.

The good news is that it is relatively easy to tick all three boxes with data value lineage.

  • Meaning and worthiness: It does not take much research to find plenty of inefficiencies in resource allocation and collaboration between data producers and consumers. Also the concept of data value lineage comes quite naturally to very different profiles.
  • Natural extension of existing practices: You do not fundamentally change what you do, you just add a thin extra layer on top to make connections that are otherwise hard to establish. You leverage your existing investments in data governance and financial control and activate them to focus on business value generation.
  • Immediate perception of value: By connecting the dots, you can very fast identify untapped potential of data, increase the business throughput of your data resources, improve the data discovery process and promote data democratisation.

Getting started

What I have found particularly useful in rolling out a data value lineage approach across an organisation is to start with the three basic questions: what, how, why. This might sound simplistic but in reality most organisations are constantly mixing these three questions together and end up not getting any clear answers.

What refers to the data initiatives that are valued by the business. In the projects that we run, this covers both traditional analytics projects (dashboards, reports…) as well as data science / AI initiatives (predictive models, recommendation engines, chatbots…). The term that we use for a data-driven project that provides tangible business value is a business use case.

How covers many aspects but from the perspective of data value lineage, the most important one is data sourcing. More and more organisations, inspired by the principles of data mesh are adopting the concept of data product or productised data set. Contrary to business use cases, in this terminology, a data product does not contribute directly to business value, yet they come with maintenance costs.

Why is about the contribution to business value. How do you expect each business use case to contribute to one or several value drivers? Once the projects are delivered and enter into maintenance mode, is that contribution sustained?

Once we have a clear understanding of these three questions, we can start documenting the basic building blocks:

  • The value drivers and their metrics
  • The portfolio of business use cases
  • The catalogue of data products

The first pragmatic level of data value lineage is to define the connections between these three levels.

Connecting the dots — Image created by Gilles Lenoble for Kindata.io

Let’s take a very simple example:

You want to optimise your energy consumption through a data-driven approach. The business use case (energy cost reduction) sources data from two data products (Company Energy Consumption and Utility Bills and Tariffs). It contributes to two business drivers: Cost Reduction and Sustainability.

An example of data value lineage — Image by the author from screenshots of Kindata.io

The arrows between the data products, business use case and value drivers are the backbone of data value lineage. They make the link between the three fundamental questions. You will notice that the arrows are bi-directional:

  • When you navigate from data products to value drivers, you get a clear view of the real usefulness of your data products. Each individual data product is indeed part of many similar consumption chains and you can easily get the big picture of generated value at all times. If the generated value does not live to your expectations, you can take corrective actions such as internal promotion, refactoring or decommissioning.
  • When you navigate from business driver to data products, you get an up to date top-down view of your data-driven contribution. Again, if the picture is not aligned to your expectations, you can initiate strategic decisions ranging from investment in data supply chains, delivery of specific new business use cases or boosting of the usage of existing ones.

Data value lineage is a powerful concept as it offers a structured approach to ensuring that every piece of data in your organisation contributes to business value. By reconciling data assets, tasks, and business outcomes, you can maximise the impact of your data initiatives.

It is also a great name that can rally people across the data, business and financial control organisations.

Don’t just find contentment in creating dozens of underused data products, fooling yourself with the illusion of value generation. Avoid the zen contemplation of unused piles of data, Tsundoku-style. Instead, take action and harness the full power of your data through data value lineage, ensuring that every effort directly contributes to business value generation.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*