Friday, August 23, 2024

Data Monetization Estimation

William Thomson, Lord Kelvin

"When you can measure what you are speaking about, and express it in numbers, you know something about it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts advanced to the stage of science.”

Lecture on "Electrical Units of Measurement" (3 May 1883), published in Popular Lectures Vol. I, p. 73

Estimating Data Value

Introduction

There has been a lot of interest in data monetization over the last decade or so and this discussion is meant as a way of thinking about estimating the potential value of data and in contradistinction to realization of that value. Think of this as an analogue of a geological survey without any guarantee of finding economically viable mines; an attempt at a quantitative analysis of what value data have along the lines that the late Lord Kelvin had suggested.

Column Value Model

We assume a company’s value to be a combination of human, physical, and data capital; a fully automated company with no human resources is a pipedream, a company must have physical assets to conduct business (be they rented) and must make decisions based on some information or other. (For simplicity, Goodwill valuation is excluded from this model).

So we start with the following formula for the company’s valuation and the proceed to give an estimate of its data assets in a more defined manner.

Company Valuation = Human Capital + Physical Capital + Data Capital

Assume all three parts have the same value and the company’s valuation is $ 3 billion:

Data Capital = Company Valuation/3 = $ 1 Billion

In this model, we would like to estimate the average value of each column of data in all the relational schemas that the company uses, i.e.

Let us further assume that there are 50 relational schemas in this company, that they each have 50 tables of 20 columns each. This gives a total of 50,000 columns in total that yields an average value of $ 20,000 per column. In this approach, all columns are considered to be of equal value; be they customer names (let us say) or such ubiquitous columns as “Last Updated Date”.

We can then proceed to estimate an Average value for each Schema

In this discussion, since all schema are assumed to be identical in the number of tables and columns, their average data value is $ 20 million. However, in practice, there are different schema and each is different from the other and this type of model serves to identify the most valuable schema that a company has.

Alternative Models

Data Volume Model

In this model, the value of each schema is estimated based on its data volume. That is:

The total Data Valuation will be divided by this number, an average value per Gigabyte is extracted, and the value of each schema is then computed by multiplying its data volume by that average number.

Time Dependent Model

Another model is one with time-dependent variable weights for value of each part of the company’s valuation, i.e.:

Valuation = h(t) Human Capital + p(t) Physical Capital + d(t) Data Capital

With constrains:

h(t) + p(t) + d(t) = 1 and h(t) ≥ 0, p(t) ≥ 0, d(t) ≥ 0

Weighted Schema Model

In this model, we are capturing the importance of each schema – via the factor a(n) – and the historical data volume available for each Schema by the second sum and the factor b(m).

The exp(1-m) is intended to model the aging and staleness of the data.

Crimson Reason

Friday, August 23, 2024

Data Monetization Estimation

Estimating Data Value

Introduction

Column Value Model

Alternative Models

Data Volume Model

Time Dependent Model

Weighted Schema Model

No comments:

Useful Links

Topics

About Me

Blog Archive