Monday, June 9, 2025

Data as Story - A Proposal

Please find the IP that I developed below on transforming data into stories in 2014 which was disclosed here:

Babak Makkinejad, “Data as Story”, database number 617040, "Research Disclosure", Published in the September 2015 paper journal, Published digitally 21 August 2015 14:06 UT

With the advent of LLMs, my proposal would be easier to implement now.

Subject Matter & Problem

The main idea is this disclosure is the transformation of the relational data – often found in Relational Database Management Systems into something resembling a story; that is: a textual representation of the relational data that is telling a story in a natural language.

This disclosure does not discuss an automated system, rather it describes a semi-automated process in which a human expert would invoke software tools to transform data into a story. 

What is presented in this disclosure is akin to the process of report creation from available relational data using such tools as MS SQL Server Reporting Services, Apache’s Java BIRT etc. in which a human user designs a report – which consumes relational data – using a variety of software tool at the end of which an automated system published that report or makes it otherwise available.

This is not meant as a replacement for other modalities of data presentation such a charts and graphs but is meat as a complementary modality.  However, the presentation of the data as a story will be found by most human beings to be more engaging than looking at reams of form-based data or columnar data as represented by database extracts of MS Excel files.

Once the data is turned into a textual, human language story, it could be read out to a human being mechanically, or it could be automatically translated to a different human language. 

Solution

This solution crucially and fundamentally relies on the prior art embodied in the software tool called Dramatica (www.dramatica.org) and its Theory of Story (http://dramatica.com/theory/book).  The Theory of Story is briefly sketched out below.

Introduction to Dramatica’s Theory of Story

The Theory of Story embodied in Dramatica models a story as a single at work finding a solution to a single problem.  This is very analogous to the situation in Business Intelligence arena when different users, in trying to answer different questions, ask for different reports out a database system.  The BI users, in other words, are trying to solve a problem.

We have 4 main areas:

  1. The overall story
  2. The main character through whom we see everything
  3. Impact character
  4. The dynamics of the Impact Character vs. Main Character       

In each of the above areas one has to answer these essential questions:

  1. 1.    Main character’s resolve: will he change or remain steadfast (no story if none of the characters in the story change)?
  2. 2.    What drivers the story –actions or decisions
  3. 3.    What is the main problem class of the story – Fixed Attitude, Manipulation, Situational, or Activities?
  4. 4.    What is the main concern of the story – Past, Present, Future, and Dynamic (How things are changing)?
  5. 5.    What is the overall story issue – Openness, Delay, Choice, Pre-conceptions?
  6. 6.    What is the overall story problem – Control, Help, Hinder, Uncontrolled

Additionally, there could be multiple development lines in each story each with their own thematic arguments.  Themes are perspectives and thus could represent data as viewed from different view point of other story characters.

An argument’s topic may be further explored through dialogue, images, charts, pictures music etc. that complement the story. 

These later supporting material such as charts and graphs ties us to the common data representations via the applications of statistical algorithms and standard charting techniques.

Dramatica’s Theory of Story posits the existence of an overall story symptom and an overall story response; each story consists of a Problem, a Direction, a Focus, and a Solution.  The Problem is finally recognized some time near the climax of the story.  “Success” means the problem is replaced with a “Solution”.  “Failure” means that the problem is persisting.

Drmatica’s Theory of Story further posits that each story could contain up to 8 archetypes:

  • Protagonist vs. Antagonist
  • Guardian vs. Contagonist
  • Reason vs. Emotion
  • Side Kick vs. Skeptic

The Dramatica Structural Matrix

This is a framework for holding dramatic topics pertinent to Genre, Plot, Theme, and Character in relationships that describe their effect upon one another.  There are 4 Classes, Universe, Physics, Psychology, and Mind.  Each class contains 4 Types, and 16 variations (4 each) for each Type.  Each of those 16 Variations, in turn, contain 4 Elements for a total of 64 elements.

During the process of story-forming, these topics (called "themantics") are re-arranged much as a Rubik's cube might be scrambled, all in response to the author's choices regarding the impact they wish to have on their audience. As a story unfolds, the matrix unwinds, scene by scene and act by act until all dramatic potentials, both large and small have been completely explored and have fully interacted.

It is during this phase of story-forming that the relational data – based on their semantics (i.e. the meaning of the data columns in the database) are mapped into these 64-elements for each of the 4 Dramatica Classes.

Approach to Story Construction

Enterprises, commercial, governmental or non-profit, internally execute a set of (business) processes.  This is where the work for the story creation starts.  Examples of such processes are Human Resources, In-patient Management, Out-patient management, Manufacturing Quality Management and very many more.

One selects an existing business processes which is being executed and for which one wishes to tell a story.  That is, a specific business problem or question would be addressed via the story that is being developed.

For this process – or indeed any process – then tries to find the answer to the following questions:

 

1.       How

2.       What

3.       When

4.       Where (to/from)

5.       Who

6.       Whom

7.       Whose

8.       Why

 

Not all of these questions could potentially have answers within an arbitrary business process but some of them will have answers by necessity.

For example, for a Human Resources Management process, the questions could be:

What: role, title

When: hired, left the company, promoted, demoted, reprimanded, recognized, rewarded

Which: salary, rewards, taxes, expenses

Who: The Specific Employee (the Protagonist)

Where: Head-Quarters, Working-from-Home, Branch Office

This step may be automated via software Wizards that guide the user in determining the answers.  Such an automated systems will consume the relational data that supports that business process.  This identification may be based on automated inference or via data dictionaries available for the targeted process.

[A data dictionary contains the semantics of the data elements in the database; it may be viewed as an Ontology for that process – or it could be a subset of the larger Ontology of the Entire Enterprise.]

Next, with the answers to the above questions, and in conjunction with the data dictionary for the database tables, the dominant Class of the story may be selected.  It might be that the story to be developed does not have a dominant Class and all Classes need to be included to present the data.  An automated “Semantic Extraction” tool may be used to facilitate the assignment of the data fields in the database tables to these 4 Classes, their 4 Types, 16 Variations, and 64 Elements.  Alternatively this step could be performed manually. 

This is the step that ties the RDBMS data to Dramatica’s structures.

In practice, the meta-data from the data dictionary may not be sufficiently numerous to cover all 256 bins (Elements) that are available for all the 4 Classes.  Or, alternatively, there could be multiple meta-data elements (concepts) that are mapped to the same Element.  It is a judgment call by the story-teller, looking at the requirements of the story, to decide which meta-data elements to keep and which ones to discard.

The story, ultimately, is a report and must supply answers to the business questions/problems that are posed by its consumers/users.

At this point, the storyteller is in position to utilize a system based on Dramatica and its Wizards, to guide him through the construction of the story. 

For example, the story might be one that is required to tell what happened to a (heart) patient admitted to the emergency room.  Within the Theory of Story of Dramatica, the storyteller would identify the patient (at this stage an unknown person) as the Protagonist, (Heart) Disease as the Antagonist, the (Heart) Surgeon as the Guardian, and Pre-existing Medical Conditions as the Contagionist.  The story would tell the how/when/where/who was admitted, the initial diagnosis, the climax (open heart surgery) and the recovery and discharge as well as other details per the requirements (members of the surgical staff, length of the operation, type of procedures, etc.).

Another example could be Manufacturing Quality Management process in which a type of widget is identified as the Protagonist, Manufacturing Process is considered as the Antagonist, Quality Engineer is identified as the Guardian and the Contagionist could be a production worker or the production machinery.

It should be noted that the stories that are being discussed in this disclosure are all generic, the identity of the patient, or the disease or the widget type are left undefined.  In the sense of RDBMS Reports, these stories may be understood as parameterized reports.

Dramatica already has Wizards that guide one in the construction of one’s story.  However, for this disclosure, it is envisioned that Dramatica’s Wizards and perhaps engine would be augmented in such a manner as to facilitate the construction of stories in which the 8 archetypes are not necessarily human being but could be things or processes.

For example, SQL queries to automatically pull data for an unknown patient, with an unknown disease could be formulated via Wizards during the story-forming stage.  The (augmented) Dramatica will then generate the story per the relational data, its meta-data, and story-teller’s decisions.

The user could then access an online system and request a story that tells what happened to Mr. Smith, with heart disease, who entered the emergency room of the General Hospital between the hours of 8:00 PM to 8:00 AM during the month of August – if any.  The system will generate a textual story as discussed in the above disclosure which could be printed, emailed, turned into an audio-file or exported to a suitable format for printing such as MS Word or Adobe PDF.

Possible Modifications

Extension to non-SQL, unstructured data.

Extension to telling multiple stories at once – by following different business processes within the same story.  For example, while telling the story of the treatment of a heart disease patient which is within the In-Patient Management process, one can also tell the story of specific surgeon who operated on him within the Human Resource Management process.

A story may be incorporated into a more customary report which contains tabular data and charts.  Alternatively, the story itself, in its published form when its parameters are specified, may contain such tabular data and charts.

The Dramatica Documentation

 

  

No comments:

Post a Comment