Crimson Reason: Building an AI/ML Risk Analyzer

Summary:

As the application of Artificial Intelligence (AI) and Machine Learning (ML) algorithms spread in the automotive domain (as well as other areas of business and industry), it is very important to be able to assess the risks that inhere in those algorithms in order to avoid or to minimize those risks. Specially in the automotive domain, it is critical to mitigate or to eliminate threats to passenger safety, security, and comfort.

AI/ML algorithms are often presented in pseudo-code and/or are implemented in a high-level programming language such as Python, Java, C#, C++. It is desirable to have a software system that automatically assesses the robustness of an AI/ML algorithm against several risk scenarios, identify them, categorize them, and supplies remediation approaches.

Risk Score Calculation

The invention is envisioned as automated process which analyzes an algorithm for risks inhering in it and recommends remediation steps. The invention processes different data sets sequentially for the same trained model. That is, it keeps the trained algorithm constant and exercises it against different data sets. The invention performs the following steps (per)

Figure 1

It will perform a variety of pre-processing steps such as checking the dimensions of the input vector and the dimension of the output vector (feature space). In such an example, if dimension_of_input < dimension_of_output then the model is risky and the system will so advise the user. Other such pre-processing steps could be added for different AI/ML algorithms. These pre-processing checks are configurable by the user of the system.
It will compute a data difference score by a suitable method of data subtraction (there could be many such methods – simple XOR, CRC difference, etc. These methods, for each data set and algorithms, can be defined via a Data Difference Editor feature of the invention.)
The run-time – Process Orchestrator – will execute the algorithm and compute its Risk score.
It will then subtract the new Risk Score from the Trained-Model’s Risk Score.
Did it increase – by what percentage?
Conceivably, given a lot of data sets that are different, one can even generate good statistics for the Risk Score changes.
So, we can compute an ensemble average of the change in Risk Score and then proceed to the TRIZ steps.

The Orchestrator is programmed to process a list of data sets until the list is exhausted. As a Workflow Orchestrator, we will be using a tool such as ML Flow since it provides a pipeline for AI/ML algorithms. It executes the algorithms and gathers performance metrics. Suitable software components would then analyze those performance parameters for Risks. The invention is not dependent on ML Flow, any other Workflow Orchestrator would work as well.

Figure 2

A trained AI model (to be analyzed) is fed with an specific dataset on which the model has been trained and validated. The following is a partial list of performance parameters that could be evaluated in order to combine them, via the Risk Score Editor, into a Risk Score.

Accuracy
True Positive Rate (TPR)
False Positive Rate (FPR)
Recall
Precision
F1-Measure
Area under the Receiver Operating Characteristic (ROC) curve
Area under the Precision-Recall (PR) curve
Logarithmic loss
C-Measure
R-Measure

An example of performance parameters for Classical Binary Classification is displayed below - analogues of such parameters can be defined for other AI/ML algorithm types.

Figure 3

Those are captured in the Algorithm Types & Performance Parameters data store.

This invention is intended for mitigating the risks of the trained model and not the data. The main objective of this tool is to advise the modeler to mitigate model-specific issues before going to Deployment phase of the model. Its output could be used to indicate the robustness of a trained model with respect to data differences.

Generating Risk Remediation Hints

Risk is estimated based on a combination of performance parameters. The System will go through performance parameters - based on the "strength" of that parameter in the calculation of the combined Risk Score and supplies hints to the algorithm designer.

The way that is done is that the Risk Score is disambiguated by looking at its parts - per its definition via the Risk Score Editor (please see below).

The system utilizes the validated TRIZ approach.

Figure 4

TRIZ principles are not changed but their analogue realizations in AI/ML domain are used.

There are 39 technical parameters (or features) in TRIZ. To use the TRIZ contradiction matrix, the performance parameters of AI/ML algorithms are mapped to the closest TRIZ equivalent feature. They are then placed at the same position of their equivalents in the TRIZ contradiction matrix, and then the system uses the suggested TRIZ inventive principles to resolve the tradeoff between pairs of AI/ML features.

AI/ML Analogue of TRIZ General Attributes

Below is table of TRIZ general system attributes / parameters that with their analogues of Attributes for AI/ML.

Attribute	Interpretation	Possible Contrivances	AIML Example(s)
Speed	The velocity of an object The rate of a process or action in time	Productivity	The time to detect and read the speed limit from a sign
Strength	The extent to which the system is able to resist changing in response to force. Resistance to breaking	Reliability Adaptability / Versatility	Error catching / handling? Outlier handling Data drift handling Adversarial attack handling
Loss of Information	Partial or complete, permanent or temporary, loss of data or access to data in or by the system		Data leakage, i.e. using information during training that is not know or have during operation Information loss via dimension reduction / lossy compression
Quantity of Substance / Matter	The number or amount of a system’s materials, substances, parts, or subsystems which might be changed fully or partially, permanently or temporarily		Amount of telemetry data needed to extract a representation of a road network Length of a telemetry time series needed to infer who’s driving Number of model trainable parameters / hyperparameters
Reliability	Ability to perform intended functions in predictable ways and conditions	Strength Adaptability / Versatility	Reaction to dirty data / missing data / wrong data type / unrealistic data Outcome sensitivity to internal stochastic processes Maneuver labeling algorithm assumes trajectory has multiple points… what happens if there’s only one??
Accuracy	The closeness of a measured value to the actual value of a property of a system		Classification accuracy
Adaptability / Versatility	The extent to which a system positively responds to external changes The extent to which a system can be used in multiple ways under a variety of circumstances	Reliability Strength	Performance change when using an algorithm designed with Region 1 data on Region 2 data Sensitivity to data drift
Productivity	The number of functions / operations performed per unit time The time for a unit function / operation	Speed	Number of intersections I can process, e.g. maneuver labeling, in a minute Number of seconds it takes to process, e.g. maneuver labeling, a single intersection

Once the TRIZ principle is identified, a hint or a number of hints could be generated from the TRIZ-Hints data.

A Partial Realization of TRIZ-supplied Hints Table is presented below:

	TRIZ Principle	AI/ML Hint
1	Segmentation /Division	Refactor the Algorithm into many independent algorithms.
3	Local Quality	Non-uniform data sampling method
5	Consolidation/Combination/Merging	Data Dimension Reduction (PCA / Autoencoding)
6	Universality /Multi-function	Parallelize the computation
17	Transition into New Dimension/Another Dimension	Change the number of layers (ANN) One-hot-encoding PCA analysis

The system will use a pre-populated Risks Map to identify the corresponding record in a TRIZ-like Contradiction Matrix for => Hints

Examples of TRIZ-like contradictions are

performance parameters vs speed of execution
precision vs numerical stability

The inventions contain the following "editors":

Data Difference Editor

This enables a user of the system to define how to compute the difference between the training set of data for a model and those that will be used during the process of Risk Assessment by the Process Orchestrator. There are many well-known such methods in prior art such as make use of cyclic redundancy checks (CRC) for each data set, XOR Operation, Image Subtraction, comparing topological invariants of each data set and very many others.

It should be noted that the data types one deals with is, in practice, finite. We have:

Relation Data (Structured Data - use binary subtractions? - attribute by attribute subtraction).
Textual Data
Different types of images (enabling the usage of Image Analysis techniques)

Camera
Lidar
Radar

Different Types of Sensors (time series - using mathematical statistical techniques)

Temperature Sensor (infrared)
Proximity Sensors
Infrared Sensor (IR Sensor)
Ultrasonic Sensor
Light Sensor
Smoke and Gas Sensors
Alcohol Sensor
Touch Sensor
Color Sensor
Humidity Sensor
Tilt Sensor

Furthermore, the number of sensor types is limited by the Laws of Physics and Chemistry and therefore is finite as well.

Given that the data types are finite, we can use existing measures (Known in the Arts) to characterize the difference between a training dataset and other datasets during the execution of an AI/ML algorithm by the Process Orchestrator.

For example, for time-series sensor data, we have metrics such as the mean, the standard deviation of the mean, the median, min/max, and the power spectrum to utilize in order to compute the difference between the reference dataset used in training and those fed to the algorithm via the process orchestrator.

For image type sensor data (Camera, Lidar, Radar), which occupy different parts of the electromagnetic spectrum, many available image analysis techniques could be used to compute the dataset difference.

In principle, a combination of metrics could be applied to measure the differences between the training dataset and those used during the execution of the model by the Process Orchestrator.

A combined difference score is also computed if the user so decides via this editor.

Risk Score Editor

This enables a user of the system to define the Risk Score for an algorithm based on its performance parameters. The user has a lot of flexibility on the ways to combine the performance parameter of a given model to create a Risk Score for that algorithm; i.e. giving each parameter a different weight and compute the arithmetic average of those parameters. The system will output Risk Score and a Probability of that risk (its confidence level). The Risk Score will be a Real number between 0 to 1.0 and the thresholding (low, medium, high) would be based on the criticality of the application (safety, money, health) Threshold is the user selection.

The Risk probability would be computed based on the ratio of occurrences of a Risk situation (defined as when small changes in data lead to large changes in the Risk Score) to the total number of Risk Scores that are calculated.

Algorithms List Editor

The system is fitted with a pre-populated data store of AI/ML Algorithms and their associated performance parameters. This list can be updated via this Algorithms list Editor in order to specify new algorithms and their corresponding performance parameters. The important point is that this is a finite list of AI/ML Algorithm types which is extensible and maintainable.

Active Reinforcement Learning	Local Search in Continuous Spaces
Agents Based on Propositional Logic	Machine Translation
Alpha-Beta Pruning	Multiagent Planning
Artificial Neural Networks	Nonparametric Models
Augmented Grammars and Semantic Interpretation	Object Recognition by Appearance
Backtracking Search for Constraint Satisfaction Problems	Object Recognition from Structural Information
Bayesian Networks	Ontological Engineering
Complex Decisions - Policy Iteration	Optimal Decisions in Games
Complex Decisions - Value Iteration	Partially Observable Games
Constraint Propagation: Inference in Constraint Satisfaction Problems	Passive Reinforcement Learning
Decision Networks	Planning and Acting in Nondeterministic Domains
Dynamic Bayesian Networks	Planning Graphs as State-Space Search
Ensemble Learning	Problem-Solving Agents
Explanation-Based Learning	Propositional Theorem Proving
First-Order Logic with Backward Chaining	Reconstructing the D World
First-Order Logic with Forward Chaining	Regression and Classification with Linear Models
Heuristic Functions	Relational and First-Order Probability Models
Hidden Markov Models	Robotic Moving
Imperfect Real-Time Decisions	Robotic Perception
Inductive Logic Programming	Robotic Planning to Move
Information Extraction	Robotic Planning Uncertain Movements
Information Retrieval	Searching with Nondeterministic Actions
Informed (Heuristic) Search Strategies	Searching with Partial Observations
Kalman Filters	Sequential Decision Problems
Knowledge Engineering in First-Order Logic	Speech Recognition
Learning Decision Trees	Statistical Learning
Learning Using Relevance Information	Stochastic Games
Learning with Complete Data	Supervised Learning
Learning with Hidden Variables: The EM Algorithm	Support Vector Machines
Local Search Algorithms and Optimization Problems	Syntactic Analysis (Parsing)
Local Search for Constraint Satisfaction Problems	Text Classification
Uninformed Search Strategies

TRIZ Attributes Editor

Mapping qualitative attributes to quantitative metrics is using an application of the House of Quality concept to this invention.

It is intended as a tool to help us map TRIZ attributes into quantitative measure of the type of AI/ML algorithms.

The user will select the TRIZ attributes that are relevant to an specific AI/ML algorithm.

The user thresholds the performance parameters; i.e. if threshold is not met, find the relevant attribute, and go back to the TRIZ contradiction matrix and look for hints.

Please note that the Risk Score is a combination of these performance parameters below:

Figure 5

Key Idea

The key insight of the invention is to measure - in a suitable and configurable manner, as specified by the AI/ML Algorithm designer - the Risk Score of a trained AI/ML model by subjecting it to a plurality of of real or simulated or spiked data sets via an automated process.

The basic idea is that small differences between the training (golden) data set and a plurality of similar data sets must be accompanied by small difference of the model's Risk Score.

The word small is something to be defined per algorithm. Small & Large changes do not mean changes in the pixels alone, but other factors - say color, or other geometrical properties, e.g. number of holes. The test data sets could be generated by adding noise to the data, or other objects, rotating the data in the space of similar such data, and so on.

Another key novelty of this system is its use of TRIZ principles as applied to AI/ML Algorithms.

A third novelty of this invention is that it can also assess the stability of an AI/ML Algorithm by comparing changes in the input data (from the training data) with corresponding changes in the performance of that algorithm.

Crimson Reason

Friday, December 13, 2024

Building an AI/ML Risk Analyzer - A Proposal

No comments:

Useful Links

Topics

About Me

Blog Archive