Friday, December 13, 2024

Building an AI/ML Risk Analyzer - A Proposal

Summary:

As the application of Artificial Intelligence (AI) and Machine Learning (ML) algorithms spread in the automotive domain (as well as other areas of business and industry), it is very important to be able to assess the risks that inhere in those algorithms in order to avoid or to minimize those risks.  Specially in the automotive domain, it is critical to mitigate or to eliminate threats to passenger safety, security, and comfort. 

AI/ML algorithms are often presented in pseudo-code and/or are implemented in a high-level programming language such as Python, Java, C#, C++.  It is desirable to have a software system that automatically assesses the robustness of an AI/ML algorithm against several risk scenarios, identify them, categorize them, and supplies remediation approaches.

Risk Score Calculation

The invention is envisioned as automated process which analyzes an algorithm for risks inhering in it and recommends remediation steps.  The invention processes different data sets sequentially for the same trained model.  That is, it keeps the trained algorithm constant and exercises it against different data sets.  The invention performs the following steps (per)

Figure 1

  1. It will perform a variety of pre-processing steps such as checking the dimensions of the input vector and the dimension of the output vector (feature space).  In such an example, if dimension_of_input < dimension_of_output then the model is risky and the system will so advise the user.  Other such pre-processing steps could be added for different AI/ML algorithms.  These pre-processing checks are configurable by the user of the system.
  2.  It will compute a data difference score by a suitable method of data subtraction (there could be many such methods – simple XOR, CRC difference, etc.  These methods, for each data set and algorithms, can be defined via a Data Difference Editor feature of the invention.)
  3. The run-time – Process Orchestrator – will execute the algorithm and compute its Risk score.
  4. It will then subtract the new Risk Score from the Trained-Model’s Risk Score.
  5. Did it increase – by what percentage?
  6. Conceivably, given a lot of data sets that are different, one can even generate good statistics for the Risk Score changes.
  7. So, we can compute an ensemble average of the change in Risk Score and then proceed to the TRIZ steps.

The Orchestrator is programmed to process a list of data sets until the list is exhausted.  As a Workflow Orchestrator, we will be using a tool such as ML Flow since it provides a pipeline for AI/ML algorithms.  It executes the algorithms and gathers performance metrics.  Suitable software components would then analyze those performance parameters for Risks.  The invention is not dependent on ML Flow, any other Workflow Orchestrator would work as well.


Figure 2

A trained AI model (to be analyzed) is fed with an specific dataset on which the model has been trained and validated. The following is a partial list of performance parameters that could be evaluated in order to combine them, via the Risk Score Editor, into a Risk Score.

  1. Accuracy
  2. True Positive Rate (TPR)
  3. False Positive Rate (FPR)
  4. Recall
  5. Precision
  6. F1-Measure
  7. Area under the Receiver Operating Characteristic (ROC) curve
  8. Area under the Precision-Recall (PR) curve
  9. Logarithmic loss
  10. C-Measure
  11. R-Measure

An example of performance parameters for Classical Binary Classification is displayed below - analogues of such parameters can be defined for other AI/ML algorithm types.

 

Figure 3

Those are captured in the Algorithm Types & Performance Parameters data store.

This invention is intended for mitigating the risks of the trained model and not the data.  The main objective of this tool is to advise the modeler to mitigate model-specific issues before going to Deployment phase of the model. Its output could be used to indicate the robustness of a trained model with respect to data differences.

Generating Risk Remediation Hints

Risk is estimated based on a combination of performance parameters.  The System will go through performance parameters - based on the "strength" of that parameter in the calculation of the combined Risk Score and supplies hints to the algorithm designer.

The way that is done is that the Risk Score is disambiguated by looking at its parts - per its definition via the Risk Score Editor (please see below). 

The system utilizes the validated TRIZ approach. 

Figure 4

TRIZ principles are not changed but their analogue realizations in AI/ML domain are used.

There are 39 technical parameters (or features) in TRIZ. To use the TRIZ contradiction matrix, the performance parameters of AI/ML algorithms are mapped to the closest TRIZ equivalent feature.  They are then placed at the same position of their equivalents in the TRIZ contradiction matrix, and then the system uses the suggested TRIZ inventive principles to resolve the tradeoff between pairs of AI/ML features.

AI/ML Analogue of TRIZ General Attributes

Below is table of TRIZ general system attributes / parameters that with their analogues of Attributes for AI/ML.

Attribute

Interpretation

Possible Contrivances

AIML Example(s)

Speed

  • The velocity of an object
  • The rate of a process or action in time
  • Productivity
  • The time to detect and read the speed limit from a sign

Strength

  • The extent to which the system is able to resist changing in response to force.
  • Resistance to breaking
  • Reliability
  • Adaptability / Versatility
  • Error catching / handling?
  • Outlier handling
  • Data drift handling
  • Adversarial attack handling

Loss of Information

  • Partial or complete, permanent or temporary, loss of data or access to data in or by the system
  • Data leakage, i.e. using information during training that is not know or have during operation
  • Information loss via dimension reduction / lossy compression

Quantity of Substance / Matter

  • The number or amount of a system’s materials, substances, parts, or subsystems which might be changed fully or partially, permanently or temporarily
  • Amount of telemetry data needed to extract a representation of a road network
  • Length of a telemetry time series needed to infer who’s driving
  • Number of model trainable parameters / hyperparameters

Reliability

  • Ability to perform intended functions in predictable ways and conditions
  • Strength
  • Adaptability / Versatility
  • Reaction to dirty data / missing data / wrong data type / unrealistic data
  • Outcome sensitivity to internal stochastic processes
  • Maneuver labeling algorithm assumes trajectory has multiple points… what happens if there’s only one??

Accuracy

  • The closeness of a measured value to the actual value of a property of a system
  • Classification accuracy

Adaptability / Versatility

  • The extent to which a system positively responds to external changes
  • The extent to which a system can be used in multiple ways under a variety of circumstances
  • Reliability
  • Strength
  • Performance change when using an algorithm designed with Region 1 data on Region 2 data
  • Sensitivity to data drift

Productivity

  • The number of functions / operations performed per unit time
  • The time for a unit function / operation
  • Speed
  • Number of intersections I can process, e.g. maneuver labeling, in a minute
  • Number of seconds it takes to process, e.g. maneuver labeling, a single intersection

 Once the TRIZ principle is identified, a hint or a number of hints could be generated from the TRIZ-Hints data.

A Partial Realization of TRIZ-supplied Hints Table is presented below:

TRIZ Principle

AI/ML Hint

1

Segmentation /Division

Refactor the Algorithm into many independent algorithms.

3

Local Quality  

Non-uniform data sampling method

5

Consolidation/Combination/Merging

Data Dimension Reduction (PCA / Autoencoding)

6

Universality /Multi-function

Parallelize the computation

17

Transition into New Dimension/Another Dimension  

Change the number of layers (ANN)

One-hot-encoding

PCA analysis

The system will use a pre-populated Risks Map to identify the corresponding record in a TRIZ-like Contradiction Matrix for => Hints

Examples of TRIZ-like contradictions are 

  • performance parameters vs speed of execution
  • precision vs numerical stability

The inventions contain the following "editors":

Data Difference Editor

This enables a user of the system to define how to compute the difference between the training set of data for a model and those that will be used during the process of Risk Assessment by the Process Orchestrator. There are many well-known such methods in prior art such as make use of cyclic redundancy checks (CRC) for each data set, XOR Operation, Image Subtraction, comparing topological invariants of each data set and very many others.

It should be noted that the data types one deals with is, in practice, finite.  We have:

  • Relation Data (Structured Data - use binary subtractions? - attribute by attribute subtraction).
  • Textual Data
  • Different types of images (enabling the usage of Image Analysis techniques)
    • Camera
    • Lidar
    • Radar
  • Different Types of Sensors (time series - using mathematical statistical techniques)
    • Temperature Sensor (infrared)
    • Proximity Sensors
    • Infrared Sensor (IR Sensor)
    • Ultrasonic Sensor
    • Light Sensor
    • Smoke and Gas Sensors
    • Alcohol Sensor
    • Touch Sensor
    • Color Sensor
    • Humidity Sensor
    • Tilt Sensor

Furthermore, the number of sensor types is limited by the Laws of Physics and Chemistry and therefore is finite as well.

Given that the data types are finite, we can use existing measures (Known in the Arts) to characterize the difference between a training dataset and other datasets during the execution of an AI/ML algorithm by the Process Orchestrator.

For example, for time-series sensor data, we have metrics such as the mean, the standard deviation of the mean, the median, min/max, and the power spectrum to utilize in order to compute the difference between the reference dataset used in training and those fed to the algorithm via the process orchestrator.

For image type sensor data (Camera, Lidar, Radar), which occupy different parts of the electromagnetic spectrum, many available image analysis techniques could be used to compute the dataset difference.

In principle, a combination of metrics could be applied to measure the differences between the training dataset and those used during the execution of the model by the Process Orchestrator.

A combined difference score is also computed if the user so decides via this editor.

Risk Score Editor

This enables a user of the system to define the Risk Score for an algorithm based on its performance parameters. The user has a lot of flexibility on the ways to combine the performance parameter of a given model to create a Risk Score for that algorithm; i.e. giving each parameter a different weight and compute the arithmetic average of those parameters. The system will output Risk Score and a Probability of that risk (its confidence level). The Risk Score will be a Real number between 0 to 1.0 and the thresholding (low, medium, high) would be based on the criticality of the application (safety, money, health) Threshold is the user selection.

The Risk probability would be computed based on the ratio of occurrences of a Risk situation (defined as when small changes in data lead to large changes in the Risk Score) to the total number of Risk Scores that are calculated.

Algorithms List Editor

The system is fitted with a pre-populated data store of AI/ML Algorithms and their associated performance parameters.  This list can be updated via this Algorithms list Editor in order to specify new algorithms and their corresponding performance parameters.   The important point is that this is a finite list of AI/ML Algorithm types which is extensible and maintainable.

Active Reinforcement Learning                        

Local Search in Continuous Spaces                      

Agents Based on Propositional Logic                     

Machine Translation                              

Alpha-Beta Pruning                              

Multiagent Planning                              

Artificial Neural Networks                          

Nonparametric Models                            

Augmented Grammars and Semantic Interpretation             

Object Recognition by Appearance                      

Backtracking Search for Constraint Satisfaction Problems                         

Object Recognition from Structural Information               

Bayesian Networks                

Ontological Engineering                            

Complex Decisions - Policy Iteration                                 

Optimal Decisions in Games                         

Complex Decisions - Value Iteration                                 

Partially Observable Games                          

Constraint Propagation: Inference in Constraint Satisfaction Problems                   

Passive Reinforcement Learning                       

Decision Networks                               

Planning and Acting in Nondeterministic Domains              

Dynamic Bayesian Networks                         

Planning Graphs as State-Space Search                                

Ensemble Learning                              

Problem-Solving Agents                            

Explanation-Based Learning                         

Propositional Theorem Proving                        

First-Order Logic with Backward Chaining                              

Reconstructing the D World                         

First-Order Logic with Forward Chaining                               

Regression and Classification with Linear Models              

Heuristic Functions                              

Relational and First-Order Probability Models                

Hidden Markov Models                            

Robotic Moving

Imperfect Real-Time Decisions                        

Robotic Perception                               

Inductive Logic Programming                         

Robotic Planning to Move                               

Information Extraction                             

Robotic Planning Uncertain Movements                        

Information Retrieval                             

Searching with Nondeterministic Actions                   

Informed (Heuristic) Search Strategies                    

Searching with Partial Observations                      

Kalman Filters                                 

Sequential Decision Problems                         

Knowledge Engineering in First-Order Logic                 

Speech Recognition                              

Learning Decision Trees                            

Statistical Learning                              

Learning Using Relevance Information                    

Stochastic Games                               

Learning with Complete Data                         

Supervised Learning                              

Learning with Hidden Variables: The EM Algorithm             

Support Vector Machines                           

Local Search Algorithms and Optimization Problems            

Syntactic Analysis (Parsing)                         

Local Search for Constraint Satisfaction Problems                             

Text Classification                               

Uninformed Search Strategies                         

TRIZ Attributes Editor

Mapping qualitative attributes to quantitative metrics is using an application of the House of Quality concept to this invention.  

It is intended as a tool to help us map TRIZ attributes into quantitative measure of the type of AI/ML algorithms.

The user will select the TRIZ attributes that are relevant to an specific AI/ML algorithm.

The user thresholds the performance parameters; i.e. if threshold is not met, find the relevant attribute, and go back to the TRIZ contradiction matrix and look for hints.

Please note that the Risk Score is a combination of these performance parameters below:


Figure 5

Key Idea

The key insight of the invention is to measure - in a suitable and configurable manner, as specified by the AI/ML Algorithm designer - the Risk Score of a trained AI/ML model by subjecting it to a plurality of of real or simulated or spiked data sets via an automated process.

The basic idea is that small differences between the training (golden) data set and a plurality of similar data sets must be accompanied by small difference of the model's Risk Score.

The word small is something to be defined per algorithm. Small & Large changes do not mean changes in the pixels alone, but other factors - say color, or other geometrical properties, e.g. number of holes. The test data sets could be generated by adding noise to the data, or other objects, rotating the data in the space of similar such data, and so on.

Another key novelty of this system is its use of TRIZ principles as applied to AI/ML Algorithms.

A third novelty of this invention is that it can also assess the stability of an AI/ML Algorithm by comparing changes in the input data (from the training data) with corresponding changes in the performance of that algorithm.

No comments:

Post a Comment