Summary:
As the application of Artificial
Intelligence (AI) and Machine Learning (ML) algorithms spread in the automotive
domain (as well as other areas of business and industry), it is very important
to be able to assess the risks that inhere in those algorithms in order to
avoid or to minimize those risks. Specially in the automotive domain, it
is critical to mitigate or to eliminate threats to passenger safety, security,
and comfort.
AI/ML algorithms are often presented
in pseudo-code and/or are implemented in a high-level programming language such
as Python, Java, C#, C++. It is desirable to have a software system that
automatically assesses the robustness of an AI/ML algorithm against several
risk scenarios, identify them, categorize them, and supplies remediation
approaches.
Risk Score Calculation
The invention is envisioned as
automated process which analyzes an algorithm for risks inhering in it and
recommends remediation steps. The invention processes different data
sets sequentially for the same trained model. That is, it keeps the
trained algorithm constant and exercises it against different data sets.
The invention performs the following steps (per)
Figure 1
- It will perform a variety of pre-processing steps such
as checking the dimensions of the input vector and the dimension of the
output vector (feature space). In such an example, if dimension_of_input
< dimension_of_output then the model is risky and the system will
so advise the user. Other such pre-processing steps could be added
for different AI/ML algorithms. These pre-processing checks are
configurable by the user of the system.
- It will compute a data difference score by a
suitable method of data subtraction (there could be many such methods –
simple XOR, CRC difference, etc. These methods, for each data set
and algorithms, can be defined via a Data Difference Editor feature of the
invention.)
- The run-time – Process Orchestrator – will execute the
algorithm and compute its Risk score.
- It will then subtract the new Risk Score from the
Trained-Model’s Risk Score.
- Did it increase – by what percentage?
- Conceivably, given a lot of data sets that are
different, one can even generate good statistics for the Risk Score
changes.
- So, we can compute an ensemble average of the change in
Risk Score and then proceed to the TRIZ steps.
The Orchestrator is programmed to
process a list of data sets until the list is exhausted. As a Workflow Orchestrator, we will be using a tool such
as ML Flow since it provides a pipeline for AI/ML algorithms. It executes
the algorithms and gathers performance metrics. Suitable software
components would then analyze those performance parameters for Risks. The
invention is not dependent on ML Flow, any other Workflow Orchestrator
would work as well.
Figure 2
A trained AI model (to be analyzed)
is fed with an specific dataset on which the model has been trained and
validated. The following is a partial list of performance parameters that could
be evaluated in order to combine them, via the Risk Score Editor, into a
Risk Score.
- Accuracy
- True Positive Rate (TPR)
- False Positive Rate (FPR)
- Recall
- Precision
- F1-Measure
- Area under the Receiver Operating Characteristic (ROC)
curve
- Area under the Precision-Recall (PR) curve
- Logarithmic loss
- C-Measure
- R-Measure
An example of performance parameters
for Classical Binary Classification is displayed below - analogues of such
parameters can be defined for other AI/ML algorithm types.
Figure 3
Those are captured in
the Algorithm Types & Performance Parameters data store.
This invention is intended for
mitigating the risks of the trained model and not the data. The main
objective of this tool is to advise the modeler to mitigate model-specific
issues before going to Deployment phase of the model. Its output could be
used to indicate the robustness of a trained model with respect to data
differences.
Generating Risk Remediation Hints
Risk is estimated based on a
combination of performance parameters. The System will go through
performance parameters - based on the "strength" of that parameter in
the calculation of the combined Risk Score and supplies hints to the algorithm
designer.
The way that is done is that the
Risk Score is disambiguated by looking at its parts - per its definition via the Risk Score Editor
(please see below).
The system utilizes the validated TRIZ approach.
Figure 4
TRIZ principles are
not changed but their analogue realizations in AI/ML domain are used.
There are 39 technical parameters
(or features) in TRIZ. To use the TRIZ contradiction matrix, the performance
parameters of AI/ML algorithms are mapped to the closest TRIZ equivalent
feature. They are then placed at the same position of their equivalents
in the TRIZ contradiction matrix, and then the system uses the
suggested TRIZ inventive principles to resolve the tradeoff between pairs
of AI/ML features.
AI/ML Analogue of TRIZ General
Attributes
Below is table of TRIZ general
system attributes / parameters that with their analogues of Attributes for
AI/ML.
Attribute |
Interpretation |
Possible Contrivances |
AIML Example(s) |
Speed |
|
|
|
Strength |
|
|
|
Loss of Information |
|
|
|
Quantity of Substance / Matter |
|
|
|
Reliability |
|
|
|
Accuracy |
|
|
|
Adaptability / Versatility |
|
|
|
Productivity |
|
|
|
Once the TRIZ principle is
identified, a hint or a number of hints could be generated from the TRIZ-Hints
data.
A Partial Realization of
TRIZ-supplied Hints Table is presented below:
TRIZ Principle |
AI/ML Hint |
|
1 |
Segmentation /Division |
Refactor the Algorithm into many independent algorithms. |
3 |
Local Quality |
Non-uniform data sampling method |
5 |
Consolidation/Combination/Merging |
Data Dimension Reduction (PCA / Autoencoding) |
6 |
Universality /Multi-function |
Parallelize the computation |
17 |
Transition into New Dimension/Another
Dimension |
Change the number of layers (ANN) One-hot-encoding PCA analysis |
The system will use a pre-populated
Risks Map to identify the corresponding record in a TRIZ-like Contradiction
Matrix for => Hints
Examples of TRIZ-like contradictions
are
- performance parameters vs speed of execution
- precision vs numerical stability
The inventions contain the following
"editors":
Data Difference Editor
This enables a user of the system to
define how to compute the difference between the training set of data for a
model and those that will be used during the process of Risk Assessment by the
Process Orchestrator. There are many well-known such methods in prior art such
as make use of cyclic redundancy checks (CRC) for each data set, XOR Operation,
Image Subtraction, comparing topological invariants of each data set and very
many others.
It should be noted that the data
types one deals with is, in practice, finite. We have:
- Relation Data (Structured Data - use binary
subtractions? - attribute by attribute subtraction).
- Textual Data
- Different types of images (enabling the usage of Image
Analysis techniques)
- Camera
- Lidar
- Radar
- Different Types of Sensors (time series - using
mathematical statistical techniques)
- Temperature Sensor (infrared)
- Proximity Sensors
- Infrared Sensor (IR Sensor)
- Ultrasonic Sensor
- Light Sensor
- Smoke and Gas Sensors
- Alcohol Sensor
- Touch Sensor
- Color Sensor
- Humidity Sensor
- Tilt Sensor
Furthermore, the number of sensor
types is limited by the Laws of Physics and Chemistry and therefore is finite
as well.
Given that the data types are
finite, we can use existing measures (Known in the Arts) to characterize the
difference between a training dataset and other datasets during the execution
of an AI/ML algorithm by the Process Orchestrator.
For example, for time-series sensor
data, we have metrics such as the mean, the standard deviation of the mean, the
median, min/max, and the power spectrum to utilize in order to compute the
difference between the reference dataset used in training and those fed to the
algorithm via the process orchestrator.
For image type sensor data (Camera,
Lidar, Radar), which occupy different parts of the electromagnetic spectrum,
many available image analysis techniques could be used to compute the dataset
difference.
In principle, a combination of
metrics could be applied to measure the differences between the training
dataset and those used during the execution of the model by the Process
Orchestrator.
A combined difference score is also
computed if the user so decides via this editor.
Risk Score Editor
This enables a user of the system to
define the Risk Score for an algorithm based on its performance parameters. The
user has a lot of flexibility on the ways to combine the performance parameter
of a given model to create a Risk Score for that algorithm; i.e. giving each
parameter a different weight and compute the arithmetic average of those
parameters. The system will output Risk Score and a Probability of that risk
(its confidence level). The Risk Score will be a Real number between 0 to 1.0
and the thresholding (low, medium, high) would be based on the criticality of
the application (safety, money, health) Threshold is the user selection.
The Risk probability would be computed based on the ratio of occurrences of a
Risk situation (defined as when small changes in data lead to large changes in
the Risk Score) to the total number of Risk Scores that are calculated.
Algorithms List Editor
The system is fitted with a
pre-populated data store of AI/ML Algorithms and their associated performance
parameters. This list can be updated via this Algorithms list Editor in
order to specify new algorithms and their corresponding performance parameters.
The important point is that this is a finite list of AI/ML Algorithm
types which is extensible and maintainable.
Active Reinforcement
Learning |
Local Search in Continuous
Spaces |
Agents Based on Propositional Logic |
Machine
Translation |
Alpha-Beta
Pruning |
Multiagent
Planning |
Artificial Neural
Networks |
Nonparametric
Models |
Augmented Grammars and Semantic
Interpretation |
Object Recognition by
Appearance |
Backtracking Search for Constraint Satisfaction
Problems |
Object Recognition from Structural
Information |
Bayesian
Networks |
Ontological
Engineering |
Complex Decisions - Policy
Iteration |
Optimal Decisions in
Games |
Complex Decisions - Value
Iteration |
Partially Observable
Games |
Constraint Propagation: Inference in Constraint
Satisfaction
Problems |
Passive Reinforcement
Learning |
Decision
Networks |
Planning and Acting in Nondeterministic
Domains |
Dynamic Bayesian
Networks |
Planning Graphs as State-Space
Search |
Ensemble
Learning |
Problem-Solving
Agents |
Explanation-Based
Learning |
Propositional Theorem
Proving |
First-Order Logic with Backward
Chaining |
Reconstructing the D
World |
First-Order Logic with Forward
Chaining |
Regression and Classification with Linear
Models |
Heuristic Functions |
Relational and First-Order Probability
Models |
Hidden Markov
Models |
Robotic Moving |
Imperfect Real-Time
Decisions |
Robotic
Perception |
Inductive Logic
Programming |
Robotic Planning to
Move |
Information
Extraction |
Robotic Planning Uncertain
Movements |
Information
Retrieval |
Searching with Nondeterministic
Actions |
Informed (Heuristic) Search
Strategies |
Searching with Partial
Observations |
Kalman Filters |
Sequential Decision
Problems |
Knowledge Engineering in First-Order
Logic |
Speech
Recognition |
Learning Decision
Trees |
Statistical
Learning |
Learning Using Relevance
Information |
Stochastic
Games |
Learning with Complete
Data |
Supervised Learning |
Learning with Hidden Variables: The EM
Algorithm |
Support Vector
Machines |
Local Search Algorithms and Optimization
Problems |
Syntactic Analysis (Parsing) |
Local Search for Constraint Satisfaction
Problems |
Text
Classification |
Uninformed Search
Strategies |
TRIZ Attributes Editor
Mapping qualitative attributes to
quantitative metrics is using an application of the House of
Quality concept to this invention.
It is intended as a tool to help us
map TRIZ attributes into quantitative measure of the type of AI/ML algorithms.
The user will select the TRIZ
attributes that are relevant to an specific AI/ML algorithm.
The user thresholds the performance
parameters; i.e. if threshold is not met, find the relevant attribute, and go
back to the TRIZ contradiction matrix and look for hints.
Please note that the Risk Score is a
combination of these performance parameters below:
Figure 5
Key Idea
The key insight of the invention is
to measure - in a suitable and configurable manner, as specified by the AI/ML
Algorithm designer - the Risk Score of a trained AI/ML model by subjecting it
to a plurality of of real or simulated or
spiked data sets via an automated process.
The basic idea is that
small differences between the training (golden) data set and a plurality of
similar data sets must be accompanied by small difference of the model's Risk
Score.
The word small is something to be
defined per algorithm. Small & Large changes do not mean changes in the
pixels alone, but other factors - say color, or other geometrical properties,
e.g. number of holes. The test data sets could be generated by adding noise to
the data, or other objects, rotating the data in the space of similar such
data, and so on.
Another key novelty of this system
is its use of TRIZ principles as applied to AI/ML Algorithms.
A third novelty of this invention is
that it can also assess the stability of an AI/ML Algorithm by comparing
changes in the input data (from the training data) with corresponding changes
in the performance of that algorithm.
No comments:
Post a Comment