fairAITools

assessBias

L'essentiel
At a glance
The assessBias action within the fairAITools set serves as a critical diagnostic layer for anyone deploying predictive models on SAS Viya. By quantifying performance gaps across sensitive attributes, this function empowers developers to detect hidden discriminatory patterns before they impact real-world outcomes. Whether you are managing regulatory requirements or refining model transparency, understanding this tool is key to building trustworthy AI. Below, you will find a curated FAQ designed to guide you through the technical nuances and implementation best practices of bias detection.

Description

The assessBias action calculates bias metrics for predictive models. This is a crucial step in ensuring fairness in artificial intelligence by identifying whether a model produces different outcomes for different subgroups, particularly those defined by sensitive variables like race or gender. The action can handle models saved as analytic stores (ASTORE) or as SAS DATA step code.

fairAITools.assessBias { code="string", cutoff=double, event="string", frequency={casvardesc}, modelTable={castable}, modelTables={{castable-1} <, {castable-2}, ...>}, modelTableType="ASTORE" | "DATASTEP" | "NONE", nBins=64-bit-integer, predictedVariables={{casvardesc-1} <, {casvardesc-2}, ...>}, referenceLevel="string", response={casvardesc}, responseLevels={"string-1" <, "string-2", ...>}, rocStep=double, scoredTable={casouttable}, selectionDepth=64-bit-integer, sensitiveVariable={casvardesc}, table={castable}, weight={casvardesc} };
Settings
ParameterDescription
code Specifies the DATA step code that describes the model or the DS2 code used with an analytic store.
cutoff Specifies the probability cutoff for classifying an observation as an event in the confusion matrix. Default is 0.5.
event Specifies the formatted value of the response variable that represents the event of interest.
frequency Specifies the variable that contains the frequency of occurrence for each observation.
modelTable Specifies the input table containing the model to be assessed, which can be an analytic store or DATA step scoring code.
modelTables Specifies multiple input tables containing model components, typically used with DS2 code.
modelTableType Specifies the type of scoring model provided: ASTORE, DATASTEP, or NONE. Default is ASTORE.
nBins Specifies the number of bins to use for lift calculations. Default is 20.
predictedVariables Specifies the list of variables that contain the model's predictions.
referenceLevel Specifies the reference level for the sensitive variable, which acts as the baseline for comparison.
response Specifies the response (target) variable.
responseLevels Specifies the list of formatted values for the response variable.
rocStep Specifies the step size for Receiver Operating Characteristic (ROC) calculations. Default is 0.05.
scoredTable Specifies the output table to store the scored results.
selectionDepth Specifies the depth to use in lift calculations. Default is 10.
sensitiveVariable Specifies the sensitive variable (e.g., gender, race) to use for bias assessment.
table Specifies the input data table for assessment.
weight Specifies the variable that contains observation weights.
Data Preparation View data prep sheet
Data Creation for Bias Assessment

This example first loads the `HMEQ` dataset, which contains home equity loan data. Then, a gradient boosting model is trained to predict loan defaults (`BAD`). The model's predictions are saved as `P_BAD1` and `P_BAD0`. This scored table, `HMEQ_SCORED`, will be used as input for the bias assessment.

Copied!
1PROC CASUTIL;
2 load DATA=sampsio.hmeq path='%casuser/hmeq.csv' replace;
3QUIT;
4 
5PROC GRADBOOST DATA=mycas.hmeq seed=12345;
6 INPUT LOAN MORTDUE VALUE YOJ DEROG DELINQ CLAGE NINQ CLNO DEBTINC / level=interval;
7 INPUT REASON JOB / level=nominal;
8 target BAD / level=nominal;
9 OUTPUT out=mycas.hmeq_scored copyvars=(_all_) pred=p;
10QUIT;

Examples

This example performs a basic bias assessment on a pre-scored table. It uses the `JOB` variable as the sensitive attribute and `BAD` as the response variable. The model's predicted probabilities for the event '1' are in the `P_BAD1` variable.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 fairAITools.assessBias
3 TABLE={name='hmeq_scored'},
4 response={name='BAD'},
5 sensitiveVariable={name='JOB'},
6 predictedVariables={{name='P_BAD1'}},
7 event='1';
8RUN;

This example demonstrates a more detailed bias assessment. It explicitly defines 'Other' as the reference level for the `JOB` sensitive variable. It also specifies a custom probability cutoff of 0.6 for creating the confusion matrix and saves the detailed assessment results, including group-specific metrics, into a CAS table named `BIAS_ASSESSMENT_RESULTS`.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 fairAITools.assessBias
3 TABLE={name='hmeq_scored'},
4 response={name='BAD'},
5 sensitiveVariable={name='JOB'},
6 predictedVariables={{name='P_BAD1'}},
7 event='1',
8 referenceLevel='Other',
9 cutoff=0.6,
10 scoredTable={name='BIAS_ASSESSMENT_RESULTS', replace=true};
11RUN;

FAQ

What is the purpose of the fairAITools.assessBias action?
What is the 'code' parameter used for in the assessBias action?
How is the 'cutoff' parameter used in the assessBias action?
What does the 'event' parameter signify?
How can I specify frequency values for the analysis?
What is the purpose of the 'modelTable' parameter?
When should I use the 'modelTables' parameter?
What are the possible values for the 'modelTableType' parameter?
What does the 'nBins' parameter control?
How do I specify the model's prediction variables?
What is the 'referenceLevel' parameter for?
How is the response or target variable specified?
What is the 'responseLevels' parameter?
What does the 'rocStep' parameter do?
How can I save the scored outputs?
What is the 'selectionDepth' parameter?
Which parameter is required for specifying the sensitive variable?
How do I specify the input data table for the assessBias action?

Associated Scenarios

Use Case
Standard Case: Assessing Gender Bias in a Loan Approval Model

A retail bank has developed a machine learning model to predict the likelihood of loan default. To comply with fair lending regulations, the bank needs to assess whether the mod...

Use Case
Performance Case: Bias Assessment on a Large Dataset with an ASTORE Model

An insurance company uses a gradient boosting model (ASTORE) to flag potentially fraudulent claims. They need to ensure the model is not unfairly flagging claims from certain ge...

Use Case
Edge Case: Handling Missing Data and Weights in Patient Readmission Model

A healthcare provider wants to assess a model that predicts patient readmission within 30 days. The goal is to check for bias related to the patient's preferred language. The da...