assessBias - WeAreCAS

Q: What is the 'code' parameter used for in the assessBias action?

The 'code' parameter specifies the DATA step code that describes the model or specifies the DS2 code that is used along with an analytic store that you specify in the modelTable or modelTables parameter.

Q: How is the 'cutoff' parameter used in the assessBias action?

The 'cutoff' parameter specifies the cutoff for the confusion matrix. The default value is 0.5 and the range is (0, 1).

Q: What does the 'event' parameter signify?

The 'event' parameter specifies the formatted value of the response (target) variable that represents the event of interest.

Q: How can I specify frequency values for the analysis?

You can use the 'frequency' parameter to specify the variable that contains frequency values.

Q: What is the purpose of the 'modelTable' parameter?

The 'modelTable' parameter specifies the input table that contains the model to explain. This table must contain an analytic store or DATA step scoring code.

Q: When should I use the 'modelTables' parameter?

The 'modelTables' parameter is used to specify the input tables that contain the model to explain when the model is composed of analytic stores and requires accompanying DS2 code specified in the 'code' parameter.

Q: What are the possible values for the 'modelTableType' parameter?

The 'modelTableType' parameter specifies the type of scoring the model table contains. It can be 'ASTORE', 'DATASTEP', or 'NONE'. The default is 'ASTORE'.

Q: What does the 'nBins' parameter control?

The 'nBins' parameter specifies the number of bins to use in lift calculations. The default is 20, and the range is from 2 to 100.

Q: How do I specify the model's prediction variables?

Use the 'predictedVariables' parameter, which is a required list of variables that contain the model's predictions. The order of variables must match the order in the 'responseLevels' parameter.

At a glance

The assessBias action within the fairAITools set serves as a critical diagnostic layer for anyone deploying predictive models on SAS Viya. By quantifying performance gaps across sensitive attributes, this function empowers developers to detect hidden discriminatory patterns before they impact real-world outcomes. Whether you are managing regulatory requirements or refining model transparency, understanding this tool is key to building trustworthy AI. Below, you will find a curated FAQ designed to guide you through the technical nuances and implementation best practices of bias detection.

Description

The assessBias action calculates bias metrics for predictive models. This is a crucial step in ensuring fairness in artificial intelligence by identifying whether a model produces different outcomes for different subgroups, particularly those defined by sensitive variables like race or gender. The action can handle models saved as analytic stores (ASTORE) or as SAS DATA step code.

fairAITools.assessBias { code="string", cutoff=double, event="string", frequency={casvardesc}, modelTable={castable}, modelTables={{castable-1} <, {castable-2}, ...>}, modelTableType="ASTORE" | "DATASTEP" | "NONE", nBins=64-bit-integer, predictedVariables={{casvardesc-1} <, {casvardesc-2}, ...>}, referenceLevel="string", response={casvardesc}, responseLevels={"string-1" <, "string-2", ...>}, rocStep=double, scoredTable={casouttable}, selectionDepth=64-bit-integer, sensitiveVariable={casvardesc}, table={castable}, weight={casvardesc} };

Settings

Parameter	Description
code	Specifies the DATA step code that describes the model or the DS2 code used with an analytic store.
cutoff	Specifies the probability cutoff for classifying an observation as an event in the confusion matrix. Default is 0.5.
event	Specifies the formatted value of the response variable that represents the event of interest.
frequency	Specifies the variable that contains the frequency of occurrence for each observation.
modelTable	Specifies the input table containing the model to be assessed, which can be an analytic store or DATA step scoring code.
modelTables	Specifies multiple input tables containing model components, typically used with DS2 code.
modelTableType	Specifies the type of scoring model provided: ASTORE, DATASTEP, or NONE. Default is ASTORE.
nBins	Specifies the number of bins to use for lift calculations. Default is 20.
predictedVariables	Specifies the list of variables that contain the model's predictions.
referenceLevel	Specifies the reference level for the sensitive variable, which acts as the baseline for comparison.
response	Specifies the response (target) variable.
responseLevels	Specifies the list of formatted values for the response variable.
rocStep	Specifies the step size for Receiver Operating Characteristic (ROC) calculations. Default is 0.05.
scoredTable	Specifies the output table to store the scored results.
selectionDepth	Specifies the depth to use in lift calculations. Default is 10.
sensitiveVariable	Specifies the sensitive variable (e.g., gender, race) to use for bias assessment.
table	Specifies the input data table for assessment.
weight	Specifies the variable that contains observation weights.

Data Preparation View data prep sheet

Data Creation for Bias Assessment

This example first loads the `HMEQ` dataset, which contains home equity loan data. Then, a gradient boosting model is trained to predict loan defaults (`BAD`). The model's predictions are saved as `P_BAD1` and `P_BAD0`. This scored table, `HMEQ_SCORED`, will be used as input for the bias assessment.

Copied!

1	PROC CASUTIL;
2	load DATA=sampsio.hmeq path='%casuser/hmeq.csv' replace;
3	QUIT;
4
5	PROC GRADBOOST DATA=mycas.hmeq seed=12345;
6	INPUT LOAN MORTDUE VALUE YOJ DEROG DELINQ CLAGE NINQ CLNO DEBTINC / level=interval;
7	INPUT REASON JOB / level=nominal;
8	target BAD / level=nominal;
9	OUTPUT out=mycas.hmeq_scored copyvars=(_all_) pred=p;
10	QUIT;

Examples

This example performs a basic bias assessment on a pre-scored table. It uses the `JOB` variable as the sensitive attribute and `BAD` as the response variable. The model's predicted probabilities for the event '1' are in the `P_BAD1` variable.

SAS® / CAS Code Code awaiting community validation

Copied!

1	PROC CAS;
2	fairAITools.assessBias
3	TABLE={name='hmeq_scored'},
4	response={name='BAD'},
5	sensitiveVariable={name='JOB'},
6	predictedVariables={{name='P_BAD1'}},
7	event='1';
8	RUN;

This example demonstrates a more detailed bias assessment. It explicitly defines 'Other' as the reference level for the `JOB` sensitive variable. It also specifies a custom probability cutoff of 0.6 for creating the confusion matrix and saves the detailed assessment results, including group-specific metrics, into a CAS table named `BIAS_ASSESSMENT_RESULTS`.

SAS® / CAS Code Code awaiting community validation

Copied!

1	PROC CAS;
2	fairAITools.assessBias
3	TABLE={name='hmeq_scored'},
4	response={name='BAD'},
5	sensitiveVariable={name='JOB'},
6	predictedVariables={{name='P_BAD1'}},
7	event='1',
8	referenceLevel='Other',
9	cutoff=0.6,
10	scoredTable={name='BIAS_ASSESSMENT_RESULTS', replace=true};
11	RUN;

FAQ

What is the purpose of the fairAITools.assessBias action?

What is the 'code' parameter used for in the assessBias action?

How is the 'cutoff' parameter used in the assessBias action?

What does the 'event' parameter signify?

How can I specify frequency values for the analysis?

What is the purpose of the 'modelTable' parameter?

When should I use the 'modelTables' parameter?

What are the possible values for the 'modelTableType' parameter?

What does the 'nBins' parameter control?

How do I specify the model's prediction variables?

What is the 'referenceLevel' parameter for?

How is the response or target variable specified?

What is the 'responseLevels' parameter?

What does the 'rocStep' parameter do?

How can I save the scored outputs?

What is the 'selectionDepth' parameter?

Which parameter is required for specifying the sensitive variable?

How do I specify the input data table for the assessBias action?

Associated Scenarios

Use Case

Standard Case: Assessing Gender Bias in a Loan Approval Model

A retail bank has developed a machine learning model to predict the likelihood of loan default. To comply with fair lending regulations, the bank needs to assess whether the mod...

View scenario

Use Case

Performance Case: Bias Assessment on a Large Dataset with an ASTORE Model

An insurance company uses a gradient boosting model (ASTORE) to flag potentially fraudulent claims. They need to ensure the model is not unfairly flagging claims from certain ge...

View scenario

Use Case

Edge Case: Handling Missing Data and Weights in Patient Readmission Model

A healthcare provider wants to assess a model that predicts patient readmission within 30 days. The goal is to check for bias related to the patient's preferred language. The da...

View scenario

Actions associées

fairAITools

mitigateBias

The mitigateBias action uses the exponentiated gradient reduction algorithm t...

Table of Contents

At a glance

Description

Data Creation for Bias Assessment

Examples

Basic Bias Assessment

Detailed Bias Assessment with Reference Group and Output

FAQ

Associated Scenarios

Use Case

Standard Case: Assessing Gender Bias in a Loan Approval Model

Use Case

Performance Case: Bias Assessment on a Large Dataset with an ASTORE Model

Use Case

Edge Case: Handling Missing Data and Weights in Patient Readmission Model

Actions associées

mitigateBias