Manufacturing Quality Control with Categorical Outcome (TMLE)

Business Context

A factory wants to determine if a 'HighSpeed' machine setting causes more defects compared to 'Normal'. The outcome is binary (Defect: Yes/No). They use Targeted Maximum Likelihood Estimation (TMLE) for precise bias reduction on this critical binary metric.

About the Set : causalanalysis

Causal inference analysis and effect estimation.

Discover all actions of causalanalysis

Data Preparation

Small dataset with a Binary Outcome (Defect 1/0) and a categorical treatment.

Copied!

1
2	DATA mycas.factory_qc;
3	call streaminit(777);
4	DO batch = 1 to 200;
5	temp = rand('NORMAL', 100, 5);
6	humidity = rand('UNIFORM', 30, 60);
7	IF rand('UNIFORM') > 0.5 THEN setting = 'HighSpeed';
8	ELSE setting = 'Normal';
9	logit_defect = -5 + 0.05temp + 2(setting='HighSpeed');
10	prob_defect = 1 / (1 + exp(-logit_defect));
11	IF rand('UNIFORM') < prob_defect THEN defect_flag = 1;
12	ELSE defect_flag = 0;
13	OUTPUT;
14	END;
15
16	RUN;
17

Étapes de réalisation

Fitting a Gradient Boosting model to predict the binary defect (simulating a complex outcome model).

Copied!

1
2	PROC CAS;
3	decisionTree.gbtreeTrain TABLE={name='factory_qc'}, target='defect_flag', inputs={'setting', 'temp', 'humidity'}, casout={name='gb_model', replace=true};
4	decisionTree.gbtreeScore TABLE={name='factory_qc'}, modelTable={name='gb_model'}, casout={name='qc_scored', replace=true}, copyVars='ALL', encodename=true;
5
6	RUN;
7

Execution of caEffect using TMLE for a CATEGORICAL outcome.

Copied!

1
2	PROC CAS;
3	causalanalysis.caEffect TABLE={name='qc_scored'}, method='TMLE', treatVar={name='setting'}, outcomeVar={name='defect_flag', type='CATEGORICAL', event='1'}, outcomeModel={predName='P_defect_flag1'}, pom={{trtLev='HighSpeed'}, {trtLev='Normal'}}, difference={{evtLev='HighSpeed', refLev='Normal'}}, inference=true;
4
5	RUN;
6

Expected Result

The action correctly handles the 'CATEGORICAL' outcome type. It estimates the probability of Defect (event='1') for both machine settings using TMLE and reports the risk difference (causal effect) with valid confidence intervals.

Voir la documentation technique de caEffect