bayesianNetClassifier bnet

Edge Case & Robustness: Predictive Maintenance with Missing Sensor Data

Scénario de test & Cas d'usage

Business Context

An industrial manufacturing plant wants to predict equipment failure using sensor data. However, due to network issues and sensor malfunctions, the data is often incomplete. This scenario tests the model's robustness and its ability to handle missing values in both interval and nominal variables through imputation.
About the Set : bayesianNetClassifier

Classification using Bayesian networks.

Discover all actions of bayesianNetClassifier
Data Preparation

Creation of a sensor dataset with significant missing data. Both interval ('Temperature', 'Pressure') and nominal ('SensorHealth') variables have missing values, represented by '.' for numeric and empty strings for character.

Copied!
1DATA casuser.sensor_data;
2 LENGTH Failure_Risk $4. SensorHealth $6.;
3 INFILE DATALINES delimiter=',';
4 INPUT MachineID $ Temperature Pressure Vibration SensorHealth $ Failure_Risk $;
5 DATALINES;
6M01,250.5,15.2,0.5,Normal,Low
7M02,310.1,.,0.8,Normal,High
8M03,245.8,14.9,1.2,,High
9M04,.,15.5,0.4,Normal,Low
10M05,295.0,19.8,0.7,Alert,High
11M06,260.3,16.1,0.5,Normal,Low
12M07,330.0,21.0,.,,High
13M08,.,.,0.6,Alert,Low
14;
15RUN;

Étapes de réalisation

1
Train a Parent-Child (PC) structure model. Configure the action to impute missing interval variables with the mean and missing nominal variables with the mode.
Copied!
1PROC CAS;
2 bayesianNetClassifier.bnet
3 TABLE={name='sensor_data'}
4 target='Failure_Risk'
5 inputs={'Temperature', 'Pressure', 'Vibration', 'SensorHealth'}
6 nominals={'SensorHealth', 'Failure_Risk'}
7 structures={'PC'}
8 missingInt='IMPUTE'
9 missingNom='IMPUTE'
10 alpha=0.1
11 OUTPUT={casout={name='sensor_scored', replace=true}, copyVars={'MachineID'}}
12 display={'MissInfo', 'NetInfo'};
13RUN;
2
Check the 'MissInfo' table in the results to confirm that imputation was applied to the specified variables.
Copied!
1/* The 'MissInfo' table is displayed directly in the results of the previous step. Manual review of the log is expected. */

Expected Result


The action must run without errors, demonstrating its ability to handle a mix of missing data types. The 'MissInfo' table in the output log should confirm that 'Temperature', 'Pressure', and 'SensorHealth' were imputed. A scored output table 'sensor_scored' should be created with predictions for all input rows, including those that originally had missing values. This validates the imputation strategy's effectiveness.