copula copulaFit

High-Volume Sensor Anomaly Detection with Clayton Copula

Scénario de test & Cas d'usage

Business Context

A manufacturing plant monitors the health of 500,000 IoT sensors. They need to detect synchronized failures where multiple sensor readings (Temperature, Vibration) drop simultaneously (lower tail dependence). Due to the high volume of data streaming in, the Engineering team needs a fast estimation method (Calibration) rather than the computationally expensive MLE to update the model frequently.
Data Preparation

Generation of a large dataset (500,000 observations) representing sensor readings with induced lower-tail dependence.

Copied!
1 
2DATA mycas.sensor_data;
3call streaminit(99);
4DO i = 1 to 500000;
5u = rand('Uniform');
6v = rand('Uniform');
7theta = 2;
8IF u > 0 THEN DO;
9t = (-log(u))**(1/theta);
10temp = (1 + t)**(-1/theta);
11vib = (1 + t + (-log(v))**(1/theta))**(-1/theta);
12OUTPUT;
13END;
14END;
15 
16RUN;
17 

Étapes de réalisation

1
Execute the copula fit using the 'CLAYTON' type (sensitive to lower tails) and the 'CAL' (Calibration) method for performance efficiency on the large dataset.
Copied!
1 
2PROC CAS;
3copula.copulaFit / TABLE={name='sensor_data'}, var={'temp', 'vib'}, copulatype='CLAYTON', method='CAL', timingReport={summary=true};
4 
5RUN;
6 
7QUIT;
8 

Expected Result


The model fits significantly faster than MLE. The output provides the Theta parameter indicative of the lower tail dependence. The Timing Report confirms the efficiency of the calibration method on the large dataset.