percentile boxPlot

Large-Scale IoT Sensor Data Monitoring using Iterative Method

Scénario de test & Cas d'usage

Business Context

A smart factory uses thousands of IoT sensors to monitor machine temperatures in real-time. To prevent overheating, the system needs to efficiently calculate the baseline temperature distribution for each production line across millions of data points. Performance is critical.
About the Set : percentile

Precise calculation of percentiles and quantiles.

Discover all actions of percentile
Data Preparation

Creation of a large dataset (2 million rows) simulating temperature readings from sensors on different production lines.

Copied!
1DATA casuser.sensor_data (bufsize=1m);
2 call streaminit(456);
3 DO line_id = 1 to 10;
4 DO sensor_id = 1 to 100;
5 DO i = 1 to 200;
6 base_temp = 70 + (line_id * 2.5);
7 temperature = rand('NORMAL', base_temp, 1.5);
8 OUTPUT;
9 END;
10 END;
11 END;
12 keep line_id sensor_id temperature;
13RUN;

Étapes de réalisation

1
Load the large-scale sensor data into CAS.
Copied!
1 
2PROC CASUTIL;
3load
4DATA=casuser.sensor_data outcaslib='casuser' casout='sensor_data' replace;
5QUIT;
6 
2
Run the boxPlot action using the default 'ITERATIVE' method and a specific percentile definition (pctlDef=5, like SAS default) to test performance on large, grouped data.
Copied!
1PROC CAS;
2 percentile.boxPlot /
3 TABLE={name='sensor_data', groupBy={'line_id'}},
4 inputs={{name='temperature'}},
5 method='ITERATIVE',
6 pctlDef=5;
7RUN;

Expected Result


The action should complete efficiently without errors, returning approximate percentile statistics for each production line. The results will provide a quick and scalable way to monitor the operational temperature range for each line, demonstrating the effectiveness of the iterative method for big data scenarios.