Published on :
Statistical CREATION_INTERNE

Descriptive Analysis of Numerical Variables

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by creating a SAS© dataset named `heights`. This dataset is populated directly within the SAS© code using a `DATALINES` statement, defining several numerical variables. Subsequently, the `PROC UNIVARIATE` procedure is used to perform an in-depth descriptive statistical analysis of the `lskew` variable from the `heights` dataset. The `plot` option is enabled to generate a comprehensive set of graphs (histogram, box plot, Q-Q plot) which are essential for visual assessment of the variable's distribution. The script also contains interpretive comments highlighting the non-normality and left-skewness of the `lskew` variable's distribution.
Data Analysis

Type : CREATION_INTERNE


The `heights` dataset is created directly within the SAS script via a `DATA` step using `datalines` for raw data input. It does not depend on any external data sources (files, databases) or default SAS libraries like SASHELP.

1 Code Block
DATA STEP Data
Explanation :
This code block uses a `DATA` step to create a new SAS dataset named `heights`. The `INPUT` statement defines the variables `id`, `uniform`, `actual`, `tails`, `middle`, `lskew`, and `rskew`. Data for these variables is provided directly in the script using the `DATALINES` statement, meaning the data is embedded within the SAS program itself.
Copied!
1DATA heights;
2 INPUT id uniform actual tails middle lskew rskew;
3 DATALINES;
401 71 75 62 62 65 67
502 71 72 62 63 65 68
603 71 72 62 65 65 70
704 71 72 62 66 64 68
805 71 72 62 67 62 67
906 71 74 62 69 69 69
1007 71 69 62 69 65 68
1108 71 69 62 69 68 70
1209 71 75 62 69 68 69
1310 71 75 62 70 70 70
1411 71 69 62 70 66 68
1512 71 72 62 70 70 67
1613 71 72 62 70 67 69
1714 71 75 62 70 68 68
1815 71 67 62 70 65 67
1916 71 67 62 71 72 70
2017 71 73 62 71 70 70
2118 71 69 62 71 72 70
2219 71 67 62 71 72 71
2320 71 67 62 71 72 70
2421 71 75 64 71 71 72
2522 71 72 64 71 71 71
2623 71 72 64 71 72 72
2724 71 75 66 71 70 71
2825 71 70 66 71 71 72
2926 71 70 76 71 70 70
3027 71 72 76 71 71 72
3128 71 73 78 71 72 70
3229 71 68 78 71 72 71
3330 71 71 78 71 70 72
3431 71 67 80 71 71 72
3532 71 70 80 71 71 71
3633 71 69 80 71 71 71
3734 71 64 80 71 71 71
3835 71 66 80 71 72 71
3936 71 66 80 72 74 75
4037 71 65 80 72 71 73
4138 71 64 80 72 72 76
4239 71 63 80 72 73 72
4340 71 66 80 72 73 77
4441 71 78 80 72 74 75
4542 71 76 80 73 74 75
4643 71 78 80 73 74 77
4744 71 76 80 73 72 72
4845 71 78 80 73 74 74
4946 71 77 80 74 74 79
5047 71 77 80 74 72 78
5148 71 71 80 75 74 79
5249 71 80 80 76 71 78
5350 71 62 80 79 73 80
54RUN;
2 Code Block
PROC UNIVARIATE
Explanation :
This `PROC UNIVARIATE` procedure is used to generate descriptive statistics and graphs for the `lskew` variable from the `heights` dataset. The `PLOT` option requests the automatic production of a histogram, a box plot, and a Q-Q (quantile-quantile) plot, allowing for a visual assessment of the variable's distribution shape, symmetry, extreme values, and normality.
Copied!
1 
2PROC UNIVARIATE
3DATA = heights plot;
4var lskew;
5RUN;
6 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : This code is posted for your benefit; however, I highly recommend that you practice typing your own SAS programs as well. With the SAS programming language, as with all new languages, immersion seems to be the best way to learn.