Published on :
Statistics INTERNAL_CREATION

Descriptive Analysis I - Measures of Central Tendency

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by creating a 'height_and_weight_20' dataset using internal 'datalines'. It contains variables for ID, sex, height (ht_in), and weight (wgt_lbs). Then, it uses 'PROC MEANS' to calculate descriptive statistics (number of observations, mean, median, mode, standard deviation, minimum, maximum) for height and weight, with a precision of two decimal places. Comments in the original source code point out 'red flags' concerning missing data for weight and an impossible value (-69) for height.
Data Analysis

Type : INTERNAL_CREATION


Data is created directly in the script via a DATA STEP with `datalines`.

1 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates the 'height_and_weight_20' dataset by reading raw data provided in the `datalines`. It defines four variables: `id` (character), `sex` (character), `ht_in` (numeric for height in inches), and `wgt_lbs` (numeric for weight in pounds). Missing values (periods) are present for sex and weight, and a clearly erroneous value (-69) is included for height, which is noted as a 'red flag' in the original comments.
Copied!
1DATA height_and_weight_20;
2 INPUT id $ sex $ ht_in wgt_lbs;
3 DATALINES;
4 001 Male 71 190
5 002 Male 69 175
6 003 Female 64 130
7 004 Female 65 154
8 005 . 73 173
9 006 Male 69 182
10 007 Female 68 .
11 008 . 73 185
12 009 Female 71 157
13 010 Male 66 155
14 011 Male 71 213
15 012 Female 69 151
16 013 Female 66 147
17 014 Female 68 196
18 015 Male 75 212
19 016 Female -69 190
20 017 Female 66 194
21 018 Female 65 176
22 019 Female 65 176
23 020 Female 65 102
24RUN;
2 Code Block
PROC MEANS
Explanation :
This 'PROC MEANS' procedure is used to calculate descriptive statistics for the variables `ht_in` and `wgt_lbs` from the `height_and_weight_20` dataset. The requested statistics are the number of observations (n), mean, median, mode, standard deviation (std), minimum (min), and maximum (max). The `maxdec=2` option limits the display of decimals to two digits.
Copied!
1 
2PROC MEANS
3DATA = height_and_weight_20 n mean median mode std min max maxdec=2;
4var ht_in wgt_lbs;
5RUN;
6 
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.