The script initializes a dataset named 'Color' using a DATA step and data directly provided via 'datalines'. The variables 'Region', 'Eyes' (eye color, character), 'Hair' (hair color, character) and 'Count' (numeric) are defined. Descriptive labels are assigned to the 'Eyes', 'Hair' and 'Region' variables for better interpretation of the outputs. Subsequently, three distinct PROC FREQ blocks are executed on the 'Color' dataset. The first two blocks focus on analyzing the frequency of the 'Region' variable, applying binomial tests with specific confidence interval methods (Agresti-Coull, Wilson, exact) for the first ('level=1') and second ('level=2') level of the 'Region' variable, with an alpha threshold of 0.1. The 'Count' variable is used as a weight for these analyses. A common title is also defined for the outputs of these procedures. The third PROC FREQ block performs a standard binomial frequency analysis on the 'Region' variable without advanced specifications. The overall objective of the script is to examine the distribution and proportions of eye and hair color categories based on the geographical region.
Data Analysis
Type : INTERNAL_CREATION
The 'Color' dataset is created and populated directly within the script via a DATA step and the DATALINES statement. All data required for the analysis is provided internally.
1 Code Block
DATA STEP Data
Explanation : This DATA STEP block creates the 'Color' dataset by reading raw data provided in DATALINES. It defines four variables: 'Region' (numeric), 'Eyes' (character string), 'Hair' (character string) and 'Count' (numeric). Descriptive labels are assigned to the 'Eyes', 'Hair' and 'Region' variables to improve the readability of output reports.
Copied!
data Color;
input Region Eyes $ Hair $ Count;
label Eyes ='Eye Color'
Hair ='Hair Color'
Region='Geographic Region';
datalines;
1 blue fair 23 1 blue red 7 1 blue medium 24
1 blue dark 11 1 green fair 19 1 green red 7
1 green medium 18 1 green dark 14 1 brown fair 34
1 brown red 5 1 brown medium 41 1 brown dark 40
1 brown black 3 0 blue fair 46 0 blue red 21
0 blue medium 44 0 blue dark 40 0 blue black 6
0 green fair 50 0 green red 31 0 green medium 37
0 green dark 23 0 brown fair 56 0 brown red 42
0 brown medium 53 0 brown dark 54 0 brown black 13
;
run;
1
DATA Color;
2
INPUT Region Eyes $ Hair $ Count;
3
label Eyes ='Eye Color'
4
Hair ='Hair Color'
5
Region='Geographic Region';
6
DATALINES;
7
1 blue fair 231 blue red 71 blue medium 24
8
1 blue dark 111 green fair 191 green red 7
9
1 green medium 181 green dark 141 brown fair 34
10
1 brown red 51 brown medium 411 brown dark 40
11
1 brown black 3 0 blue fair 46 0 blue red 21
12
0 blue medium 44 0 blue dark 40 0 blue black 6
13
0 green fair 50 0 green red 31 0 green medium 37
14
0 green dark 23 0 brown fair 56 0 brown red 42
15
0 brown medium 53 0 brown dark 54 0 brown black 13
16
;
17
RUN;
2 Code Block
PROC FREQ
Explanation : This block executes PROC FREQ on the 'Color' dataset. It generates frequency tables for the 'Region' variable. The `binomial(ac wilson exact level=1) alpha=.1` option requests the calculation of binomial confidence intervals (Agresti-Coull, Wilson, exact) for the first level of 'Region', with a significance level of 0.1. The 'Count' variable is used as the observation weighting variable. A title is also specified for the output.
Copied!
proc freq data=Color order=freq;
tables region / binomial(ac wilson exact level=1) alpha=.1 ;
exact binomial;
weight Count;
title 'Hair and Eye Color of European Children';
run;
1
PROC FREQDATA=Color order=freq;
2
tables region / binomial(ac wilson exact level=1) alpha=.1 ;
3
exact binomial;
4
weight Count;
5
title 'Hair and Eye Color of European Children';
6
RUN;
3 Code Block
PROC FREQ
Explanation : Similar to the previous block, this PROC FREQ also analyzes the 'Region' variable of the 'Color' dataset. The main difference is the `level=2` option in `binomial(ac wilson exact level=2)`, which indicates that binomial confidence interval calculations are performed for the second level of the 'Region' variable, still with an alpha of 0.1 and 'Count' as weight. A title is also assigned.
Copied!
proc freq data=Color order=freq;
tables region / binomial(ac wilson exact level=2) alpha=.1 ;
exact binomial;
weight Count;
title 'Hair and Eye Color of European Children';
run;
1
PROC FREQDATA=Color order=freq;
2
tables region / binomial(ac wilson exact level=2) alpha=.1 ;
3
exact binomial;
4
weight Count;
5
title 'Hair and Eye Color of European Children';
6
RUN;
4 Code Block
PROC FREQ
Explanation : This block executes a PROC FREQ on the 'Color' dataset for the 'Region' variable. The `binomial` option alone requests standard binomial statistics for each level of 'Region', including proportions, frequencies, and default confidence intervals, without advanced specifications of calculation methods or level. The order of frequencies is maintained.
Copied!
proc freq data=Color order=freq;
tables region / binomial;
run;
1
2
PROC FREQ
3
DATA=Color order=freq;
4
tables region / binomial;
5
RUN;
6
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.