Published on :
Statistical EXTERNAL

Chi-Square Test on Survey Data

This code is also available in: Deutsch Español Français
The program initializes an external library 'class' to access the 'classurv15' dataset located on a Windows file path. It then configures SAS© to search for formats in this library. The script proceeds with an initial cross-frequency table between 'persdoc' and 'genhealth'. Subsequently, it defines custom formats ('fpersdoc' and 'fgenhealth') to group the categories of these variables. These formats are applied in subsequent frequency tables to create a 2x2 table. Finally, it performs a Chi-square test to evaluate the independence of the two variables, displaying observed counts, expected counts, and Chi-square statistics.
Data Analysis

Type : EXTERNAL


The data comes from the 'classurv15' dataset, accessible via the 'class' library which is mapped to an external file system path specified by a LIBNAME statement.

1 Code Block
Configuration
Explanation :
This block configures the SAS environment by creating the 'class' libname which points to a folder containing the data. The 'fmtsearch' option tells SAS where to search for custom formats, especially in the 'class' library.
Copied!
1LIBNAME class "Z:\Dropbox\UNTHSC Admin and Teaching\Courses\5147-Fall 2014\BACH_EPID 5313\
2DATA\Day one survey\5147\";
3options fmtsearch = (class);
4 
2 Code Block
PROC FREQ
Explanation :
This procedure generates an initial cross-frequency table for the 'persdoc' and 'genhealth' variables from the 'class.classurv15' dataset, without applying specific formats, to display the original distributions.
Copied!
1/*
2Revisit the persdoc by genhealth frequency table.
3*/
4PROC FREQ DATA=class.classurv15;
5 tables persdoc*genhealth;
6RUN;
3 Code Block
PROC FORMAT
Explanation :
This block uses 'PROC FORMAT' to define two custom formats: 'fpersdoc' and 'fgenhealth'. These formats group the original categories of the 'persdoc' and 'genhealth' variables into broader groups, thus simplifying the analysis and the creation of a 2x2 table.
Copied!
1/*
2Using formats to collapse categories of persdoc and genhealth in order to create a two-by-two
3table
4*/
5PROC FORMAT;
6 value fpersdoc 0 = "No Personal Doctor"
7 1-2 = "At Least One Personal Doctor";
8 value fgenhealth 1-2 = "Excellent, Very Good, or Good"
9 3-high = "Fair or Poor";
10RUN;
4 Code Block
PROC FREQ
Explanation :
This 'PROC FREQ' procedure generates a cross-frequency table for 'persdoc' and 'genhealth', but this time, it applies the custom formats 'fpersdoc' and 'fgenhealth'. This allows visualizing the distribution of variables with grouped categories.
Copied!
1PROC FREQ DATA=class.classurv15;
2 tables persdoc*genhealth;
3 FORMAT persdoc fpersdoc. genhealth fgenhealth.;
4RUN;
5 Code Block
PROC FREQ
Explanation :
This final 'PROC FREQ' block performs a Chi-square test on the formatted cross-frequency table. The 'chisq' option requests the calculation of the Chi-square statistic, 'expected' displays the expected counts under the assumption of independence, and 'nocol' suppresses the display of column percentages for an output more focused on the test.
Copied!
1/*
2Chi-square test for the difference between distributions
3*/
4PROC FREQ DATA=class.classurv15;
5 tables persdoc*genhealth / chisq expected nocol;
6 FORMAT persdoc fpersdoc. genhealth fgenhealth.;
7RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.