Published on :
Statistics CREATION_INTERNE

Statistical and Graphical Analysis of Biomedical and Epidemiological Data

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by creating the 'Athelate' dataset via datalines, where a new variable (ABP - Mean Arterial Pressure) is calculated. The content of this dataset is then displayed. A copy, 'Practice.Athelate', is created for further analysis. Descriptive statistics (means, standard deviations) are calculated and displayed for 'Athelate's numerical variables via PROC MEANS. Several univariate analyses are performed on the 'Age' variable of 'Athelate' and 'Practice.Athelate', including confidence intervals and normality tests. Data visualization is then addressed with PROC SGPLOT to generate vertical and horizontal boxplots of the SBP variable, and PROC PLOT for a scatterplot between SBP and DBP, all for the 'Practice.Athelate' dataset. Finally, a second 'disease' dataset is created via datalines to analyze cross-frequencies between 'Severity' and 'Herd_size' using PROC FREQ, including Chi-square tests.
Data Analysis

Type : CREATION_INTERNE


All main datasets ('Athelate' and 'disease') are created directly within the SAS script via DATALINES statements, integrating raw data into the code. The 'Practice.Athelate' and 'desp_athelate' datasets are derivations of this internal data. No external data sources (CSV files, databases, etc.) are directly read by this script.

1 Code Block
DATA STEP Data
Explanation :
Creates the SAS dataset 'Athelate' by reading the data provided via the DATALINES statement. It defines the variables Id, Age, Race (character), SBP, DBP, HR (numeric), and calculates a new variable 'ABP' (Mean Arterial Pressure) from SBP and DBP.
Copied!
1DATA Athelate;
2INPUT Id Age Race $ SBP DBP HR;
3ABP=1/3*SBP+2/3*DBP;
4DATALINES;
54101 18 W 130 80 60
64102 18 W 140 90 70
74103 19 B 120 70 64
84104 17 B 150 90 76
94105 18 B 124 86 72
104106 19 W 145 94 70
114107 23 B 125 78 68
124108 21 W 140 85 74
134109 18 W 150 82 65
144110 20 W 145 95 75
15RUN;
2 Code Block
PROC PRINT
Explanation :
Displays the content of the 'Athelate' dataset. The 'noobs' option suppresses the display of the default numeric observation column.
Copied!
1PROC PRINT DATA=Athelate noobs;
3 Code Block
DATA STEP Data
Explanation :
Creates a new dataset named 'Practice.Athelate' in the 'Practice' library (if defined, otherwise in WORK) by copying all observations and variables from the 'Athelate' dataset.
Copied!
1DATA Practice.Athelate;
2SET Athelate;
4 Code Block
PROC MEANS Data
Explanation :
Calculates descriptive statistics (mean and standard deviation) for the variables 'Age', 'SBP', 'DBP', 'HR' from the 'Athelate' dataset. The results of the means and standard deviations for 'Age' and 'SBP' are saved in a new dataset 'desp_athelate'.
Copied!
1 
2PROC MEANS
3DATA=Athelate;
4var Age SBP DBP HR;
5OUTPUT out=desp_athelate mean=av_Age av_SBP std=sd_Age sd_SBP;
6 
5 Code Block
PROC PRINT
Explanation :
Displays the content of the 'desp_athelate' dataset, which contains the previously calculated descriptive statistics.
Copied!
1PROC PRINT DATA=desp_athelate;
6 Code Block
PROC UNIVARIATE
Explanation :
Performs a univariate analysis on the 'Age' variable of the 'athelate' dataset. It calculates a basic confidence interval (type=upper, alpha=0.10) and tests the null hypothesis that the mean of 'Age' is equal to 120 (mu0=120).
Copied!
1 
2PROC UNIVARIATE
3DATA=athelate cibasic(type=upper alpha=0.10) mu0=120;
4var Age;
5 
7 Code Block
PROC UNIVARIATE
Explanation :
Performs a univariate analysis on all numeric variables in the 'athelate' dataset, providing descriptive statistics and a default basic confidence interval for the mean.
Copied!
1PROC UNIVARIATE DATA=athelate cibasic;
8 Code Block
PROC UNIVARIATE
Explanation :
Performs a univariate analysis on the 'Age' variable of the 'Practice.Athelate' dataset. The 'plots' option generates default graphs and 'normaltest' performs normality tests. The 'histogram' statement creates a histogram of the 'Age' variable.
Copied!
1 
2PROC UNIVARIATE
3DATA=Practice.Athelate plots normaltest;
4var Age;
5histogram;
6 
9 Code Block
PROC SGPLOT
Explanation :
Generates a vertical boxplot ('vbox') of the 'SBP' variable from the 'practice.athelate' dataset. The grid is enabled on the Y-axis and a title is set for the graph.
Copied!
1PROC SGPLOT DATA=practice.athelate;
2vbox SBP;
3yaxis grid;
4title "Boxplot of SBP Variable From Athelate data";
10 Code Block
PROC SGPLOT
Explanation :
Generates a horizontal boxplot ('hbox') of the 'SBP' variable from the 'Practice.Athelate' dataset, with a specific title.
Copied!
1 
2PROC SGPLOT
3DATA=Practice.Athelate;
4hbox SBP;
5title "Horizontal Boxplot of SBP Variable From Athelate
6data";
7 
11 Code Block
PROC PLOT
Explanation :
Creates a scatterplot of the 'SBP' and 'DBP' variables from the 'Practice.Athelate' dataset, with 'SBP' on the Y-axis and 'DBP' on the X-axis, and a descriptive title.
Copied!
1 
2PROC PLOT
3DATA=Practice.Athelate;
4plot SBP*DBP;
5title "Scatter plot of SBP and DBP Variable";
6 
12 Code Block
DATA STEP Data
Explanation :
Creates the SAS 'disease' dataset by reading the data provided via the DATALINES statement. It defines the variables 'Severity' (character), 'Herd_size' (character) and 'Count' (numeric).
Copied!
1DATA disease;
2INPUT Severity $ Herd_size $ Count ;
3DATALINES;
4a1 b1 11 a1 b2 88 a1 b3 136
5a2 b1 18 a2 b2 4 a2 b3 19
6a3 b1 9 a3 b2 5 a3 b3 9
7RUN;
13 Code Block
PROC PRINT
Explanation :
Displays the content of the 'disease' dataset.
Copied!
1PROC PRINT DATA=disease;
14 Code Block
PROC FREQ
Explanation :
Performs a frequency analysis for the variables 'Severity' and 'Herd_size' from the 'disease' dataset. The 'weight count' statement indicates that the 'count' variable represents the frequency of observations. The second 'tables' block requests additional statistics, including the Chi-square test ('chisq'), and suppresses the display of column, row, and global percentages ('nocol', 'nopercent', 'norow'), as well as measures of association.
Copied!
1PROC FREQ;
2weight count;
3tables Severity*Herd_size;
4tables Severity*Herd_size / chisq nocol nopercent norow measures;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.