Power Analysis and Monte Carlo Simulation for Two-Means Test

This code is also available in: Deutsch Español Français
Difficulty Level
Beginner
Published on :
The script begins by using PROC POWER to calculate the theoretical power for various parameter combinations of a two-means test. It then generates graphs to visualize these relationships. A manual implementation of power calculation is also included for comparison. The second part of the script implements a Monte Carlo simulation to empirically estimate power: it generates thousands of datasets under different conditions, performs t-tests on each, then calculates the proportion of significant tests, thus providing a simulated power estimate.
Data Analysis

Type : CREATION_INTERNE


All data used in this script is generated internally, either by PROC POWER (power calculations) or by DATA STEPs (simulated data generation for t-test and manual power calculation).

1 Code Block
PROC POWER
Explanation :
This block uses the PROC POWER procedure to calculate statistical power for an independent two-means comparison test (test=diff). It explores different combinations of mean difference (meandiff), standard deviation (stddev), significance level (alpha), and total sample size (ntotal). The value `power = .` indicates that power is the value to be calculated.
Copied!
1PROC POWER;
2 twosamplemeans test=diff
3 meandiff = 5 6
4 stddev = 12 18
5 alpha = 0.05 0.1
6 ntotal = 100 200
7 power = .;
8RUN;
2 Code Block
PROC POWER
Explanation :
This block activates the Output Delivery System (ODS) to generate graphs. PROC POWER is executed with the `plotonly` option to produce only power plots, without the textual tables. The `plot` subcommand customizes the appearance of the graphs, allowing color, line type, and symbol to vary based on different variables. `ods output output=powdata;` saves the data used for the graphs into a dataset named `powdata`.
Copied!
1ods graphics on;
2 
3PROC POWER plotonly;
4 twosamplemeans test=diff
5 meandiff = 5 6
6 stddev = 12 18
7 alpha = 0.05 0.1
8 ntotal = 100 200
9 power = .;
10 plot;
11RUN;
12 
13ods graphics off;
14 
15 plot
16 min=60
17 yopts=(ref=0.9 crossref=yes)
18 vary(color BY stddev, linestyle BY meandiff, symbol BY alpha);
19 
20 ods OUTPUT OUTPUT=powdata;
3 Code Block
DATA STEP Data
Explanation :
This DATA STEP creates a dataset named `tpow` by calculating power for each combination of the specified parameters. `ncp` (non-centrality parameter) and `critval` (critical value of the F distribution) are calculated. Power is then determined using the `sdf` (survival density function) of the non-central F distribution.
Copied!
1DATA tpow;
2 DO meandiff = 5, 6;
3 DO stddev = 12, 18;
4 DO alpha = 0.05, 0.1;
5 DO ntotal = 100, 200;
6 ncp = ntotal * 0.5 * 0.5 * meandiff**2 / stddev**2;
7 critval = finv(1-alpha, 1, ntotal-2, 0);
8 power = sdf('f', critval, 1, ntotal-2, ncp);
9 OUTPUT;
10 END;
11 END;
12 END;
13 END;
14RUN;
4 Code Block
PROC PRINT
Explanation :
This block uses PROC PRINT to display the content of the `tpow` dataset, allowing visualization of the manually calculated power values.
Copied!
1PROC PRINT DATA=tpow;
2RUN;
5 Code Block
DATA STEP Data
Explanation :
This block initializes several macro variables (`meandiff`, `stddev`, `alpha`, `ntotal`, `nsim`) that will be used in the simulation. The DATA STEP `simdata` then generates a large number (`&nsim`) of simulated datasets. For each simulation (`isim`), it creates observations for two groups (`group` 1 and 2) where the variable `y` is drawn from a normal distribution. Group 1 has a mean of 0, and Group 2 has a mean equal to `&meandiff`, with a common standard deviation `&stddev`. `call streaminit(123)` ensures reproducibility of random number generation.
Copied!
1%let meandiff = 5;
2%let stddev = 12;
3%let alpha = 0.05;
4%let ntotal = 100;
5%let nsim = 10000;
6 
7DATA simdata;
8 call streaminit(123);
9 DO isim = 1 to ≁
10 DO i = 1 to floor(&ntotal/2);
11 group = 1;
12 y = rand('normal', 0 , &stddev);
13 OUTPUT;
14 group = 2;
15 y = rand('normal', &meandiff, &stddev);
16 OUTPUT;
17 END;
18 END;
19RUN;
6 Code Block
PROC TTEST
Explanation :
This block temporarily disables all standard ODS output to avoid excessive display. It then runs PROC TTEST on the simulated data (`simdata`). The `by isim` statement performs a t-test for each individual simulation. `class group` defines the two groups to be compared, and `var y` specifies the variable to be tested. `ods output ttests=tests` saves the t-test results to a new dataset named `tests`. Finally, ODS output is re-enabled.
Copied!
1ods exclude all;
2PROC TTEST DATA=simdata;
3 ods OUTPUT ttests=tests;
4 BY isim;
5 class group;
6 var y;
7RUN;
8ods exclude none;
7 Code Block
DATA STEP Data
Explanation :
This DATA STEP processes the `tests` dataset (from PROC TTEST). It filters observations to retain only results corresponding to the "Pooled" method (combined variance). A new variable `issig` is created, taking the value 1 if the p-value (`probt`) of the test is less than the alpha threshold (`&alpha`), indicating a significant test, and 0 otherwise.
Copied!
1DATA tests;
2 SET tests;
3 where method="Pooled";
4 issig = probt < α
5RUN;
8 Code Block
PROC FREQ
Explanation :
This block uses PROC FREQ on the `tests` dataset to analyze the frequency of the `issig` variable. The `ods select binomial` option allows displaying only the binomial test results. The `tables issig / binomial(level='1')` statement calculates the proportion of significant tests (`issig = 1`) and provides a binomial confidence interval for this proportion, which represents the simulated estimate of statistical power.
Copied!
1PROC FREQ DATA=tests;
2 ods select binomial;
3 tables issig / binomial(level='1');
4RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.

Related Documentation

Aucune documentation spécifique pour cette catégorie.