Published on :
Statistical CREATION_INTERNE

PROC MCMC Getting Started Example 2: Behrens-Fisher Problem

This code is also available in: Deutsch Français
Awaiting validation
This script performs a Bayesian analysis to compare the means of two independent groups assumed to follow normal distributions with different variances. It first creates an internal dataset. Then, it uses the MCMC procedure to estimate the model parameters (means and variances by group) and the difference of means ('mudif'). Finally, it uses PROC FREQ to analyze the probability that the difference of means is positive or negative from the posterior sample.
Data Analysis

Type : CREATION_INTERNE


Data is generated directly within the script via a DATA step with the 'datalines' command.

1 Code Block
DATA STEP Data
Explanation :
Sets the title and creates the 'behrens' dataset containing the response variable 'y' and the group indicator 'ind'. Data is read in-line.
Copied!
1title 'The Behrens-Fisher Problem';
2 
3DATA behrens;
4 INPUT y ind @;
5 DATALINES;
6121 1 94 1 119 1 122 1 142 1 168 1 116 1
7172 1 155 1 107 1 180 1 119 1 157 1 101 1
8145 1 148 1 120 1 147 1 125 1 126 2 125 2
9130 2 130 2 122 2 118 2 118 2 111 2 123 2
10126 2 127 2 111 2 112 2 121 2
11;
2 Code Block
PROC MCMC Data
Explanation :
Executes the Markov Chain Monte Carlo (MCMC) simulation. Defines the parameters (mu1, mu2, sig21, sig22), priors (non-informative here), and the conditional model structure (different means and variances depending on the 'ind' group). The difference of means 'mudif' is calculated at each iteration. Results are stored in the 'postout' dataset.
Copied!
1PROC MCMC DATA=behrens outpost=postout seed=123
2 nmc=40000 monitor=(_parms_ mudif)
3 statistics(alpha=0.01);
4 ods select PostSumInt;
5 parm mu1 0 mu2 0;
6 parm sig21 1;
7 parm sig22 1;
8 prior mu: ~ general(0);
9 prior sig21 ~ general(-log(sig21), lower=0);
10 prior sig22 ~ general(-log(sig22), lower=0);
11 mudif = mu1 - mu2;
12 IF ind = 1 THEN DO;
13 mu = mu1;
14 s2 = sig21;
15 END;
16 ELSE DO;
17 mu = mu2;
18 s2 = sig22;
19 END;
20 model y ~ normal(mu, var=s2);
21RUN;
3 Code Block
PROC FORMAT
Explanation :
Creates a custom format 'diffmt' to categorize numeric values into two groups: less than or equal to 0, and strictly greater than 0.
Copied!
1 
2PROC FORMAT;
3value diffmt low-0 = 'mu1 - mu2 <= 0' 0<-high = 'mu1 - mu2 > 0';
4RUN;
5 
4 Code Block
PROC FREQ
Explanation :
Uses PROC FREQ on the simulation output data ('postout') to calculate the frequency of the difference of means ('mudif') according to the defined format, allowing estimation of the probability that the difference is positive or negative.
Copied!
1PROC FREQ DATA = postout;
2 tables mudif /nocum;
3 FORMAT mudif diffmt.;
4RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.