The examples use data generated via datalines, ensuring their autonomy.
1 Code Block
DATA STEP / PROC SORT Data
Explanation : This example shows how to group data using a single BY variable, `zipCode`, in a DATA step. The `zip` dataset contains street names, cities, states, and zip codes. Groups are created by specifying the `zipCode` variable in the BY statement. The DATA step arranges zip codes with the same values into groups. The figure shows five BY groups being created.
Copied!
data zip;
input zipCode State $ City $ Street $20-29;
datalines;
85730 AZ Tucson Domenic Ln
85730 AZ Tucson Gleeson Pl
33133 FL Miami Rice St
33133 FL Miami Thomas Ave
33133 FL Miami Surrey Dr
33133 FL Miami Trade Ave
33146 FL Miami Nervia St
33146 FL Miami Corsica St
33801 FL Lakeland French Ave
33809 FL Lakeland Egret Dr
;
proc sort data=zip;
by zipCode;
run;
data zip;
set zip;
by zipCode;
run;
proc print data=zip noobs;
title 'BY-Group Uing a Single Variable: ZipCode';
run;
1
DATA zip;
2
INPUT zipCode State $ City $ Street $20-29;
3
DATALINES;
4
85730 AZ Tucson Domenic Ln
5
85730 AZ Tucson Gleeson Pl
6
33133 FL Miami Rice St
7
33133 FL Miami Thomas Ave
8
33133 FL Miami Surrey Dr
9
33133 FL Miami Trade Ave
10
33146 FL Miami Nervia St
11
33146 FL Miami Corsica St
12
33801 FL Lakeland French Ave
13
33809 FL Lakeland Egret Dr
14
;
15
16
PROC SORTDATA=zip;
17
BY zipCode;
18
RUN;
19
20
DATA zip;
21
SET zip;
22
BY zipCode;
23
RUN;
24
25
PROC PRINTDATA=zip noobs;
26
title 'BY-Group Uing a Single Variable: ZipCode';
27
RUN;
2 Code Block
DATA STEP / PROC SORT Data
Explanation : This example shows the results of processing the `zip` dataset with two BY variables, State and City. The figure shows three BY groups. The dataset is displayed with the BY variables State and City printed on the left for easy reading. The position of BY variables in observations does not affect how values are grouped and ordered.
Observations are organized so that observations for Arizona appear first. Observations within each State value are organized in order of the City value. Each BY group has a unique combination of values for the State and City variables. For example, the BY value of the first BY group is `AZ Tucson`, and the BY value of the second BY group is `FL Lakeland`.
Copied!
data zip;
input State $ City $ Street $13-22 ZipCode ;
datalines;
FL Miami Nervia St 33146
FL Miami Rice St 33133
FL Miami Corsica St 33146
FL Miami Thomas Ave 33133
FL Miami Surrey Dr 33133
FL Miami Trade Ave 33133
FL Lakeland French Ave 33801
FL Lakeland Egret Dr 33809
AZ Tucson Domenic Ln 85730
AZ Tucson Gleeson Pl 85730
;
proc sort data=zip;
by State City;
run;
data zip;
set zip;
by State City;
run;
proc print data=zip noobs;
title 'BY Groups with Multiple BY Variables: State City';
run;
1
DATA zip;
2
INPUT State $ City $ Street $13-22 ZipCode ;
3
DATALINES;
4
FL Miami Nervia St 33146
5
FL Miami Rice St 33133
6
FL Miami Corsica St 33146
7
FL Miami Thomas Ave 33133
8
FL Miami Surrey Dr 33133
9
FL Miami Trade Ave 33133
10
FL Lakeland French Ave 33801
11
FL Lakeland Egret Dr 33809
12
AZ Tucson Domenic Ln 85730
13
AZ Tucson Gleeson Pl 85730
14
;
15
16
17
PROC SORTDATA=zip;
18
BY State City;
19
RUN;
20
21
DATA zip;
22
SET zip;
23
BY State City;
24
RUN;
25
PROC PRINTDATA=zip noobs;
26
title 'BY Groups with Multiple BY Variables: State City';
27
RUN;
3 Code Block
DATA STEP / PROC FORMAT Data
Explanation : This example uses the FORMAT procedure, the GROUPFORMAT option, and the FORMAT statement to create and print a simple dataset. The input TEST dataset is sorted by increasing values. The NEWTEST dataset is organized by the formatted values of the Score variable. The example uses the GROUPFORMAT option and the FORMAT statement to create and print a simple dataset.
Key ideas:
- Processing BY groups in the DATA step using the GROUPFORMAT option is the same as processing BY groups with formatted values in SAS procedures. Using the GROUPFORMAT option is useful when defining your own formats to display grouped data.
- Using the GROUPFORMAT option in the DATA step ensures that the BY groups you use to create a dataset match the BY groups in the PROC steps that report grouped and formatted data. GROUPFORMAT also determines how the FIRST.variable and LAST.variable are assigned.
Copied!
options
linesize=80 pagesize=60;
data test;
input name $ Score;
datalines;
Jon 1
Anthony 3
Miguel 3
Joseph 4
Ian 5
Jan 6
;
proc format;
value Range 1-2='Low'
3-4='Medium'
5-6='High';
run;
data newtest;
set test;
by groupformat Score;
format Score Range.;
run;
proc print data=newtest;
title 'Score Categories';
var Name Score;
by Score;
run;
1
options
2
linesize=80 pagesize=60;
3
4
DATA test;
5
INPUT name $ Score;
6
DATALINES;
7
Jon 1
8
Anthony 3
9
Miguel 3
10
Joseph 4
11
Ian 5
12
Jan 6
13
;
14
PROC FORMAT;
15
value Range 1-2='Low'
16
3-4='Medium'
17
5-6='High';
18
RUN;
19
20
DATA newtest;
21
SET test;
22
BY groupformat Score;
23
FORMAT Score Range.;
24
RUN;
25
26
PROC PRINTDATA=newtest;
27
title 'Score Categories';
28
var Name Score;
29
BY Score;
30
RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.