Published on :
Statistical CREATION_INTERNE

Automobile Data Analysis and Visualization

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by creating a temporary SAS© dataset named 'auto' using a DATA statement and in-line data via CARDS. This dataset contains vehicle information such as make, MPG, reliability (rep78), weight, and origin (foreign1). It then uses PROC FREQ to obtain the frequency distribution of the 'mpg' variable. After that, it initializes global graphic options. Two types of graphs are generated: a simple PROC PLOT to visualize 'mpg' as a function of 'weight1', and a more advanced PROC GPLOT. The customized PROC GPLOT displays the same relationship but adds a visual distinction based on the 'foreign1' variable (foreign or non-foreign vehicle), defines specific axes, and calculates and displays the regression equation (regeqn) for the fitted line.
Data Analysis

Type : CREATION_INTERNE


The data is directly integrated into the script via a DATA statement with CARDS, creating the temporary 'auto' dataset.

1 Code Block
DATA STEP Data
Explanation :
This DATA STEP block creates a temporary SAS dataset named 'auto'. Data is read in-line using the CARDS statement and defines the variables 'make' (character), 'mpg', 'rep78', 'weight1', and 'foreign1' (numeric). 'mpg' represents fuel consumption, 'weight1' is weight, and 'foreign1' indicates whether the car is foreign (1) or not (0).
Copied!
1DATA auto ;
2 INPUT make $ mpg rep78 weight1 foreign1 ;
3CARDS;
4AMC 22 3 2930 0
5AMC 17 3 3350 0
6AMC 22 . 2640 0
7Audi 17 5 2830 1
8Audi 23 3 2070 1
9BMW 25 4 2650 1
10Buick 20 3 3250 0
11Buick 15 4 4080 0
12Buick 18 3 3670 0
13Buick 26 . 2230 0
14Buick 20 3 3280 0
15Buick 16 3 3880 0
16Buick 19 3 3400 0
17Cad. 14 3 4330 0
18Cad. 14 2 3900 0
19Cad. 21 3 4290 0
20Chev. 29 3 2110 0
21Chev. 16 4 3690 0
22Chev. 22 3 3180 0
23Chev. 22 2 3220 0
24Chev. 24 2 2750 0
25Chev. 19 3 3430 0
26Datsun 23 4 2370 1
27Datsun 35 5 2020 1
28Datsun 24 4 2280 1
29Datsun 21 4 2750 1
30;
31RUN;
2 Code Block
PROC FREQ
Explanation :
This procedure generates a frequency table for the 'mpg' variable from the 'auto' dataset. It allows observing the distribution of different fuel consumption values.
Copied!
1PROC FREQ DATA = auto;
2 TABLES mpg;
3RUN;
3 Code Block
GOPTIONS
Explanation :
This statement resets all global graphic options to their default values and adds a border to the generated graphs. This ensures a clean base for subsequent graphs.
Copied!
1goptions reset=all border;
4 Code Block
PROC PLOT
Explanation :
This procedure generates a simple scatter plot ('plot') of 'mpg' (Y-axis) versus 'weight1' (X-axis) from the 'auto' dataset. It provides an initial visual overview of the relationship between these two variables.
Copied!
1 
2PROC PLOT
3DATA=auto;
4plot mpg * weight1 ;
5 
6RUN;
7 
5 Code Block
PROC GPLOT
Explanation :
This procedure generates a more elaborate GPLOT graph. The title 'Study of MPG vs Weight' is set. The SYMBOL statement configures the display of points and the regression line (interpol=rqcli95 for a quadratic regression with 95% confidence intervals, points as circles, specific colors). The graph represents 'mpg' as a function of 'weight1', with points colored differently according to the 'foreign1' variable. The X and Y axes are customized with specific ranges and increments, and the 'regeqn' option displays the regression equation on the graph.
Copied!
1 PROC GPLOT DATA=auto;
2 title "Study of MPG vs Weight";
3
4symbol interpol= rqcli95
5 value=circle
6 cv= crimson
7 ci = black
8 co = bib
9 width= 2
10 ;
11
12 plot mpg*weight1 = foreign1 / haxis=2000 to 4500 BY 500
13 vaxis=12 to 35 BY 2
14 regeqn;
15
16RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Author - Anupama Rajaram Program Description - This program creates a simple gplot of 2 variables, draws the plot line and calculates regression equation. y-axis = mpg. x-axis = weight1.