The SVMACHINE procedure is a powerful tool for building classification and regression models. It uses the concept of hyperplanes to separate data classes, maximizing the margin between the closest data points (support vectors). It supports various kernel functions (linear, polynomial, RBF, sigmoid) to adapt to complex relationships in the data. The procedure runs on the CAS server, enabling distributed and parallel processing of large volumes of data. It is particularly effective for binary classification problems and can be adapted to regression problems.
Data Analysis
Type : CREATION_INTERNE
Examples use generated data (datalines) or SASHELP data, loaded into the CAS session.
1 Code Block
PROC SVMACHINE Data
Explanation : This example illustrates a simple binary classification using the SVMACHINE procedure with a linear kernel. A subset of the SASHELP.IRIS dataset is created in CAS, keeping only two species. The 'target' variable is binarized (0 or 1). The SVMACHINE procedure is then executed to train a model, and fit statistics are displayed.
Copied!
CASLIB _ALL_ ASSIGN; /* Assurez-vous que la CASLIB est assignée */
DATA mycas.iris_subset;
set sashelp.iris;
where species in ('Setosa', 'Versicolor');
if species = 'Setosa' then target = 0;
else target = 1;
drop species;
RUN;
PROC SVMACHINE data=mycas.iris_subset;
input sepalwidth sepallength petalwidth petallength / level=interval;
target target / level=binary;
kernel linear;
ods output FitStatistics=FitStat;
RUN;
PROC PRINT data=FitStat;
title 'Statistiques d''ajustement du modèle SVM Linéaire';
RUN;
1
CASLIB _ALL_ ASSIGN; /* Assurez-vous que la CASLIB est assignée */
title 'Statistiques d''ajustement du modèle SVM Linéaire';
20
RUN;
2 Code Block
PROC SVMACHINE Data
Explanation : This example uses a simulated two-dimensional dataset with a non-linear decision boundary. The SVMACHINE procedure is configured with an RBF (Radial Basis Function) kernel and a cost parameter 'C' of 10. The 'C' parameter controls the penalty for misclassified data points, influencing the margin width and model complexity.
Copied!
CASLIB _ALL_ ASSIGN;
DATA mycas.simulated_data;
call streaminit(123);
do i = 1 to 100;
x1 = rand('Uniform') * 10;
x2 = rand('Uniform') * 10;
if (x1 - 5)**2 + (x2 - 5)**2 < 10 then target = 0;
else target = 1;
output;
end;
RUN;
PROC SVMACHINE data=mycas.simulated_data;
input x1 x2 / level=interval;
target target / level=binary;
kernel rbf;
c 10; /* Paramètre de coût (C) */
ods output FitStatistics=FitStat;
RUN;
PROC PRINT data=FitStat;
title 'Statistiques d''ajustement du modèle SVM RBF';
RUN;
1
CASLIB _ALL_ ASSIGN;
2
3
DATA mycas.simulated_data;
4
call streaminit(123);
5
DO i = 1 to 100;
6
x1 = rand('Uniform') * 10;
7
x2 = rand('Uniform') * 10;
8
IF (x1 - 5)**2 + (x2 - 5)**2 < 10THEN target = 0;
9
ELSE target = 1;
10
OUTPUT;
11
END;
12
RUN;
13
14
PROC SVMACHINEDATA=mycas.simulated_data;
15
INPUT x1 x2 / level=interval;
16
target target / level=binary;
17
kernel rbf;
18
c 10; /* Paramètre de coût (C) */
19
ods OUTPUT FitStatistics=FitStat;
20
RUN;
21
22
PROC PRINTDATA=FitStat;
23
title 'Statistiques d''ajustement du modèle SVM RBF';
24
RUN;
3 Code Block
PROC SVMACHINE Data
Explanation : This example demonstrates the use of a polynomial kernel of degree 2 to capture more complex relationships in the data. It also incorporates a data partitioning step into training (70%) and validation (30%) sets to evaluate model generalization. The output includes fit statistics and a confusion matrix.
Copied!
CASLIB _ALL_ ASSIGN;
DATA mycas.complex_data;
call streaminit(456);
do i = 1 to 200;
x1 = rand('Normal', 0, 2);
x2 = rand('Normal', 0, 2);
if x1**2 + x2**2 < 4 and x1 > 0 then target = 0;
else target = 1;
output;
end;
RUN;
PROC SVMACHINE data=mycas.complex_data;
input x1 x2 / level=interval;
target target / level=binary;
kernel polynomial / degree=2; /* Noyau polynomial de degré 2 */
c 1;
partition fraction(train=0.7 val=0.3 seed=789);
ods output FitStatistics=FitStat
MisclassificationMatrix=MisClass;
RUN;
PROC PRINT data=FitStat;
title 'Statistiques d''ajustement du modèle SVM Polynomial';
RUN;
PROC PRINT data=MisClass;
title 'Matrice de confusion du modèle SVM Polynomial';
RUN;
title 'Statistiques d''ajustement du modèle SVM Polynomial';
26
RUN;
27
28
PROC PRINTDATA=MisClass;
29
title 'Matrice de confusion du modèle SVM Polynomial';
30
RUN;
4 Code Block
PROC CAS / SVMACHINE Data
Explanation : This advanced example shows how to interact with the SVMACHINE procedure via the CASL scripting language (CAS Language) within PROC CAS. It includes creating training data and new data to be scored directly in a CAS table. The SVM model is trained using the `svMachine.svMachine` action and its state is saved. Then, this saved model is used to predict values on new data, and the results are displayed.
Copied!
CASLIB _ALL_ ASSIGN;
PROC CAS;
session casauto;
/* Création de données simulées directement dans CAS */
data casuser.score_data / overwrite=true;
input x1 x2 target;
datalines;
1 1 0
9 9 1
5 5 0
2 8 1
7 3 0
;
run;
/* Entraînement d'un modèle simple pour la démonstration */
svmachine.svMachine /
table={name='casuser.score_data'},
inputs={'x1', 'x2'},
target='target',
kernel={type='linear'},
savestate={name='my_svm_model', replace=true}
;
/* Préparation des nouvelles données à scorer */
data casuser.new_data / overwrite=true;
input x1 x2;
datalines;
1.5 1.5
8.5 8.5
4.8 5.2
3.0 7.0
6.5 2.5
;
run;
/* Scoring des nouvelles données */
svmachine.svMachine /
table={name='casuser.new_data'},
predict={model={name='my_svm_model'}},
output={casout={name='casuser.predictions', replace=true}, copyvars={'x1', 'x2'}}
;
/* Affichage des prédictions */
printtable / table={name='casuser.predictions'};
QUIT;
1
CASLIB _ALL_ ASSIGN;
2
3
PROC CAS;
4
SESSION casauto;
5
/* Création de données simulées directement dans CAS */
6
DATA casuser.score_data / overwrite=true;
7
INPUT x1 x2 target;
8
DATALINES;
9
11 0
10
991
11
55 0
12
281
13
73 0
14
;
15
RUN;
16
17
/* Entraînement d'un modèle simple pour la démonstration */
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
« In SAS Viya, PROC SVMACHINE implements the Support Vector Machine (SVM) algorithm, a gold standard for high-accuracy binary classification. By finding the optimal hyperplane that maximizes the margin between classes, SVMs are uniquely robust against overfitting, even in high-dimensional feature spaces. »
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.