Macro for extracting unique words between two strings

This code is also available in: Deutsch Español Français
Difficulty Level
Beginner
Published on :
The `%mf_wordsInStr1ButNotStr2` macro is designed to identify and extract unique words. It accepts two main parameters, `Str1` and `Str2`, both character strings containing space-separated words. An initial check is performed to ensure that the input strings are not empty; if they are, the macro displays an informational message and terminates prematurely.
Local variables (`count_base`, `count_extr`, `i`, `i2`, `extr_word`, `base_word`, `match`, `outvar`) are declared to prevent interference with global macro variables. The macro functions `%sysfunc(countw(...))` are used to determine the number of words in each string, stored in `count_extr` for `Str1` and `count_base` for `Str2`.
The core of the logic is a nested double loop. The outer loop (`%do i=1 %to &count_extr;`) iterates over each word in `Str1`, extracted by `%scan(&Str1,&i,%str( ))`. For each extracted word (`extr_word`), a `match` variable is initialized to 0. The inner loop (`%do i2=1 %to &count_base;`) then iterates through each word in `Str2`, extracted similarly (`base_word`). If `extr_word` matches `base_word`, `match` is set to 1.
After the inner loop ends, if the `match` variable is still 0 (indicating that no corresponding word was found in `Str2`), then `extr_word` is added to the macro variable `outvar`, which accumulates the unique words. Finally, the macro emits the content of `outvar`, thus presenting the words from `Str1` that were not present in `Str2`.
Data Analysis

Type : CREATION_INTERNE


The `%mf_wordsInStr1ButNotStr2` macro exclusively processes character strings passed to it as arguments (`Str1` and `Str2`). There is no direct access to existing SAS datasets (including SASHELP) or external files for its main operations. The 'data' processed are therefore textual values provided dynamically during the macro call, not requiring pre-existing external or internal data sources.

1 Code Block
MACRO
Explanation :
This block defines the SAS macro `%mf_wordsInStr1ButNotStr2`. It uses local variables to encapsulate its operation. The logic begins with a check of the input strings to ensure they are not empty, followed by the calculation of the number of words in each string via `%sysfunc(countw())`. Nested loops iterate through the words of both strings using the `%scan()` function. The current word from `Str1` (`extr_word`) is compared to each word in `Str2` (`base_word`). If `extr_word` is not found in `Str2`, it is added to the macro variable `outvar`. Finally, the macro emits the content of `outvar`, providing the list of unique words present in `Str1` but absent from `Str2`. This implementation is purely macro and does not generate intermediate SAS datasets.
Copied!
1%macro mf_wordsInStr1ButNotStr2(
2 Str1= /* string containing words to extract */
3 ,Str2= /* used to compare with the extract string */
4)/*/STORE SOURCE*/;
5 
6%local count_base count_extr i i2 extr_word base_word match outvar;
7%IF %LENGTH(&str1)=0 or %LENGTH(&str2)=0 %THEN %DO;
8 %put base string (str1)= &str1;
9 %put compare string (str2) = &str2;
10 %return;
11%END;
12%let count_base=%sysfunc(countw(&Str2));
13%let count_extr=%sysfunc(countw(&Str1));
14 
15%DO i=1 %to &count_extr;
16 %let extr_word=%scan(&Str1,&i,%str( ));
17 %let match=0;
18 %DO i2=1 %to &count_base;
19 %let base_word=%scan(&Str2,&i2,%str( ));
20 %IF &extr_word=&base_word %THEN %let match=1;
21 %END;
22 %IF &match=0 %THEN %let outvar=&outvar &extr_word;
23%END;
24 
25 &outvar
26 
27%mend mf_wordsInStr1ButNotStr2;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Copyright information is compiled from the help section of the main macro and referenced files: 'Allan Bowe' (author or contributor mentioned in the main help). Additional copyrights from referenced files: 'Copyright (c) 2001-2006 Rodney Sparapani' (from `_version.sas` file, under GNU General Public License), 'Copyright 2010-2023 HMS Analytical Software GmbH' (from `macro_without_brief_tag.sas` file), 'Copyright © 2022, SAS Institute Inc.' (from `print_macro_parameters.sas` file, under Apache-2.0 license), and 'Original code by ChrisNZ' (from `datastep_infile_trick.json` file, under Apache-2.0 license). Excerpts from SAS Viya documentation are also included in the references.


Related Documentation

Aucune documentation spécifique pour cette catégorie.