Published on :
Administration INTERNAL_CREATION

Extracting File Paths from Metadata

This code is also available in: Español Français
Awaiting validation
Attention : This code requires administrator privileges.
This program configures a connection to a SAS© metadata server (hardcoded parameters) and inventories referenced physical files. It uses the metadata programming interface (metadata_* functions) to query `Directory` objects (by traversing their `Files` associations) and `SASFileRef` objects. The result is consolidated into a SAS© table named 'directories'.
Data Analysis

Type : INTERNAL_CREATION


Data is dynamically extracted from the metadata repository via SAS functions (metadata_resolve, metadata_getnobj, etc.).

1 Code Block
OPTIONS
Explanation :
Definition of global options for connecting to the SAS metadata server (server, port, protocol, sasadm user).
Copied!
1options
2 metaserver='meta.demo.sas.com'
3 metaport=8561
4 metaprotocol='bridge'
5 metauser='sasadm @saspw'
6 metapass='password'
7 metarepository='Foundation'
8 metaconnect='NONE';
2 Code Block
DATA STEP Data
Explanation :
Main Data Step that queries the metadata. It uses `metadata_resolve` to identify objects, then loops through the results. For each directory (`Directory`), it retrieves the path (`DirectoryName`) and explores the `Files` association to get file names. For `SASFileRef` objects, it directly retrieves the `Name` attribute. Full paths are stored in the `fqn` variable.
Copied!
1DATA directories;
2 
3 /* Initialize variables. */
4 LENGTH type id dir_uri file_uri $ 50 path file_name fqn $ 255;
5 call missing(of _character_);
6 keep fqn;
7 
8 /* Define a query to find all directory objects */
9 obj="omsobj:Directory?Directory[Files/File[ @code_sas_json/downside_frequency_test.json contains '.']";
10 
11 /* Count the objects that match this query. */
12 dir_count=metadata_resolve(obj,type,id);
13 
14 /* Proceed if any directories are found. */
15 IF dir_count > 0 THEN DO i=1 to dir_count;
16 rc=metadata_getnobj(obj,i,dir_uri);
17 rc=metadata_getattr(dir_uri,"DirectoryName",path);
18 
19 /* Find the files associated with the path. */
20 file_count=metadata_getnasn(dir_uri,"Files",1,file_uri);
21 IF file_count > 0 THEN DO j=1 to file_count;
22 rc=metadata_getnasn(dir_uri,"Files",j,file_uri);
23 rc=metadata_getattr(file_uri,"FileName",file_name);
24 fqn=catx("/",path,file_name);
25 OUTPUT;
26 END;
27 END;
28 /* Define a search query to find any "SASFileRef" object types. */
29 obj2="omsobj:SASFileRef? @code_sas_json/downside_frequency_test.json contains '.'";
30 fileref_count=metadata_resolve(obj2,type,id);
31 IF fileref_count > 0 THEN DO i = 1 to fileref_count;
32 rc=metadata_getnobj(obj2,i,file_uri);
33 rc=metadata_getattr(file_uri,"Name",fqn);
34 OUTPUT;
35 END;
36 
37RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Banner
Expert Advice
Expert
Simon
Expert SAS et fondateur.
« This inventory is the first step in identifying "ghost files"—metadata references pointing to files that have been deleted from the physical disk. To take this a step further, consider pairing this script with the fileexist() function to immediately validate the physical presence of every resource found. »