Published on :
ETL EXTERNAL

Extracting Statistical Data from the KOSIS API

This code is also available in: Deutsch Español Français
Awaiting validation
This SAS© script defines and uses a '%json' macro to query a web API (kosis.kr) via PROC HTTP. It loops over a range of years to retrieve statistical data. Each JSON response from the API is read using the LIBNAME JSON engine, transformed with PROC TRANSPOSE, then appended to a final SAS© table. The script performs two extractions for different geographical levels (state and city) and ends by cleaning variables in the 'data_city' table.
Data Analysis

Type : EXTERNAL


Data is entirely retrieved from an external source via calls to the KOSIS (Korean Statistical Information Service) web API. The script does not depend on pre-existing data.

1 Code Block
%LET / LIBNAME
Explanation :
Initialization of global macro variables. 'lib' defines the output path, 'String01' contains an API key, 'var_want' lists the columns to keep. A 'json' libname is assigned to the output path to save SAS tables.
Copied!
1%let lib=C:\Json\;
2%let String01=NjdhZTg3ZTM1OGEzZGMyOGIyZWE0ZmIxZTBiMDg0ZTg=;/*apiKey*/
3%let var_want=PRD_DE/*연도*/ C1 C1_NM C3_NM/*연령*/ DT/*수*/;/*원하는 변수만 남기기, 전부 남기고 싶으면 공란*/
4LIBNAME json "&lib";
2 Code Block
Macro Data
Explanation :
Definition of the '%json' macro. This macro iterates over a date range, constructs a URL for the KOSIS API at each iteration, executes a PROC HTTP to retrieve JSON data, reads it with LIBNAME JSON, transforms it with PROC TRANSPOSE to pivot it, and accumulates it into a final SAS table.
Copied!
1%macro json(data_final, String02, String03, String04, String05, String06, String07, String08, String09, String10, String11, String12, String13, String14, date_s, date_e);
2DATA "&lib.&data_final";RUN;
3 %DO date_want=&date_s %to &date_e;
4 %let url=http://kosis.kr/openapi/Param/statisticsParameterData.DO?method=getList&apiKey=&String01&itmId=&String12&objL1=&String04&objL2=&String05&objL3=&String06&objL4=&String07&objL5=&String08&objL6=&String09&objL7=&String10&objL8=&String11&FORMAT=json&jsonVD=Y&prdSe=&String14&startPrdDe=&date_want&endPrdDe=&date_want&loadGubun=&String13&orgId=&String02&tblId=&String03;
5 
6 filename out temp;PROC HTTP url="&url" method="get" out=out;RUN;
7 LIBNAME raw json fileref=out;
8 
9 DATA "&lib.temp";SET raw.alldata;IF p1='TBL_NM' THEN group+1;/*자동으로 p1의 첫번째 값을 'TBL_NM'대신 들어가도록 하는 방법을 못 찾겠습니다.*/RUN;
10 
11 PROC TRANSPOSE DATA="&lib.temp" out="&lib.data_one"(drop=_:);BY group;id P1;var Value;RUN;
12 
13 DATA "&lib.data_one";SET "&lib.data_one"(keep=&var_want);RUN;
14 
15 DATA "&lib.&data_final";SET "&lib.&data_final" "&lib.data_one";RUN;
16 %END;
17DATA "&lib.&data_final";SET "&lib.&data_final";IF _n_=1 THEN delete;RUN;
18%mend;
3 Code Block
Macro Call Data
Explanation :
Call to the '%json' macro twice. The first call downloads statistical data at the state level ('data_state'). The second call downloads data at the city level ('data_city'). Both extractions cover the period from 1993 to 2017.
Copied!
1%json(data_final=data_state, /*orgId*/String02=101, /*tblId*/ String03=DT_1B040M1, /*objL1*/String04=00+11+21+22+23+24+25+26+29+31+32+33+34+35+36+37+38+39, /*objL2*/String05=0, /*objL3*/String06=ALL, /*objL4*/String07=,/*objL5*/String08=, /*objL6*/String09=, /*objL7*/String10=, /*objL8*/String11=, /*itmId*/String12=all, /*loadGubun*/String13=2, /*prdSe*/String14=Y, date_s=1993, date_e=2017);
2%json(data_final=data_city, /*orgId*/String02=101, /*tblId*/ String03=DT_1B040M1, /*objL1*/String04=all, /*objL2*/String05=0, /*objL3*/String06=ALL, /*objL4*/String07=,/*objL5*/String08=, /*objL6*/String09=, /*objL7*/String10=, /*objL8*/String11=, /*itmId*/String12=all, /*loadGubun*/String13=2, /*prdSe*/String14=Y, date_s=1993, date_e=2017);
3 
4 Code Block
DATA STEP Data
Explanation :
This final DATA STEP modifies the 'json.data_city' table. It renames columns with more explicit names (Year, Div, Old, Num), converts numeric variables stored as text to a numeric format, and selects the final columns for analysis.
Copied!
1DATA json.data_city;
2SET json.data_city;
3Year=PRD_DE+0;/*연도*/
4Div_num=C1;
5Div=C1_NM;
6Old=C3_NM;
7Num=DT+0;/*수*/
8keep Year Div_num Div Old Num;
9RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.