SAS編程-宏:Source側(cè)與QC側(cè)程序運(yùn)行時(shí)間檢查

目前SAS編程工作一般都要求雙側(cè)獨(dú)立編程,這就像雙盲試驗(yàn)一樣,減少偏倚,提升輸出結(jié)果的準(zhǔn)確性。雙側(cè)編程中,QC側(cè)程序運(yùn)行時(shí)間需在Source側(cè)之后。這不僅是一個(gè)很好的編程習(xí)慣,不少公司也將這一要求添加到工作手冊(cè)中,程序提交之前會(huì)有專門的系統(tǒng)來檢測(cè)。

這篇文章運(yùn)用SAS編程來完成這樣的程序運(yùn)行時(shí)間檢查,為方便程序調(diào)用,將代碼封裝為宏,宏程序的完整代碼在第4節(jié)匯總

1. 宏程序的整體思路

宏程序主要分為兩部分,第一部分,獲取SAS程序以及SAS日志的末次修改時(shí)間;第二部分,比較各個(gè)末次修改時(shí)間,得出對(duì)應(yīng)的結(jié)論。

第一部分內(nèi)容,參考文章SAS編程:如何獲取某路徑下所有文件的修改時(shí)間?。

第二部分,我把各個(gè)末次修改時(shí)間的比較劃分為4類:

  1. 程序修改時(shí)間缺失,需創(chuàng)建缺失程序;
  2. 日志修改時(shí)間缺失,程序需要Batch Run;
  3. SAS程序修改時(shí)間,晚于日志修改時(shí)間,程序需要Batch Run;
  4. Source側(cè)日志修改時(shí)間,晚于QC側(cè)修改時(shí)間,QC側(cè)程序需要Batch Run。

演示文件夾如下,兩個(gè)文件夾,一個(gè)是Source側(cè),一個(gè)是QC側(cè),SAS程序?qū)?yīng)的日志文件在同一文件夾中。

Source側(cè)
QC側(cè)

2. 獲取SAS程序以及日志的末次修改時(shí)間

這一部分分為兩個(gè)步驟,先使用Dopen系列函數(shù)獲取所有文件的名稱(文件路徑),再使用Fopen系列函數(shù)獲取所有文件的修改時(shí)間。

直接參照之前的文章代碼,整合成一個(gè)宏程序,宏程序中做2點(diǎn)額外的考量。

第一點(diǎn),考慮Windows和UNIX系統(tǒng)中,文件地址的斜杠不同,以及在輸入文件夾地址時(shí),地址末尾可能添加斜杠,也可能不添加。根據(jù)輸入的地址判斷斜杠的類型,同時(shí),統(tǒng)一將輸入地址末尾的斜杠移除,后續(xù)使用在手動(dòng)添加

第二點(diǎn),因?yàn)楹瘮?shù)FINO返回的信息,受SAS語言的影響,且返回的中文時(shí)間不方便讀入。所以,在調(diào)用FINO時(shí)將SAS系統(tǒng)語言設(shè)置為英文(options locale = EN_US),調(diào)用結(jié)束后還原之前的選項(xiàng)值。這個(gè)操作是不是跟之前提到的ods listing close;ods listing;;options mprint;options nomprint有異曲同工之妙?

第一部分,該程序如下:

%macro  get_last_mod_date( dirpath =, suffix =, outdt =  );

%if  "&dirpath." ne "" %then %do;

%local dirpath_tmp slash;

%let slash = %substr(%sysfunc(compress(&dirpath., : _ , a d)), 1, 1);

*Remove trailing slash;
%if "%substr(&dirpath.,%length(&dirpath.),1)" = "&slash." %then %let dirpath_tmp=%substr(&dirpath.,1,%length(&dirpath.)-1);
%else %let  dirpath_tmp = &dirpath.;


**Dopen--Get filepath;
data _tmp1;
  fileres = filename("dirpath", "&dirpath_tmp");
  dirid = dopen("dirpath");
  num = dnum(dirid);

  length direct filename filepath $200;

  if dirid > 0 and num >0 then do;
    do i = 1 to num;
      direct = "&dirpath_tmp.";
      filename = dread(dirid, i);
      filepath = catx("&slash.", direct, filename);

      if strip(scan(filename, 2, "."))="sas" or strip(scan(filename, 2, "."))="log"  then output;
    end;
  end;

  keep filename filepath;

  proc sort;
    by filename;
run;

*Set SAS language;
 %local locale_sys ;

 %let locale_sys = %sysfunc(getoption(locale));
options locale = EN_US;

**Fopen--Get Last Modified date;
data _tmp2;
  set _tmp1;

  *Get fileID;
  fileres = filename("filepath", filepath);
  fileid = fopen("filepath");

  *Get Last Modified date;
  if fileid > 0 then do;
    length lmdtc $200;
    lmdtc = finfo(fileid, "Last Modified"); 
    if lmdtc ne "" then lmdtm = input(lmdtc, datetime19.);
  end;

  *Close fileID;
  fileid_c = fclose(fileid);
  
  format lmdtm e8601dt.;

  keep filename filepath lmdtc lmdtm;
run;

options locale = &locale_sys.;


**3. Combine lmdtm of .sas and .log file;
proc sql noprint;
  create table &outdt. as
    select scan(a.filename, -2, ".") as domain_&suffix., a.lmdtm as lmdtm_sas_&suffix.,   b.lmdtm as lmdtm_log_&suffix.
  from _tmp2 as a
    left join
    _tmp2 as b
  on scan(a.filename, 1, ".") = scan(b.filename, 1,  ".") and index(a.filename, ".sas") and index(b.filename, ".log")
  where index(a.filename, ".sas")
  ;
quit;

%end;

%else %put Dirpath is missing ! ;

%mend get_last_mod_date;

將上面兩個(gè)文件夾地址代入宏程序:

*Source;
%get_last_mod_date(
  dirpath = E:\99_Test\Test\test1\
  ,suffix = S
  ,outdt = Source_lmdtm
);

*QC;
%get_last_mod_date(
  dirpath = E:\99_Test\Test\test1\validation
  ,suffix = QC
  ,outdt = QC_lmdtm
);

結(jié)果如下:

Source_lmdtm
QC_lmdtm

3. 比較各文件的末次修改時(shí)間

獲取Source側(cè)和QC側(cè)程序和日志的末次修改時(shí)間后,將兩側(cè)獲取到的時(shí)間數(shù)據(jù)集拼接到一起(Full join),比較輸出第1節(jié)提到的4類輸出結(jié)果。

%macro check_date(resdt=, SourcePath=, QCPath= );

**Get last modified dates of files in each folder;
*Source;
%get_last_mod_date(
  dirpath = &SourcePath.
  ,suffix = S
  ,outdt = Source_lmdtm
);

*QC;
%get_last_mod_date(
  dirpath = &QCPath.
  ,suffix = QC
  ,outdt = QC_lmdtm
);


**Combine source and QC results;
proc sql noprint;
  create table _tmp3 as
    select a.*, b.*
    from source_lmdtm as a
      full join
      QC_lmdtm as b
    on a.domain_S = substr(b.domain_QC, 3) or a.domain_S = b.domain_QC
  ;
quit;


**Create results dataset;
data &resdt.;
  retain domain side resultsn results;

  length domain $64 side $10 results $200;

  set _tmp3;

  *1. SAS missing;
  if missing(lmdtm_sas_s) then do;
    domain = strip(domain_QC);
    side = "Source";
    resultsn = 11;
    results = "Source program for **"||strip(domain)||"** is not created!";

    put "results = " results;
    output;
  end;

  if missing(lmdtm_sas_QC) then do;
    domain = strip(domain_S);
    side = "QC";
    resultsn = 12;
    results = "QC program for **"||strip(domain)||"** is not created!";

    put "results = " results;
    output;
  end;

  *2. Log missing;
if not missing(lmdtm_sas_s)  and missing(lmdtm_log_s) then do;
    domain = strip(domain_S);
    side = "Source";
    resultsn = 21;
    results = "Source program for **"||strip(domain)||"** does not putty run!";

    put "results = " results;
    output;
  end;

  if not missing(lmdtm_sas_QC)  and missing(lmdtm_log_QC) then do;
    domain = strip(domain_QC);
    side = "QC";
    resultsn = 22;
    results = "QC program for **"||strip(domain)||"** does not putty run!";

    put "results = " results;
    output;
  end;

  *3. SAS LM after LOG;
  if lmdtm_sas_s > lmdtm_log_s >. then do;
    domain = strip(domain_S);
    side = "Source";
    resultsn = 31;
    results = "Source program for **"||strip(domain)||"** does not putty run after code update!";

    put "results = " results;
    output;
  end;

  if  lmdtm_sas_QC > lmdtm_log_QC >.  then do;
    domain = strip(domain_QC);
    side = "QC";
    resultsn = 32;
    results = "QC program for **"||strip(domain)||"** does not putty run after code update!";

    put "results = " results;
    output;
  end;

 *4. Source log LM after QC log;
  if  lmdtm_log_S > lmdtm_log_QC >.  then do;
    domain = strip(domain_QC);
    side = "QC";
    resultsn = 41;
    results = "QC program for **"||strip(domain)||"** does not putty run after Source putty run!";

    put "results = " results;
    output;
  end;
run;

%mend check_date;

宏程序的參數(shù)為,輸出的結(jié)果數(shù)據(jù)集,Source文件夾地址,QC文件夾地址:

%check_date(
    resdt = check_date_SDTM
    ,SourcePath = E:\99_Test\Test\test1\
    ,QCPath = E:\99_Test\Test\test1\validation
);

輸出結(jié)果為:

Check_date_SDTM

4. 完整宏程序匯總

%macro check_date(resdt=, SourcePath=, QCPath= );
**Author: Jihai;
**Date: 2022-05-22;

***1. Create a macro to get last modified dates of files in each folder;
%macro  get_last_mod_date( dirpath =, suffix =, outdt =  );

%if  "&dirpath." ne "" %then %do;

%local dirpath_tmp slash;

%let slash = %substr(%sysfunc(compress(&dirpath., : _ , a d)), 1, 1);

*Remove trailing slash;
%if "%substr(&dirpath.,%length(&dirpath.),1)" = "&slash." %then %let dirpath_tmp=%substr(&dirpath.,1,%length(&dirpath.)-1);
%else %let  dirpath_tmp = &dirpath.;


**1.1 Dopen--Get filepath;
data _tmp1;
  fileres = filename("dirpath", "&dirpath_tmp");
  dirid = dopen("dirpath");
  num = dnum(dirid);

  length direct filename filepath $200;

  if dirid > 0 and num >0 then do;
    do i = 1 to num;
      direct = "&dirpath_tmp.";
      filename = dread(dirid, i);
      filepath = catx("&slash.", direct, filename);

      if strip(scan(filename, 2, "."))="sas" or strip(scan(filename, 2, "."))="log"   then output;
    end;
  end;

  keep filename filepath;

  proc sort;
    by filename;
run;

*Set SAS language;
 %local locale_sys ;

 %let locale_sys = %sysfunc(getoption(locale));
options locale = EN_US;


**1.2 Fopen--Get Last Modified date;
data _tmp2;
  set _tmp1;

  *Get fileID;
  fileres = filename("filepath", filepath);
  fileid = fopen("filepath");

  *Get Last Modified date;
  if fileid > 0 then do;
    length lmdtc $200;
    lmdtc = finfo(fileid, "Last Modified"); 
    if lmdtc ne "" then lmdtm = input(lmdtc, datetime19.);
  end;

  *Close fileID;
  fileid_c = fclose(fileid);
  
  format lmdtm e8601dt.;

  keep filename filepath lmdtc lmdtm;
run;

options locale = &locale_sys.;


**1.3 Combine lmdtm of .sas and .log file;
proc sql noprint;
  create table &outdt. as
    select scan(a.filename, -2, ".") as domain_&suffix., a.lmdtm as lmdtm_sas_&suffix.,   b.lmdtm as lmdtm_log_&suffix.
  from _tmp2 as a
    left join
    _tmp2 as b
  on scan(a.filename, 1, ".") = scan(b.filename, 1,  ".") and index(a.filename, ".sas") and index(b.filename, ".log")
  where index(a.filename, ".sas")
  ;
quit;

%end;

%else %put Dirpath is missing ! ;

%mend get_last_mod_date;

*Source;
%get_last_mod_date(
  dirpath = &SourcePath.
  ,suffix = S
  ,outdt = Source_lmdtm
);

*QC;
%get_last_mod_date(
  dirpath = &QCPath.
  ,suffix = QC
  ,outdt = QC_lmdtm
);


***2. Combine source and QC results;
proc sql noprint;
  create table _tmp3 as
    select a.*, b.*
    from source_lmdtm as a
      full join
      QC_lmdtm as b
    on a.domain_S = substr(b.domain_QC, 3) or a.domain_S = b.domain_QC
  ;
quit;


***3. Create results dataset;
data &resdt.;
  retain domain side resultsn results;

  length domain $64 side $10 results $200;

  set _tmp3;

  **3.1 SAS missing;
  if missing(lmdtm_sas_s) then do;
    domain = strip(domain_QC);
    side = "Source";
    resultsn = 11;
    results = "Source program for **"||strip(domain)||"** is not created!";

    put "results = " results;
    output;
  end;

  if missing(lmdtm_sas_QC) then do;
    domain = strip(domain_S);
    side = "QC";
    resultsn = 12;
    results = "QC program for  **"||strip(domain)||"** is not created!";

    put "results = " results;
    output;
  end;

  **3.2 Log missing;
if not missing(lmdtm_sas_s)  and missing(lmdtm_log_s) then do;
    domain = strip(domain_S);
    side = "Source";
    resultsn = 21;
    results = "Source program for **"||strip(domain)||"** does not putty run!";

    put "results = " results;
    output;
  end;

  if not missing(lmdtm_sas_QC)  and missing(lmdtm_log_QC) then do;
    domain = strip(domain_QC);
    side = "QC";
    resultsn = 22;
    results = "QC program for **"||strip(domain)||"** does not putty run!";

    put "results = " results;
    output;
  end;

  **3.3 SAS LM after LOG;
  if lmdtm_sas_s > lmdtm_log_s >. then do;
    domain = strip(domain_S);
    side = "Source";
    resultsn = 31;
    results = "Source program for **"||strip(domain)||"** does not putty run after code update!";

    put "results = " results;
    output;
  end;

  if  lmdtm_sas_QC > lmdtm_log_QC >.  then do;
    domain = strip(domain_QC);
    side = "QC";
    resultsn = 32;
    results = "QC program for **"||strip(domain)||"** does not putty run after code update!";

    put "results = " results;
    output;
  end;

 **3.4 Source log LM after QC log;
  if  lmdtm_log_S > lmdtm_log_QC >.  then do;
    domain = strip(domain_QC);
    side = "QC";
    resultsn = 41;
    results = "QC program for **"||strip(domain)||"** does not putty run after Source putty run!";

    put "results = " results;
    output;
  end;
run;

%mend check_date;

***Invoke the macro;
%check_date(
    resdt = check_date_SDTM
    ,SourcePath = E:\99_Test\Test\test1\
    ,QCPath = E:\99_Test\Test\test1\validation
);

總結(jié)

這個(gè)宏的關(guān)鍵點(diǎn)在于獲取特定文件夾下的所有文件末次修改時(shí)間,涉及Dopen、Fopen系列函數(shù)的使用。

相關(guān)閱讀:
SAS編程:Dopen系列函數(shù)介紹
SAS編程:Fopen系列函數(shù)介紹

感謝閱讀, 歡迎關(guān)注!
若有疑問,歡迎評(píng)論交流!

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容