背景:有時(shí)我們要觀察各個(gè)分?jǐn)?shù)區(qū)間的用戶,在各個(gè)特征上的表現(xiàn)有無差異。在進(jìn)行分組時(shí),除了使用PROC FORMAT手工定義區(qū)間之外,也可以使用PROC RANK和PROC FORMAT,利用分?jǐn)?shù)(或者其他數(shù)據(jù))的分位數(shù)等統(tǒng)計(jì)量,實(shí)現(xiàn)自動(dòng)化分組排序。
PROC RANK
proc rank data=test out=r_test【輸出的數(shù)據(jù)集】;
var spend【對(duì)spend進(jìn)行排序】;
ranks r_spend【序號(hào)變量命名為r_spend】; ?
run;
PROC UNIVARIATE
proc univariate data=events noprint;
var neg_score;
output out=p pctlpre=P_【分位數(shù)變量名稱的前綴為P_】
pctlpts=10 to 100 by 10;
weight SamplingWeight;
run;
proc transpose data=p out=pt;
run;
proc sort data=pt
nodupkey force noequals;
by COL1;
run;
Generating deciles, quartiles, percentiles or other groups from numeric variables. The GROUPS optionis used here to specify the binning. Deciles are created by specifying GROUPS=10, quartiles can be generated by GROUPS=4, and percentiles are created with setting GROUPS=100.