##[開源]HIVE數(shù)倉數(shù)據(jù)血緣分析工具-SQL解析

HIVE數(shù)倉數(shù)據(jù)血緣分析工具-SQL解析 - thomas0yang的專欄 - 博客頻道 - CSDN.NET http://blog.csdn.net/thomas0yang/article/details/49449723

sql解析代碼_bak - 下載頻道 - CSDN.NET http://download.csdn.net/download/thomas0yang/9369949

一、數(shù)倉經(jīng)常會碰到的幾類問題: 1、兩個數(shù)據(jù)報表進(jìn)行對比,結(jié)果差異很大,需要人工核對分析指標(biāo)的維度信息,比如從頭分析數(shù)據(jù)指標(biāo)從哪里來,處理條件是什么,最后才能分析出問題原因。 2、基礎(chǔ)數(shù)據(jù)表因某種原因需要修改字段,需要評估其對數(shù)倉的影響,費時費力,然后在做方案。
二、問題分析: 數(shù)據(jù)源長途跋涉,經(jīng)過大量的處理和組件來傳遞,呈現(xiàn)在業(yè)務(wù)用戶面前,對數(shù)據(jù)進(jìn)行回溯其實很難。元數(shù)據(jù)回溯在有效決策、策略制定、差異分析等過程中很重要。這兩類問題都屬于數(shù)據(jù)血緣分析問題,第一類叫做數(shù)據(jù)回溯、第二類叫做影響分析,是數(shù)據(jù)回溯的逆向。
三、解決方法: 自己實現(xiàn)了一套基于Hive數(shù)倉的數(shù)據(jù)血緣分析工具,來完成各個數(shù)據(jù)表、字段之間的關(guān)系梳理,進(jìn)而解決上面兩個問題。
工具主要目標(biāo):解析計算腳本中的HQL語句,分析得到輸入輸出表、輸入輸出字段和相應(yīng)的處理條件,進(jìn)行分析展現(xiàn)。
實現(xiàn)思路:對AST深度優(yōu)先遍歷,遇到操作的token則判斷當(dāng)前的操作,遇到子句則壓棧當(dāng)前處理,處理子句。子句處理完,棧彈出。處理字句的過程中,遇到子查詢就保存當(dāng)前子查詢的信息,判斷與其父查詢的關(guān)系,最終形成樹形結(jié)構(gòu); 遇到字段或者條件處理則記錄當(dāng)前的字段和條件信息、組成Block,嵌套調(diào)用。
關(guān)鍵點解析: 1、遇到TOK_TAB或TOK_TABREF則判斷出當(dāng)前操作的表 2、壓棧判斷是否是join,判斷join條件 3、定義數(shù)據(jù)結(jié)構(gòu)Block,遇到在where\select\join時獲得其下相應(yīng)的字段和條件,組成Block 4、定義數(shù)據(jù)結(jié)構(gòu)ColLine,遇到TOK_SUBQUERY保存當(dāng)前的子查詢信息,供父查詢使用 5、定義數(shù)據(jù)結(jié)構(gòu)ColLine,遇到TOK_UNION結(jié)束時,合并并截斷當(dāng)前的列信息 6、遇到select 或者未明確指出的字段,查詢元數(shù)據(jù)進(jìn)行輔助分析 7、解析結(jié)果進(jìn)行相關(guān)校驗

代碼如下: Block類
package com.xiaoju.products.parse;import java.util.ArrayList;import java.util.HashMap;import java.util.HashSet;import java.util.List;import java.util.Map;import java.util.Set;import java.util.Stack;import java.util.Map.Entry;import java.util.LinkedHashSet;import org.antlr.runtime.tree.Tree;import org.apache.hadoop.hive.ql.parse.ASTNode;import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer;import org.apache.hadoop.hive.ql.parse.HiveParser;import org.apache.hadoop.hive.ql.parse.ParseDriver;import com.xiaoju.products.bean.Block;import com.xiaoju.products.bean.ColLine;import com.xiaoju.products.bean.QueryTree;import com.xiaoju.products.exception.SQLParseException;import com.xiaoju.products.exception.UnSupportedException;import com.xiaoju.products.util.Check;import com.xiaoju.products.util.MetaCache;import com.xiaoju.products.util.NumberUtil;import com.xiaoju.products.util.ParseUtil;import com.xiaoju.products.util.PropertyFileUtil;/** * hive sql解析類 * * 目的:實現(xiàn)HQL的語句解析,分析出輸入輸出表、字段和相應(yīng)的處理條件。為字段級別的數(shù)據(jù)血緣提供基礎(chǔ)。 * 重點:獲取SELECT操作中的表和列的相關(guān)操作。其他操作這判斷到字段級別。 * 實現(xiàn)思路:對AST深度優(yōu)先遍歷,遇到操作的token則判斷當(dāng)前的操作,遇到子句則壓棧當(dāng)前處理,處理子句。子句處理完,棧彈出。 * 處理字句的過程中,遇到子查詢就保存當(dāng)前子查詢的信息,判斷與其父查詢的關(guān)系,最終形成樹形結(jié)構(gòu); * 遇到字段或者條件處理則記錄當(dāng)前的字段和條件信息、組成Block,嵌套調(diào)用。 * 關(guān)鍵點解析 * 1、遇到TOK_TAB或TOK_TABREF則判斷出當(dāng)前操作的表 * 2、壓棧判斷是否是join,判斷join條件 * 3、定義數(shù)據(jù)結(jié)構(gòu)Block,遇到在where\select\join時獲得其下相應(yīng)的字段和條件,組成Block * 4、定義數(shù)據(jù)結(jié)構(gòu)ColLine,遇到TOK_SUBQUERY保存當(dāng)前的子查詢信息,供父查詢使用 * 5、定義數(shù)據(jù)結(jié)構(gòu)ColLine,遇到TOK_UNION結(jié)束時,合并并截斷當(dāng)前的列信息 * 6、遇到select  或者未明確指出的字段,查詢元數(shù)據(jù)進(jìn)行輔助分析 * 7、解析結(jié)果進(jìn)行相關(guān)校驗 * 試用范圍: * 1、支持標(biāo)準(zhǔn)SQL * 2、不支持transform using script * * @author yangyangthomas * /public class LineParser { private static final String SPLIT_DOT = "."; private static final String SPLIT_COMMA = ","; private static final String SPLIT_AND = "&"; private static final String TOK_EOF = "<EOF>"; private static final String CON_WHERE = "WHERE:"; private static final String TOK_TMP_FILE = "TOK_TMP_FILE"; private Map<String /table/, List<String/column/>> dbMap = new HashMap<String, List<String>>(); private List<QueryTree> queryTreeList = new ArrayList<QueryTree>(); //子查詢樹形關(guān)系保存 private Stack<Set<String>> conditionsStack = new Stack<Set<String>>(); private Stack<List<ColLine>> colsStack = new Stack<List<ColLine>>(); private Map<String, List<ColLine>> resultQueryMap = new HashMap<String, List<ColLine>>(); private Set<String> conditions = new HashSet<String>(); //where or join 條件緩存 private List<ColLine> cols = new ArrayList<ColLine>(); //一個子查詢內(nèi)的列緩存 private Stack<String> tableNameStack = new Stack<String>(); private Stack<Boolean> joinStack = new Stack<Boolean>(); private Stack<ASTNode> joinOnStack = new Stack<ASTNode>(); private Map<String, QueryTree> queryMap = new HashMap<String, QueryTree>(); private boolean joinClause = false; private ASTNode joinOn = null; private String nowQueryDB = "default"; //hive的默認(rèn)庫 private boolean isCreateTable = false; //結(jié)果 private List<ColLine> colLines = new ArrayList<ColLine>(); private Set<String> outputTables = new HashSet<String>(); private Set<String> inputTables = new HashSet<String>(); private List<ColLine> tmpColLines = new ArrayList<ColLine>(); private Set<String> tmpOutputTables = new HashSet<String>(); private Set<String> tmpInputTables = new HashSet<String>(); public List<ColLine> getColLines() { return colLines; } public Set<String> getOutputTables() { return outputTables; } public Set<String> getInputTables() { return inputTables; } private void parseIteral(ASTNode ast) { prepareToParseCurrentNodeAndChilds(ast); parseChildNodes(ast); parseCurrentNode(ast); endParseCurrentNode(ast); } /** * 解析當(dāng)前節(jié)點 * @param ast * @param set * @return / private void parseCurrentNode(ASTNode ast){ if (ast.getToken() != null) { switch (ast.getToken().getType()) { case HiveParser.TOK_CREATETABLE: //outputtable isCreateTable = true; String tableOut = fillDB(BaseSemanticAnalyzer.getUnescapedName((ASTNode) ast.getChild(0))); tmpOutputTables.add(tableOut); MetaCache.getInstance().init(tableOut); //初始化數(shù)據(jù),供以后使用 break; case HiveParser.TOK_TAB:// outputTable String tableTab = BaseSemanticAnalyzer.getUnescapedName((ASTNode) ast.getChild(0)); String tableOut2 = fillDB(tableTab); tmpOutputTables.add(tableOut2); MetaCache.getInstance().init(tableOut2); //初始化數(shù)據(jù),供以后使用 break; case HiveParser.TOK_TABREF:// inputTable ASTNode tabTree = (ASTNode) ast.getChild(0); String tableInFull = fillDB((tabTree.getChildCount() == 1) ? BaseSemanticAnalyzer.getUnescapedName((ASTNode) tabTree.getChild(0)) : BaseSemanticAnalyzer.getUnescapedName((ASTNode) tabTree.getChild(0)) + SPLIT_DOT + BaseSemanticAnalyzer.getUnescapedName((ASTNode) tabTree.getChild(1)) ); String tableIn = tableInFull.substring(tableInFull.indexOf(SPLIT_DOT) + 1); tmpInputTables.add(tableInFull); MetaCache.getInstance().init(tableInFull); //初始化數(shù)據(jù),供以后使用 queryMap.clear(); String alia = null; if (ast.getChild(1) != null) { //(TOK_TABREF (TOK_TABNAME detail usersequence_client) c) alia = ast.getChild(1).getText().toLowerCase(); QueryTree qt = new QueryTree(); qt.setCurrent(alia); qt.getTableSet().add(tableInFull); QueryTree pTree = getSubQueryParent(ast); qt.setpId(pTree.getpId()); qt.setParent(pTree.getParent()); queryTreeList.add(qt); if (joinClause && ast.getParent() == joinOn) { // TOK_SUBQUERY join TOK_TABREF ,此處的TOK_SUBQUERY信息不應(yīng)該清楚 for (QueryTree entry : queryTreeList) { //當(dāng)前的查詢范圍 if (qt.getParent().equals(entry.getParent())) { queryMap.put(entry.getCurrent(), entry); } } } else { queryMap.put(qt.getCurrent(), qt); } } else { alia = tableIn.toLowerCase(); QueryTree qt = new QueryTree(); qt.setCurrent(alia); qt.getTableSet().add(tableInFull); QueryTree pTree = getSubQueryParent(ast); qt.setpId(pTree.getpId()); qt.setParent(pTree.getParent()); queryTreeList.add(qt); if (joinClause && ast.getParent() == joinOn) { for (QueryTree entry : queryTreeList) { if (qt.getParent().equals(entry.getParent())) { queryMap.put(entry.getCurrent(), entry); } } } else { queryMap.put(qt.getCurrent(), qt); //此處檢查查詢 select app.t1.c1,t1.c1 from t1 的情況 queryMap.put(tableInFull.toLowerCase(), qt); } } break; case HiveParser.TOK_SUBQUERY: if (ast.getChildCount() == 2) { String tableAlias = BaseSemanticAnalyzer.unescapeIdentifier(ast.getChild(1).getText()); String aliaReal = ""; if(aliaReal.length() !=0){ aliaReal = aliaReal.substring(0, aliaReal.length()-1); } QueryTree qt = new QueryTree(); qt.setCurrent(tableAlias.toLowerCase()); qt.setColLineList(generateColLineList(cols, conditions)); QueryTree pTree = getSubQueryParent(ast); qt.setId(generateTreeId(ast)); qt.setpId(pTree.getpId()); qt.setParent(pTree.getParent()); qt.setChildList(getSubQueryChilds(qt.getId())); if (Check.notEmpty(qt.getChildList())) { for (QueryTree cqt : qt.getChildList()) { qt.getTableSet().addAll(cqt.getTableSet()); queryTreeList.remove(cqt); // 移除子節(jié)點信息 } } queryTreeList.add(qt); cols.clear(); queryMap.clear(); for (QueryTree _qt : queryTreeList) { if (qt.getParent().equals( _qt.getParent())) { //當(dāng)前子查詢才保存 queryMap.put(_qt.getCurrent(), _qt); } } } break; case HiveParser.TOK_SELEXPR: //輸入輸出字段的處理 /* * (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) * (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) * * (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) * (TOK_SELECT * (TOK_SELEXPR (. (TOK_TABLE_OR_COL p) datekey) datekey) * (TOK_SELEXPR (TOK_TABLE_OR_COL datekey)) * (TOK_SELEXPR (TOK_FUNCTIONDI count (. (TOK_TABLE_OR_COL base) userid)) buyer_count)) * (TOK_SELEXPR (TOK_FUNCTION when (> (. (TOK_TABLE_OR_COL base) userid) 5) (. (TOK_TABLE_OR_COL base) clienttype) (> (. (TOK_TABLE_OR_COL base) userid) 1) (+ (. (TOK_TABLE_OR_COL base) datekey) 5) (+ (. (TOK_TABLE_OR_COL base) clienttype) 1)) bbbaaa) / //解析需要插入的表 Tree tok_insert = ast.getParent().getParent(); Tree child = tok_insert.getChild(0).getChild(0); String tName = BaseSemanticAnalyzer.getUnescapedName((ASTNode) child.getChild(0)); String destTable = TOK_TMP_FILE.equals(tName) ? TOK_TMP_FILE : fillDB(tName); //select a.,* from t1 和 select * from (select c1 as a,c2 from t1) t 的情況 if (ast.getChild(0).getType() == HiveParser.TOK_ALLCOLREF) { String tableOrAlias = ""; if (ast.getChild(0).getChild(0) != null) { tableOrAlias = ast.getChild(0).getChild(0).getChild(0).getText(); } String[] result = getTableAndAlia(tableOrAlias); String _alia = result[1]; boolean isSub = false; //處理嵌套select * 的情況 if (Check.notEmpty(_alia)) { for (String string : _alia.split(SPLIT_AND)) { //迭代循環(huán)的時候查詢 QueryTree qt = queryMap.get(string.toLowerCase()); if (null != qt) { List<ColLine> colLineList = qt.getColLineList(); if (Check.notEmpty(colLineList)) { isSub = true; for (ColLine colLine : colLineList) { cols.add(colLine); } } } } } if (!isSub) { //處理直接select * 的情況 String nowTable = result[0]; String[] tableArr = nowTable.split(SPLIT_AND); //fact.test&test2 for (String tables : tableArr) { String[] split = tables.split("\."); if (split.length > 2) { throw new SQLParseException("parse table:" + nowTable); } List<String> colByTab = MetaCache.getInstance().getColumnByDBAndTable(tables); for (String column : colByTab) { Set<String> fromNameSet = new LinkedHashSet<String>(); fromNameSet.add(tables + SPLIT_DOT + column); ColLine cl = new ColLine(column, tables + SPLIT_DOT + column, fromNameSet, new LinkedHashSet<String>() , destTable, column); cols.add(cl); } } } } else { Block bk = getBlockIteral((ASTNode)ast.getChild(0)); String toNameParse = getToNameParse(ast, bk); Set<String> fromNameSet = filterData(bk.getColSet()); ColLine cl = new ColLine(toNameParse, bk.getCondition(), fromNameSet, new LinkedHashSet<String>() , destTable, ""); cols.add(cl); } break; case HiveParser.TOK_WHERE: //3、過濾條件的處理select類 conditions.add(CON_WHERE + getBlockIteral((ASTNode) ast.getChild(0)).getCondition()); break; default: /** * (or * (> (. (TOK_TABLE_OR_COL p) orderid) (. (TOK_TABLE_OR_COL c) orderid)) * (and (= (. (TOK_TABLE_OR_COL p) a) (. (TOK_TABLE_OR_COL c) b)) * (= (. (TOK_TABLE_OR_COL p) aaa) (. (TOK_TABLE_OR_COL c) bbb)))) / //1、過濾條件的處理join類 if (joinOn != null && joinOn.getTokenStartIndex() == ast.getTokenStartIndex() && joinOn.getTokenStopIndex() == ast.getTokenStopIndex()) { ASTNode astCon = (ASTNode)ast.getChild(2); conditions.add(ast.getText().substring(4) + ":" + getBlockIteral(astCon).getCondition()); break; } } } } /* * 查找當(dāng)前節(jié)點的父子查詢節(jié)點 * @param ast / private QueryTree getSubQueryParent(Tree ast) { Tree _tree = ast; QueryTree qt = new QueryTree(); while(!(_tree = _tree.getParent()).isNil()){ if(_tree.getType() == HiveParser.TOK_SUBQUERY){ qt.setpId(generateTreeId(_tree)); qt.setParent(BaseSemanticAnalyzer.getUnescapedName((ASTNode)_tree.getChild(1))); return qt; } } qt.setpId(-1); qt.setParent("NIL"); return qt; } private int generateTreeId(Tree tree) { return tree.getTokenStartIndex() + tree.getTokenStopIndex(); } /* * 查找當(dāng)前節(jié)點的子子查詢節(jié)點(索引) * @param ast / private List<QueryTree> getSubQueryChilds(int id) { List<QueryTree> list = new ArrayList<QueryTree>(); for (int i = 0; i < queryTreeList.size(); i++) { QueryTree qt = queryTreeList.get(i); if (id == qt.getpId()) { list.add(qt); } } return list; } /* * 獲得要解析的名稱 * @param ast * @param bk * @return / private String getToNameParse(ASTNode ast, Block bk) { String alia = ""; Tree child = ast.getChild(0); if (ast.getChild(1) != null) { //有別名 ip as alia alia = ast.getChild(1).getText(); } else if (child.getType() == HiveParser.DOT //沒有別名 a.ip && child.getChild(0).getType() == HiveParser.TOK_TABLE_OR_COL && child.getChild(0).getChildCount() == 1 && child.getChild(1).getType() == HiveParser.Identifier) { alia = BaseSemanticAnalyzer.unescapeIdentifier(child.getChild(1).getText()); } else if (child.getType() == HiveParser.TOK_TABLE_OR_COL //沒有別名 ip && child.getChildCount() == 1 && child.getChild(0).getType() == HiveParser.Identifier) { alia = BaseSemanticAnalyzer.unescapeIdentifier(child.getChild(0).getText()); } return alia; } /* * 獲得解析的塊,主要應(yīng)用在WHERE、JOIN和SELECT端 * 如: <p>where a=1 * <p>t1 join t2 on t1.col1=t2.col1 and t1.col2=123 * <p>select count(distinct col1) from t1 * @param ast * @return / private Block getBlockIteral(ASTNode ast) { if (ast.getType() == HiveParser.KW_OR ||ast.getType() == HiveParser.KW_AND) { Block bk1 = getBlockIteral((ASTNode)ast.getChild(0)); Block bk2 = getBlockIteral((ASTNode)ast.getChild(1)); bk1.getColSet().addAll(bk2.getColSet()); bk1.setCondition("(" + bk1.getCondition() + " " + ast.getText() + " " + bk2.getCondition() + ")"); return bk1; } else if (ast.getType() == HiveParser.NOTEQUAL //判斷條件 > < like in || ast.getType() == HiveParser.EQUAL || ast.getType() == HiveParser.LESSTHAN || ast.getType() == HiveParser.LESSTHANOREQUALTO || ast.getType() == HiveParser.GREATERTHAN || ast.getType() == HiveParser.GREATERTHANOREQUALTO || ast.getType() == HiveParser.KW_LIKE || ast.getType() == HiveParser.DIVIDE || ast.getType() == HiveParser.PLUS || ast.getType() == HiveParser.MINUS || ast.getType() == HiveParser.STAR || ast.getType() == HiveParser.MOD || ast.getType() == HiveParser.AMPERSAND || ast.getType() == HiveParser.TILDE || ast.getType() == HiveParser.BITWISEOR || ast.getType() == HiveParser.BITWISEXOR) { Block bk1 = getBlockIteral((ASTNode)ast.getChild(0)); if (ast.getChild(1) == null) { // -1 bk1.setCondition(ast.getText() + bk1.getCondition()); } else { Block bk2 = getBlockIteral((ASTNode)ast.getChild(1)); bk1.getColSet().addAll(bk2.getColSet()); bk1.setCondition(bk1.getCondition() + " " + ast.getText() + " " + bk2.getCondition()); } return bk1; } else if (ast.getType() == HiveParser.TOK_FUNCTIONDI) { Block col = getBlockIteral((ASTNode) ast.getChild(1)); String condition = ast.getChild(0).getText(); col.setCondition(condition + "(distinct (" + col.getCondition() +"))"); return col; } else if (ast.getType() == HiveParser.TOK_FUNCTION){ String fun = ast.getChild(0).getText(); Block col = ast.getChild(1) == null ? new Block() : getBlockIteral((ASTNode) ast.getChild(1)); if ("when".equalsIgnoreCase(fun)) { col.setCondition(getWhenCondition(ast)); Set<Block> processChilds = processChilds(ast, 1); col.getColSet().addAll(bkToCols(col, processChilds)); return col; } else if("IN".equalsIgnoreCase(fun)) { col.setCondition(col.getCondition() + " in (" + blockCondToString(processChilds(ast, 2)) + ")"); return col; } else if("TOK_ISNOTNULL".equalsIgnoreCase(fun) //isnull isnotnull || "TOK_ISNULL".equalsIgnoreCase(fun)){ col.setCondition(col.getCondition() + " " + fun.toLowerCase().substring(4)); return col; } else if("BETWEEN".equalsIgnoreCase(fun)){ col.setCondition(getBlockIteral((ASTNode) ast.getChild(2)).getCondition() + " between " + getBlockIteral((ASTNode) ast.getChild(3)).getCondition() + " and " + getBlockIteral((ASTNode) ast.getChild(4)).getCondition()); return col; } Set<Block> processChilds = processChilds(ast, 1); col.getColSet().addAll(bkToCols(col, processChilds)); col.setCondition(fun +"("+ blockCondToString(processChilds) + ")"); return col; } else if(ast.getType() == HiveParser.LSQUARE){ //map,array Block column = getBlockIteral((ASTNode) ast.getChild(0)); Block key = getBlockIteral((ASTNode) ast.getChild(1)); column.setCondition(column.getCondition() +"["+ key.getCondition() + "]"); return column; } else { return parseBlock(ast); } } private Set<String> bkToCols(Block col, Set<Block> processChilds) { Set<String> set = new LinkedHashSet<String>(processChilds.size()); for (Block colLine : processChilds) { if (Check.notEmpty(colLine.getColSet())) { set.addAll(colLine.getColSet()); } } return set; } private String blockCondToString(Set<Block> processChilds) { StringBuilder sb = new StringBuilder(); for (Block colLine : processChilds) { sb.append(colLine.getCondition()).append(SPLIT_COMMA); } if (sb.length()>0) { sb.setLength(sb.length()-1); } return sb.toString(); } /* * 解析when條件 * @param ast * @return case when c1>100 then col1 when c1>0 col2 else col3 end / private String getWhenCondition(ASTNode ast) { int cnt = ast.getChildCount(); StringBuilder sb = new StringBuilder(); for (int i = 1; i < cnt; i++) { String condition = getBlockIteral((ASTNode)ast.getChild(i)).getCondition(); if (i == 1) { sb.append("(case when " + condition); } else if (i == cnt-1) { //else sb.append(" else " + condition + " end)"); } else if (i % 2 == 0){ //then sb.append(" then " + condition); } else { sb.append(" when " + condition); } } return sb.toString(); } /* * 保存subQuery查詢別名和字段信息 * @param sqlIndex * @param tableAlias / private void putResultQueryMap(int sqlIndex, String tableAlias) { List<ColLine> list = generateColLineList(cols, conditions); String key = sqlIndex == 0 ? tableAlias : tableAlias + sqlIndex; //沒有重名的情況就不用標(biāo)記 resultQueryMap.put(key, list); } private List<ColLine> generateColLineList(List<ColLine> cols, Set<String> conditions) { List<ColLine> list = new ArrayList<ColLine>(); for (ColLine entry : cols) { entry.getConditionSet().addAll(conditions); list.add(ParseUtil.cloneColLine(entry)); } return list; } /* * 判斷正常列, * 正常:a as col, a * 異常:1 ,'a' //數(shù)字、字符等作為列名 / private boolean notNormalCol(String column) { return Check.isEmpty(column) || NumberUtil.isNumeric(column) || (column.startsWith(""") && column.endsWith(""")) || (column.startsWith("'") && column.endsWith("'")); } /* * 從指定索引位置開始解析子樹 * @param ast * @param startIndex 開始索引 * @param isSimple 是否簡寫 * @param withCond 是否包含條件 * @return / private Set<Block> processChilds(ASTNode ast,int startIndex) { int cnt = ast.getChildCount(); Set<Block> set = new LinkedHashSet<Block>(); for (int i = startIndex; i < cnt; i++) { Block bk = getBlockIteral((ASTNode) ast.getChild(i)); if (Check.notEmpty(bk.getCondition()) || Check.notEmpty(bk.getColSet())){ set.add(bk); } } return set; } /* * 解析獲得列名或者字符數(shù)字等和條件 * @param ast * @param isSimple * @return / private Block parseBlock(ASTNode ast) { if (ast.getType() == HiveParser.DOT && ast.getChild(0).getType() == HiveParser.TOK_TABLE_OR_COL && ast.getChild(0).getChildCount() == 1 && ast.getChild(1).getType() == HiveParser.Identifier) { String column = BaseSemanticAnalyzer.unescapeIdentifier(ast.getChild(1).getText()); String alia = BaseSemanticAnalyzer.unescapeIdentifier(ast.getChild(0).getChild(0).getText()); return getBlock(column, alia); } else if (ast.getType() == HiveParser.TOK_TABLE_OR_COL && ast.getChildCount() == 1 && ast.getChild(0).getType() == HiveParser.Identifier) { String column = ast.getChild(0).getText(); return getBlock(column, null); } else if (ast.getType() == HiveParser.Number || ast.getType() == HiveParser.StringLiteral || ast.getType() == HiveParser.Identifier) { Block bk = new Block(); bk.setCondition(ast.getText()); bk.getColSet().add(ast.getText()); return bk; } return new Block(); } /* * 根據(jù)列名和別名獲得塊信息 * @param column * @param alia * @param tree 當(dāng)前子查詢下的別名可以使用 * @return / private Block getBlock(String column, String alia) { String[] result = getTableAndAlia(alia); String tableArray = result[0]; String _alia = result[1]; for (String string : _alia.split(SPLIT_AND)) { //迭代循環(huán)的時候查詢 QueryTree qt = queryMap.get(string.toLowerCase()); if (Check.notEmpty(column)) { for (ColLine colLine : qt.getColLineList()) { if (column.equalsIgnoreCase(colLine.getToNameParse())) { Block bk = new Block(); bk.setCondition(colLine.getColCondition()); bk.setColSet(ParseUtil.cloneSet(colLine.getFromNameSet())); return bk; } } } } String _realTable = tableArray; int cnt = 0; //匹配字段和元數(shù)據(jù)字段相同數(shù)目,如果有多個匹配,即此sql有二義性 for (String tables : tableArray.split(SPLIT_AND)) { //初始化的時候查詢數(shù)據(jù)庫對應(yīng)表 String[] split = tables.split("\."); if (split.length > 2) { throw new SQLParseException("parse table:" + tables); } List<String> colByTab = MetaCache.getInstance().getColumnByDBAndTable(tables); for (String col : colByTab) { if (column.equalsIgnoreCase(col)) { _realTable = tables; cnt++; } } }// if (cnt == 0) { //此類沒有找到的檢查在Validater類中檢查// } if (cnt > 1) { //二義性檢查 throw new SQLParseException("SQL is ambiguity, column: " + column + " tables:" + tableArray); } Block bk = new Block(); bk.setCondition(_realTable + SPLIT_DOT + column); bk.getColSet().add(_realTable + SPLIT_DOT + column); return bk; } /* * 過濾掉無用的列:如col1,123,'2013',col2 ==>> col1,col2 * @param col * @return / private Set<String> filterData(Set<String> colSet){ Set<String> set = new LinkedHashSet<String>(); for (String string : colSet) { if (!notNormalCol(string)) { set.add(string); } } return set; } /* * 解析所有子節(jié)點 * @param ast * @return / private void parseChildNodes(ASTNode ast){ int numCh = ast.getChildCount(); if (numCh > 0) { for (int num = 0; num < numCh; num++) { ASTNode child = (ASTNode) ast.getChild(num); parseIteral(child); } } } /* * 準(zhǔn)備解析當(dāng)前節(jié)點 * @param ast / private void prepareToParseCurrentNodeAndChilds(ASTNode ast){ if (ast.getToken() != null) { switch (ast.getToken().getType()) { case HiveParser.TOK_SWITCHDATABASE: System.out.println("nowQueryDB changed " + nowQueryDB+ " to " +ast.getChild(0).getText()); nowQueryDB = ast.getChild(0).getText(); break; case HiveParser.TOK_TRANSFORM: throw new UnSupportedException("no support transform using clause"); case HiveParser.TOK_RIGHTOUTERJOIN: case HiveParser.TOK_LEFTOUTERJOIN: case HiveParser.TOK_JOIN: case HiveParser.TOK_LEFTSEMIJOIN: case HiveParser.TOK_MAPJOIN: case HiveParser.TOK_FULLOUTERJOIN: case HiveParser.TOK_UNIQUEJOIN: joinStack.push(joinClause); joinClause = true; joinOnStack.push(joinOn); joinOn = ast; break; } } } /* * 結(jié)束解析當(dāng)前節(jié)點 * @param ast / private void endParseCurrentNode(ASTNode ast){ if (ast.getToken() != null) { Tree parent = ast.getParent(); switch (ast.getToken().getType()) { //join 從句結(jié)束,跳出join case HiveParser.TOK_RIGHTOUTERJOIN: case HiveParser.TOK_LEFTOUTERJOIN: case HiveParser.TOK_JOIN: case HiveParser.TOK_LEFTSEMIJOIN: case HiveParser.TOK_MAPJOIN: case HiveParser.TOK_FULLOUTERJOIN: case HiveParser.TOK_UNIQUEJOIN: joinClause = joinStack.pop(); joinOn = joinOnStack.pop(); break; case HiveParser.TOK_QUERY: processUnionStack(ast, parent); //union的子節(jié)點 case HiveParser.TOK_INSERT: case HiveParser.TOK_SELECT: break; case HiveParser.TOK_UNION: //合并union字段信息 mergeUnionCols(); processUnionStack(ast, parent); //union的子節(jié)點 break; } } } private void mergeUnionCols() { validateUnion(cols); int size = cols.size(); int colNum = size / 2; List<ColLine> list = new ArrayList<ColLine>(colNum); for (int i = 0; i < colNum; i++) { //合并字段 ColLine col = cols.get(i); for (int j = i + colNum; j < size; j = j + colNum) { ColLine col2 = cols.get(j); list.add(col2); if (notNormalCol(col.getToNameParse()) && !notNormalCol(col2.getToNameParse())) { col.setToNameParse(col2.getToNameParse()); } col.getFromNameSet().addAll(col2.getFromNameSet()); col.setColCondition(col.getColCondition() + SPLIT_AND + col2.getColCondition()); Set<String> conditionSet = ParseUtil.cloneSet(col.getConditionSet()); conditionSet.addAll(col2.getConditionSet()); conditionSet.addAll(conditions); col.getConditionSet().addAll(conditionSet); } } cols.removeAll(list); //移除已經(jīng)合并的數(shù)據(jù) } private void processUnionStack(ASTNode ast, Tree parent) { boolean isNeedAdd = parent.getType() == HiveParser.TOK_UNION; if (isNeedAdd) { if (parent.getChild(0) == ast && parent.getChild(1) != null) {//有弟節(jié)點(是第一節(jié)點) //壓棧 conditionsStack.push(ParseUtil.cloneSet(conditions)); conditions.clear(); colsStack.push(ParseUtil.cloneList(cols)); cols.clear(); } else { //無弟節(jié)點(是第二節(jié)點) //出棧 if (!conditionsStack.isEmpty()) { conditions.addAll(conditionsStack.pop()); } if (!colsStack.isEmpty()) { cols.addAll(0, colsStack.pop()); } } } } private void parseAST(ASTNode ast) { parseIteral(ast); } public void parse(String sqlAll) throws Exception{ if (Check.isEmpty(sqlAll)) { return; } startParseAll(); //清空最終結(jié)果集 int i = 0; //當(dāng)前是第幾個sql for (String sql : sqlAll.split("(?<!\\);")) { ParseDriver pd = new ParseDriver(); String trim = sql.toLowerCase().trim(); if (trim.startsWith("set") || trim.startsWith("add") || Check.isEmpty(trim)) { continue; } ASTNode ast = pd.parse(sql); if ("local".equals(PropertyFileUtil.getProperty("environment"))) { System.out.println(ast.toStringTree()); } prepareParse(); parseAST(ast); endParse(++i); } } /* * 清空上次處理的結(jié)果 / private void startParseAll() { colLines.clear(); outputTables.clear(); inputTables.clear(); } private void prepareParse() { isCreateTable = false; dbMap.clear(); queryMap.clear(); queryTreeList.clear(); //結(jié)果 tmpColLines.clear(); tmpOutputTables.clear(); tmpInputTables.clear(); conditionsStack.clear(); //where or join 條件緩存 colsStack.clear(); //一個子查詢內(nèi)的列緩存 resultQueryMap.clear(); conditions.clear(); //where or join 條件緩存 cols.clear(); //一個子查詢內(nèi)的列緩存 tableNameStack.clear(); joinStack.clear(); joinOnStack.clear(); joinClause = false; joinOn = null; } /* * 所有解析完畢之后的后期處理 / private void endParse(int sqlIndex) { putResultQueryMap(sqlIndex, TOK_EOF); putDBMap(); setColLineList(); setOutInputTableSet(); } /** * 設(shè)置輸出表的字段對應(yīng)關(guān)系 / private void setColLineList() { Map<String, List<ColLine>> map = new HashMap<String, List<ColLine>>(); for (Entry<String, List<ColLine>> entry : resultQueryMap.entrySet()) { if (entry.getKey().startsWith(TOK_EOF)) { List<ColLine> value = entry.getValue(); for (ColLine colLine : value) { List<ColLine> list = map.get(colLine.getToTable()); if (Check.isEmpty(list)) { list = new ArrayList<ColLine>(); map.put(colLine.getToTable(), list); } list.add(colLine); } } } for (Entry<String, List<ColLine>> entry : map.entrySet()) { String table = entry.getKey(); List<ColLine> pList = entry.getValue(); List<String> dList = dbMap.get(table); int metaSize = Check.isEmpty(dList) ? 0 : dList.size(); for (int i = 0; i < pList.size(); i++) { //按順序插入對應(yīng)的字段 ColLine clp = pList.get(i); String colName = null; if (i < metaSize) { colName = table + SPLIT_DOT + dList.get(i); } if (isCreateTable && TOK_TMP_FILE.equals(table)) { for (String string : tmpOutputTables) { table = string; } } ColLine colLine = new ColLine(clp.getToNameParse(), clp.getColCondition(), clp.getFromNameSet(), clp.getConditionSet(), table, colName); colLines.add(colLine); } } } /** * 設(shè)置輸出表的字段對應(yīng)關(guān)系 / private void setOutInputTableSet() { outputTables.addAll(ParseUtil.cloneSet(tmpOutputTables)); inputTables.addAll(ParseUtil.cloneSet(tmpInputTables)); } private void putDBMap() { for (String table : tmpOutputTables) { List<String> list = MetaCache.getInstance().getColumnByDBAndTable(table); dbMap.put(table, list); } } /* * 補全db信息 * table1 ==>> db1.table1 * db1.table1 ==>> db1.table1 * db2.t1&t2 ==>> db2.t1&db1.t2 * @param tables / private String fillDB(String nowTable) { if (Check.isEmpty(nowTable)) { return nowTable; } StringBuilder sb = new StringBuilder(); String[] tableArr = nowTable.split(SPLIT_AND); //fact.test&test2&test3 for (String tables : tableArr) { String[] split = tables.split("\" + SPLIT_DOT); if (split.length > 2) { System.out.println(tables); throw new SQLParseException("parse table:" + nowTable); } String db = split.length == 2 ? split[0] : nowQueryDB ; String table = split.length == 2 ? split[1] : split[0] ; sb.append(db).append(SPLIT_DOT).append(table).append(SPLIT_AND); } if (sb.length()>0) { sb.setLength(sb.length()-1); } return sb.toString(); } /* * 根據(jù)別名查詢表明 * @param alia * @return / private String[] getTableAndAlia(String alia) { String _alia = Check.notEmpty(alia) ? alia : ParseUtil.collectionToString(queryMap.keySet(), SPLIT_AND, true) ; String[] result = {"" , _alia}; Set<String> tableSet = new HashSet<String>(); if (Check.notEmpty(_alia)) { String[] split = _alia.split(SPLIT_AND); for (String string : split) { //別名又分單獨起的別名 和 表名,即 select a.col,table_name.col from table_name a if (tmpInputTables.contains(string) || tmpInputTables.contains(fillDB(string))) { tableSet.add(fillDB(string)); } else if (queryMap.containsKey(string.toLowerCase())) { tableSet.addAll(queryMap.get(string.toLowerCase()).getTableSet()); } } result[0] = ParseUtil.collectionToString(tableSet, SPLIT_AND, true); result[1] = _alia; } return result; } /* * 校驗union * @param list */ private void validateUnion(List<ColLine> list){ int size = list.size(); if (size % 2 == 1) { throw new SQLParseException("union column number are different, size=" + size); } int colNum = size / 2; checkUnion(list, 0, colNum); checkUnion(list, colNum, size); } private void checkUnion(List<ColLine> list, int start, int end) { String tmp = null; for (int i = start; i < end; i++) { //合并字段 ColLine col = list.get(i); if (Check.isEmpty(tmp)) { tmp = col.getToTable(); } else if (!tmp.equals(col.getToTable())){ throw new SQLParseException("union column number/types are different,table1=" + tmp +",table2="+ col.getToTable()); } } }}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981

另,附上一AST語法樹格式化的Python腳本。

!/usr/bin/env python# -- coding: utf-8 --'''Created on 2012-5-20'''import sys# explain select key from kv mykv join test mytest on (mykv.key == mytest.id);original_str = """(TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME dim_city))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL city_name)) (TOK_SELEXPR (TOK_TABLE_OR_COL city_id)) (TOK_SELEXPR (TOK_TABLE_OR_COL pt))) (TOK_WHERE (AND (= (TOK_TABLE_OR_COL pt) '$yesday') (= (TOK_TABLE_OR_COL level) 2))) (TOK_GROUPBY (TOK_TABLE_OR_COL city_name) (TOK_TABLE_OR_COL city_id) (TOK_TABLE_OR_COL pt)))) b) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME dw_dri_wide_sheet))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL city_id)) (TOK_SELEXPR (TOK_TABLE_OR_COL pt)) (TOK_SELEXPR (TOK_FUNCTIONDI count (TOK_FUNCTION when (= (TOK_FUNCTION to_date (TOK_TABLE_OR_COL last_sucgrabord_time)) '$data_desc') (TOK_TABLE_OR_COL dri_id))) last1_dri_cnt) (TOK_SELEXPR (TOK_FUNCTIONDI count (TOK_FUNCTION when (and (> (TOK_FUNCTION to_date (TOK_TABLE_OR_COL last_sucgrabord_time)) (TOK_FUNCTION date_sub '$data_desc' 7)) (<= (TOK_FUNCTION to_date (TOK_TABLE_OR_COL last_sucgrabord_time)) '$data_desc')) (TOK_TABLE_OR_COL dri_id))) last7_dri_cnt)) (TOK_WHERE (and (= (TOK_TABLE_OR_COL pt) '$data_desc') (TOK_FUNCTION TOK_ISNOTNULL (TOK_TABLE_OR_COL last_sucgrabord_time)))) (TOK_GROUPBY (TOK_TABLE_OR_COL city_id) (TOK_TABLE_OR_COL pt)))) a) (= (. (TOK_TABLE_OR_COL a) city_id) (. (TOK_TABLE_OR_COL b) city_id)))) (TOK_INSERT (TOK_DESTINATION (TOK_TAB (TOK_TABNAME test kd_st_kpi_dri_active_day_city_bi))) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) city_name)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) city_id)) (TOK_SELEXPR (TOK_FUNCTION nvl (TOK_TABLE_OR_COL last1_dri_cnt) 0)) (TOK_SELEXPR (TOK_FUNCTION nvl (TOK_TABLE_OR_COL last7_dri_cnt) 0)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) pt))))) <EOF>"""tmp_str = original_str.strip().replace('\n', '')def my_print(mystr): sys.stdout.write(mystr)def print_indent(indent_level): for i in range(indent_level): my_print(' ' * 4)indent_level = 0for char in tmp_str: if char == '(': # 如果是左括號,先換行,然后打印縮進(jìn)+( my_print('\n') print_indent(indent_level) my_print(char) indent_level += 1 elif char == ')': # 如果是右括號,先打印),再換行,打印下一級別的縮進(jìn) indent_level -= 1 my_print(char) my_print('\n') print_indent(indent_level - 1) else: # 其他的直接打印出來 my_print(char)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

參考文章: http://tech.meituan.com/hive-sql-to-mapreduce.html http://www.cnblogs.com/drawwindows/p/4595771.html https://cwiki.apache.org/confluence/display/Hive/LanguageManual

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容