《美團機器學習實踐》筆記

https://book.douban.com/subject/30243136/

Performance Metric

  • F1 score: 2/F = 1/P + 1/R
  • Other interpretations for AUC:
    • Wilcoxon Test of Ranks
    • Gini-index: Gini+1 = 2*AUC
    • Not sensitive to predicted score

Feature Engineering and Feature Selection

Continuous Variables

  • Bucketing for continuous variables in, for example, logistic regression (by width or by percentile)
  • Missing value treatment (imputation or code dummy variables)
  • Feed RF nodes to linear models

Discrete Variables

  • Cross-interaction
  • Statistics (e.g., unique values of B for each A)

Time, Space, Text Features

Popular Models

Logistic Regression:

  • Why not OLS (outliers)
  • How to solver: GD, or stochastic GD (Google FTRL)
  • Advantage: Fast, scalable

FM

  • Motivation:
    • Feature interaction (not done manually)
    • Polynomial kernel (too many parameters, too sparse matrix)
  • Approach:
    • Instead of learning all co-occurrence of i and j, the weight w is calculated as the dot product of v_i and v_j with dimension k.
    • Here assumption is imposed on matrix W so that it can be de-composed.
    • The parameters for different combinations are no longer independent
  • Improvement:
    • FFM to map similar features into a field
  • Application:
    • Serve as embedding for NN (e.g., User and Ad similarity)
    • Outperforms GBDT for learn complicated feature interactions (due to sparse combinations)

GBDT
Compared with Linear Models: Missing value, Range difference of attributes,, outliers, interactions, non-linear decision boundary

Data Mining

?著作權歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內(nèi)容

  • 時年60歲的著名主持人王剛老來得子,兒子滿周歲后,他第一次接受采訪(CCTV《心理訪談》欄目)談育兒心得: 很感謝...
    Micro寶閱讀 874評論 0 3
  • 人的一生須練就兩項本領:一是說話讓人結(jié)緣,二是做事讓人感動?!皭赫Z傷人心,良言利于行”。 行事之惡,莫大于苛刻;心...
    歐韓女裝傳播潮流時尚閱讀 544評論 0 0
  • 2016年3月10日星期五晚自習。他來了,穿著一件深藍色上有幾只蝴蝶的衣服。記得第一次見到他,也穿著這件衣服。今天...
    羽蒙162425閱讀 225評論 0 0

友情鏈接更多精彩內(nèi)容