国产综合亚洲一区激情,色噜噜夜夜草一二三区

原文地址：TensorFlow in a Nutshell?—?Part Two: Hybrid Learning

原文作者：Camron Godbout
譯者：edvardhua
校對(duì)者：marcmoore, futureshine

確保你已經(jīng)閱讀了第一部分

在本文中，我們將演示一個(gè)寬 N 深度網(wǎng)絡(luò)，它使用廣泛的線性模型與前饋網(wǎng)絡(luò)同時(shí)訓(xùn)練，以證明它比一些傳統(tǒng)的機(jī)器學(xué)習(xí)技術(shù)能提供精度更高的預(yù)測(cè)結(jié)果。下面我們將使用混合學(xué)習(xí)方法預(yù)測(cè)泰坦尼克號(hào)乘客的生存概率。

混合學(xué)習(xí)技術(shù)已被 Google 應(yīng)用在 Play 商店中提供應(yīng)用推薦。Youtube 也在使用類似的混合學(xué)習(xí)技術(shù)來(lái)推薦視頻。

本文的代碼可以在這里找到。

廣泛深度網(wǎng)絡(luò)

寬和深網(wǎng)絡(luò)將線性模型與前饋神經(jīng)網(wǎng)絡(luò)結(jié)合，使得我們的預(yù)測(cè)將具有記憶和通用化。這種類型的模型可以用于分類和回歸問(wèn)題。這種方法能夠在減少特征工程的同時(shí)擁有相對(duì)精確的預(yù)測(cè)結(jié)果，可謂一箭雙雕。

廣泛深度網(wǎng)絡(luò)

數(shù)據(jù)

我們將使用泰坦尼克號(hào) Kaggle 數(shù)據(jù)來(lái)預(yù)測(cè)乘客的生存率是否和某些屬性有關(guān)，如姓名，性別，船票，船艙的類型等。有關(guān)此數(shù)據(jù)的更多信息請(qǐng)點(diǎn)擊這里。

首先，我們要將所有列定義為連續(xù)或分類。

連續(xù)的列 - 連續(xù)范圍內(nèi)的任何數(shù)值。像錢(qián)或年齡。

**分類列 - **有限集的一部分。像男性或女性，或著乘客的國(guó)籍。

CATEGORICAL_COLUMNS = ["Name", "Sex", "Embarked", "Cabin"]
CONTINUOUS_COLUMNS = ["Age", "SibSp", "Parch", "Fare", "PassengerId", "Pclass"]

因?yàn)槲覀冎皇窍肟纯匆粋€(gè)人是否幸存下來(lái)，這是一個(gè)二元分類問(wèn)題。所以預(yù)測(cè)結(jié)果 1 表示該乘客幸存下來(lái)，而結(jié)果 0 表示沒(méi)有幸存。（也即創(chuàng)建一列來(lái)儲(chǔ)存預(yù)測(cè)結(jié)果）

SURVIVED_COLUMN = "Survived"

網(wǎng)絡(luò)

現(xiàn)在我們可以創(chuàng)建列和添加嵌入層。當(dāng)我們構(gòu)建我們的模型時(shí)，我們想要將我們的分類列變成稀疏列。對(duì)于沒(méi)有那么多類別（例如 Sex 或 Embarked（S，Q 或 C））的列，我們根據(jù)類名將它們轉(zhuǎn)換為稀疏列。（sparse_column_with_keys）

sex = tf.contrib.layers.sparse_column_with_keys(column_name="Sex",
                                                     keys=["female",
                                                 "male"])
  embarked = tf.contrib.layers.sparse_column_with_keys(column_name="Embarked",
                                                   keys=["C",
                                                         "S",
                                                         "Q"])

對(duì)于類別較多的分類列，由于我們沒(méi)有一個(gè)詞匯表文件（vocab file）將所有可能的類別映射為一個(gè)整數(shù)，所以我們使用哈希值作為鍵值。（sparse_column_with_hash_bucket）

cabin = tf.contrib.layers.sparse_column_with_hash_bucket(
      "Cabin", hash_bucket_size=1000)
      name = tf.contrib.layers.sparse_column_with_hash_bucket(
      "Name", hash_bucket_size=1000)

我們的連續(xù)列使用的是真實(shí)的值。因?yàn)?passengerId 是連續(xù)的而不是分類的，并且他們已經(jīng)是整數(shù)的 ID 而不是字符串。

age = tf.contrib.layers.real_valued_column("Age")
      passenger_id = tf.contrib.layers.real_valued_column("PassengerId")
sib_sp = tf.contrib.layers.real_valued_column("SibSp")
parch = tf.contrib.layers.real_valued_column("Parch")
fare = tf.contrib.layers.real_valued_column("Fare")
p_class = tf.contrib.layers.real_valued_column("Pclass")

我們需要根據(jù)年齡對(duì)乘客進(jìn)行分類。桶化（Bucketization ）允許我們找到乘客對(duì)應(yīng)年齡組的生存相關(guān)性，而不是將所有年齡作為一個(gè)大整體，從而提高我們的準(zhǔn)確性。

age_buckets = tf.contrib.layers.bucketized_column(age,
                                                    boundaries=[
                                                        5, 18, 25,
                                                        30, 35, 40,
                                                        45, 50, 55,
                                                         65
                                                    ])

最后，我們將定義我們的廣度列和深度列。我們的寬列將有效地記住我們與特征之間的交互。我們的寬列不會(huì)將我們的特征通用化，這是深度列的用處。

wide_columns = [sex, embarked, p_class, cabin, name, age_buckets,
                  tf.contrib.layers.crossed_column([p_class, cabin],
                                                   hash_bucket_size=int(1e4)),
                  tf.contrib.layers.crossed_column(
                      [age_buckets, sex],
                      hash_bucket_size=int(1e6)),
                  tf.contrib.layers.crossed_column([embarked, name],
                                                   hash_bucket_size=int(1e4))]

擁有這些深度列的好處是，它會(huì)將我們提供的高維度稀疏的特征進(jìn)行降維來(lái)計(jì)算。

deep_columns = [
      tf.contrib.layers.embedding_column(sex, dimension=8),
      tf.contrib.layers.embedding_column(embarked, dimension=8),
      tf.contrib.layers.embedding_column(p_class,
                                         dimension=8),
      tf.contrib.layers.embedding_column(cabin, dimension=8),
      tf.contrib.layers.embedding_column(name, dimension=8),
      age,
      passenger_id,
      sib_sp,
      parch,
      fare,
  ]

我們通過(guò)使用深度列和廣度列來(lái)創(chuàng)建分類器，以完成我們的函數(shù)。

return tf.contrib.learn.DNNLinearCombinedClassifier(
         linear_feature_columns=wide_columns,
        dnn_feature_columns=deep_columns,
        dnn_hidden_units=[100, 50])

我們?cè)谶\(yùn)行網(wǎng)絡(luò)之前要做的最后一件事是為我們的連續(xù)和分類列創(chuàng)建映射。我們先創(chuàng)建一個(gè)輸入函數(shù)給我們的數(shù)據(jù)框，它能將我們的數(shù)據(jù)框轉(zhuǎn)換為 Tensorflow 可以操作的對(duì)象。這樣做的好處是，我們可以改變和調(diào)整我們的 tensors 創(chuàng)建過(guò)程。例如說(shuō)我們可以將特征列傳遞到.fit .feature .predict作為一個(gè)單獨(dú)創(chuàng)建的列，就像我們上面所描述的一樣，但這個(gè)是一個(gè)更加簡(jiǎn)潔的方案。

def input_fn(df, train=False):
  """Input builder function."""
  # Creates a dictionary mapping from each continuous feature column name (k) to
  # the values of that column stored in a constant Tensor.
  continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS}
  # Creates a dictionary mapping from each categorical feature column name (k)
  # to the values of that column stored in a tf.SparseTensor.
  categorical_cols = {k: tf.SparseTensor(
    indices=[[i, 0] for i in range(df[k].size)],
    values=df[k].values,
    shape=[df[k].size, 1])
                      for k in CATEGORICAL_COLUMNS}
  # Merges the two dictionaries into one.
  feature_cols = dict(continuous_cols)
  feature_cols.update(categorical_cols)
  # Converts the label column into a constant Tensor.
  if train:
    label = tf.constant(df[SURVIVED_COLUMN].values)
      # Returns the feature columns and the label.
    return feature_cols, label
  else:
    # so we can predict our results that don't exist in the csv
    return feature_cols

現(xiàn)在，做完了以上工作，我們就可以開(kāi)始編寫(xiě)訓(xùn)練功能了

def train_and_eval():
  """Train and evaluate the model."""
  df_train = pd.read_csv(
      tf.gfile.Open("./train.csv"),
      skipinitialspace=True)
  df_test = pd.read_csv(
      tf.gfile.Open("./test.csv"),
      skipinitialspace=True)

  model_dir = "./models"
  print("model directory = %s" % model_dir)

  m = build_estimator(model_dir)
  m.fit(input_fn=lambda: input_fn(df_train, True), steps=200)
  print m.predict(input_fn=lambda: input_fn(df_test))
  results = m.evaluate(input_fn=lambda: input_fn(df_train, True), steps=1)
  for key in sorted(results):
    print("%s: %s" % (key, results[key]))

我們讀取預(yù)處理后的 csv 文件，像處理缺失值等。為了讓文章保持簡(jiǎn)潔，更多有關(guān)預(yù)處理的代碼和內(nèi)容可以在代碼倉(cāng)庫(kù)中找到。

這些 csv 文件將通過(guò)調(diào)用 input_fn 函數(shù)轉(zhuǎn)換為 tensors 。我們先構(gòu)建評(píng)價(jià)指標(biāo)，然后打印我們的預(yù)測(cè)和評(píng)估結(jié)果。

結(jié)果

網(wǎng)絡(luò)結(jié)果

運(yùn)行我們的代碼為我們提供了相當(dāng)好的結(jié)果，不需要添加任何額外的列或做任何特征工程。而且只要很少的微調(diào)這個(gè)模型可以得到相對(duì)較好的結(jié)果。

對(duì)比圖

與傳統(tǒng)廣度線性模型一起添加嵌入層的能力，允許通過(guò)將稀疏維度降低到低維度來(lái)進(jìn)行準(zhǔn)確的預(yù)測(cè)。

結(jié)論

這部分偏離了傳統(tǒng)的深度學(xué)習(xí)，說(shuō)明 Tensorflow 還有許多其他用途和應(yīng)用。本文主要根據(jù) Google 提供的論文和代碼進(jìn)行廣泛深入的學(xué)習(xí)。研究論文可以在這里找到。 Google 將此模型用作 Google Play 商店的產(chǎn)品推薦引擎，并幫助他們?cè)谔岣邞?yīng)用銷量上給出了建議。 YouTube 也發(fā)布了一篇關(guān)于他們使用混合模型做推薦系統(tǒng)的文章。這些模型開(kāi)始更多地被各種公司推薦，并且會(huì)因?yàn)閮?yōu)秀的嵌入能力越來(lái)越流行。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

[譯] 簡(jiǎn)明 TensorFlow 教程 ?—?第二部分：混合模型

[譯] 簡(jiǎn)明 TensorFlow 教程 ?—?第二部分：混合模型

廣泛深度網(wǎng)絡(luò)

數(shù)據(jù)

網(wǎng)絡(luò)

結(jié)果

結(jié)論

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

[譯] 簡(jiǎn)明 TensorFlow 教程 ?—?第二部分：混合模型

廣泛深度網(wǎng)絡(luò)

數(shù)據(jù)

網(wǎng)絡(luò)

結(jié)果

結(jié)論

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av