国产午夜中字,老鸭锅A片视频

循環(huán)

循環(huán)神經(jīng)網(wǎng)絡(luò)（RNN）是一種重要的神經(jīng)網(wǎng)絡(luò)模型，尤其適用于序列化標(biāo)注問題。初學(xué)循環(huán)神經(jīng)網(wǎng)絡(luò)的過程中，經(jīng)常迷惑于各種似曾相識的原理圖，糾結(jié)于不同的Cell是什么原理，它們是怎么組合起來的，輸入數(shù)據(jù)究竟長啥樣，它們是怎么被單個Cell處理的，又是怎樣在Cell間流轉(zhuǎn)的，代碼層是怎么實現(xiàn)的，復(fù)雜程度咋樣？本文將試圖從多個角度，提綱挈領(lǐng)的對諸多問題進(jìn)行由淺入深的探討。本文將神經(jīng)網(wǎng)絡(luò)的組件劃分為三個維度（RNN Cell類型、編碼器|解碼器結(jié)構(gòu)、Seq2Seq模型），并從這三個維度分別講解各個組件，以及它們的組合使用方式。本文借鑒了斯坦福CS224D課程的部分內(nèi)容，每個環(huán)節(jié)都會伴隨著原理的講解，并列兄弟結(jié)構(gòu)的差別和演進(jìn)，并從代碼的角度進(jìn)行展示，希望能給初學(xué)者一個有點有面的認(rèn)識。

下圖從三個維度總結(jié)了神經(jīng)網(wǎng)絡(luò)的組件：

神經(jīng)網(wǎng)絡(luò)組建的三個維度

單個RNN cell:

RNN cell是循環(huán)神經(jīng)網(wǎng)絡(luò)最基本的單元，代表了一個基本的神經(jīng)元。該Cell的輸入除了常規(guī)的X(t)，還多出了一個代表上一步記憶的H(t-1)，這里可以稱之為記憶，也可以稱之為上一步的HiddenState。
如下圖：

基本的RNN Cell

對于最基本的RNN Cell(對應(yīng)于rnn_cell_impl.py中的BasicRNNCell)，H(t-1)和Y(t-1)是一樣的，沒錯，是一樣的，看下面源碼：

class BasicRNNCell(RNNCell):
  def __init__(self, num_units, activation=None, reuse=None):
    super(BasicRNNCell, self).__init__(_reuse=reuse)
    self._num_units = num_units
    self._activation = activation or math_ops.tanh
    self._linear = None

  @property
  def state_size(self):
    return self._num_units

  @property
  def output_size(self):
    return self._num_units

  def call(self, inputs, state):
    """Most basic RNN: output = new_state = act(W * input + U * state + B)."""
    if self._linear is None:
      self._linear = _Linear([inputs, state], self._num_units, True)

    output = self._activation(self._linear([inputs, state]))
    return output, output

call函數(shù)返回的兩個值分別代表一個Cell的output和hidden_state，可以看出返回的output和hidden_state值是一樣的，所以output_size=hidden_state_size，都等于num_units。同時，通過上面的代碼我們還可以看到，BasicRNNCell構(gòu)造函數(shù)只有一個必不可少的參數(shù)：num_units。該參數(shù)代表的是BasicRNNCell里面的全連接網(wǎng)絡(luò)的輸出維度。輸入維度是任意的，取決于用戶調(diào)用call函數(shù)時輸入的input的size，所以構(gòu)造函數(shù)只需要輸出維度一個必選參數(shù)。

接下來我們通過一個簡單的例子說明如何通過X(t)和H(t-1)計算output Y(t)，同時也就是hidden_state。假設(shè)我們要做一個語言翻譯模型，將中文翻譯成英語。我們的訓(xùn)練數(shù)據(jù)有100萬條，每一條都是一句中文，和對應(yīng)的英文翻譯。我們指定batch_size=100，也就是說每次處理100條訓(xùn)練數(shù)據(jù)。我們首先對每句話進(jìn)行切詞，并轉(zhuǎn)換成WordEmbedding表示。這里為了畫圖方便，假設(shè)我們的WordEmbedding維度是4，也就是每個單詞用一個長度為4的向量表示(實際上WordEmbedding一般是256)。

BasicRNNCell計算過程

這是一個極簡的BasicRNNCell的計算過程，深刻的理解它對理解整個RNN網(wǎng)絡(luò)至關(guān)重要。這里強(qiáng)調(diào)幾點，首先這是一個全連接網(wǎng)絡(luò)，X(t)由矩陣W映射到x_output，H(t-1)由矩陣U映射到state_output。第二，x_output和state_output維度是一樣的，這樣才能進(jìn)行后續(xù)的累加操作。第三，x_output和state_output累加并加入Bias后，一份作為Y(t)，一份作為H(t)，所以對于BasicRNNCell，Y(t)=H(t)。

所以訓(xùn)練這個網(wǎng)絡(luò)，就是訓(xùn)練W和U矩陣，Bias參數(shù)，以及初始化的H(0)矩陣的過程。有了這些概念，我們就可以理解下面這個按照時間展開的RNN網(wǎng)絡(luò)圖了：

RNN網(wǎng)絡(luò)按時間展開圖

上面講的BasicRNNCell是最基本的RNN Cell樣式，在此基礎(chǔ)上發(fā)展出了很多更復(fù)雜的RNN Cell，比如BasicLSTMCell/LSTMCell/GRUCell/MultiRNNCell等。為什么要去改進(jìn)BasicRNNCell呢？這是因為雖然RNN Cell在長距離梯度更新的時候容易出現(xiàn)梯度消失和(或)梯度爆炸的問題，具體證明方法超出了本文的討論范圍，請參照附錄里斯坦福大學(xué)的教程，里面給出了詳細(xì)復(fù)雜的證明方法。而GRU和LSTM Cell的出現(xiàn)解決了這個問題。

GRUCell公式和原理圖如下：

GRUCell公式

GRUCell原理圖

GRUCell最顯著的特征是添加了兩個Gate，ResetGate和UpdateGate。之前BasicRNNCell是將X(t)和H(t-1)分別經(jīng)過W，U矩陣映射之后無腦累加起來，而GRU會通過兩個門函數(shù)進(jìn)行取舍。ResetGate用于決定從之前的記憶H(t-1)中獲取多少來和當(dāng)前的輸入進(jìn)行合并計算（虛線框的部分），這時會計算出新的hidden_state，也就是新的記憶，而UpdateGate用于決定新的記憶和舊的記憶如何按比例被傳遞到下一步。所以這里比較tricky的是UpdateGate不只是直接作用在虛線框計算出的新記憶上，而是同時作用在新記憶和舊記憶H(t-1)上，它決定從新記憶中取多少，從舊記憶中取多少，累加后作為傳遞到下一步的H(t)?？梢詮倪@個角度理解，如果模型認(rèn)為當(dāng)前時刻t的這個輸入對最終結(jié)果意義很大，那么新記憶會以很大的權(quán)重被累加進(jìn)H(t)，反之，會被以很小的權(quán)重被累加進(jìn)H(t)，舊記憶同理。

同BasicRNNCell一樣，GRU每一個時刻t輸出的output和hidden_state是一樣的，都是H(t)，所以他的state_size和output_size是一樣的，看代碼：

class GRUCell(RNNCell):
  def __init__(self,
               num_units,
               activation=None,
               reuse=None,
               kernel_initializer=None,
               bias_initializer=None):
    super(GRUCell, self).__init__(_reuse=reuse)
    self._num_units = num_units
    self._activation = activation or math_ops.tanh
    self._kernel_initializer = kernel_initializer
    self._bias_initializer = bias_initializer
    self._gate_linear = None
    self._candidate_linear = None

  @property
  def state_size(self):
    return self._num_units

  @property
  def output_size(self):
    return self._num_units

  def call(self, inputs, state):
    """Gated recurrent unit (GRU) with nunits cells."""
    if self._gate_linear is None:
      bias_ones = self._bias_initializer
      if self._bias_initializer is None:
        bias_ones = init_ops.constant_initializer(1.0, dtype=inputs.dtype)
      with vs.variable_scope("gates"):  # Reset gate and update gate.
        self._gate_linear = _Linear(
            [inputs, state],
            2 * self._num_units,
            True,
            bias_initializer=bias_ones,
            kernel_initializer=self._kernel_initializer)

    value = math_ops.sigmoid(self._gate_linear([inputs, state]))
    r, u = array_ops.split(value=value, num_or_size_splits=2, axis=1)

    r_state = r * state
    if self._candidate_linear is None:
      with vs.variable_scope("candidate"):
        self._candidate_linear = _Linear(
            [inputs, r_state],
            self._num_units,
            True,
            bias_initializer=self._bias_initializer,
            kernel_initializer=self._kernel_initializer)
    c = self._activation(self._candidate_linear([inputs, r_state]))
    new_h = u * state + (1 - u) * c
    return new_h, new_h

理解了GRUCell后，再簡單介紹一下LSTMCell。
LSTMCell相比于GRUCell更復(fù)雜一些，加入了更多的門。它包括了三個Gate，分別是InputGate，F(xiàn)orgetGate和OutputGate。理解上和GRUCell類似，只不過控制更為精細(xì)，當(dāng)然，控制更精細(xì)并不意味著結(jié)果一定會更好，實際使用的時候可以分別測試一下比較效果擇優(yōu)使用即可。LSTMCell公式和原理圖如下：

LSTMCell公式

LSTMCell原理圖

LSTMCell源碼如下：

class LSTMCell(RNNCell):
  def __init__(self, num_units,
               use_peepholes=False, cell_clip=None,
               initializer=None, num_proj=None, proj_clip=None,
               num_unit_shards=None, num_proj_shards=None,
               forget_bias=1.0, state_is_tuple=True,
               activation=None, reuse=None):
    
    super(LSTMCell, self).__init__(_reuse=reuse)
    if not state_is_tuple:
      logging.warn("%s: Using a concatenated state is slower and will soon be "
                   "deprecated.  Use state_is_tuple=True.", self)
    if num_unit_shards is not None or num_proj_shards is not None:
      logging.warn(
          "%s: The num_unit_shards and proj_unit_shards parameters are "
          "deprecated and will be removed in Jan 2017.  "
          "Use a variable scope with a partitioner instead.", self)

    self._num_units = num_units
    self._use_peepholes = use_peepholes
    self._cell_clip = cell_clip
    self._initializer = initializer
    self._num_proj = num_proj
    self._proj_clip = proj_clip
    self._num_unit_shards = num_unit_shards
    self._num_proj_shards = num_proj_shards
    self._forget_bias = forget_bias
    self._state_is_tuple = state_is_tuple
    self._activation = activation or math_ops.tanh

    if num_proj:
      self._state_size = (
          LSTMStateTuple(num_units, num_proj)
          if state_is_tuple else num_units + num_proj)
      self._output_size = num_proj
    else:
      self._state_size = (
          LSTMStateTuple(num_units, num_units)
          if state_is_tuple else 2 * num_units)
      self._output_size = num_units
    self._linear1 = None
    self._linear2 = None
    if self._use_peepholes:
      self._w_f_diag = None
      self._w_i_diag = None
      self._w_o_diag = None

  @property
  def state_size(self):
    return self._state_size

  @property
  def output_size(self):
    return self._output_size

  def call(self, inputs, state):
    num_proj = self._num_units if self._num_proj is None else self._num_proj
    sigmoid = math_ops.sigmoid

    if self._state_is_tuple:
      (c_prev, m_prev) = state
    else:
      c_prev = array_ops.slice(state, [0, 0], [-1, self._num_units])
      m_prev = array_ops.slice(state, [0, self._num_units], [-1, num_proj])

    dtype = inputs.dtype
    input_size = inputs.get_shape().with_rank(2)[1]
    if input_size.value is None:
      raise ValueError("Could not infer input size from inputs.get_shape()[-1]")
    if self._linear1 is None:
      scope = vs.get_variable_scope()
      with vs.variable_scope(
          scope, initializer=self._initializer) as unit_scope:
        if self._num_unit_shards is not None:
          unit_scope.set_partitioner(
              partitioned_variables.fixed_size_partitioner(
                  self._num_unit_shards))
        self._linear1 = _Linear([inputs, m_prev], 4 * self._num_units, True)

    # i = input_gate, j = new_input, f = forget_gate, o = output_gate
    lstm_matrix = self._linear1([inputs, m_prev])
    i, j, f, o = array_ops.split(
        value=lstm_matrix, num_or_size_splits=4, axis=1)
    # Diagonal connections
    if self._use_peepholes and not self._w_f_diag:
      scope = vs.get_variable_scope()
      with vs.variable_scope(
          scope, initializer=self._initializer) as unit_scope:
        with vs.variable_scope(unit_scope):
          self._w_f_diag = vs.get_variable(
              "w_f_diag", shape=[self._num_units], dtype=dtype)
          self._w_i_diag = vs.get_variable(
              "w_i_diag", shape=[self._num_units], dtype=dtype)
          self._w_o_diag = vs.get_variable(
              "w_o_diag", shape=[self._num_units], dtype=dtype)

    if self._use_peepholes:
      c = (sigmoid(f + self._forget_bias + self._w_f_diag * c_prev) * c_prev +
           sigmoid(i + self._w_i_diag * c_prev) * self._activation(j))
    else:
      c = (sigmoid(f + self._forget_bias) * c_prev + sigmoid(i) *
           self._activation(j))

    if self._cell_clip is not None:
      # pylint: disable=invalid-unary-operand-type
      c = clip_ops.clip_by_value(c, -self._cell_clip, self._cell_clip)
      # pylint: enable=invalid-unary-operand-type
    if self._use_peepholes:
      m = sigmoid(o + self._w_o_diag * c) * self._activation(c)
    else:
      m = sigmoid(o) * self._activation(c)

    if self._num_proj is not None:
      if self._linear2 is None:
        scope = vs.get_variable_scope()
        with vs.variable_scope(scope, initializer=self._initializer):
          with vs.variable_scope("projection") as proj_scope:
            if self._num_proj_shards is not None:
              proj_scope.set_partitioner(
                  partitioned_variables.fixed_size_partitioner(
                      self._num_proj_shards))
            self._linear2 = _Linear(m, self._num_proj, False)
      m = self._linear2(m)

      if self._proj_clip is not None:
        # pylint: disable=invalid-unary-operand-type
        m = clip_ops.clip_by_value(m, -self._proj_clip, self._proj_clip)
        # pylint: enable=invalid-unary-operand-type

    new_state = (LSTMStateTuple(c, m) if self._state_is_tuple else
                 array_ops.concat([c, m], 1))
    return m, new_state

LSTMCell輸出了兩個值，m代表hidden_state，c代表final memory，new_state是簡單的把c和m連接在了一起。

編碼器和解碼器

編碼器根據(jù)適用場景可以分為適用于文本的rnn_encoder，適用于圖片的image_encoder，包含了卷積或池化的conv_encoder/pooling_encoder。我們重點關(guān)注適用于文本的rnn_encoder。編碼器根據(jù)單向還是雙向分為UndirectionalRNNEncoder和BidirectionalRNNEncoder，它們只是提供了一種框架邏輯，里面具體使用到的Cell可以是RNNCell中的任意一種。UndirectionalRNNEncoder和BidirectionalRNNEncoder又分別調(diào)用了python2.7/site-packages/tensorflow/python/ops/rnn.py中的提供的tf.nn.dynamic_rnn和tf.nn.bidirectional_dynamic_rnn。它們實現(xiàn)了單向和雙向RNN的核心邏輯。

class UnidirectionalRNNEncoder(Encoder):
  """
  A unidirectional RNN encoder. Stacking should be performed as
  part of the cell.

  Args:
    cell: An instance of tf.contrib.rnn.RNNCell
    name: A name for the encoder
  """

  def __init__(self, params, mode, name="forward_rnn_encoder"):
    super(UnidirectionalRNNEncoder, self).__init__(params, mode, name)
    self.params["rnn_cell"] = _toggle_dropout(self.params["rnn_cell"], mode)

  @staticmethod
  def default_params():
    return {
        "rnn_cell": _default_rnn_cell_params(),
        "init_scale": 0.04,
    }

  def encode(self, inputs, sequence_length, **kwargs):
    scope = tf.get_variable_scope()
    scope.set_initializer(tf.random_uniform_initializer(
        -self.params["init_scale"],
        self.params["init_scale"]))

    cell = training_utils.get_rnn_cell(**self.params["rnn_cell"])
    outputs, state = tf.nn.dynamic_rnn(
        cell=cell,
        inputs=inputs,
        sequence_length=sequence_length,
        dtype=tf.float32,
        **kwargs)
    return EncoderOutput(
        outputs=outputs,
        final_state=state,
        attention_values=outputs,
        attention_values_length=sequence_length)

class BidirectionalRNNEncoder(Encoder):
  """
  A bidirectional RNN encoder. Uses the same cell for both the
  forward and backward RNN. Stacking should be performed as part of
  the cell.

  Args:
    cell: An instance of tf.contrib.rnn.RNNCell
    name: A name for the encoder
  """

  def __init__(self, params, mode, name="bidi_rnn_encoder"):
    super(BidirectionalRNNEncoder, self).__init__(params, mode, name)
    self.params["rnn_cell"] = _toggle_dropout(self.params["rnn_cell"], mode)

  @staticmethod
  def default_params():
    return {
        "rnn_cell": _default_rnn_cell_params(),
        "init_scale": 0.04,
    }

  def encode(self, inputs, sequence_length, **kwargs):
    scope = tf.get_variable_scope()
    scope.set_initializer(tf.random_uniform_initializer(
        -self.params["init_scale"],
        self.params["init_scale"]))

    cell_fw = training_utils.get_rnn_cell(**self.params["rnn_cell"])
    cell_bw = training_utils.get_rnn_cell(**self.params["rnn_cell"])
    outputs, states = tf.nn.bidirectional_dynamic_rnn(
        cell_fw=cell_fw,
        cell_bw=cell_bw,
        inputs=inputs,
        sequence_length=sequence_length,
        dtype=tf.float32,
        **kwargs)

    # Concatenate outputs and states of the forward and backward RNNs
    outputs_concat = tf.concat(outputs, 2)

    return EncoderOutput(
        outputs=outputs_concat,
        final_state=states,
        attention_values=outputs_concat,
        attention_values_length=sequence_length)

解碼器分為BasicDecoder、AttentionDecoder和BeamSearchDecoder。 BasicDecoder就是在每步step計算中將inputs和state傳入cell進(jìn)行計算，生成cell_output和cell_state，作為下一個step的輸入。同時將cell_output通過一個全連接網(wǎng)絡(luò)轉(zhuǎn)換成各個輸出目標(biāo)的概率（在compute_output中進(jìn)行）。這里采用了負(fù)采樣的方法（self.helper.sample）來選擇唯一的正例和少數(shù)負(fù)例進(jìn)行計算，可以大大降低計算量。

class BasicDecoder(RNNDecoder):
  """Simple RNN decoder that performed a softmax operations on the cell output.
  """

  def __init__(self, params, mode, vocab_size, name="basic_decoder"):
    super(BasicDecoder, self).__init__(params, mode, name)
    self.vocab_size = vocab_size

  def compute_output(self, cell_output):
    """Computes the decoder outputs."""
    return tf.contrib.layers.fully_connected(
        inputs=cell_output, num_outputs=self.vocab_size, activation_fn=None)

  @property
  def output_size(self):
    return DecoderOutput(
        logits=self.vocab_size,
        predicted_ids=tf.TensorShape([]),
        cell_output=self.cell.output_size)

  @property
  def output_dtype(self):
    return DecoderOutput(
        logits=tf.float32, predicted_ids=tf.int32, cell_output=tf.float32)

  def initialize(self, name=None):
    finished, first_inputs = self.helper.initialize()
    return finished, first_inputs, self.initial_state

  def step(self, time_, inputs, state, name=None):
    cell_output, cell_state = self.cell(inputs, state)
    logits = self.compute_output(cell_output)
    sample_ids = self.helper.sample(
        time=time_, outputs=logits, state=cell_state)
    outputs = DecoderOutput(
        logits=logits, predicted_ids=sample_ids, cell_output=cell_output)
    finished, next_inputs, next_state = self.helper.next_inputs(
        time=time_, outputs=outputs, state=cell_state, sample_ids=sample_ids)
    return (outputs, next_state, next_inputs, finished)

相比于BasicDecoder，AttentionDecoder添加了Attention機(jī)制，BeamSearchDecoder通過每一步選擇最好的幾種可能的結(jié)果（相比于其他兩種方法每一步都選擇唯一的最大概率結(jié)果）并采用了剪枝策略和貪心算法，擴(kuò)大了候選項的范圍，提高了捕獲到最優(yōu)解的可能，這里不再貼出代碼。

Seq2Seq模型

Seq2Seq模型建立在編碼器和解碼器之上，將編碼器和解碼器組合起來，編碼器和解碼器又分別可以使用不同種類的RNNCell。
Seq2Seq模型可以通過配置化的方式指定使用什么樣的編碼器、什么樣的解碼器、什么樣注意力機(jī)制等：

// Seq2SeqModel.default_params()
  def default_params():
    params = ModelBase.default_params()
    params.update({
        "source.max_seq_len": 50,
        "source.reverse": True,
        "target.max_seq_len": 50,
        "embedding.dim": 100,
        "embedding.init_scale": 0.04,
        "embedding.share": False,
        "inference.beam_search.beam_width": 0,
        "inference.beam_search.length_penalty_weight": 0.0,
        "inference.beam_search.choose_successors_fn": "choose_top_k",
        "optimizer.clip_embed_gradients": 0.1,
        "vocab_source": "",
        "vocab_target": "",
    })

// BasicSeq2Seq.default_params()
  def default_params():
    params = Seq2SeqModel.default_params().copy()
    params.update({
        "bridge.class": "seq2seq.models.bridges.InitialStateBridge",
        "bridge.params": {},
        "encoder.class": "seq2seq.encoders.UnidirectionalRNNEncoder",
        "encoder.params": {},  # Arbitrary parameters for the encoder
        "decoder.class": "seq2seq.decoders.BasicDecoder",
        "decoder.params": {}  # Arbitrary parameters for the decoder
    })
    return params

// AttentionSeq2Seq.default_params()
  def default_params():
    params = BasicSeq2Seq.default_params().copy()
    params.update({
        "attention.class": "AttentionLayerBahdanau",
        "attention.params": {}, # Arbitrary attention layer parameters
        "bridge.class": "seq2seq.models.bridges.ZeroBridge",
        "encoder.class": "seq2seq.encoders.BidirectionalRNNEncoder",
        "encoder.params": {},  # Arbitrary parameters for the encoder
        "decoder.class": "seq2seq.decoders.AttentionDecoder",
        "decoder.params": {}  # Arbitrary parameters for the decoder
    })
    return params

自定義encoder、decoder過程

作為前面直接調(diào)用現(xiàn)成的seq2seq模型的替代，可以自己構(gòu)造encoder結(jié)構(gòu)并和decoder隨意組合，通過實驗驗證哪種效果更好。同時如果seq2seq.py里現(xiàn)成的encoder-decoder模型不能滿足你的要求，還可以自定義Decoder過程。

總結(jié)：通過本文的介紹，希望能讓初學(xué)者從各種混亂的表述中解脫出來，從更高的層次了解Cell，鏈?zhǔn)浇Y(jié)構(gòu)，seq2seq模型之間的關(guān)系，希望能有些啟發(fā)。

參考：
https://github.com/tensorflow/nmt
https://cs224d.stanford.edu/lecture_notes/LectureNotes4.pdf
https://tensorflowkorea.files.wordpress.com/2017/03/cs224n-2017winter-notes-all.pdf

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

神經(jīng)網(wǎng)絡(luò)三組建：Cell，Encoder/Decoder和Seq2Seq模型

神經(jīng)網(wǎng)絡(luò)三組建：Cell，Encoder/Decoder和Seq2Seq模型

單個RNN cell:

編碼器和解碼器

Seq2Seq模型

自定義encoder、decoder過程

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

神經(jīng)網(wǎng)絡(luò)三組建：Cell，Encoder/Decoder和Seq2Seq模型

單個RNN cell:

編碼器和解碼器

Seq2Seq模型

自定義encoder、decoder過程

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

神經(jīng)網(wǎng)絡(luò)三組建：Cell，Encoder/Decoder和Seq2Seq模型