webrtc視頻幀率控制算法機制(一)--目標碼率丟幀

前言

本篇文章的丟幀是依據編碼后的碼率和目標碼率來決定丟幀,
而下一篇文章介紹的丟幀依據是目標幀率。

http://www.itdecent.cn/p/fe303bdabc26

由此可對丟幀策略分類如下:
  • 編碼后的碼率和目標碼率來決定丟幀
  • 目標幀率決定丟幀

整個幀率控制多次使用的算法---指數權重濾波(暫且如此命名)

在exp_filter.cc文件中:

#include "webrtc/base/exp_filter.h"
#include <math.h>
namespace rtc {
const float ExpFilter::kValueUndefined = -1.0f;
void ExpFilter::Reset(float alpha) {
  alpha_ = alpha;
  filtered_ = kValueUndefined;
}
float ExpFilter::Apply(float exp, float sample) {
  if (filtered_ == kValueUndefined) {
    // Initialize filtered value.
    filtered_ = sample;
  } else if (exp == 1.0) {
    filtered_ = alpha_ * filtered_ + (1 - alpha_) * sample;
  } else {
    float alpha = pow(alpha_, exp);
    filtered_ = alpha  * filtered_ + (1 - alpha)  * sample;
  }
  if (max_ != kValueUndefined && filtered_ > max_) {
    filtered_ = max_;
  }
  return filtered_;
}
void ExpFilter::UpdateBase(float alpha) {
  alpha_ = alpha;
}
}  // namespace rtc

這個文件的大概思想就是對歷史值和當前值做指數加權求和。公式為:

f(x)=alpha*f(x-1)+(1-alpha)*sample;
alpha=pow(alpha_, exp);

其中alpha_為設定常量,exp為冪次方,sample為最新樣點值。
后面還有:

f(x)=min(f(x),max);即不要超過max。

調用丟幀

bool MediaOptimization::DropFrame() {
  CriticalSectionScoped lock(crit_sect_.get());
  UpdateIncomingFrameRate();
  // Leak appropriate number of bytes.
  frame_dropper_->Leak((uint32_t)(InputFrameRateInternal() + 0.5f));
  if (video_suspended_) {
    return true;  // Drop all frames when muted.
  }
  return frame_dropper_->DropFrame();
}

解釋:

  • UpdateIncomingFrameRate();更新采集出來的幀率。
  • frame_dropper_->Leak((uint32_t)(InputFrameRateInternal() + 0.5f));這里主要利用采集幀率,去更新丟幀比率等關鍵丟幀信息。
  • return frame_dropper_->DropFrame();這里就是根據前面計算的丟幀比率等去實現均勻丟幀。
    這些函數的具體實現后面會一一介紹。

更新采集出來的幀率

void MediaOptimization::UpdateIncomingFrameRate() {
  int64_t now = clock_->TimeInMilliseconds();
  if (incoming_frame_times_[0] == 0) {
    // No shifting if this is the first time.
  } else {
    // Shift all times one step.
    for (int32_t i = (kFrameCountHistorySize - 2); i >= 0; i--) {
      incoming_frame_times_[i + 1] = incoming_frame_times_[i];
    }
  }
  incoming_frame_times_[0] = now;
  ProcessIncomingFrameRate(now);
}
//framerate=n/t
void MediaOptimization::ProcessIncomingFrameRate(int64_t now) {
  int32_t num = 0;
  int32_t nr_of_frames = 0;
  for (num = 1; num < (kFrameCountHistorySize - 1); ++num) {
    if (incoming_frame_times_[num] <= 0 ||
        // don't use data older than 2 s
        now - incoming_frame_times_[num] > kFrameHistoryWinMs) {
      break;
    } else {
      nr_of_frames++;
    }
  }
  if (num > 1) {
    const int64_t diff = now - incoming_frame_times_[num - 1];
    incoming_frame_rate_ = 1.0;
    if (diff > 0) {
      incoming_frame_rate_ = nr_of_frames * 1000.0f / static_cast<float>(diff);
    }
  }
}

解釋:
這一段比較好理解,就是根據每一幀到來的時間,最多2秒鐘的統(tǒng)計,利用公式:
incoming_frame_rate_ = nr_of_frames * 1000.0f / static_cast<float>(diff);
得到這一段時間的采集幀率。
對于統(tǒng)計數據,

  for (int32_t i = (kFrameCountHistorySize - 2); i >= 0; i--) {
     incoming_frame_times_[i + 1] = incoming_frame_times_[i];
   }

可見這是一個滑動窗口,即總是用最新的kFrameCountHistorySize 大小的數據。

丟幀算法主要實現

丟幀算法全部在frame_dropper.cc文件中,下面先通過代碼解讀,在細說算法實現。
此為frame_dropper.cc文件內容,及注釋

/*
 *  Copyright (c) 2011 The WebRTC project authors. All Rights Reserved.
 *
 *  Use of this source code is governed by a BSD-style license
 *  that can be found in the LICENSE file in the root of the source
 *  tree. An additional intellectual property rights grant can be found
 *  in the file PATENTS.  All contributing project authors may
 *  be found in the AUTHORS file in the root of the source tree.
 */

#include "webrtc/modules/video_coding/utility/include/frame_dropper.h"

#include "webrtc/system_wrappers/interface/trace.h"

namespace webrtc
{

const float kDefaultKeyFrameSizeAvgKBits = 0.9f;
const float kDefaultKeyFrameRatio = 0.99f;
const float kDefaultDropRatioAlpha = 0.9f;
const float kDefaultDropRatioMax = 0.96f;
const float kDefaultMaxTimeToDropFrames = 4.0f;  // In seconds.

FrameDropper::FrameDropper()
:
_keyFrameSizeAvgKbits(kDefaultKeyFrameSizeAvgKBits),
_keyFrameRatio(kDefaultKeyFrameRatio),
_dropRatio(kDefaultDropRatioAlpha, kDefaultDropRatioMax),
_enabled(true),
_max_time_drops(kDefaultMaxTimeToDropFrames)
{
    Reset();
}

FrameDropper::FrameDropper(float max_time_drops)
:
_keyFrameSizeAvgKbits(kDefaultKeyFrameSizeAvgKBits),
_keyFrameRatio(kDefaultKeyFrameRatio),
_dropRatio(kDefaultDropRatioAlpha, kDefaultDropRatioMax),
_enabled(true),
_max_time_drops(max_time_drops)
{
    Reset();
}

void
FrameDropper::Reset()
{
    _keyFrameRatio.Reset(0.99f);
    _keyFrameRatio.Apply(1.0f, 1.0f/300.0f); // 1 key frame every 10th second in 30 fps
    _keyFrameSizeAvgKbits.Reset(0.9f);
    _keyFrameCount = 0;
    _accumulator = 0.0f;
    _accumulatorMax = 150.0f; // assume 300 kb/s and 0.5 s window
    _targetBitRate = 300.0f;
    _incoming_frame_rate = 30;
    _keyFrameSpreadFrames = 0.5f * _incoming_frame_rate;
    _dropNext = false;
    _dropRatio.Reset(0.9f);
    _dropRatio.Apply(0.0f, 0.0f); // Initialize to 0
    _dropCount = 0;
    _windowSize = 0.5f;
    _wasBelowMax = true;
    _fastMode = false; // start with normal (non-aggressive) mode
    // Cap for the encoder buffer level/accumulator, in secs.
    _cap_buffer_size = 3.0f;
    // Cap on maximum amount of dropped frames between kept frames, in secs.
    _max_time_drops = 4.0f;
}

void
FrameDropper::Enable(bool enable)
{
    _enabled = enable;
}

//deltaFrame : 0:key frame 1:P frame
void
FrameDropper::Fill(size_t frameSizeBytes, bool deltaFrame)
{
    if (!_enabled)
    {
        return;
    }
    float frameSizeKbits = 8.0f * static_cast<float>(frameSizeBytes) / 1000.0f;
    if (!deltaFrame && !_fastMode) // fast mode does not treat key-frames any different//非fast_mode而且key_frame
    {
        //exp=1.0時,filtered_ = alpha_ * filtered_ + (1 - alpha_) * sample;當alpha_=0.8或0.9時,則更偏重于歷史值,而非當前sample
        _keyFrameSizeAvgKbits.Apply(1, frameSizeKbits);
        _keyFrameRatio.Apply(1.0, 1.0);//_keyFrameRatio同樣偏重于歷史值,而當前值設置為1,因為當前為key frame ,所以值為1
        if (frameSizeKbits > _keyFrameSizeAvgKbits.filtered())//當前值大于均值
        {
            // Remove the average key frame size since we
            // compensate for key frames when adding delta
            // frames.
            frameSizeKbits -= _keyFrameSizeAvgKbits.filtered();//超出均值的部分
        }
        else
        {
            // Shouldn't be negative, so zero is the lower bound.
            frameSizeKbits = 0;
        }
        if (_keyFrameRatio.filtered() > 1e-5 &&
            1 / _keyFrameRatio.filtered() < _keyFrameSpreadFrames)   //_keyFrameSpreadFrames = 0.5f * inputFrameRate;
        {
            // We are sending key frames more often than our upper bound for
            // how much we allow the key frame compensation to be spread
            // out in time. Therefor we must use the key frame ratio rather
            // than keyFrameSpreadFrames.
            _keyFrameCount =
                static_cast<int32_t>(1 / _keyFrameRatio.filtered() + 0.5);//每一秒關鍵幀的數量?
        }
        else
        {
            // Compensate for the key frame the following frames
            _keyFrameCount = static_cast<int32_t>(_keyFrameSpreadFrames + 0.5);
        }
    }
    else
    {
        // Decrease the keyFrameRatio
        _keyFrameRatio.Apply(1.0, 0.0);//因為這是P幀,降低_keyFrameRatio的fileter值,因為sample=0
    }
    // Change the level of the accumulator (bucket)
    _accumulator += frameSizeKbits; //_accumulator是frameSizeKbits的累加器,表示超過均值的bit值累加
    CapAccumulator();//max_accumulator = _targetBitRate * _cap_buffer_size;累加器最多為max_accumulator,3倍目標碼率
}

void
FrameDropper::Leak(uint32_t inputFrameRate)
{
    if (!_enabled)
    {
        return;
    }
    if (inputFrameRate < 1)
    {
        return;
    }
    if (_targetBitRate < 0.0f)
    {
        return;
    }
    _keyFrameSpreadFrames = 0.5f * inputFrameRate;
    // T is the expected bits per frame (target). If all frames were the same size,
    // we would get T bits per frame. Notice that T is also weighted to be able to
    // force a lower frame rate if wanted.
    float T = _targetBitRate / inputFrameRate;//T:每一幀期望的bit大小,從下面內容,明顯這個T代表的是每個P幀期望的大小,K幀是另外補償的
    if (_keyFrameCount > 0)
    {
        // Perform the key frame compensation
        if (_keyFrameRatio.filtered() > 0 &&
            1 / _keyFrameRatio.filtered() < _keyFrameSpreadFrames)
        {
            T -= _keyFrameSizeAvgKbits.filtered() * _keyFrameRatio.filtered();//_keyFrameSizeAvgKbits.filtered() * _keyFrameRatio.filtered()為keyframe在每一幀均攤的占用的kbit
        }
        else
        {
            T -= _keyFrameSizeAvgKbits.filtered() / _keyFrameSpreadFrames;//
        }
        _keyFrameCount--;//補償一個關鍵幀,則關鍵幀數-1.
    }
    _accumulator -= T;//累加器在編碼后增加,在編碼前減去當前幀占用的大小
    if (_accumulator < 0.0f)
    {
        _accumulator = 0.0f;
    }
    UpdateRatio();
}

void
FrameDropper::UpdateNack(uint32_t nackBytes)
{
    if (!_enabled)
    {
        return;
    }
    _accumulator += static_cast<float>(nackBytes) * 8.0f / 1000.0f;
}

void
FrameDropper::FillBucket(float inKbits, float outKbits)
{
    _accumulator += (inKbits - outKbits);
}

void
FrameDropper::UpdateRatio()
{
    if (_accumulator > 1.3f * _accumulatorMax)//_accumulatorMax = bitRate * _windowSize;累加器過大之后,減小alpha值,_dropRatio更偏重當前值
    {
        // Too far above accumulator max, react faster
        _dropRatio.UpdateBase(0.8f);
    }
    else
    {
        // Go back to normal reaction
        _dropRatio.UpdateBase(0.9f);
    }
    if (_accumulator > _accumulatorMax)
    {
        // We are above accumulator max, and should ideally
        // drop a frame. Increase the dropRatio and drop
        // the frame later.
        if (_wasBelowMax)//_wasBelowMax = _accumulator < _accumulatorMax;上一次小于_accumulatorMax
        {
            _dropNext = true;//丟掉下一幀
        }
        if (_fastMode)
        {
            // always drop in aggressive mode
            _dropNext = true;
        }

        _dropRatio.Apply(1.0f, 1.0f);//因為丟幀,所以sample為1
        _dropRatio.UpdateBase(0.9f);
    }
    else
    {
        _dropRatio.Apply(1.0f, 0.0f);//不丟幀,sample為0
    }
    _wasBelowMax = _accumulator < _accumulatorMax;
}

// This function signals when to drop frames to the caller. It makes use of the dropRatio
// to smooth out the drops over time.
bool
FrameDropper::DropFrame()
{
    if (!_enabled)
    {
        return false;
    }
    if (_dropNext)
    {
        _dropNext = false;
        _dropCount = 0;
    }

    if (_dropRatio.filtered() >= 0.5f) // Drops per keep//>=0.5表示當前幀不丟,下一幀一定丟,即2個至少丟一個
    {
        // limit is the number of frames we should drop between each kept frame
        // to keep our drop ratio. limit is positive in this case.
        float denom = 1.0f - _dropRatio.filtered();//denom:分母,表示不丟的比率
        if (denom < 1e-5)
        {
            denom = (float)1e-5;
        }
        int32_t limit = static_cast<int32_t>(1.0f / denom - 1.0f + 0.5f);//這里注釋意思limit代表需要丟掉的幀數,即如果當前幀不丟,則后面有l(wèi)imit幀需要丟掉
        // Put a bound on the max amount of dropped frames between each kept
        // frame, in terms of frame rate and window size (secs).
        int max_limit = static_cast<int>(_incoming_frame_rate *
                                         _max_time_drops);//4倍幀率,max_limit則表示連續(xù)丟掉4倍幀率的幀,明顯太大了
        if (limit > max_limit) {
          limit = max_limit;
        }
        if (_dropCount < 0)//_dropCount表示當前這一輪丟幀,已經丟掉的幀數
        {
            // Reset the _dropCount since it was negative and should be positive.
            if (_dropRatio.filtered() > 0.4f)
            {
                _dropCount = -_dropCount;
            }
            else
            {
                _dropCount = 0;
            }
        }
        if (_dropCount < limit)//直到丟掉limit幀
        {
            // As long we are below the limit we should drop frames.
            _dropCount++;
            return true;
        }
        else
        {
            // Only when we reset _dropCount a frame should be kept.
            _dropCount = 0;
            return false;
        }
    }
    else if (_dropRatio.filtered() > 0.0f &&
        _dropRatio.filtered() < 0.5f) // Keeps per drop//表示當前幀不丟,下一幀可能丟,也可能不丟,即每隔若干幀丟一幀
    {
        // limit is the number of frames we should keep between each drop
        // in order to keep the drop ratio. limit is negative in this case,
        // and the _dropCount is also negative.
        float denom = _dropRatio.filtered();
        if (denom < 1e-5)
        {
            denom = (float)1e-5;
        }
        int32_t limit = -static_cast<int32_t>(1.0f / denom - 1.0f + 0.5f);
        if (_dropCount > 0)
        {
            // Reset the _dropCount since we have a positive
            // _dropCount, and it should be negative.
            if (_dropRatio.filtered() < 0.6f)
            {
                _dropCount = -_dropCount;
            }
            else
            {
                _dropCount = 0;
            }
        }
        if (_dropCount > limit)
        {
            if (_dropCount == 0)
            {
                // Drop frames when we reset _dropCount.
                _dropCount--;
                return true;//丟,明顯每次只丟一幀
            }
            else
            {
                // Keep frames as long as we haven't reached limit.
                _dropCount--;
                return false;//不丟,直到_dropCount > limit,則重新置_dropCount = 0;開始新一輪丟幀
            }
        }
        else
        {
            _dropCount = 0;
            return false;
        }
    }
    _dropCount = 0;
    return false;

    // A simpler version, unfiltered and quicker
    //bool dropNext = _dropNext;
    //_dropNext = false;
    //return dropNext;
}

void
FrameDropper::SetRates(float bitRate, float incoming_frame_rate)
{
    // Bit rate of -1 means infinite bandwidth.
    _accumulatorMax = bitRate * _windowSize; // bitRate * windowSize (in seconds)
    if (_targetBitRate > 0.0f && bitRate < _targetBitRate && _accumulator > _accumulatorMax)
    {
        // Rescale the accumulator level if the accumulator max decreases
        _accumulator = bitRate / _targetBitRate * _accumulator;
    }
    _targetBitRate = bitRate;
    CapAccumulator();
    _incoming_frame_rate = incoming_frame_rate;
}

float
FrameDropper::ActualFrameRate(uint32_t inputFrameRate) const
{
    if (!_enabled)
    {
        return static_cast<float>(inputFrameRate);
    }
    return inputFrameRate * (1.0f - _dropRatio.filtered());//實際編碼幀率
}

// Put a cap on the accumulator, i.e., don't let it grow beyond some level.
// This is a temporary fix for screencasting where very large frames from
// encoder will cause very slow response (too many frame drops).
void FrameDropper::CapAccumulator() {
  float max_accumulator = _targetBitRate * _cap_buffer_size;
  if (_accumulator > max_accumulator) {
    _accumulator = max_accumulator;
  }
}

}

1、丟幀的決定因素在_dropRatio.Apply(1.0f, 1.0f);通過給_dropRatio賦值,使得_dropRatio不為0.而_dropRatio.Apply(1.0f, 1.0f);調用的起因,還在

int32_t VCMEncodedFrameCallback::Encoded
->int32_t MediaOptimization::UpdateWithEncodedData
->FrameDropper::Fill(size_t frameSizeBytes, bool deltaFrame)

通過Fill函數中的_accumulator(累加器),再通過

FrameDropper::Leak(uint32_t inputFrameRate)
->FrameDropper::UpdateRatio()

來最終調用_dropRatio.Apply(1.0f, 1.0f)或_dropRatio.Apply(1.0f, 0.0f)

2、丟幀的方法
在FrameDropper::DropFrame()函數中,通過上面注釋的代碼也可以理解。

drop.png

就是當dropRatio>=0.5時,兩個幀之間可能丟多個;當dropRatio<0.5時,兩個幀之間最多丟一個。

3、調用丟幀的地方

  • int32_t VideoSender::AddVideoFrame()幀數據加入encoder之前

4、如何從_accumulator控制幀率

  • FrameDropper::Fill()中,每編碼完一幀數據,就將數據的大小累加到_accumulator,其中P幀全部累加,K幀只加超出均值的部分。
  • 每個采集后,即將給到編碼器的幀,利用_targetBitRate / inputFrameRate;得到每一幀期望占用的bit大小,其中K幀單獨計算:
    _keyFrameSizeAvgKbits.filtered() * _keyFrameRatio.filtered();
疑問:
為什么_accumulator累加時,K幀只加超出均值的部分,而不是全部。
```

5、什么時候丟幀
_accumulator > _accumulatorMax;
其中,_accumulatorMax = bitRate * _windowSize;(_windowSize=0.5f)

##編碼完后,更新_accumulator 
這一部分只是說明編碼完后怎么去更新_accumulator 的流程,比較容易看懂。
```
int32_t VCMEncodedFrameCallback::Encoded(
    const EncodedImage& encodedImage,
    const CodecSpecificInfo* codecSpecificInfo,
    const RTPFragmentationHeader* fragmentationHeader) {
  post_encode_callback_->Encoded(encodedImage, NULL, NULL);

  if (_sendCallback == NULL) {
    return VCM_UNINITIALIZED;
  }

  RTPVideoHeader rtpVideoHeader;
  memset(&rtpVideoHeader, 0, sizeof(RTPVideoHeader));
  RTPVideoHeader* rtpVideoHeaderPtr = &rtpVideoHeader;
  CopyCodecSpecific(codecSpecificInfo, &rtpVideoHeaderPtr);

  int32_t callbackReturn = _sendCallback->SendData(
      _payloadType, encodedImage, *fragmentationHeader, rtpVideoHeaderPtr);
  if (callbackReturn < 0) {
    return callbackReturn;
  }

  if (_mediaOpt != NULL) {

   //編碼后的統(tǒng)計信息更新
    _mediaOpt->UpdateWithEncodedData(encodedImage);

    if (_internalSource)
      return _mediaOpt->DropFrame();  // Signal to encoder to drop next frame.
  }
  return VCM_OK;
}
```

```
int32_t MediaOptimization::UpdateWithEncodedData(
    const EncodedImage& encoded_image) {
  size_t encoded_length = encoded_image._length;
  uint32_t timestamp = encoded_image._timeStamp;
  CriticalSectionScoped lock(crit_sect_.get());
  const int64_t now_ms = clock_->TimeInMilliseconds();
  PurgeOldFrameSamples(now_ms);
  if (encoded_frame_samples_.size() > 0 &&
      encoded_frame_samples_.back().timestamp == timestamp) {
    // Frames having the same timestamp are generated from the same input
    // frame. We don't want to double count them, but only increment the
    // size_bytes.
    encoded_frame_samples_.back().size_bytes += encoded_length;
    encoded_frame_samples_.back().time_complete_ms = now_ms;
  } else {
    encoded_frame_samples_.push_back(
        EncodedFrameSample(encoded_length, timestamp, now_ms));
  }
  UpdateSentBitrate(now_ms);
  UpdateSentFramerate();
  if (encoded_length > 0) {
    const bool delta_frame = encoded_image._frameType != kKeyFrame;//0:key 1:P

    //這里將每次編碼完的數據長度Fill到frame_dropper
    frame_dropper_->Fill(encoded_length, delta_frame);

    if (max_payload_size_ > 0 && encoded_length > 0) {
      const float min_packets_per_frame =
          encoded_length / static_cast<float>(max_payload_size_);
      if (delta_frame) {
        loss_prot_logic_->UpdatePacketsPerFrame(min_packets_per_frame,
                                                clock_->TimeInMilliseconds());
      } else {
        loss_prot_logic_->UpdatePacketsPerFrameKey(
            min_packets_per_frame, clock_->TimeInMilliseconds());
      }

      if (enable_qm_) {
        // Update quality select with encoded length.
        qm_resolution_->UpdateEncodedSize(encoded_length);
      }
    }
    if (!delta_frame && encoded_length > 0) {
      loss_prot_logic_->UpdateKeyFrameSize(static_cast<float>(encoded_length));
    }

    // Updating counters.
    if (delta_frame) {
      delta_frame_cnt_++;
    } else {
      key_frame_cnt_++;
    }
  }

  return VCM_OK;
}
```

解釋:
編碼完后的數據都是經過callback回調的,
```
int32_t VCMEncodedFrameCallback::Encoded
->int32_t MediaOptimization::UpdateWithEncodedData
->frame_dropper_->Fill(encoded_length, delta_frame);
```
經過這個流程,每次編碼后,送給發(fā)送的數據都要去更新frame_dropper_。

后記:
作者對于這一個算法的機制原理,也不是很明白,只能從代碼中體會算法實現,不免有錯誤理解,如有更好理解或者不同見解的道友,敬請賜教,不勝感激!
最后編輯于
?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容