解碼流程概要
- 從解碼前視頻隊(duì)列里取出視頻包
- 解析NALU,獲取vps、sps、pps、sei等信息
- 根據(jù)獲取到的參數(shù)信息初始化解碼器參數(shù)CMVideoFormatDescriptionRef
- 創(chuàng)建解碼器
- 解碼
詳情
一、從解碼前視頻隊(duì)列里取出視頻包
通常取出的是AVPacket或者封裝的視頻包結(jié)構(gòu)體對(duì)象,無論是h264還是hevc一個(gè)視頻包里都只有一個(gè)視頻幀(音頻包可能有多個(gè)音頻幀),必要信息:數(shù)據(jù)幀、數(shù)據(jù)長度、顯示時(shí)間戳、解碼時(shí)間戳、數(shù)據(jù)的時(shí)長
二、解析NALU,獲取vps、sps、pps、sei等信息
根據(jù)codecType獲取naluType,如下codecType為h264或者h(yuǎn)evc,data是碼流的nalu頭,轉(zhuǎn)成十進(jìn)制即是我們要的naluId
int naluId = -1;
if(codecType == VT_CODEC_TYPE_H264)
{
naluId = data & 0x1F;
}
if(codecType == VT_CODEC_TYPE_H265)
{
naluId = (data & 0x7E) >> 1;
}
naluId對(duì)應(yīng)的naluType如下:
int naluType = VT_NALU_TYPE_UNK;
if(codecType == VT_CODEC_TYPE_H264)
{
switch (val)
{
case 7:
naluType = VT_NALU_TYPE_SPS;
break;
case 8:
naluType = VT_NALU_TYPE_PPS;
break;
case 5:
naluType = VT_NALU_TYPE_IDR;
break;
case 1:
naluType = VT_NALU_TYPE_NONIDR;
default:
break;
}
}
else if(codecType == VT_CODEC_TYPE_H265)
{
if(val <= 9)
naluType = VT_NALU_TYPE_NONIDR;
else if(val >= 16 && val <= 23)
naluType = VT_NALU_TYPE_IDR;
else if(val == 33)
naluType = VT_NALU_TYPE_SPS;
else if(val == 34)
naluType = VT_NALU_TYPE_PPS;
else if(val == 32)
naluType = VT_NALU_TYPE_VPS;
}
如果是hevc,解析到參數(shù)的順序是vps、sps、pps,保存起來用于創(chuàng)建解碼器參數(shù)配置
三、根據(jù)獲取到的參數(shù)信息初始化解碼器參數(shù)CMVideoFormatDescriptionRef
CMVideoFormatDescriptionRef input_format = nullptr;
OSStatus status = -1;
if(codecType == VT_CODEC_TYPE_H264)
{
int count = 1 + _ppsNums;
uint8_t **parameterSetPointers =(uint8_t**) malloc(count * sizeof(uint8_t *));
size_t *parameterSetSizes = (size_t *)malloc(count * sizeof(size_t));
int i = 1;
parameterSetPointers[0] = _sps;
parameterSetSizes[0] = _spsSize;
for(i ; i < count ; i++)
{
parameterSetSizes[i] = _ppsSize[i-1];
parameterSetPointers[i] = _pps[i-1];
}
status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault,
count,
parameterSetPointers,
parameterSetSizes,
4,
&input_format);
free(parameterSetPointers);
free(parameterSetSizes);
}
else if(codecType == VT_CODEC_TYPE_H265)
{
int count = 2 + _ppsNums;
uint8_t **parameterSetPointers =(uint8_t**) malloc(count * sizeof(uint8_t *));
size_t *parameterSetSizes = (size_t *)malloc(count * sizeof(size_t));
int i = 2;
parameterSetPointers[0] = _vps;
parameterSetPointers[1] = _sps;
parameterSetSizes[0] = _vpsSize;
parameterSetSizes[1] = _spsSize;
for(i ; i < count ; i++)
{
parameterSetSizes[i] = _ppsSize[i-2];
parameterSetPointers[i] = _pps[i-2];
}
status = CMVideoFormatDescriptionCreateFromHEVCParameterSets(kCFAllocatorDefault,
count,
parameterSetPointers,
parameterSetSizes,
4,
NULL,
&input_format);
free(parameterSetPointers);
free(parameterSetSizes);
}
四、創(chuàng)建解碼器
蘋果建議用kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange(nv12)類型解碼,根據(jù)上面的input_format創(chuàng)建解碼器session并且設(shè)置解碼回調(diào)didDecompress
CFDictionaryRef attrs = NULL;
const void *keys[] = { kCVPixelBufferPixelFormatTypeKey };
uint32_t v = kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange;
const void *values[] = { CFNumberCreate(NULL, kCFNumberSInt32Type, &v) };
attrs = CFDictionaryCreate(NULL, keys, values, 1, NULL, NULL);
VTDecompressionOutputCallbackRecord callBackRecord;
callBackRecord.decompressionOutputCallback = didDecompress;
callBackRecord.decompressionOutputRefCon = this;
OSStatus status = VTDecompressionSessionCreate(kCFAllocatorDefault,
input_format,
NULL,
attrs,
&callBackRecord,
&_deocderSession);
CFRelease(attrs);
五、解碼
解碼過程中可以動(dòng)態(tài)配置解碼參數(shù),蘋果官方文檔定義如下
//解碼參數(shù)
typedef CF_OPTIONS(uint32_t, VTDecodeFrameFlags) {
kVTDecodeFrame_EnableAsynchronousDecompression = 1<<0,
kVTDecodeFrame_DoNotOutputFrame = 1<<1,
kVTDecodeFrame_1xRealTimePlayback = 1<<2,
kVTDecodeFrame_EnableTemporalProcessing = 1<<3,
};
解釋一下上面的參數(shù)
kVTDecodeFrame_EnableAsynchronousDecompression允許異步解碼,解碼器默認(rèn)是同步解碼,即解碼之后同步回調(diào),設(shè)置允許異步解碼之后,解碼器會(huì)同時(shí)解碼幾幀數(shù)據(jù),帶來的后果是,解碼總體時(shí)間更短,但是前面幾幀回調(diào)的時(shí)間可能長一些
kVTDecodeFrame_DoNotOutputFrame通知解碼器不要輸出幀,設(shè)置之后解碼回調(diào)會(huì)返回NULL,某些情況我們不需要解碼器輸出幀,比如發(fā)生解碼器狀態(tài)錯(cuò)誤的時(shí)候
kVTDecodeFrame_1xRealTimePlayback通知解碼器使用低功耗模式,設(shè)置之后處理器消耗會(huì)變少,解碼速度會(huì)變慢,通常我們不會(huì)設(shè)置這個(gè)參數(shù),因?yàn)橛步獯a使用的是專用處理器,不消耗cpu,所以越快越好
kVTDecodeFrame_EnableTemporalProcessing通知解碼器需要處理幀序,設(shè)置之后解碼回調(diào)會(huì)變慢,因?yàn)闊o論是異步解碼還是pts、dts不等的時(shí)候都需要進(jìn)行幀排序,會(huì)耗時(shí),蘋果官方文檔不建議我們使用這個(gè)參數(shù)
解碼流程如下
void *sourceRef = NULL;
CMBlockBufferRef blockBuffer = NULL;
CMSampleTimingInfo timingInfo;
timingInfo.presentationTimeStamp = CMTimeMake(pts, 1000);
timingInfo.duration = CMTimeMake(duration, 1000);
timingInfo.decodeTimeStamp = CMTimeMake(dts, 1000);
OSStatus status = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault,
(void*)pBuffer,
size,
kCFAllocatorNull,
NULL,
0,
size,
0,
&blockBuffer);
if(status == kCMBlockBufferNoErr)
{
CMSampleBufferRef sampleBuffer = NULL;
const size_t sampleSizeArray[] = {size};
status = CMSampleBufferCreateReady(kCFAllocatorDefault,
blockBuffer,
_decoderFormatDescription ,
1,
1,
&timingInfo,
1,
sampleSizeArray,
&sampleBuffer);
if (status == kCMBlockBufferNoErr && sampleBuffer)
{
VTDecodeFrameFlags flags = 0;
VTDecodeInfoFlags flagOut = 0;
if(m_useAsync)
flags |= kVTDecodeFrame_EnableAsynchronousDecompression;
if(m_bNeedRefresh || m_bNeedRecovery)
{
flags |= kVTDecodeFrame_DoNotOutputFrame;
}
OSStatus decodeStatus = VTDecompressionSessionDecodeFrame(_deocderSession,
sampleBuffer,
flags,
&sourceRef,
&flagOut);
if(decodeStatus == kVTInvalidSessionErr)
{
m_bNeedRefresh = 1;
}
else if(decodeStatus == kVTVideoDecoderBadDataErr)
{
NSLog(@"VT decoder failed kVTVideoDecoderBadDataErr \n");
}
CFRelease(sampleBuffer);
}
CFRelease(blockBuffer);
}
關(guān)鍵流程處理
1、解碼器參數(shù)更新策略
對(duì)于直播場景,觀眾端看播可能是任意時(shí)間節(jié)點(diǎn)的,這意味著,客戶端播放器需要支持任意節(jié)點(diǎn)解碼直播流,這就需要推流端每個(gè)關(guān)鍵幀前面都要加上vps、sps、pps信息,客戶端存儲(chǔ)這些參數(shù)信息,每次解析到相關(guān)參數(shù)時(shí)候用于對(duì)比,如果參數(shù)有更新,重新初始化解碼器
對(duì)于點(diǎn)播而言,通常vps、sps、pps信息指在視頻頭有一份,解析一遍即可,但是也不完全如此,因?yàn)檫@個(gè)點(diǎn)播可能是直播生成的回放,這中情況就必須與直播的策略相同,否側(cè)會(huì)解碼失敗或者花屏
2、app進(jìn)入后臺(tái)策略
iOS硬解碼是支持后臺(tái)解碼的,當(dāng)程序進(jìn)入后臺(tái)之后解碼器會(huì)返kVTInvalidSessionErr錯(cuò)誤,
注意:這與程序是否有后臺(tái)權(quán)限無關(guān),這時(shí)候需要保留數(shù)據(jù)包,重新創(chuàng)建解碼器,再行解碼
3、幀排序策略
通常直播場景我們是不會(huì)編b幀進(jìn)來的,延時(shí)會(huì)變大,h264的一般用baseline,iOS平臺(tái)參數(shù)kVTProfileLevel_H264_Baseline_AutoLevel,除了baseline,其他規(guī)格的都需要解碼后幀排序,hevc在iOS平臺(tái)只開放兩種kVTProfileLevel_HEVC_Main_AutoLevel``kVTProfileLevel_HEVC_Main10_AutoLevelprofile,都是支持b幀的,所以需要幀排序
4、解碼暫停策略
解碼器沒有暫停功能,進(jìn)來的幀只能解碼或者丟掉,所以解碼器管理模塊需要設(shè)計(jì)一個(gè)停止給解碼器輸入數(shù)據(jù)模塊,通過我們的測試,解碼后視頻隊(duì)列大于3幀的時(shí)候,就不在輸入數(shù)據(jù)給解碼器,可以保證內(nèi)存占用不多并且流暢播放