iOS 音視頻(四) - 音頻AAC編解碼

相關(guān)文獻(xiàn):
iOS 音視頻(一) - 基礎(chǔ)知識
iOS 音視頻(二) - 視頻編碼-H264概念與原理
iOS 音視頻(三) - 視頻編碼-實現(xiàn)H264編解碼
iOS 音視頻(四) - 音頻AAC編解碼

文章結(jié)掌握內(nèi)容:
1.音頻基礎(chǔ)知識
2.音頻編碼原理
3.音頻壓縮編碼格式
4.音頻ACC編碼實現(xiàn)
5.音頻ACC解碼實現(xiàn)

一、音頻基礎(chǔ)知識

1.聲音

什么是聲音?
聲音是波,靠物體的振動產(chǎn)生。

聲波的三要素:是頻率、振幅、波形(頻率代表音階的高低;振幅代表響度;波形則代表音色)。

  • 頻率越高,波長就會越短;低頻聲響的波長則較長。
    所以這樣的聲音更容易繞過障礙物,能量衰減就越小,聲音就會傳播的越遠(yuǎn)。

  • 響度 就是能量大小的反饋。用不同的力度敲打桌面,聲音的大小勢必發(fā)生變換。
    在生活中,我們用分貝描述聲音的響度。

  • 音色 在同樣的頻率和響度下,不同的物體發(fā)出的聲音不一樣。
    比如鋼琴和古箏聲音就完全不同。波形的形狀決定了聲音的音色,因為不同的介質(zhì)所產(chǎn)生的波形不同,就會產(chǎn)生不一樣的音色。

聲音的傳播 可以通過空氣、液體、固體傳播。介質(zhì)不同,會影響聲音的傳播速度。

  • 吸音棉:通過聲音反射而產(chǎn)生的嘈雜感,吸音材料選擇使用可以衰減入射音源的反射能量,從而對原有聲音的保真效果。比如錄音棚墻壁上就會使用吸音材質(zhì)。
  • 隔音:主要解決聲音穿透而降低主體空間的吵鬧感,隔音棉材質(zhì)可以衰減入射聲音的透射能量,從而達(dá)到主體空間安靜狀態(tài)。比如KTV墻壁上就會安裝隔音棉材料。
2.模擬信號數(shù)字化過程 PCM

將模擬信號轉(zhuǎn)換為數(shù)字信號所得到的數(shù)據(jù)就是PCM數(shù)據(jù)(脈沖編碼調(diào)制)。

將模擬信號轉(zhuǎn)換為數(shù)字信號的過程:分別是音頻采樣,量化,編碼

  • 音頻采樣

對模型信號進(jìn)行采樣,采樣可以理解為在時間軸上對信號進(jìn)行數(shù)字化。

根據(jù)奈斯特定理(采樣定理),按比聲音最高頻率高2倍以上的頻率對聲音進(jìn)行采樣,這個過程稱為AD轉(zhuǎn)換。

人類聽到聲音的頻率范圍是20Hz-20KHz,所以采樣頻率一般是44.1KHz。這樣可以保證采樣聲音達(dá)到20KHz也能被數(shù)字化,而且經(jīng)過數(shù)字化處理后的聲音,音質(zhì)也不會降低。(44.1KHZ指的是1秒會采樣44100次)

  • 量化

量化指的是在幅度軸上對信號進(jìn)行數(shù)字化。

就是聲音波形的數(shù)據(jù)是多少位的二進(jìn)制數(shù)據(jù),通常用bit做單位。
比如16比特的二進(jìn)制信號來表示聲音的一個采樣,它的取值范圍[-32768,32767],一共有65536個值。
如16bit、24bit;16bit量化級記錄聲音的數(shù)據(jù)是用16位的二進(jìn)制數(shù)。

因此量化級也是數(shù)字聲音質(zhì)量的重要指標(biāo)。形容數(shù)字聲音的質(zhì)量通常就描述為24bit(量化級)、48KHz采樣,比如標(biāo)準(zhǔn)CD音樂的質(zhì)量就是16bit、44.1KHz采樣。

  • 編碼

編碼就是按照一定格式記錄采樣和量化后的數(shù)據(jù)。

音頻裸數(shù)據(jù)指的是脈沖編碼調(diào)制(PCM)數(shù)據(jù)。

想要描述一份PCM數(shù)據(jù),需要從如下幾個方向出發(fā):
1.量化格式(sampleFormat)
2.采樣率(sampleRate)
3.聲道數(shù)(channel)

舉例: CD音質(zhì)為例
量化格式為16bite,采樣率為44100,聲道數(shù)為2。
這些信息描述CD音質(zhì).那么可以CD音質(zhì)數(shù)據(jù),比特率是多少?
44100 * 16 * 2 = 1378.125kbps

那么一分鐘的,這類CD音質(zhì)數(shù)據(jù)需要占用多少存儲空間?
1378.125 * 60 /8/1024 = 10.09MB

如果sampleFormat更加精確或者sampleRate更加密集,那么所占的存儲空間就會越大,同時能夠描述的聲音細(xì)節(jié)就會更加精確。

存儲在這些二進(jìn)制數(shù)據(jù)即可理解為將模型信號轉(zhuǎn)化為數(shù)字信號。
那么轉(zhuǎn)為數(shù)字信號之后,就可以對這些數(shù)據(jù)進(jìn)行存儲\播放\復(fù)制獲取其它任何操作。

二、音頻編碼原理

為什么要做音頻編碼?
從上面舉例得到一分鐘需要大概10.1MB的存儲空間,從存儲的角度或者網(wǎng)絡(luò)實時傳播的角度,這個數(shù)據(jù)量都是太大了。

音頻編碼需要剔除什么樣的冗余?
壓縮編碼的原理實際上就是壓縮冗余的信號。 冗余信號就是指不能被人耳感知的信號,包括人耳聽覺范圍之外的音頻信號以及被掩蓋掉的音頻信號。

數(shù)字音頻信號包含的對人們感受信息影響可以忽略的成分成為冗余:
1.頻域冗余
2.時域冗余
3.聽覺冗余:人類聽得到頻率范圍20Hz-20KHz以外的音頻
(干掉那些人聽不到的音頻)

人耳朵遮蓋效應(yīng):主要表現(xiàn)在頻域掩蓋效應(yīng)與時域掩蓋效應(yīng)。

掩蔽效應(yīng)指人的耳朵只對最明顯的聲音反應(yīng)敏感,而對于不明顯的聲音,反應(yīng)則較不為敏感。例如在聲音的整個頻率譜中,如果某一個頻率段的聲音比較強,則人就對其它頻率段的聲音不敏感了。

1.頻域遮蔽效應(yīng)

假設(shè)遮蔽音是單一頻率的純音,它的遮蔽效果會隨著音量變大,遮蔽的頻率范圍也會變大,在頻域中1kHz 能量強度約70dB的聲音信號會對鄰近的三組聲音信號產(chǎn)生遮蔽效應(yīng),因此對于聲音信號而言,能量的強度將會影響所能遮蔽的范圍,當(dāng)能量愈強時,所能遮蔽的范圍也會相對越大。

而當(dāng)一組聲音信號高于遮罩門檻,如上圖范例所示,在0.5kHz其能量強度達(dá)48dB時,此聲音信號便又會被人耳察覺。一般來說,弱純音頻率相隔強純音頻率愈接近,就愈容易被掩蔽。因為人腦對聽覺有這種遮蔽的反應(yīng),而多數(shù)的噪音也不只是單一的頻率音,所以就會把某個范圍的頻率也遮蔽了。

2.時域遮蔽效應(yīng)

時域的遮蔽效應(yīng)指在時間上相鄰的聲音之間也有掩蔽現(xiàn)象,其可分為前遮蔽(Pre-masking) 、同步遮蔽(Simultaneous-masking)、后遮蔽(Post-masking)。

產(chǎn)生時域掩蔽的主要原因是人的大腦處理信息需要花費一定的時間。前遮蔽一般來說只有大慨50ms,而后遮蔽一般可以持續(xù)達(dá)200ms。

假設(shè)一個很響的聲音后面緊跟著一個很弱的聲音,而時間差在200ms之內(nèi),弱音就很難聽到,相反在弱音后緊跟著一個很強的音,而時間在50ms之內(nèi),弱音也是很難聽到。當(dāng)然這個對強弱音的音壓差距也會產(chǎn)生不同的遮蔽程度。

三、音頻壓縮編碼格式

1.WAV編碼

WAV編碼的一種實現(xiàn)方式(其實它有非常多實現(xiàn)方式,但都是不會進(jìn)行壓縮操作)。它是在源PCM數(shù)據(jù)格式的前面加上44個字節(jié),分別用來描述PCM的采樣率、聲道數(shù)、數(shù)據(jù)格式等信息。

  • 特點:音質(zhì)非常好,大量軟件都支持其播放
  • 適合場合:多媒體開發(fā)的中間文件,保存音樂和音效素材
2.MP3編碼

MP3編碼具有不錯的壓縮比,而且聽感也接近于WAV文件,當(dāng)然在不同的環(huán)境下,應(yīng)該調(diào)整合適的參數(shù)來達(dá)到更好的效果。

  • 特點:音質(zhì)在128Kbit/s以上表現(xiàn)不錯,壓縮比比較高。大量軟件和硬件都支持,兼容性高。
  • 適合場合:高比特率下對兼容性有要求的音樂欣賞。
3.AAC編碼

AAC是目前比較熱門的有損壓縮編碼技術(shù),并且衍生了LC-AAC、HE-AAC、HE-AAC v2 三種主要編碼格式。

  • LC-AAC 是比較傳統(tǒng)的AAC,主要應(yīng)用于中高碼率的場景編碼(>= 80Kbit/s);
  • HE-AAC 主要應(yīng)用于低碼率場景的編碼(<= 48Kbit/s)
  • 特點:在小于128Kbit/s的碼率下表現(xiàn)優(yōu)異,并且多用于視頻中的音頻編碼。
  • 適合場景:于128Kbit/s以下的音頻編碼,多用于視頻中的音頻軌的編碼。
4.Ogg編碼

Ogg編碼是一種非常有潛力的編碼,在各種碼率下都有比較優(yōu)秀的表現(xiàn)。尤其在低碼率場景下,Ogg除了音質(zhì)好之外,Ogg的編碼算法也是非常出色,可以用更小的碼率達(dá)到更好的音質(zhì),128Kbit/s的Ogg比192Kbit/s甚至更高碼率的MP3更優(yōu)質(zhì)。但目前由于軟件還是硬件支持問題,都沒法達(dá)到與MP3的使用廣度。

  • 特點:可以用比MP3更小的碼率實現(xiàn)比MP3更好的音質(zhì),高中低碼率下均有良好的表現(xiàn),兼容不夠好,流媒體特性不支持。
  • 適合場景:語言聊天的音頻消息場景。

AAC封裝類型:

  1. ADIF (Audio Data Interchange Format): 音頻數(shù)據(jù)交換格式。這種格式一般應(yīng)用在將音頻通過寫文件方式存儲在磁盤里,不能進(jìn)行隨機訪問,不允許在文件中間開始進(jìn)行解碼。

  2. ADTS (Audio Data Transport Stream): 音頻數(shù)據(jù)傳輸流。這種格式的特征是用同步字節(jié)進(jìn)行將AAC音頻截斷,然后可以允許客戶端在任何地方進(jìn)行解碼播放,適合網(wǎng)絡(luò)傳輸場景。

注:ADTS Header的長度可能為7字節(jié)或9字節(jié)。protection_absent=0時,9字節(jié).protection_absent=1時,7字節(jié)

將PCM轉(zhuǎn)換為AAC音頻流:

  1. 設(shè)置編碼器 (codec),
  2. 采集音頻,AVFoundation已將模型信號轉(zhuǎn)化為數(shù)字信號得到PCM數(shù)據(jù)
  3. 收集PCM數(shù)據(jù),傳給編碼器
  4. 編碼完成后回調(diào)callback,寫入文件/網(wǎng)絡(luò)傳輸

四、音頻ACC編碼實現(xiàn)

編碼配置:

#import <Foundation/Foundation.h>
@interface AudioAccConfig : NSObject
// 碼率(96000)
@property (nonatomic, assign) NSInteger bitrate;
// 聲道(1 或者 2)
@property (nonatomic, assign) NSInteger channelCount;
// 采樣率 (44100)
@property (nonatomic, assign) NSInteger sampleRate;
// 采樣點量化(16)
@property (nonatomic, assign) NSInteger sampleSize;

+ (instancetype)defaultConfig;
@end
#import "AudioAccConfig.h"
@implementation AudioAccConfig

+ (instancetype)defaultConfig {
    return [[AudioAccConfig alloc] init];
}

- (instancetype)init {
    self = [super init];
    if (self) {
        self.bitrate = 96000;
        self.channelCount = 1;
        self.sampleRate = 44100;
        self.sampleSize = 16;
    }
    return self;
}
@end

acc編碼編碼實現(xiàn):

#import <Foundation/Foundation.h>
#import <AVFoundation/AVFoundation.h>
@class AudioAccConfig;

@interface AACEncoder: NSObject

// 初始化
- (instancetype)init;
- (instancetype)initWithConfig: (AudioAccConfig *)config;

// 不斷傳入音頻數(shù)據(jù),并編碼后通過block返回編碼后的acc數(shù)據(jù)
- (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer completionBlock:(void (^)(NSData *encodedData, NSError* error))completionBlock;
@end
#import "AACEncoder.h"
#import <AudioToolbox/AudioToolbox.h>
#import "AudioAccConfig.h"

@interface AACEncoder()
@property (nonatomic) dispatch_queue_t encoderQueue;
@property (nonatomic) dispatch_queue_t callbackQueue;

@property (nonatomic, strong) AudioAccConfig *config;

// 音頻轉(zhuǎn)換器對象
@property (nonatomic, unsafe_unretained) AudioConverterRef audioConverter;
@property (nonatomic) uint8_t *aacBuffer;
@property (nonatomic) NSUInteger aacBufferSize;
@property (nonatomic) char *pcmBuffer;
@property (nonatomic) size_t pcmBufferSize;
@end

@implementation AACEncoder

- (void)dealloc {
    AudioConverterDispose(_audioConverter);
    free(_aacBuffer);
}

- (instancetype)init {
    return [self initWithConfig:nil];
}

- (instancetype)initWithConfig:(AudioAccConfig *)config {
    self = [super init];
    if (self) {
        // 音頻編碼配置
        _config = config;
        if (config == nil) {
            _config = AudioAccConfig.defaultConfig;
        }
        // 初始化隊列
        _encoderQueue = dispatch_queue_create("AAC Encoder Queue", DISPATCH_QUEUE_SERIAL);
        _callbackQueue = dispatch_queue_create("AAC Encoder Callback Queue", DISPATCH_QUEUE_SERIAL);
        // 轉(zhuǎn)換器對象/pcm緩沖區(qū)/acc緩沖區(qū)
        _audioConverter = NULL;
        _pcmBufferSize = 0;
        _pcmBuffer = NULL;
        _aacBufferSize = 1024;
        _aacBuffer = malloc(_aacBufferSize * sizeof(uint8_t));
        memset(_aacBuffer, 0, _aacBufferSize);
    }
    return self;
}

- (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer completionBlock:(void (^)(NSData * encodedData, NSError* error))completionBlock {
    CFRetain(sampleBuffer);
    dispatch_async(_encoderQueue, ^{
        if (!self->_audioConverter) {
            [self setupEncoderFromSampleBuffer:sampleBuffer];
        }
        // 獲取 CMBlockBuffer
        CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
        CFRetain(blockBuffer);
        // 獲取音頻PCM數(shù)據(jù)地址/數(shù)據(jù)大小
        OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &self->_pcmBufferSize, &self->_pcmBuffer);
        NSError *error = nil;
        if (status != kCMBlockBufferNoErr) {
            error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        }
        //NSLog(@"PCM Buffer Size: %zu", _pcmBufferSize);
        // _aacBuffer數(shù)據(jù)填充到AudioBufferList對象去
        memset(self->_aacBuffer, 0, self->_aacBufferSize);
        AudioBufferList outAudioBufferList = {0};
        outAudioBufferList.mNumberBuffers = 1;
        outAudioBufferList.mBuffers[0].mNumberChannels = 1;
        outAudioBufferList.mBuffers[0].mDataByteSize = (UInt32)self->_aacBufferSize;
        outAudioBufferList.mBuffers[0].mData = self->_aacBuffer;
        
        AudioStreamPacketDescription *outPacketDescription = NULL;
        UInt32 ioOutputDataPacketSize = 1;
        /**
         AudioConverterFillComplexBuffer不斷地向音頻轉(zhuǎn)換器填充數(shù)據(jù)
            _audioConverter:音頻編碼器
            inInputDataProc:編碼回調(diào)函數(shù)(當(dāng)轉(zhuǎn)換器準(zhǔn)備好,會不斷調(diào)用此回調(diào))
            參數(shù)3: self
            ioOutputDataPacketSize:輸出緩沖區(qū)大小
            outAudioBufferList:需要編碼的音頻數(shù)據(jù)
            outPacketDescription:輸出acc包信息
         */
        status = AudioConverterFillComplexBuffer(self->_audioConverter,
                                                 inInputDataProc,
                                                 (__bridge void *)(self),
                                                 &ioOutputDataPacketSize,
                                                 &outAudioBufferList,
                                                 outPacketDescription);
        //NSLog(@"ioOutputDataPacketSize: %d", (unsigned int)ioOutputDataPacketSize);
        NSData *data = nil;
        if (status == noErr) {
            // 獲取編碼后的aac數(shù)據(jù)
            NSData *rawAAC = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
            /**
             如果只需要獲取音頻裸流,則不需要寫入ADTS頭,直接解碼;
             如果想要寫入文件,則必須添加ADTS頭再寫入文件。
             */
            // 拼接ADTS頭
            NSData *adtsHeader = [self adtsDataForPacketLength:rawAAC.length];
            NSMutableData *fullData = [NSMutableData dataWithData:adtsHeader];
            [fullData appendData:rawAAC];
            data = fullData;
        } else {
            error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        }
        if (completionBlock) {
            dispatch_async(self->_callbackQueue, ^{
                completionBlock(data, error);
            });
        }
        CFRelease(sampleBuffer);
        CFRelease(blockBuffer);
    });
}


- (void)setupEncoderFromSampleBuffer:(CMSampleBufferRef)sampleBuffer {
    // 獲取輸入?yún)?shù)
    AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));
    // 設(shè)置編碼輸出參數(shù)
    AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
    outAudioStreamBasicDescription.mSampleRate = (Float64)_config.sampleRate;
    //outAudioStreamBasicDescription.mSampleRate = inAudioStreamBasicDescription.mSampleRate; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see “kAudioStreamAnyRate”).
    outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0.
    outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_LC; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See “Audio Data Format Identifiers” for the flags that apply to each format.
    outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure.
    outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
    outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ...
    outAudioStreamBasicDescription.mChannelsPerFrame = (uint32_t)_config.channelCount;
    //outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero.
    outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats.
    outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment. Must be set to 0.
    AudioClassDescription *description = [self
                                          getAudioClassDescriptionWithType:outAudioStreamBasicDescription.mFormatID
                                          fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
    /**
     創(chuàng)建編碼器:
        參數(shù)1:輸入音頻格式描述
        參數(shù)2:輸出音頻格式描述
        參數(shù)3: class dessc的數(shù)量
        參數(shù)4:class desc
        參數(shù)5:創(chuàng)建的編碼器
     */
    OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription,
                                                &outAudioStreamBasicDescription,
                                                1,
                                                description,
                                                &_audioConverter);
    if (status != noErr) {
        NSLog(@"setup converter: %d", (int)status);
    }
    
    // 設(shè)置編碼解碼質(zhì)量
    /**
     kAudioConverterQuality_Max                              = 0x7F,
     kAudioConverterQuality_High                             = 0x60,
     kAudioConverterQuality_Medium                           = 0x40,
     kAudioConverterQuality_Low                              = 0x20,
     kAudioConverterQuality_Min                              = 0
     */
    UInt32 temp = kAudioConverterQuality_High;
    //編解碼器呈現(xiàn)質(zhì)量
    status = AudioConverterSetProperty(_audioConverter, kAudioConverterCodecQuality, sizeof(temp), &temp);
    if (status != noErr) {
        NSLog(@"ConverterSetProperty error: %d", (int)status);
    }
    //設(shè)置比特率
    uint32_t audioBitrate = (uint32_t)_config.bitrate;
    status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, sizeof(audioBitrate), &audioBitrate);
    if (status != noErr) {
        NSLog(@"ConverterSetProperty error: %d", (int)status);
    }
}

// 獲取編碼器類型描述
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
                                           fromManufacturer:(UInt32)manufacturer
{
    static AudioClassDescription desc;

    UInt32 encoderSpecifier = type;
    OSStatus st;
    UInt32 size;
    /**
     參數(shù)1:編碼器類型
     參數(shù)2:類型描述大小
     參數(shù)3:類型描述
     參數(shù)4:大小
     */
    st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
                                    sizeof(encoderSpecifier),
                                    &encoderSpecifier,
                                    &size);
    if (st) {
        NSLog(@"error getting audio format propery info: %d", (int)(st));
        return nil;
    }
    // 計算aac編碼器個數(shù)
    unsigned int count = size / sizeof(AudioClassDescription);
    // 創(chuàng)建一個包含count個的編碼器的數(shù)組
    AudioClassDescription descriptions[count];
    //將滿足aac編碼的編碼器的信息寫入數(shù)組
    st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
                                sizeof(encoderSpecifier),
                                &encoderSpecifier,
                                &size,
                                descriptions);
    if (st) {
        NSLog(@"error getting audio format propery: %d", (int)(st));
        return nil;
    }
    
    for (unsigned int i = 0; i < count; i++) {
        if ((type == descriptions[i].mSubType) &&
            (manufacturer == descriptions[i].mManufacturer)) {
            memcpy(&desc, &(descriptions[i]), sizeof(desc));
            return &desc;
        }
    }

    return nil;
}

// 不斷滴向AudioBufferList填充PCM數(shù)據(jù)
static OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
    AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);
    UInt32 requestedPackets = *ioNumberDataPackets;
    //NSLog(@"Number of packets requested: %d", (unsigned int)requestedPackets);
    size_t copiedSamples = [encoder copyPCMSamplesIntoBuffer:ioData];
    if (copiedSamples < requestedPackets) {
        //NSLog(@"PCM buffer isn't full enough!");
        *ioNumberDataPackets = 0;
        return -1;
    }
    *ioNumberDataPackets = 1;
    //NSLog(@"Copied %zu samples into ioData", copiedSamples);
    return noErr;
}
- (size_t)copyPCMSamplesIntoBuffer:(AudioBufferList*)ioData {
    size_t originalBufferSize = _pcmBufferSize;
    if (!originalBufferSize) {
        return 0;
    }
    ioData->mBuffers[0].mData = _pcmBuffer;
    ioData->mBuffers[0].mDataByteSize = (UInt32)_pcmBufferSize;
    ioData->mBuffers[0].mNumberChannels = (uint32_t)_config.channelCount;
    _pcmBuffer = NULL;
    _pcmBufferSize = 0;
    return originalBufferSize;
}


/**
 *  Add ADTS header at the beginning of each and every AAC packet.
 *  This is needed as MediaCodec encoder generates a packet of raw
 *  AAC data.
 *
 *  Note the packetLen must count in the ADTS header itself.
 *  See: http://wiki.multimedia.cx/index.php?title=ADTS
 *  Also: http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Channel_Configurations
 **/
- (NSData *)adtsDataForPacketLength:(NSUInteger)packetLength {
    int adtsLength = 7;
    char *packet = malloc(sizeof(char) * adtsLength);
    // Variables Recycled by addADTStoPacket
    int profile = 2;  //AAC LC
    //39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
    int freqIdx = 4;  //44.1KHz
    int chanCfg = 1;  //MPEG-4 Audio Channel Configuration. 1 Channel front-center
    NSUInteger fullLength = adtsLength + packetLength;
    // fill in ADTS data
    packet[0] = (char)0xFF; // 11111111     = syncword
    packet[1] = (char)0xF9; // 1111 1 00 1  = syncword MPEG-2 Layer CRC
    packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
    packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
    packet[4] = (char)((fullLength&0x7FF) >> 3);
    packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
    packet[6] = (char)0xFC;
    NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
    return data;
}


@end

五、音頻ACC解碼實現(xiàn)

#import <Foundation/Foundation.h>
#import <AVFoundation/AVFoundation.h>
@class AudioAccConfig;

@interface AACDecoder : NSObject

// 初始化
- (instancetype)init;
- (instancetype)initWithConfig:(AudioAccConfig *)config;

// 拿到acc數(shù)據(jù),不斷解碼出原音頻數(shù)據(jù)
- (void)decodeAACData:(NSData *)data completionBlock:(void (^)(NSData *decodedData, NSError* error))completionBlock;

@end

#import "AACDecoder.h"
#import <AVFoundation/AVFoundation.h>
#import <AudioToolbox/AudioToolbox.h>
#import "AudioAccConfig.h"

typedef struct {
    char *data;
    UInt32 size;
    UInt32 channelCount;
    AudioStreamPacketDescription packetDesc;
}AudioUserData;

@interface AACDecoder()
@property (nonatomic, strong) NSCondition *converterCond;
@property (nonatomic) dispatch_queue_t decoderQueue;
@property (nonatomic) dispatch_queue_t callbackQueue;

@property (nonatomic, strong) AudioAccConfig *config;

// 音頻轉(zhuǎn)換器對象
@property (nonatomic, unsafe_unretained) AudioConverterRef audioConverter;
@property (nonatomic) char *aacBuffer;
@property (nonatomic) UInt32 aacBufferSize;
@property (nonatomic) AudioStreamPacketDescription *packetDesc;

@end

@implementation AACDecoder

- (instancetype)init {
    return [self initWithConfig:nil];
}

- (instancetype)initWithConfig:(AudioAccConfig *)config {
    self = [super init];
    if (self) {
        // 音頻編碼配置
        _config = config;
        if (config == nil) {
            _config = AudioAccConfig.defaultConfig;
        }
        // 初始化隊列
        _decoderQueue = dispatch_queue_create("AAC Decoder Queue", DISPATCH_QUEUE_SERIAL);
        _callbackQueue = dispatch_queue_create("AAC Decoder Callback Queue", DISPATCH_QUEUE_SERIAL);
        // 轉(zhuǎn)換器對象/pcm緩沖區(qū)/acc緩沖區(qū)
        _audioConverter = NULL;
        _aacBufferSize = 0;
        _aacBuffer = NULL;
        AudioStreamPacketDescription desc = {0};
        _packetDesc = &desc;
        [self setupDecoder];
    }
    return self;
}

- (void)decodeAACData:(NSData *)data completionBlock:(void (^)(NSData *, NSError *))completionBlock {
    if (!_audioConverter) {return;}
    dispatch_async(_decoderQueue, ^{
        // 記錄aac,作為參數(shù)傳入解碼回調(diào)函數(shù)
        AudioUserData userData = {0};
        userData.channelCount = (UInt32)self->_config.channelCount;
        userData.data = (char *)[data bytes];
        userData.size = (UInt32)data.length;
        userData.packetDesc.mDataByteSize = (UInt32)data.length;
        userData.packetDesc.mStartOffset = 0;
        userData.packetDesc.mVariableFramesInPacket = 0;
        
        // 輸出大小和packet個數(shù)
        UInt32 pcmBufferSize = (UInt32)(2048 * self->_config.channelCount);
        UInt32 pcmDataPacketSize = 1024;
        // 創(chuàng)建臨時容器pcm
        uint8_t *pcmBuffer = malloc(pcmBufferSize);
        memset(pcmBuffer, 0, pcmBufferSize);
        // 輸出buffer
        AudioBufferList outAudioBufferList = {0};
        outAudioBufferList.mNumberBuffers = 1;
        outAudioBufferList.mBuffers[0].mNumberChannels = (uint32_t)self->_config.channelCount;
        outAudioBufferList.mBuffers[0].mDataByteSize = (UInt32)pcmBufferSize;
        outAudioBufferList.mBuffers[0].mData = pcmBuffer;
        
        // 輸出描述
        AudioStreamPacketDescription outputPacketDesc = {0};
        // 配置填充函數(shù),獲取輸出數(shù)據(jù)
        NSError *error = nil;
        OSStatus status = AudioConverterFillComplexBuffer(self->_audioConverter, &AudioDecoderConverterComplexInputDataProc, &userData, &pcmDataPacketSize, &outAudioBufferList, &outputPacketDesc);
        if (status != noErr) {
            error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
            return;
        }
        // 如果獲取到數(shù)據(jù)
        if (outAudioBufferList.mBuffers[0].mDataByteSize > 0) {
            NSData *rawData = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
            dispatch_async(self->_callbackQueue, ^{
                completionBlock(rawData, error);
            });
        }
        free(pcmBuffer);
    });
}

- (void)setupDecoder {
    // 輸出參數(shù)pcm
    AudioStreamBasicDescription outputAudioDes = {0};
    outputAudioDes.mSampleRate = (Float64)_config.sampleRate; // 采樣率
    outputAudioDes.mChannelsPerFrame = (UInt32)_config.channelCount;//輸出聲道數(shù)
    outputAudioDes.mFormatID = kAudioFormatLinearPCM; // 輸出格式
    outputAudioDes.mFormatFlags = (kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked); // 編碼12
    outputAudioDes.mFramesPerPacket = 1; //每個Packet幀數(shù)
    outputAudioDes.mBitsPerChannel = 16; //數(shù)據(jù)幀中每個通道的采樣位數(shù)
    outputAudioDes.mBytesPerFrame = outputAudioDes.mBitsPerChannel / 8 * outputAudioDes.mChannelsPerFrame; // 每一幀大?。ú蓸勇?8*聲道數(shù))
    outputAudioDes.mBytesPerPacket = outputAudioDes.mBytesPerFrame * outputAudioDes.mFramesPerPacket; // 每個Packet大?。◣笮?* 幀數(shù))
    outputAudioDes.mReserved = 0; // 對其方式(0代表8字節(jié)對其)
    
    // 輸入?yún)?shù)aac
    AudioStreamBasicDescription inputAudioDes = {0};
    inputAudioDes.mSampleRate = (Float64)_config.sampleRate;
    inputAudioDes.mFormatID = kAudioFormatMPEG4AAC;
    inputAudioDes.mFormatFlags = kMPEG4Object_AAC_LC;
    inputAudioDes.mFramesPerPacket = 1024;
    inputAudioDes.mChannelsPerFrame = (UInt32)_config.channelCount;
    
    // 填充輸出相關(guān)信息
    UInt32 inDesSize = sizeof(inputAudioDes);
    AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &inDesSize, &inputAudioDes);
    
    // 獲取解碼器的信息(只能傳入software)
    AudioClassDescription *audioClassDesc = [self getAudioClassDescriptionWithType:outputAudioDes.mFormatID
                                                                  fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
    
    /** 創(chuàng)建解碼器
     參數(shù)1:輸入音頻格式描述
     參數(shù)2:輸出音頻格式描述
     參數(shù)3:class desc的數(shù)量
     參數(shù)4:class desc
     參數(shù)5:創(chuàng)建的解碼器引用者
     */
    OSStatus status = AudioConverterNewSpecific(&inputAudioDes, &outputAudioDes, 1, audioClassDesc, &_audioConverter);
    if (status != noErr) {
        NSLog(@"error: 硬編碼AAC創(chuàng)建失敗,status=%d", (int)status);
        return;
    }
}

// 獲取編碼器類型描述
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
                                           fromManufacturer:(UInt32)manufacturer
{
    static AudioClassDescription desc;

    UInt32 encoderSpecifier = type;
    OSStatus st;
    UInt32 size;
    /**
     參數(shù)1:編碼器類型
     參數(shù)2:類型描述大小
     參數(shù)3:類型描述
     參數(shù)4:大小
     */
    st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
                                    sizeof(encoderSpecifier),
                                    &encoderSpecifier,
                                    &size);
    if (st) {
        NSLog(@"error getting audio format propery info: %d", (int)(st));
        return nil;
    }
    // 計算aac編碼器個數(shù)
    unsigned int count = size / sizeof(AudioClassDescription);
    // 創(chuàng)建一個包含count個的編碼器的數(shù)組
    AudioClassDescription descriptions[count];
    //將滿足aac編碼的編碼器的信息寫入數(shù)組
    st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
                                sizeof(encoderSpecifier),
                                &encoderSpecifier,
                                &size,
                                descriptions);
    if (st) {
        NSLog(@"error getting audio format propery: %d", (int)(st));
        return nil;
    }
    
    for (unsigned int i = 0; i < count; i++) {
        if ((type == descriptions[i].mSubType) &&
            (manufacturer == descriptions[i].mManufacturer)) {
            memcpy(&desc, &(descriptions[i]), sizeof(desc));
            return &desc;
        }
    }

    return nil;
}

// 解碼回調(diào)函數(shù)
static OSStatus AudioDecoderConverterComplexInputDataProc(AudioConverterRef inAudioConverter,
                                                          UInt32 *ioNumberDataPackets,
                                                          AudioBufferList *ioData,
                                                          AudioStreamPacketDescription **outDataPacketDescription,
                                                          void *inUserData) {
    AudioUserData *audioDecoder = (AudioUserData *)inUserData;
    if (audioDecoder->size <= 0) {
        ioNumberDataPackets = 0;
        return -1;
    }
    
    // 填充數(shù)據(jù)
    *outDataPacketDescription = &audioDecoder->packetDesc;
    (*outDataPacketDescription)[0].mStartOffset = 0;
    (*outDataPacketDescription)[0].mDataByteSize = audioDecoder->size;
    (*outDataPacketDescription)[0].mVariableFramesInPacket = 0;
    
    ioData->mBuffers[0].mData = audioDecoder->data;
    ioData->mBuffers[0].mDataByteSize = audioDecoder->size;
    ioData->mBuffers[0].mNumberChannels = audioDecoder->channelCount;
    
    return noErr;
}

@end

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容