使用AudioToolbox編碼AAC

前言

使用VideoToolbox硬編碼H.264
使用VideoToolbox硬解碼H.264
這次在編碼H.264視頻流的同時,錄制并編碼AAC音頻流。

介紹

自然界中的聲音非常復(fù)雜,波形極其復(fù)雜,通常我們采用的是脈沖代碼調(diào)制編碼,即PCM編碼。PCM通過抽樣、量化、編碼三個步驟將連續(xù)變化的模擬信號轉(zhuǎn)換為數(shù)字編碼。

  • 抽樣:對模擬信號進(jìn)行周期性掃描,把時間上連續(xù)的信號變成時間上離散的信號;
  • 量化:用一組規(guī)定的電平,把瞬時抽樣值用最接近的電平值來表示,通常是用二進(jìn)制表示;
  • 編碼:用一組二進(jìn)制碼組來表示每一個有固定電平的量化值;

PCM介紹:百度百科

容易知道,采樣后的數(shù)據(jù)大小 = 采樣率值×采樣大小值×聲道數(shù) bps。
一個采樣率為44.1KHz,采樣大小為16bit,雙聲道的PCM編碼的WAV文件,它的數(shù)據(jù)速率=44.1K×16×2 bps=1411.2 Kbps= 176.4 KB/s。
這個速率和壓縮后的視頻數(shù)據(jù)速率差不多!
延伸出來AAC高級音頻編碼。

AAC高級音頻編碼

AAC(Advanced Audio Coding),中文名:高級音頻編碼,出現(xiàn)于1997年,基于MPEG-2的音頻編碼技術(shù)。由Fraunhofer IIS、杜比實(shí)驗(yàn)室、AT&T、Sony等公司共同開發(fā),目的是取代MP3格式。

AAC的維基百科
音頻壓縮編碼原理看這里。

AAC音頻格式

AAC音頻格式有ADIF和ADTS:

  • ADIF:Audio Data Interchange Format 音頻數(shù)據(jù)交換格式。這種格式的特征是可以確定的找到這個音頻數(shù)據(jù)的開始,不需進(jìn)行在音頻數(shù)據(jù)流中間開始的解碼,即它的解碼必須在明確定義的開始處進(jìn)行。故這種格式常用在磁盤文件中。
  • ADTS:Audio Data Transport Stream 音頻數(shù)據(jù)傳輸流。這種格式的特征是它是一個有同步字的比特流,解碼可以在這個流中任何位置開始。它的特征類似于mp3數(shù)據(jù)流格式。

iOS上把PCM音頻編碼成AAC音頻流

  • 1、設(shè)置編碼器(codec),并開始錄制;
  • 2、收集到PCM數(shù)據(jù),傳給編碼器;
  • 3、編碼完成回調(diào)callback,寫入文件。


具體步驟

1、創(chuàng)建并配置AVCaptureSession

創(chuàng)建AVCaptureSession,然后找到音頻的AVCaptureDevice,根據(jù)音頻device創(chuàng)建輸入并添加到session,最后添加output到session。

audioFileHandle是NSFileHandle,用戶寫入編碼后的AAC音頻到文件。
demo中,此段代碼還包括Video的設(shè)置。為了縮短篇幅,去掉了video相關(guān)的配置。

- (void)startCapture {
    self.mCaptureSession = [[AVCaptureSession alloc] init];
    mCaptureQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    mEncodeQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
        
    AVCaptureDevice *audioDevice = [[AVCaptureDevice devicesWithMediaType:AVMediaTypeAudio] lastObject];
    self.mCaptureAudioDeviceInput = [[AVCaptureDeviceInput alloc] initWithDevice:audioDevice error:nil];
    if ([self.mCaptureSession canAddInput:self.mCaptureAudioDeviceInput]) {
        [self.mCaptureSession addInput:self.mCaptureAudioDeviceInput];
    }
    self.mCaptureAudioOutput = [[AVCaptureAudioDataOutput alloc] init];
    
    if ([self.mCaptureSession canAddOutput:self.mCaptureAudioOutput]) {
        [self.mCaptureSession addOutput:self.mCaptureAudioOutput];
    }
    [self.mCaptureAudioOutput setSampleBufferDelegate:self queue:mCaptureQueue];
       
    NSString *audioFile = [[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject] stringByAppendingPathComponent:@"abc.aac"];
    [[NSFileManager defaultManager] removeItemAtPath:audioFile error:nil];
    [[NSFileManager defaultManager] createFileAtPath:audioFile contents:nil attributes:nil];
    audioFileHandle = [NSFileHandle fileHandleForWritingAtPath:audioFile];
    
    [self.mCaptureSession startRunning];
}

2、創(chuàng)建轉(zhuǎn)換器

AudioStreamBasicDescription是輸出流的結(jié)構(gòu)體描述,
配置好outAudioStreamBasicDescription后,
根據(jù)AudioClassDescription(編碼器),
調(diào)用AudioConverterNewSpecific創(chuàng)建轉(zhuǎn)換器。

/**
 *  設(shè)置編碼參數(shù)
 *
 *  @param sampleBuffer 音頻
 */
- (void) setupEncoderFromSampleBuffer:(CMSampleBufferRef)sampleBuffer {
    AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));
    
    AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // 初始化輸出流的結(jié)構(gòu)體描述為0. 很重要。
    outAudioStreamBasicDescription.mSampleRate = inAudioStreamBasicDescription.mSampleRate; // 音頻流,在正常播放情況下的幀率。如果是壓縮的格式,這個屬性表示解壓縮后的幀率。幀率不能為0。
    outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // 設(shè)置編碼格式
    outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_LC; // 無損編碼 ,0表示沒有
    outAudioStreamBasicDescription.mBytesPerPacket = 0; // 每一個packet的音頻數(shù)據(jù)大小。如果的動態(tài)大小,設(shè)置為0。動態(tài)大小的格式,需要用AudioStreamPacketDescription 來確定每個packet的大小。
    outAudioStreamBasicDescription.mFramesPerPacket = 1024; // 每個packet的幀數(shù)。如果是未壓縮的音頻數(shù)據(jù),值是1。動態(tài)幀率格式,這個值是一個較大的固定數(shù)字,比如說AAC的1024。如果是動態(tài)大小幀數(shù)(比如Ogg格式)設(shè)置為0。
    outAudioStreamBasicDescription.mBytesPerFrame = 0; //  每幀的大小。每一幀的起始點(diǎn)到下一幀的起始點(diǎn)。如果是壓縮格式,設(shè)置為0 。
    outAudioStreamBasicDescription.mChannelsPerFrame = 1; // 聲道數(shù)
    outAudioStreamBasicDescription.mBitsPerChannel = 0; // 壓縮格式設(shè)置為0
    outAudioStreamBasicDescription.mReserved = 0; // 8字節(jié)對齊,填0.
    AudioClassDescription *description = [self
                                          getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC
                                          fromManufacturer:kAppleSoftwareAudioCodecManufacturer]; //軟編
    
    OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, description, &_audioConverter); // 創(chuàng)建轉(zhuǎn)換器
    if (status != 0) {
        NSLog(@"setup converter: %d", (int)status);
    }
}

獲取編碼器的方法


/**
 *  獲取編解碼器
 *
 *  @param type         編碼格式
 *  @param manufacturer 軟/硬編
 *
 編解碼器(codec)指的是一個能夠?qū)σ粋€信號或者一個數(shù)據(jù)流進(jìn)行變換的設(shè)備或者程序。這里指的變換既包括將 信號或者數(shù)據(jù)流進(jìn)行編碼(通常是為了傳輸、存儲或者加密)或者提取得到一個編碼流的操作,也包括為了觀察或者處理從這個編碼流中恢復(fù)適合觀察或操作的形式的操作。編解碼器經(jīng)常用在視頻會議和流媒體等應(yīng)用中。
 *  @return 指定編碼器
 */
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
                                           fromManufacturer:(UInt32)manufacturer
{
    static AudioClassDescription desc;
    
    UInt32 encoderSpecifier = type;
    OSStatus st;
    
    UInt32 size;
    st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
                                    sizeof(encoderSpecifier),
                                    &encoderSpecifier,
                                    &size);
    if (st) {
        NSLog(@"error getting audio format propery info: %d", (int)(st));
        return nil;
    }
    
    unsigned int count = size / sizeof(AudioClassDescription);
    AudioClassDescription descriptions[count];
    st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
                                sizeof(encoderSpecifier),
                                &encoderSpecifier,
                                &size,
                                descriptions);
    if (st) {
        NSLog(@"error getting audio format propery: %d", (int)(st));
        return nil;
    }
    
    for (unsigned int i = 0; i < count; i++) {
        if ((type == descriptions[i].mSubType) &&
            (manufacturer == descriptions[i].mManufacturer)) {
            memcpy(&desc, &(descriptions[i]), sizeof(desc));
            return &desc;
        }
    }
    
    return nil;
}

3、獲取到PCM數(shù)據(jù)并傳入編碼器

CMSampleBufferGetDataBuffer獲取到CMSampleBufferRef里面的CMBlockBufferRef,再通過CMBlockBufferGetDataPointer獲取到_pcmBufferSize和_pcmBuffer;
調(diào)用AudioConverterFillComplexBuffer傳入數(shù)據(jù),并在callBack函數(shù)調(diào)用填充buffer的方法。

        CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
        CFRetain(blockBuffer);
        OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &_pcmBufferSize, &_pcmBuffer);
        NSError *error = nil;
        if (status != kCMBlockBufferNoErr) {
            error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        }
        memset(_aacBuffer, 0, _aacBufferSize);
        
        AudioBufferList outAudioBufferList = {0};
        outAudioBufferList.mNumberBuffers = 1;
        outAudioBufferList.mBuffers[0].mNumberChannels = 1;
        outAudioBufferList.mBuffers[0].mDataByteSize = (int)_aacBufferSize;
        outAudioBufferList.mBuffers[0].mData = _aacBuffer;
        AudioStreamPacketDescription *outPacketDescription = NULL;
        UInt32 ioOutputDataPacketSize = 1;
        // Converts data supplied by an input callback function, supporting non-interleaved and packetized formats.
        // Produces a buffer list of output data from an AudioConverter. The supplied input callback function is called whenever necessary.
        status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, outPacketDescription);

Callback函數(shù)

/**
 *  A callback function that supplies audio data to convert. This callback is invoked repeatedly as the converter is ready for new input data.
 
 */
OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
    AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);
    UInt32 requestedPackets = *ioNumberDataPackets;
    
    size_t copiedSamples = [encoder copyPCMSamplesIntoBuffer:ioData];
    if (copiedSamples < requestedPackets) {
        //PCM 緩沖區(qū)還沒滿
        *ioNumberDataPackets = 0;
        return -1;
    }
    *ioNumberDataPackets = 1;
    
    return noErr;
}

/**
 *  填充PCM到緩沖區(qū)
 */
- (size_t) copyPCMSamplesIntoBuffer:(AudioBufferList*)ioData {
    size_t originalBufferSize = _pcmBufferSize;
    if (!originalBufferSize) {
        return 0;
    }
    ioData->mBuffers[0].mData = _pcmBuffer;
    ioData->mBuffers[0].mDataByteSize = (int)_pcmBufferSize;
    _pcmBuffer = NULL;
    _pcmBufferSize = 0;
    return originalBufferSize;
}

4、得到rawAAC碼流,添加ADTS頭,并寫入文件

AudioConverterFillComplexBuffer返回的是AAC原始碼流,需要在AAC每幀添加ADTS頭,調(diào)用adtsDataForPacketLength方法生成,最后把數(shù)據(jù)寫入audioFileHandle的文件。

        if (status == 0) {
            NSData *rawAAC = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
            NSData *adtsHeader = [self adtsDataForPacketLength:rawAAC.length];
            NSMutableData *fullData = [NSMutableData dataWithData:adtsHeader];
            [fullData appendData:rawAAC];
            data = fullData;
        } else {
            error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        }
        if (completionBlock) {
            dispatch_async(_callbackQueue, ^{
                completionBlock(data, error);
            });
        }

網(wǎng)上的ADTS頭生成方法

/**
 *  Add ADTS header at the beginning of each and every AAC packet.
 *  This is needed as MediaCodec encoder generates a packet of raw
 *  AAC data.
 *
 *  Note the packetLen must count in the ADTS header itself.
 *  See: http://wiki.multimedia.cx/index.php?title=ADTS
 *  Also: http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Channel_Configurations
 **/
- (NSData*) adtsDataForPacketLength:(NSUInteger)packetLength {
    int adtsLength = 7;
    char *packet = malloc(sizeof(char) * adtsLength);
    // Variables Recycled by addADTStoPacket
    int profile = 2;  //AAC LC
    //39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
    int freqIdx = 4;  //44.1KHz
    int chanCfg = 1;  //MPEG-4 Audio Channel Configuration. 1 Channel front-center
    NSUInteger fullLength = adtsLength + packetLength;
    // fill in ADTS data
    packet[0] = (char)0xFF; // 11111111     = syncword
    packet[1] = (char)0xF9; // 1111 1 00 1  = syncword MPEG-2 Layer CRC
    packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
    packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
    packet[4] = (char)((fullLength&0x7FF) >> 3);
    packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
    packet[6] = (char)0xFC;
    NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
    return data;
}

總結(jié)

demo主要是為了熟悉AAC編碼的格式,實(shí)現(xiàn)了從麥克風(fēng)錄制音頻并編碼成AAC碼流。
下一篇介紹如何解碼播放這次生成的AAC碼流。
代碼地址點(diǎn)這里

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容