Hello! AudioStreamBasicDescription

原文地址

Apple是如何定義Audio的

In Core Audio, the following definitions apply:

  • An audio stream is a continuous series of data that represents a sound, such as a song.
  • A channel is a discrete track of monophonic audio. A monophonic stream has one channel; a stereo stream has two channels.
  • A sample is single numerical value for a single audio channel in an audio stream.
  • A frame is a collection of time-coincident samples. For instance, a linear PCM stereo sound file has two samples per frame, one for the left channel and one for the right channel.
  • A packet is a collection of one or more contiguous frames. A packet defines the smallest meaningful set of frames for a given audio data format, and is the smallest data unit for which time can be measured. In linear PCM audio, a packet holds a single frame. In compressed formats, it typically holds more; in some formats, the number of frames per packet varies.
  • The sample rate for a stream is the number of frames per second of uncompressed (or, for compressed formats, the equivalent in decompressed) audio.

AudioStreamBasicDescription 結(jié)構(gòu)

struct AudioStreamBasicDescription
{
    Float64             mSampleRate;
    AudioFormatID       mFormatID;
    AudioFormatFlags    mFormatFlags;
    UInt32              mBytesPerPacket;
    UInt32              mFramesPerPacket;
    UInt32              mBytesPerFrame;
    UInt32              mChannelsPerFrame;
    UInt32              mBitsPerChannel;
    UInt32              mReserved;
};
typedef struct AudioStreamBasicDescription  AudioStreamBasicDescription;

PCM時采樣頻率叫做sample rate。
每一次采樣可以得到若干采樣數(shù)據(jù),對應(yīng)多個channel。
每一個采樣點得到的若干采樣數(shù)據(jù)組合起來,叫做一個frame。
若干frame組合起來叫做一個packet。

AudioStreamBasicDescription 各字段的含義

mSampleRate

  • 采樣率,表示錄音設(shè)備在單位時間內(nèi)對聲音信號進行了多少次采樣,常用的采樣率有 16000 32000 44100 等

AudioFormatID

采樣數(shù)據(jù)的類型,PCM,AAC等

kAudioFormatLinearPCM               = 'lpcm',
kAudioFormatMPEG4AAC                = 'aac ',
kAudioFormatMPEGLayer3              = '.mp3',

mFormatFlags

描述AudioBufferList的格式

  kAudioFormatFlagIsFloat                     = (1U << 0),     // 0x1
    kAudioFormatFlagIsBigEndian                 = (1U << 1),     // 0x2
    kAudioFormatFlagIsSignedInteger             = (1U << 2),     // 0x4
    kAudioFormatFlagIsPacked                    = (1U << 3),     // 0x8
    kAudioFormatFlagIsAlignedHigh               = (1U << 4),     // 0x10
    kAudioFormatFlagIsNonInterleaved            = (1U << 5),     // 0x20
    kAudioFormatFlagIsNonMixable                = (1U << 6),     // 0x40

kAudioFormatFlagIsFloat

是否是浮點數(shù), 沒有設(shè)置,默認是 int 類型

kAudioFormatFlagIsBigEndian

是否是大端, 沒有設(shè)置,默認是小端

kAudioFormatFlagIsSignedInteger

是否是 signed int, 沒有設(shè)置,默認是 unsigned int

kAudioFormatFlagIsPacked

是否mBitsPerChannel 會占滿整個通道,如果沒有占滿, 就會依高位對齊或低位對齊。
沒有設(shè)置的時候,滿足 ((mBitsPerSample / 8) * mChannelsPerFrame) == mBytesPerFrame 的條件,默認會設(shè)置此選項。

kAudioFormatFlagIsNonInterleaved

設(shè)置 是否是平面類型,是否是交錯類型。

音頻數(shù)據(jù)的layout是分交錯布局和平面布局,一個雙聲道音頻數(shù)據(jù)為例則數(shù)據(jù)有兩種布局的可能

  1. 交錯布局:LRLRLR...
  2. 平面布局:
  • 平面1 LLLLLL...
  • 平面2 RRRRRR...

mChannelsPerFrame

描述音頻文件的聲道數(shù)。 單聲道 1 雙聲道 2 。這個值不能為0

mBitsPerChannel

每個音頻樣本的bit位數(shù),1byte = 8bit,一般值為 8 16 32

mBytesPerFrame

每一音頻幀中的字節(jié)數(shù)
計算方法

  • 交錯布局: mBytesPerFrame = mBitsPerChannel / 8 * mBitsPerChannel
  • 平面布局: mBytesPerFrame = mBitsPerChannel / 8

mFramesPerPacket

一個數(shù)據(jù)包中的幀數(shù),每個packet的幀數(shù)。如果是未壓縮的音頻數(shù)據(jù),值是1。動態(tài)幀率格式,這個值是一個較大的固定數(shù)字,比如說AAC的1024。如果是動態(tài)大小幀數(shù)(比如Ogg格式)設(shè)置為0。

mBytesPerPacket

一個數(shù)據(jù)包中的字節(jié)數(shù),mBytesPerPacket = mBytesPerFrame * mFramesPerPacket

mReserved

填充結(jié)構(gòu)以強制統(tǒng)一 8 字節(jié)對齊。必須設(shè)置為 0

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容