如何實(shí)現(xiàn)音視頻同步 (live555)

live555中視頻和音頻是分別進(jìn)行編碼的,如何實(shí)現(xiàn)兩者的同步呢?
如果可以做到讓視頻和音頻的時(shí)間戳,都與NTP時(shí)間保持同步,就可達(dá)到音視頻同步的目的。

Network Time Protocol (NTP) is a networking protocol for clock synchronization between computer systems overpacket-switched, variable-latency data networks.

在live555中是如何實(shí)現(xiàn)這種機(jī)制的呢?
總體思路是:

  • RTSP服務(wù)端利用RTCP協(xié)議中的Sender Report將NTP Timestamp發(fā)送到RTSP客戶端。
  • RTSP客戶端(數(shù)據(jù)的接收方)把A/V的RTP時(shí)間戳同步到RTCP的絕對時(shí)間(NTP Timestamp),實(shí)現(xiàn)A/V同步。
    這個(gè)絕對時(shí)間就是當(dāng)前時(shí)間距離Jan 1 1900 00:00:00的差值。

首先看一下未加入同步機(jī)制的時(shí)間戳代碼:

void RTPReceptionStats::noteIncomingPacket(u_int16_t seqNum, 
                                           u_int32_t rtpTimestamp,
                                           unsigned timestampFrequency,
                                           Boolean useForJitterCalculation,
                                           struct timeval& resultPresentationTime,
                                           Boolean& resultHasBeenSyncedUsingRTCP,
                                           unsigned packetSize) 
{
    ...

    // Record the inter-packet delay
    struct timeval timeNow;
    gettimeofday(&timeNow, NULL);

    ...

    // Return the 'presentation time' that corresponds to "rtpTimestamp":
    if (fSyncTime.tv_sec == 0 && fSyncTime.tv_usec == 0) 
    {
        // This is the first timestamp that we've seen, so use the current
        // 'wall clock' time as the synchronization time.  (This will be
        // corrected later when we receive RTCP SRs.)
        fSyncTimestamp = rtpTimestamp; // 首個(gè)RTP Timestamp
        fSyncTime      = timeNow; // 使用當(dāng)前系統(tǒng)時(shí)間作為初始參考時(shí)間戳
    }

    int timestampDiff = rtpTimestamp - fSyncTimestamp;

    // Note: This works even if the timestamp wraps around
    // (as long as "int" is 32 bits)

    // Divide this by the timestamp frequency to get real time:
    double timeDiff = timestampDiff/(double)timestampFrequency;

    // Add this to the 'sync time' to get our result:
    unsigned const million = 1000000;
    unsigned seconds, uSeconds;

    if (timeDiff >= 0.0) 
    {
        // 計(jì)算時(shí)間戳
        seconds  = fSyncTime.tv_sec  + (unsigned)(timeDiff);
        uSeconds = fSyncTime.tv_usec + (unsigned)((timeDiff - (unsigned)timeDiff)*million);

        if (uSeconds >= million) 
        {
            uSeconds -= million;
            ++seconds;
        }
    } 
    else 
    {
        timeDiff = -timeDiff;
        seconds  = fSyncTime.tv_sec  - (unsigned)(timeDiff);
        uSeconds = fSyncTime.tv_usec - (unsigned)((timeDiff - (unsigned)timeDiff)*million);
        if ((int)uSeconds < 0) 
        {
            uSeconds += million;
            --seconds;
        }
    }

    resultPresentationTime.tv_sec  = seconds;
    resultPresentationTime.tv_usec = uSeconds;
    resultHasBeenSyncedUsingRTCP   = fHasBeenSynchronized;

    // Save these as the new synchronization timestamp & time:
    fSyncTimestamp = rtpTimestamp;
    fSyncTime      = resultPresentationTime;

    fPreviousPacketRTPTimestamp = rtpTimestamp;
}

其中有兩個(gè)重要的參數(shù): fSyncTimestampfSyncTime;

class RTPReceptionStats {
...

private:
  // Used to convert from RTP timestamp to 'wall clock' time:
  Boolean fHasBeenSynchronized;
  u_int32_t fSyncTimestamp;
  struct timeval fSyncTime;
};
  • fSyncTimestamp
    RTP Timestamp, 默認(rèn)第N幀的rtpTimestamp為第N+1幀的fSyncTimestamp。
  • fSyncTime
    'wall clock' time, 默認(rèn)第N幀的'wall clock' time為第N+1幀的fSyncTime。

RTPReceptionStats::noteIncomingPacket的實(shí)質(zhì)是:
將 RTP timestamp 轉(zhuǎn)換為 'wall clock' time

獲取首個(gè)RTP時(shí),將系統(tǒng)時(shí)間作為首個(gè)'wall clock' time。
后續(xù),當(dāng)RTP timestamp發(fā)生變化時(shí),要將變化的部分轉(zhuǎn)換為real time:

int timestampDiff = rtpTimestamp - fSyncTimestamp;
 // Divide this by the timestamp frequency to get real time: 
double timeDiff = timestampDiff/(double)timestampFrequency;

然后將該部分改變反映到'wall clock' time上, 如:

seconds = fSyncTime.tv_sec + (unsigned)(timeDiff); 
uSeconds = fSyncTime.tv_usec + (unsigned)((timeDiff - (unsigned)timeDiff)*million);

可以看出以上的邏輯中,完全取決于系統(tǒng)時(shí)間的精確度,沒有任何校正機(jī)制。

live555是在哪里實(shí)現(xiàn)時(shí)間校正的呢?
答案是利用RTSP客戶端(數(shù)據(jù)的接收者)利用RTCP返回的Sender Report, 然后利用其中的NTP TimestampRTP timestamp, 對fSyncTimestampfSyncTime進(jìn)行校正。

Part of Sender Report RTCP Packet

校正程序如下:

void RTPReceptionStats::noteIncomingSR(u_int32_t ntpTimestampMSW,
                                       u_int32_t ntpTimestampLSW,
                                       u_int32_t rtpTimestamp) 
{
    fLastReceivedSR_NTPmsw = ntpTimestampMSW;
    fLastReceivedSR_NTPlsw = ntpTimestampLSW;

    gettimeofday(&fLastReceivedSR_time, NULL);

    // Use this SR to update time synchronization information:
    // ntpTimestampMSW : NTP timestamp, most significant word (64位NTP時(shí)間戳的高32位)
    fSyncTimestamp      = rtpTimestamp;
    fSyncTime.tv_sec    = ntpTimestampMSW - 0x83AA7E80; // 1/1/1900 -> 1/1/1970

    // ntpTimestampLSW  : NTP timestamp, least significant word (64位NTP時(shí)間戳的低32位)
    double microseconds = (ntpTimestampLSW * 15625.0) / 0x04000000; // 10^6/2^32
    fSyncTime.tv_usec   = (unsigned)(microseconds + 0.5);
}

通過Sender Report,分別對視頻和音頻的時(shí)間及時(shí)進(jìn)行校正,即可保證視音頻同步。

References:

https://en.wikipedia.org/wiki/Network_Time_Protocol
RTP: A Transport Protocol for Real-Time Applications

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容