1:訊飛支持的音頻文件格式只有pcm和wav(這種格式直接忽略,用ffmpeg庫轉(zhuǎn)換其他音頻格式成wav格式之后,全部都是噪音),并且音頻文件的參數(shù)也有要求:采樣率為8khz或16khz,采樣精度為16bits,聲道為單聲道
2:先將MP4中的音頻用原生的MediaExtractor和MediaMuxer提取出來,代碼如下:
private void muxerAudio() {
mediaExtractor = new MediaExtractor();
int audioIndex = -1;
try {
mediaExtractor.setDataSource("/storage/emulated/0/1.mp4");
int trackCount = mediaExtractor.getTrackCount();
for (int i = 0; i < trackCount; i++) {
MediaFormat trackFormat = mediaExtractor.getTrackFormat(i);
if (trackFormat.getString(MediaFormat.KEY_MIME).startsWith("audio/")) {
audioIndex = i;
}
}
mediaExtractor.selectTrack(audioIndex);
MediaFormat trackFormat = mediaExtractor.getTrackFormat(audioIndex);
mediaMuxer = new MediaMuxer("/storage/emulated/0/output_audio.wav", MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4);
int writeAudioIndex = mediaMuxer.addTrack(trackFormat);
mediaMuxer.start();
ByteBuffer byteBuffer = ByteBuffer.allocate(500 * 1024);
MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
long stampTime = 0;
//獲取幀之間的間隔時間
{
mediaExtractor.readSampleData(byteBuffer, 0);
if (mediaExtractor.getSampleFlags() == MediaExtractor.SAMPLE_FLAG_SYNC) {
mediaExtractor.advance();
}
mediaExtractor.readSampleData(byteBuffer, 0);
long secondTime = mediaExtractor.getSampleTime();
mediaExtractor.advance();
mediaExtractor.readSampleData(byteBuffer, 0);
long thirdTime = mediaExtractor.getSampleTime();
stampTime = Math.abs(thirdTime - secondTime);
}
mediaExtractor.unselectTrack(audioIndex);
mediaExtractor.selectTrack(audioIndex);
while (true) {
int readSampleSize = mediaExtractor.readSampleData(byteBuffer, 0);
if (readSampleSize < 0) {
break;
}
mediaExtractor.advance();
bufferInfo.size = readSampleSize;
//bufferInfo.flags = mediaExtractor.getSampleFlags();
bufferInfo.flags = MediaCodec.BUFFER_FLAG_END_OF_STREAM;
bufferInfo.offset = 0;
bufferInfo.presentationTimeUs += stampTime;
mediaMuxer.writeSampleData(writeAudioIndex, byteBuffer, bufferInfo);
}
mediaMuxer.stop();
mediaMuxer.release();
mediaExtractor.release();
Log.e("fuck", "finish");
} catch (IOException e) {
e.printStackTrace();
}
}
然后用AndroidAudioConverter將MP3轉(zhuǎn)換為wav格式的音頻,然后用RxFFmpeg將wav轉(zhuǎn)為pcm格式,然后就可以用訊飛識別了。
最后編輯于 :
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。