当前位置：首页 > article >正文

FFmpeg + Qt 音频文件转PCM数据

article 2025/8/19 15:35:38

0.前言

PCM(Pulse Code Modulation，脉冲编码调制)音频数据是未经压缩的音频采样数据裸流，它是由模拟信号经过采样、量化、编码转换成的标准数字音频数据。

描述 PCM 数据的 6 个参数（参考PCM音频数据 - 简书）：

Sample Rate : 采样频率。如8kHz(电话)、44.1kHz(CD)、48kHz(DVD)。
Sample Size : 量化位数。通常该值为16-bit。
Number of Channels : 通道个数。常见的音频有立体声(stereo)和单声道(mono)两种类型，立体声包含左声道和右声道。另外还有环绕立体声等其它不太常用的类型。
Sign : 表示样本数据是否是有符号位，比如用一字节表示的样本数据，有符号的话表示范围为-128 ~ 127，无符号是0 ~ 255。
Byte Ordering : 字节序。字节序是little-endian还是big-endian。通常均为little-endian。
Integer Or Floating Point : 整形或浮点型。大多数格式的PCM样本数据使用整形表示，而在一些对精度要求高的应用方面，使用浮点类型表示PCM样本数据。

本文通过重采样的方式，将音频文件中的数据转为指定参数格式的 PCM 数据。FFmpeg 中重采样的功能由 libswresample 提供，该模块提供了高度优化的转换音频的采样频率、声道格式或采样格式的功能。如果不转换直接读取文件的 PCM 数据，因为格式比较多，处理起来也挺麻烦，重采样之后便于进一步的处理，如绘制波形等。最终效果：

参考示例：ffmpeg-4.2.4\doc\examples\resampling_audio.c

参考博客：FFmpeg音频重采样API(libswresample) - 简书

参考博客：https://segmentfault.com/a/1190000025145553

本文代码链接（不带FFmpeg库）：MyTestCode/Qt/GetAudioInfo at master · gongjianbo/MyTestCode · GitHub

工程CSDN下载：GetAudioInfo_VS2017x64.rar_qt使用ffmpeg获取pcm数据-编解码文档类资源-CSDN下载

(2021-04-01)之前转码的时候如果导出多声道，只导出了单个声道的数据，现已更正.（因为原本设计的是可以将原本双声道的拆成两个单声道）

(2021-12-27)问题1：之前用的 AVFormatContext 来获取的比特率，如果是视频文件这就不能作为音频的比特率了，所以改为了 AVCodecContext 来获取。不过有些文件的 AVCodecContext 可能获取不到比特率信息，这时候再使用 AVFormatContext 提供的信息。问题2：之前重置缓冲区时没把通道数加入计算，不过预置的大小比较大，一般不会进入重置缓冲区大小的逻辑。

(2022-08-25)修复采样精度信息读取错误的问题，如24bit。因为之前读取的AVSampleFormat枚举只有固定的几种精度，现在通过AVCodecParameters结构体成员来获取实际的采样精度。

1.主要接口

swr_alloc_set_opts

该函数相当于 swr_alloc 加上 swr_alloc_set_opts ，即初始化并设置 SwrContext 参数。对于输入参数，取 AVCodecContext 输入解码器上下文的参数就行了。对于输出参数，可以自己制定，达到编码格式转换的目的。

/*** @param s               已有的重采样上下文对象, 或者填 NULL* @param out_ch_layout   输出通道布局 (AV_CH_LAYOUT_*)* @param out_sample_fmt  输出采样格式，如16位有符号数 (AV_SAMPLE_FMT_*).* @param out_sample_rate 输出采样率 (frequency in Hz)* @param in_ch_layout    输入通道布局 (AV_CH_LAYOUT_*)* @param in_sample_fmt   输入采样格式 (AV_SAMPLE_FMT_*).* @param in_sample_rate  输入采样率 (frequency in Hz)* @param log_offset      日志等级* @param log_ctx         parent logging context, can be NULL** @return 返回重采样上下文对象，如果为 NULL 则失败*/
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,int64_t  in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate,int log_offset, void *log_ctx);

swr_convert

该函数执行重采样转换。在转换之前，还要通过 av_read_frame+avcodec_send_packet+avcodec_receive_frame 解码原数据，解码完成之后，输入参数的缓冲区就填 AVFrame 的 data，采样数就填 AVFrame 的 nb_samples。

主要需要关注的是输出参数，缓冲区是一个指针数组，如果是 planar 存储形式，则左右声道会分别给两个数组写数据，如 [0]=LLLL [1]=RRRR；如果是 packed 存储形式，则只使用数组 [0]，双声道时左右声道交错写，[0]=LRLRLRLR。那么缓冲区该多大呢？可以使用 av_rescale_rnd 或者 swr_get_out_samples 获取大致的转换后采样数，一般比实际的大一点，再乘上通道数和采样点位宽就得到了需要的缓冲区大小。如果输入的 nb_samles 采样数大于了输出的 nb_samplse 采样数，则 SwrContext 中会有缓存。如果有缓存，swr_get_out_samples 第二个参数填零可以取缓存数据大小，swr_convert 最后两个参数填0可以获取缓存数据。

/** * @param s         有效的重采样上下文对象* @param out       输出缓冲区，如果是压缩音频，只需设置第一个缓冲区* @param out_count 输出采样数* @param in        输入缓冲区* @param in_count  输入采样数** @return 每个通道的采样数，NULL 则错误*/
int swr_convert(struct SwrContext *s, uint8_t **out, int out_count,const uint8_t **in , int in_count);

2.主要代码

因为我封装了两个类文件（EasyAudioContext 和 EasyAudioDecoder），所以代码比较长，完整代码请看文首，贴出来主要是便于我以后在线阅读。

（这个 Demo 也比较简单，主要就是调用 FFmpeg 的相关接口）

#pragma once
#include <QString>
#include <QSharedPointer>//在头文件导入只是偷个懒
extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavformat/avio.h>
#include <libswresample/swresample.h>
#include <libavutil/frame.h>
#include <libavutil/mem.h>
}/*** @brief 存储音频格式等信息*/
struct EasyAudioInfo
{bool valid=false;//下面为音频文件相关信息QString filepath;QString filename;QString encode;int sampleRate = 0;int channels = 0;int sampleBit = 0;qint64 duration = 0;qint64 bitRate = 0;qint64 size = 0;QString type;
};/*** @brief 存储输入或者输出用的参数*/
struct EasyAudioParameter
{//通道数int channels=1;//采样存储格式，对应枚举AVSampleFormat，AV_SAMPLE_FMT_S16=1AVSampleFormat sampleFormat=AV_SAMPLE_FMT_S16;//采样率int sampleRate=16000;
};/*** @brief 管理音频上下文，也可用来获取音频格式等信息* @author 龚建波* @date 2020-11-20* @details* 去掉了拷贝和赋值，需要作为参数传递时使用智能指针管理* （为什么用 NULL 不用 nullptr，为了和 C 保持一致）** 内存管理参考：* https://www.jianshu.com/p/9f45d283d904* https://blog.csdn.net/leixiaohua1020/article/details/41181155** 获取音频信息参考：* https://blog.csdn.net/zhoubotong2012/article/details/79340722* https://blog.csdn.net/luotuo44/article/details/54981809**/
class EasyAudioContext
{
private://判断解析状态，只有Success才表示成功enum EasyState{None //无效的,Success //解析成功,NoFile //文件不存在,FormatOpenError //打开文件失败,FindStreamError //读取流信息失败,NoAudio //未找到音频流,CodecFindDecoderError //未找到解码器,CodecAllocContextError //分配解码上下文失败,ParameterError //填充解码上下文失败,CodecOpenError //打开解码器失败};
public://根据文件创建上下文对象explicit EasyAudioContext(const QString &filepath);//去掉了拷贝和赋值，使用智能指针管理EasyAudioContext(const EasyAudioContext &other)=delete;EasyAudioContext &operator =(const EasyAudioContext &other)=delete;EasyAudioContext(EasyAudioContext &&other)=delete;EasyAudioContext &operator =(EasyAudioContext &&other)=delete;//析构中释放资源~EasyAudioContext();//是否为有效的上下文bool isValid() const;//获取该上下文音频格式等信息EasyAudioInfo getAudioInfo() const;//获取该上下文参数信息EasyAudioParameter getAudioParameter() const;private://根据文件初始化上下文void init(const QString &filepath);//释放资源void free();private://源文件路径QString srcpath;//该上下文是否有效，默认无效EasyState status=None;//格式化I/O上下文AVFormatContext *formatCtx = NULL;//解码器AVCodec *codec = NULL;//解码器上下文AVCodecContext *codecCtx = NULL;//参数信息AVCodecParameters *codecParam = NULL;//音频流indexint streamIndex = -1;//在友元中访问私有变量用friend class EasyAudioDecoder;
};

#include "EasyAudioContext.h"#include <QFileInfo>
#include <QDebug>extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavformat/avio.h>
#include "libswresample/swresample.h"
#include <libavutil/frame.h>
#include <libavutil/mem.h>
}EasyAudioContext::EasyAudioContext(const QString &filepath)
{init(filepath);
}EasyAudioContext::~EasyAudioContext()
{free();
}bool EasyAudioContext::isValid() const
{return (status==EasyState::Success);
}EasyAudioInfo EasyAudioContext::getAudioInfo() const
{EasyAudioInfo info;//把需要的格式信息copy过来info.valid = isValid();info.filepath = srcpath;QFileInfo f_info(srcpath);info.filename = f_info.fileName();info.size = f_info.size();if(!isValid())return info;info.encode = codec->name;info.sampleRate = codecParam->sample_rate; //hzinfo.channels = codecParam->channels;//2022-08-25 之前取的采样精度不是文件实际的精度，导致24bit等不能正确识别//info.sampleBit = (av_get_bytes_per_sample(codecCtx->sample_fmt)<<3);  //bitinfo.sampleBit = av_get_bits_per_sample(codecParam->codec_id);//if (codecCtx && codecCtx->bits_per_raw_sample > 0) {//    info.sampleBit = codecCtx->bits_per_raw_sample;//}info.duration = formatCtx->duration/(AV_TIME_BASE/1000.0);  //ms//2020-12-31 测试一个ape文件时发现音频信息比特率为0，现判断无效则使用容器比特率info.bitRate = codecCtx->bit_rate<1?formatCtx->bit_rate:codecCtx->bit_rate; //bpsinfo.type = formatCtx->iformat->name;return info;
}EasyAudioParameter EasyAudioContext::getAudioParameter() const
{EasyAudioParameter param;if(!isValid())return param;param.channels=codecCtx->channels;param.sampleFormat=codecCtx->sample_fmt;param.sampleRate=codecCtx->sample_rate;return param;
}void EasyAudioContext::init(const QString &filepath)
{srcpath=filepath;if(!QFileInfo::exists(filepath)){status=EasyState::NoFile;return;}//ffmpeg默认用的utf8编码，这里转换下QByteArray temp=filepath.toUtf8();const char *path=temp.constData();//const char *filepath="D:/Download/12.wav";//打开输入流并读取头//流要使用avformat_close_input关闭//成功时返回=0const int result=avformat_open_input(&formatCtx, path, NULL, NULL);if (result!=0||formatCtx==NULL){status=EasyState::FormatOpenError;return;}//读取文件获取流信息，把它存入AVFormatContext中//正常时返回>=0if (avformat_find_stream_info(formatCtx, NULL) < 0) {status=EasyState::FindStreamError;return;}//qDebug()<<"filepath"<<filepath;//duration/AV_TIME_BASE单位为秒//qDebug()<<"duration"<<formatCtx->duration/(AV_TIME_BASE/1000.0)<<"ms";//文件格式，如wav//qDebug()<<"format"<<formatCtx->iformat->name<<":"<<formatCtx->iformat->long_name;//qDebug()<<"bitrate"<<formatCtx->bit_rate<<"bps";//qDebug()<<"n stream"<<formatCtx->nb_streams;status=EasyState::NoAudio;for (unsigned int i = 0; i < formatCtx->nb_streams; i++){
#if 1//AVStream是存储每一个视频/音频流信息的结构体AVStream *in_stream = formatCtx->streams[i];//类型为音频if(in_stream->codecpar->codec_type == AVMEDIA_TYPE_AUDIO){codecParam = in_stream->codecpar;streamIndex = i;//查找具有匹配编解码器ID的已注册解码器//失败返回NULLcodec = avcodec_find_decoder(in_stream->codecpar->codec_id);if(codec==NULL){status=EasyState::CodecFindDecoderError;return;}//分配AVCodecContext并将其字段设置为默认值//需要使用avcodec_free_context释放生成的对象//如果失败，则返回默认填充或者 NULLcodecCtx = avcodec_alloc_context3(codec);if(codecCtx==NULL){status=EasyState::CodecAllocContextError;return;}//根据编码器填充上下文参数//事实上codecpar包含了大部分解码器相关的信息，这里是直接从AVCodecParameters复制到AVCodecContext//成功时返回值>=0if(avcodec_parameters_to_context(codecCtx, in_stream->codecpar)<0){status=EasyState::ParameterError;return;}//某些AVCodecContext字段的访问器，已弃用//av_codec_set_pkt_timebase(codec_ctx, in_stream->time_base);//没有此句会出现：Could not update timestamps for skipped samplescodecCtx->pkt_timebase = formatCtx->streams[i]->time_base;//打开解码器（将解码器和解码器上下文进行关联）//使用给定的AVCodec初始化AVCodecContext//在之前必须使用avcodec_alloc_context3()分配上下文//成功时返回值=0if(avcodec_open2(codecCtx, codec, nullptr)!=0){status=EasyState::CodecOpenError;return;}//采样率//qDebug()<<"sample rate"<<codecCtx->sample_rate;//编码，如pcm//qDebug()<<"codec name"<<codec->name<<":"<<codec->long_name;status=EasyState::Success;return;}
#else//新的版本这种获取方式已弃用AVStream *in_stream = fmt_ctx->streams[i];AVCodecContext *avctx=in_stream->codec;if (avctx->codec_type == AVMEDIA_TYPE_VIDEO){//视频信息略}else if (avctx->codec_type == AVMEDIA_TYPE_AUDIO){//音频信息qDebug()<<"sample rate"<<in_stream->codec->sample_rate;AVCodec *codec=avcodec_find_decoder(avctx->codec_id);if(codec==NULL){return;}qDebug()<<"codec name"<<codec->name<<":"<<codec->long_name;return;}
#endif}
}void EasyAudioContext::free()
{if(codecCtx){//不要直接使用avcodec_close，而是用avcodec_free_context//把codec相关的其他东西一并释放avcodec_free_context(&codecCtx);}if(formatCtx){//avformat_close_input内部其实已经调用了avformat_free_contextavformat_close_input(&formatCtx);//avformat_free_context(formatCtx);}codec=NULL;codecCtx=NULL;codecParam=NULL;formatCtx=NULL;
}

#pragma once
#include <functional>
#include "EasyAudioContext.h"/*** @brief wav文件头结构体* @author 龚建波* @date 2020-11-12* @details* wav头是不定长格式，不过这里用的比较简单的格式* （数值以小端存储，不过pc一般是小端存储，暂不特殊处理）* 参照：https://www.cnblogs.com/ranson7zop/p/7657874.html* 参照：https://www.cnblogs.com/Ph-one/p/6839892.html*/
struct EasyWavHead
{char riffFlag[4]; //文档标识，大写"RIFF"//从下一个字段首地址开始到文件末尾的总字节数。//该字段的数值加 8 为当前文件的实际长度。unsigned int riffSize; //数据长度char waveFlag[4]; //文件格式标识，大写"WAVE"char fmtFlag[4]; //格式块标识，小写"fmt "unsigned int fmtSize; //格式块长度，可以是 16、 18 、20、40 等unsigned short compressionCode; //编码格式代码，1为pcmunsigned short numChannels; //通道个数unsigned int sampleRate; //采样频率//该数值为:声道数×采样频率×每样本的数据位数/8。//播放软件利用此值可以估计缓冲区的大小。unsigned int bytesPerSecond; //码率（数据传输速率）//采样帧大小。该数值为:声道数×位数/8。//播放软件需要一次处理多个该值大小的字节数据,用该数值调整缓冲区。unsigned short blockAlign; //数据块对其单位//存储每个采样值所用的二进制数位数。常见的位数有 4、8、12、16、24、32unsigned short bitsPerSample; //采样位数（采样深度）char dataFlag[4]; //表示数据开头，小写"data"unsigned int dataSize; //数据部分的长度
};/*** @brief 处理音频解码相关* @author 龚建波* @date 2020-11-24* @details* 主要功能为将音频数据解码为PCM* （本来只处理编解码，不带多线程处理部分）** 测试文件地址：* https://samples.mplayerhq.hu/A-codecs/* 音乐下载：* http://www.musictool.top/* 重采样参考：* ffmpeg-4.2.4\doc\examples\resampling_audio.c* https://www.jianshu.com/p/bf5e54f553a4* https://segmentfault.com/a/1190000025145553* https://blog.csdn.net/bixinwei22/article/details/86545497* https://blog.csdn.net/zhuweigangzwg/article/details/53395009*/
class EasyAudioDecoder
{
public://解码单个音频文件，获取pcm数据//filepath:文件路径//params:目标格式的参数//返回数据的size为0则转换无效static QByteArray toPcmData(const QString &filepath,const EasyAudioParameter &params);//转码音频文件为指定参数格式的wav(pcm)//filepath:原文件路径//dstpath:目标文件路径//params:目标格式的参数//返回false则转换失败static bool transcodeToWav(const QString &srcpath,const QString &dstpath,const EasyAudioParameter &params);//多个文件转码并拼接为指定参数格式的wav(pcm)//bool stitchingToWav(const QList<QString> &srcpaths,//                    const QString &dstpath,//                    const EasyAudioParameter &params);//获取pcm数据(使用重采样libswresample)//contextPtr:上下文指针//params:目标格式的参数，如果参数无效会使用原数据参数//callBack:转换时的同步回调函数// 每次packet处理都会调用，若返回false则整个toPcm无效返回false// 回调函数参描1为输出缓冲区地址，参数2为输出数据有效字节长度//return false:表示转换无效失败static bool toPcm(const QSharedPointer<EasyAudioContext> &contextPtr,const EasyAudioParameter &params,std::function<bool(const char* outData,int outSize)> callBack);//对智能指针版本的封装static bool toPcm(const QString &filepath,const EasyAudioParameter &params,std::function<bool(const char* outData,int outSize)> callBack);//生成wav(pcm)文件头信息////sampleRate: 采样频率//channels: 通道数，一般为1//sampleFormat: AVSampleFormat枚举值//dataSize: pcm数据字节长度//return EasyWavHead: wav头static EasyWavHead createWavHead(int sampleRate,int channels,AVSampleFormat sampleFormat,unsigned int dataSize);//判断导出参数是否有效，在无效的情况下将使用输出参数对应字段值static EasyAudioParameter getOutParameter( const EasyAudioParameter &inParams,const EasyAudioParameter &outParams);
};

#include "EasyAudioDecoder.h"#include <QFileInfo>
#include <QFile>
#include <QDir>
#include <QScopeGuard>
#include <QDebug>QByteArray EasyAudioDecoder::toPcmData(const QString &filepath, const EasyAudioParameter &params)
{//保存解码后的pcm数据QByteArray pcm_data;//toPcm的回调参数std::function<bool(const char*,int)> call_back=[&](const char* pcmData,int pcmSize){pcm_data.append(pcmData, pcmSize);return true;};if(!toPcm(filepath,params,call_back))pcm_data.clear();return pcm_data;
}bool EasyAudioDecoder::transcodeToWav(const QString &srcpath, const QString &dstpath, const EasyAudioParameter &params)
{qDebug()<<"transcodeToWav begin. src"<<srcpath<<"dst"<<dstpath;bool trans_result=false;QFileInfo src_info(srcpath);QFileInfo dst_info(dstpath);if(!src_info.exists())return trans_result;//判断目录存在(qfile不能生成目录)if(dst_info.dir().exists()||dst_info.dir().mkpath(dst_info.absolutePath())){EasyWavHead head;QFile dst_file(dstpath);if(dst_file.open(QIODevice::WriteOnly)){//先写头，再写数据，再seek0覆盖头dst_file.write((const char *)&head,sizeof(head));//缓存pcm数据，达到一定size再写入文件QByteArray pcm_temp;//数据总大小unsigned int size_count=0;//toPcm的回调参数std::function<bool(const char*,int)> call_back=[&](const char* pcmData,int pcmSize){//每次只写一点速度比较慢//dst_file.write(pcmData,pcmSize);pcm_temp.append(pcmData,pcmSize);size_count+=pcmSize;//每次写10Mif(pcm_temp.count()>1024*1024*10){dst_file.write(pcm_temp);pcm_temp.clear();}return true;};trans_result=toPcm(srcpath,params,call_back);if(trans_result){//尾巴上那点写文件dst_file.write(pcm_temp);EasyAudioContext context{srcpath};if(context.isValid()){EasyAudioParameter in_param=context.getAudioParameter();EasyAudioParameter out_param=getOutParameter(in_param,params);head=createWavHead(out_param.sampleRate,out_param.channels,out_param.sampleFormat,size_count);//覆盖头dst_file.seek(0);dst_file.write((const char *)&head,sizeof(head));}else{trans_result=false;}}dst_file.close();}//无效的转换就把那个文件删除if(!trans_result){dst_file.remove();}}qDebug()<<"transcodeToWav end."<<trans_result;return trans_result;
}bool EasyAudioDecoder::toPcm(const QSharedPointer<EasyAudioContext> &contextPtr,const EasyAudioParameter &params,std::function<bool (const char *, int)> callBack)
{//无效的上下文if(contextPtr.isNull()||!contextPtr->isValid())return false;//描述存储压缩数据//视频通常包含一个压缩帧，音频可能包含多个压缩帧AVPacket *packet = NULL;//描述原始数据AVFrame *frame = NULL;//重采样上下文SwrContext *swr_ctx = NULL;//解析时out缓冲int out_bufsize=1024*1024*2; //默认单个通道1M大小uint8_t *out_buffer=new uint8_t[out_bufsize];uint8_t *out_buffer_arr[2] = {out_buffer,out_buffer+out_bufsize/2};//没有变量来接收的话会立即执行auto clean=qScopeGuard([&]{if(frame){av_frame_unref(frame);av_frame_free(&frame);}if(packet){//av_free_packet改用av_packet_unrefav_packet_unref(packet);av_packet_free(&packet);}if(swr_ctx){swr_close(swr_ctx);swr_free(&swr_ctx);}if (out_buffer)delete [] out_buffer;qDebug()<<"toPcm clean";});Q_UNUSED(clean);packet=av_packet_alloc();av_init_packet(packet);frame=av_frame_alloc();//原数据格式参数EasyAudioParameter in_param=contextPtr->getAudioParameter();const int in_channels = in_param.channels;const AVSampleFormat in_sample_fmt=in_param.sampleFormat;const int in_sample_rate=in_param.sampleRate;//目标格式的参数（如果传入的参数无效，就保持输入格式）EasyAudioParameter out_param=getOutParameter(in_param,params);const int out_channels = out_param.channels;const AVSampleFormat out_sample_fmt=out_param.sampleFormat;const int out_sample_rate=out_param.sampleRate;//区分planar和packedconst bool out_is_planar=(out_sample_fmt>AV_SAMPLE_FMT_DBL&&out_sample_fmt!=AV_SAMPLE_FMT_S64);//返回每个sample的字节数，S16=2 bytesconst int sample_bytes=av_get_bytes_per_sample(out_sample_fmt);//分配SwrContext并设置/重置公共参数//返回NULL为失败，否则分配上下文//（目前的需求默认是转为单声道 16bit 的，只有采样率会设置）swr_ctx=swr_alloc_set_opts(NULL, //现有的swr上下文，不可用则为NULLav_get_default_channel_layout(out_channels), //输出通道布局 (AV_CH_LAYOUT_*)out_sample_fmt, //输出采样格式 (AV_SAMPLE_FMT_*).out_sample_rate, //输出采样频率 (frequency in Hz)av_get_default_channel_layout(in_channels), //输入通道布局 (AV_CH_LAYOUT_*)in_sample_fmt, //输入采样格式 (AV_SAMPLE_FMT_*).in_sample_rate, //输入采样频率 (frequency in Hz)0, NULL); //日志相关略if(swr_ctx==NULL)return false;//初始化//如果要修改转换的参数，调用参数设置后再次initif(swr_init(swr_ctx)<0)return false;int ret=0;//因为av_read_frame后就到了下一帧，为了重入先seek到起始处//参数一: 上下文;//参数二: 流索引, 如果stream_index是-1，会选择一个默认流，时间戳会从以AV_TIME_BASE为单位向具体流的时间基自动转换。//参数三: 将要定位处的时间戳，time_base单位或者如果没有流是指定的就用av_time_base单位。//参数四: seek功能flag；//AVSEEK_FLAG_BACKWARD  是seek到请求的timestamp之前最近的关键帧//AVSEEK_FLAG_BYTE 是基于字节位置的查找//AVSEEK_FLAG_ANY 是可以seek到任意帧，注意不一定是关键帧，因此使用时可能会导致花屏//AVSEEK_FLAG_FRAME 是基于帧数量快进//返回值：成功返回>=0if(av_seek_frame(contextPtr->formatCtx,-1,0,AVSEEK_FLAG_ANY)<0)return false;//av_read_frame取流的下一帧，这里循环读取//返回0表示成功，小于0表示错误或者文件尾while (av_read_frame(contextPtr->formatCtx, packet)>=0){//取音频if (packet->stream_index == contextPtr->streamIndex){//提供原始数据作为解码器的输入(将packet写入到解码队列当中去)//返回0表示成功ret=avcodec_send_packet(contextPtr->codecCtx, packet);if(ret!=0)continue;//从解码器循环取数据帧//返回0表示成功while (avcodec_receive_frame(contextPtr->codecCtx, frame)==0){//下一个采样数的上限//swr_get_out_samples貌似算出来的比av_rescale_rnd多一丢丢//但是比最终导出的采样数多一点const int out_samples=swr_get_out_samples(swr_ctx,frame->nb_samples);//const int out_samples=av_rescale_rnd(swr_get_delay(swr_ctx, in_sample_rate)+//                                        frame->nb_samples,//                                        out_sample_rate,//                                        contextPtr->codecCtx->sample_rate,//                                        AV_ROUND_ZERO);//qDebug()<<out_samples<<out_bufsize<<sample_bytes*out_samples*out_channels;//缓冲区大小是否足够，不够就根据计算值扩充缓冲区大小，且比实际值大0.5倍if(out_bufsize<sample_bytes*out_samples*out_channels){delete[] out_buffer;out_bufsize=sample_bytes*out_samples*out_channels*1.5;out_buffer=new uint8_t[out_bufsize];out_buffer_arr[0]=out_buffer;out_buffer_arr[1]=out_buffer+out_bufsize/2;}//重采样转换//如果传入的nb_samles大于了传出的nb_samplse则SwrContext中会有缓存//如果有缓存，swr_get_out_samples第二个参数填零取大小，swr_convert最后两个也填0来获取数据//通过使用swr_get_out_samples来获取下一次调用swr_convert在给定输入样本数量下输出样本数量的上限，来提供足够的空间。//如果是planar类型比如AV_SAMPLE_FMT_S16P,每个data[0]就是左声道，data[1]就是右声道。//如果是packed类型，这种类型左右声道的数据都是一个一维数据连续存放的。LRLRLRLR...，就只有data[0]有数据。//return每个通道输出的样本数，出错时为负值ret = swr_convert(swr_ctx, out_buffer_arr, out_samples,(const uint8_t **)frame->data,frame->nb_samples);if (ret <= 0) {av_frame_unref(frame);continue;}//获取给定音频参数所需的缓冲区大小=通道数 * 采样点数* 采样位数/8const int out_bufuse = av_samples_get_buffer_size(NULL, out_channels, ret, out_sample_fmt, 1);//qDebug()<<"out"<<out_bufuse<<"sample"<<ret<<"channel"<<out_channels<<sample_bytes*out_samples;//2021-04-01更正，之前导出没有处理多声道数据，导致数据长度不一致//（因为原本的设计是双声道可以导出为两个单声道文件）if(out_bufuse > 0){//回调false则整体失败返回falseif(!callBack((const char*)out_buffer, out_bufuse)){return false;}}//if(out_channels==2)//{//    //双声道时提取左声道数据//    //双声道区分planaer和packed//    if(out_is_planar){//        //planaer左右声道单独放的//        if(!callBack((const char*)out_buffer_arr[0], out_bufuse/2)){//            return false;//        }//    }else{//        //packed都在[0]，一左一右存放//        for(int i = 0; i < out_bufuse; i += sample_bytes*2)//        {//            //回调false则整体失败返回false//            if(!callBack((const char*)out_buffer_arr[0] + i, sample_bytes)){//                return false;//            }//        }//    }//}else if(out_channels==1){//    //单声道数据//    //回调false则整体失败返回false//    if(!callBack((const char*)out_buffer_arr[0], out_bufuse)){//        return false;//    }//}av_frame_unref(frame);}}av_packet_unref(packet);}qDebug()<<"toPcm end";return true;
}bool EasyAudioDecoder::toPcm(const QString &filepath, const EasyAudioParameter &params, std::function<bool (const char *, int)> callBack)
{return toPcm(QSharedPointer<EasyAudioContext>(new EasyAudioContext(filepath)),params,callBack);
}EasyWavHead EasyAudioDecoder::createWavHead(int sampleRate, int channels, AVSampleFormat sampleFormat, unsigned int dataSize)
{const int bits=av_get_bytes_per_sample(sampleFormat)*8;const int head_size = sizeof(EasyWavHead);EasyWavHead wav_head;memset(&wav_head, 0, head_size);memcpy(wav_head.riffFlag, "RIFF", 4);memcpy(wav_head.waveFlag, "WAVE", 4);memcpy(wav_head.fmtFlag, "fmt ", 4);memcpy(wav_head.dataFlag, "data", 4);//出去头部前8个字节的长度，用的44字节的格式头，所以+44-8=36wav_head.riffSize = dataSize + 36;//不知道干嘛的wav_head.fmtSize = 16;//1为pcmwav_head.compressionCode = 0x01;wav_head.numChannels = channels;wav_head.sampleRate = sampleRate;wav_head.bytesPerSecond = (bits / 8) * channels * sampleRate;wav_head.blockAlign = (bits / 8) * channels;wav_head.bitsPerSample = bits;//除去头的数据长度wav_head.dataSize = dataSize;return wav_head;
}EasyAudioParameter EasyAudioDecoder::getOutParameter(const EasyAudioParameter &inParams, const EasyAudioParameter &outParams)
{//如果导出的参数无效，就用输入数据的参数EasyAudioParameter param;param.channels=(outParams.channels<1)?inParams.channels:outParams.channels;param.sampleFormat=(outParams.sampleFormat<=AV_SAMPLE_FMT_NONE||outParams.sampleFormat>=AV_SAMPLE_FMT_NB)?inParams.sampleFormat:outParams.sampleFormat;param.sampleRate=(outParams.sampleRate<1)?inParams.sampleRate:outParams.sampleRate;return param;
}

#include "mainwindow.h"
#include "ui_mainwindow.h"#include <QFileDialog>
#include <QPainter>
#include <QFileInfo>
#include <QDebug>#include "EasyAudioContext.h"
#include "EasyAudioDecoder.h"MainWindow::MainWindow(QWidget *parent): QMainWindow(parent), ui(new Ui::MainWindow)
{ui->setupUi(this);ui->boxFormat->setCurrentIndex(1);ui->lineEdit->setText(qApp->applicationDirPath()+"/wav_1ch_11.025K_16bit.wav");//选择音频文件connect(ui->btnFile,&QPushButton::clicked,this,[this]{const QString filepath=QFileDialog::getOpenFileName(this);if(!filepath.isEmpty())ui->lineEdit->setText(filepath);});//获取音频信息connect(ui->btnInfo,&QPushButton::clicked,this,[this]{const QString filepath=ui->lineEdit->text();QFileInfo info(filepath);if(!info.exists())return;EasyAudioContext context{filepath};if(context.isValid()){EasyAudioInfo info=context.getAudioInfo();QString info_str="\nfilepath: "+info.filepath+"\nfilename: "+info.filename+"\nencode: "+info.encode+"\nsampleRate: "+QString::number(info.sampleRate)+" Hz"+"\nchannels: "+QString::number(info.channels)+"\nsampleBit: "+QString::number(info.sampleBit)+" bit"+"\nduration: "+QString::number(info.duration)+" ms"+"\nbitRate: "+QString::number(info.bitRate)+" bps"+"\nsize: "+QString::number(info.size)+" byte"+"\ntype: "+info.type;ui->textEdit->append(info_str);}});//转为pcm数据connect(ui->btnPcm,&QPushButton::clicked,this,[this]{const QString filepath=ui->lineEdit->text();QFileInfo info(filepath);if(!info.exists())return;//EasyAudioContext *context=new EasyAudioContext(filepath);EasyAudioDecoder decoder;EasyAudioParameter param;param.channels=ui->spinChannel->value();param.sampleFormat=AVSampleFormat(ui->boxFormat->currentIndex());param.sampleRate=ui->spinRate->value();isS16=(param.sampleFormat==AV_SAMPLE_FMT_S16||param.sampleFormat==AV_SAMPLE_FMT_S16P);//测试重入pcmData=decoder.toPcmData(filepath,param);qDebug()<<"pcm data size"<<pcmData.count();//pcmData=decoder.toPcmData(filepath,param);//qDebug()<<"redecode pcm data size"<<pcmData.count();update();});//转为wav（pcm）文件connect(ui->btnWav,&QPushButton::clicked,this,[this]{const QString filepath=ui->lineEdit->text();QFileInfo info(filepath);if(!info.exists())return;//EasyAudioContext *context=new EasyAudioContext(filepath);EasyAudioDecoder decoder;EasyAudioParameter param;param.channels=ui->spinChannel->value();param.sampleFormat=AVSampleFormat(ui->boxFormat->currentIndex());param.sampleRate=ui->spinRate->value();//qDebug()<<info.filePath()<<info.fileName()//       <<info.absoluteDir()<<info.absoluteFilePath()<<info.absolutePath();decoder.transcodeToWav(filepath,filepath+".wav",param);});
}MainWindow::~MainWindow()
{delete ui;
}void MainWindow::paintEvent(QPaintEvent *event)
{Q_UNUSED(event);QPainter painter(this);painter.fillRect(rect(),QColor(50,50,50));//只绘制shortif(pcmData.count()<4||!isS16)return;painter.setPen(QColor(100,100,100));painter.translate(0,height()/2);const int length=pcmData.count()/2;//只绘制16位的数据const short *datas=(const short *)pcmData.constData();//点的x间隔double xspace=width()/(double)length;//绘制采样点步进，测试用的固定值，文件比较大懒得算，测试时不要用大文件就行了const int step=1;//qDebug()<<"step"<<step;for(int i=0;i<length-step;i+=step){painter.drawLine(xspace*i,-datas[i]/150,xspace*(i+step),-datas[i+step]/150);}
}

查看全文

http://www.lryc.cn/news/2419803.html