FFMpeg学习进阶：音频处理基础理论与重采样技术详解

作者：被抛弃的微博名 | 来源：互联网 | 2024-11-09 13:46

在Android平台中，播放音频的采样率通常固定为44.1kHz，而录音的采样率则固定为8kHz。为了确保音频设备的正常工作，底层驱动必须预先设定这些固定的采样率。当上层应用提供的采样率与这些预设值不匹配时，需要通过重采样（resample）技术来调整采样率，以保证音频数据的正确处理和传输。本文将详细探讨FFMpeg在音频处理中的基础理论及重采样技术的应用。

Android放音的采样率固定为44.1KHz，录音的采样率固定为8KHz，因此底层的音频设备驱动需要设置好这两个固定的采样率。如果上层传过来的采样率不符的话，需要进行resample重采样处理。

几个名词：

1. 采样率

采样设备每秒抽取样本的次数

2. 音频格式及量化精度（位宽）

每种音频格式有不同的量化精度（位宽），位数越多，表示值就越精确，声音表现自然就越精准。FFMpeg中音频格式有以下几种，每种格式有其占用的字节数信息：

enum AVSampleFormat { AV_SAMPLE_FMT_NONE = -1, AV_SAMPLE_FMT_U8, /// AV_SAMPLE_FMT_S16, /// AV_SAMPLE_FMT_S32, /// AV_SAMPLE_FMT_FLT, /// AV_SAMPLE_FMT_DBL, ///
    AV_SAMPLE_FMT_U8P,         /// AV_SAMPLE_FMT_S16P, /// AV_SAMPLE_FMT_S32P, /// AV_SAMPLE_FMT_FLTP, /// AV_SAMPLE_FMT_DBLP, /// AV_SAMPLE_FMT_S64, /// AV_SAMPLE_FMT_S64P, ///
    AV_SAMPLE_FMT_NB           ///};

3. 分片（plane）和打包（packed）

以双声道为例，带P（plane）的数据格式在存储时，其左声道和右声道的数据是分开存储的，左声道的数据存储在data[0]，右声道的数据存储在data[1]，每个声道的所占用的字节数为linesize[0]和linesize[1]；

不带P（packed）的音频数据在存储时，是按照LRLRLR...的格式交替存储在data[0]中，linesize[0]表示总的数据量。

4. 声道分布（channel_layout)

声道分布在FFmpeg\libavutil\channel_layout.h中有定义，一般来说用的比较多的是AV_CH_LAYOUT_STEREO（双声道）和AV_CH_LAYOUT_SURROUND（三声道），这两者的定义如下：

#define AV_CH_LAYOUT_STEREO            (AV_CH_FRONT_LEFT|AV_CH_FRONT_RIGHT)
#define AV_CH_LAYOUT_SURROUND          (AV_CH_LAYOUT_STEREO|AV_CH_FRONT_CENTER)

5. 音频帧的数据量计算

一帧音频的数据量=channel数 * nb_samples样本数 * 每个样本占用的字节数

如果该音频帧是FLTP格式的PCM数据，包含1024个样本，双声道，那么该音频帧包含的音频数据量是2*1024*4=8192字节。

6. 音频重采样（resample）

FFMpeg自带的resample例子：FFmpeg\doc\examples\resampling_audio.c，这里把最核心的resample代码贴一下，在工程中使用时，注意设置的各种参数，给定的输入数据都不能错。

int main(int argc, char **argv)
{
    // 设置数据源src和dst声道布局
    int64_t src_ch_layout = AV_CH_LAYOUT_STEREO, dst_ch_layout = AV_CH_LAYOUT_SURROUND;
    // 设置src和dst采样率
    int src_rate = 48000, dst_rate = 44100;
    uint8_t **src_data = NULL, **dst_data = NULL;
    int src_nb_channels = 0, dst_nb_channels = 0;
    int src_linesize, dst_linesize;
    int src_nb_samples = 1024, dst_nb_samples, max_dst_nb_samples;
    // 设置src和dst音频格式
    enum AVSampleFormat src_sample_fmt = AV_SAMPLE_FMT_DBL, dst_sample_fmt = AV_SAMPLE_FMT_S16;
    const char *dst_filename = NULL;
    FILE *dst_file;
    int dst_bufsize;
    const char *fmt;
    // 重采样上下文，包含resample信息
    struct SwrContext *swr_ctx;
    double t;
    int ret;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s output_file\n"
                "API example program to show how to resample an audio stream with libswresample.\n"
                "This program generates a series of audio frames, resamples them to a specified "
                "output format and rate and saves them to an output file named output_file.\n",
            argv[0]);
        exit(1);
    }
    // resample后的数据保存到本地文件
    dst_filename = argv[1];

    dst_file = fopen(dst_filename, "wb");
    if (!dst_file) {
        fprintf(stderr, "Could not open destination file %s\n", dst_filename);
        exit(1);
    }

    /* create resampler context */
    swr_ctx = swr_alloc();
    if (!swr_ctx) {
        fprintf(stderr, "Could not allocate resampler context\n");
        ret = AVERROR(ENOMEM);
        goto end;
    }

    /* set options */
    // 将resample信息写入resample上下文
    av_opt_set_int(swr_ctx, "in_channel_layout",    src_ch_layout, 0);
    av_opt_set_int(swr_ctx, "in_sample_rate",       src_rate, 0);
    av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", src_sample_fmt, 0);

    av_opt_set_int(swr_ctx, "out_channel_layout",    dst_ch_layout, 0);
    av_opt_set_int(swr_ctx, "out_sample_rate",       dst_rate, 0);
    av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", dst_sample_fmt, 0);

    /* initialize the resampling context */
    if ((ret = swr_init(swr_ctx)) <0) {
        fprintf(stderr, "Failed to initialize the resampling context\n");
        goto end;
    }

    /* allocate source and destination samples buffers */

    src_nb_channels = av_get_channel_layout_nb_channels(src_ch_layout);
    ret = av_samples_alloc_array_and_samples(&src_data, &src_linesize, src_nb_channels,
                                             src_nb_samples, src_sample_fmt, 0);
    if (ret <0) {
        fprintf(stderr, "Could not allocate source samples\n");
        goto end;
    }

    /* compute the number of converted samples: buffering is avoided
     * ensuring that the output buffer will contain at least all the
     * converted input samples */
    max_dst_nb_samples = dst_nb_samples =
        av_rescale_rnd(src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);

    /* buffer is going to be directly written to a rawaudio file, no alignment */
    dst_nb_channels = av_get_channel_layout_nb_channels(dst_ch_layout);
    ret = av_samples_alloc_array_and_samples(&dst_data, &dst_linesize, dst_nb_channels,
                                             dst_nb_samples, dst_sample_fmt, 0);
    if (ret <0) {
        fprintf(stderr, "Could not allocate destination samples\n");
        goto end;
    }

    t = 0;
    do {
        /* generate synthetic audio */
        // 这里是自行生成源数据帧，实际工程中应该将解码后的PCM数据填入src_data中
        fill_samples((double *)src_data[0], src_nb_samples, src_nb_channels, src_rate, &t);

        /* compute destination number of samples */
        dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, src_rate) +
                                        src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);
        if (dst_nb_samples > max_dst_nb_samples) {
            av_freep(&dst_data[0]);
            ret = av_samples_alloc(dst_data, &dst_linesize, dst_nb_channels,
                                   dst_nb_samples, dst_sample_fmt, 1);
            if (ret <0)
                break;
            max_dst_nb_samples = dst_nb_samples;
        }

        /* convert to destination format */
        // 重采样操作
        ret = swr_convert(swr_ctx, dst_data, dst_nb_samples, (const uint8_t **)src_data, src_nb_samples);
        if (ret <0) {
            fprintf(stderr, "Error while converting\n");
            goto end;
        }
        dst_bufsize = av_samples_get_buffer_size(&dst_linesize, dst_nb_channels,
                                                 ret, dst_sample_fmt, 1);
        if (dst_bufsize <0) {
            fprintf(stderr, "Could not get sample buffer size\n");
            goto end;
        }
        printf("t:%f in:%d out:%d\n", t, src_nb_samples, ret);
        fwrite(dst_data[0], 1, dst_bufsize, dst_file);
    } while (t <10);

    if ((ret = get_format_from_sample_fmt(&fmt, dst_sample_fmt)) <0)
        goto end;
    fprintf(stderr, "Resampling succeeded. Play the output file with the command:\n"
            "ffplay -f %s -channel_layout %"PRId64" -channels %d -ar %d %s\n",
            fmt, dst_ch_layout, dst_nb_channels, dst_rate, dst_filename);

end:
    fclose(dst_file);

    if (src_data)
        av_freep(&src_data[0]);
    av_freep(&src_data);

    if (dst_data)
        av_freep(&dst_data[0]);
    av_freep(&dst_data);

    swr_free(&swr_ctx);
    return ret <0;
}

推荐阅读

function
分页插件3指定到某一页

前言--页数多了以后需要指定到某一页（只做了功能，样式没有细调）html ... [详细]

蜡笔小新 2024-12-27 15:19:01
java
扫描线三巨头 hdu1928hdu 1255 hdu 1542 [POJ 1151]

学习链接：http:blog.csdn.netlwt36articledetails48908031学习扫描线主要学习的是一种扫描的思想，后期可以求解很 ... [详细]

蜡笔小新 2024-12-26 20:04:36
char
HTML Attribute Naming Conventions for Fast Components

This document outlines the recommended naming conventions for HTML attributes in Fast Components, focusing on readability and consistency with existing standards. ... [详细]

蜡笔小新 2024-12-26 19:13:45
get
Weight the Tree（树形dp）

题目Link题目学习link1题目学习link2题目学习link3%%%受益匪浅！－－－－－&# ... [详细]

蜡笔小新 2024-12-26 15:55:56
list
VxWorks中的双向链表与环形缓冲应用

本文详细探讨了VxWorks操作系统中双向链表和环形缓冲区的实现原理及使用方法，通过具体示例代码加深理解。 ... [详细]

蜡笔小新 2024-12-26 13:26:16
char
Codeforces Round #566 (Div. 2) A~F个人题解

Dashboard-CodeforcesRound#566(Div.2)-CodeforcesA.FillingShapes题意：给你一个的表格，你 ... [详细]

蜡笔小新 2024-12-25 18:41:21
list
深入理解Redis的数据结构与对象系统

本文详细探讨了Redis中的数据结构和对象系统的实现，包括字符串、列表、集合、哈希表和有序集合等五种核心对象类型，以及它们所使用的底层数据结构。通过分析源码和相关文献，帮助读者更好地理解Redis的设计原理。 ... [详细]

蜡笔小新 2024-12-25 04:11:22
java
最小路径覆盖与强连通分量的应用：国王的问题

本题探讨了在一个有向图中，如何根据特定规则将城市划分为若干个区域，使得每个区域内的城市之间能够相互到达，并且划分的区域数量最少。题目提供了时间限制和内存限制，要求在给定的城市和道路信息下，计算出最少需要划分的区域数量。 ... [详细]

蜡笔小新 2024-12-23 18:42:12
list
C++之父论C++：快速掌握高效编程之道

精通C++并非易事，为何它比其他语言更难掌握？这主要归因于C++的设计理念，即不强迫用户接受特定的编程风格或限制创新思维。本文探讨了如何有效学习C++，并介绍了几本权威的学习资源。 ... [详细]

蜡笔小新 2024-12-15 14:51:25
get
USACO 2014 Jan - Moolympics区间记录优化算法

题目描述：给定n个半开区间[a, b)，要求使用两个互不重叠的记录器，求最多可以记录多少个区间。解决方案采用贪心算法，通过排序和遍历实现最优解。 ... [详细]

蜡笔小新 2024-12-27 18:14:31
list
UNP 第9章：主机名与地址转换

本章探讨了用于在主机名和数值地址之间进行转换的函数，如gethostbyname和gethostbyaddr。此外，还介绍了getservbyname和getservbyport函数，用于在服务器名和端口号之间进行转换。 ... [详细]

蜡笔小新 2024-12-27 11:26:39
get
计算机图形学实训：OpenGL入门与直线光栅化算法

本教程涵盖OpenGL基础操作及直线光栅化技术，包括点的绘制、简单图形绘制、直线绘制以及DDA和中点画线算法。通过逐步实践，帮助读者掌握OpenGL的基本使用方法。 ... [详细]

蜡笔小新 2024-12-26 12:24:25
function
深入理解线程局部存储

在多线程编程环境中，线程之间共享全局变量可能导致数据竞争和不一致性。为了解决这一问题，Linux提供了线程局部存储（TLS），使每个线程可以拥有独立的变量副本，确保线程间的数据隔离与安全。 ... [详细]

蜡笔小新 2024-12-25 17:04:36
get
2016年10月25日数学考试：斐波那契数列与矩阵快速幂的应用

本次考试于2016年10月25日上午7:50至11:15举行，主要涉及数学专题，特别是斐波那契数列的性质及其在编程中的应用。本文将详细解析考试中的题目，并提供解题思路和代码实现。 ... [详细]

蜡笔小新 2024-12-25 13:08:21
get
2-SAT问题学习笔记

本文介绍了一种解决二元可满足性（2-SAT）问题的方法。通过具体实例，详细解释了如何构建模型、应用算法，并提供了编程实现的细节和优化建议。 ... [详细]

蜡笔小新 2024-12-24 21:48:43

被抛弃的微博名

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章