FFmpeg源码分析:音频滤镜介绍(上)
at 8个月前 ca FFmpeg pv 430 by touch
FFmpeg在libavfilter模块提供音视频滤镜。所有的音频滤镜都注册在libavfilter/allfilters.c。我们也可以使用ffmpeg -filters命令行来查看当前支持的所有滤镜,前面-a代表音频。
本篇文章主要介绍音频滤镜,包括:压缩器、淡入淡出、移除噪声、延时、回声、噪声门。
关于音频滤镜的详细介绍,可查看官方文档:音频滤镜。
1、acompressor
压缩器,主要用于减小信号的动态范围。尤其是现代音乐,大多数以高压缩比,提高整体响度。压缩原理是通过检测信号超过所设定阈值,将其除以比例因子。参数选项如下:
level_in:输入增益,默认为1,范围[0.015625, 64]
mode:压缩模式,有 upward和downward两种模式, 默认为downward
threshold:如果媒体流信号达到此阈值,会引起增益减少。默认为0.125,范围[0.00097563, 1]
ratio:信号压缩的比例因子,默认为2,范围[1, 20]
attack:信号提升到阈值所用的毫秒数,默认为20,范围[0.01, 2000]
release:信号降低到阈值所用的毫秒数,默认为250,范围[0.01, 9000]
makeup:在处理后,多少信号被放大. 默认为1,范围[1, 64]
knee:增益降低的阶数,默认为2.82843,范围[1, 8]
link:信号衰减的average和maximum两种模式, 默认为average
detection:采用peak峰值信号或rms均方根信号,默认采用更加平滑的rms
mix:输出时使用多少压缩信号, 默认为1,范围[0, 1]
2、acrossfade
淡入淡出效果,将该效果应用到从一个音频流到另一个音频流的切换过程。参数选项如下:
nb_samples, ns:指定淡入淡出效果的采样数, 默认为44100
duration, d:指定淡入淡出的持续时间
overlap, o:第一个流结束是否与第二个流无缝衔接,默认开启
curve1:设置第一个流淡入淡出的过渡曲线
curve2:设置第二个流淡入淡出的过渡曲线
参考命令如下:
ffmpeg -i first.flac -i second.flac -filter_complex acrossfade=d=10:c1=exp:c2=exp output.flac
3、afade
淡入淡出效果,与acrossfade效果类似。参数列表如下:
type, t:效果类型in或者out,默认为in
start_sample, ss:开始采样数,默认为0
nb_samples, ns:淡入淡出的采样个数,默认为44100
start_time, st:开始时间,默认为0
duration, d:淡入淡出效果持续时长
curve:淡入淡出的过渡曲线,包括以下选项:
tri:三角形,默认为线性斜率
qsin:四分之一正弦波
hsin:二分之一正弦波
esin:指数正弦波
log:对数
ipar:反抛物线
qua:二次插值
cub:三次插值
squ:平方根
cbr:立方根
par:抛物线
exp:指数
iqsin:反四分之一正弦波
ihsin:反二分之一正弦波
dese:双指数
desi:双指数曲线
losi:回归曲线
sinc:正弦基数函数
isinc:反正弦基数函数
nofade:无淡入淡出
同样地,采用宏定义给不同采样格式设置淡入淡出效果,代码位于af_afade.c,分为FADE_PLANAR(平面存储)和FADE(交错存储)两种形式:
#define FADE_PLANAR(name, type) \ static void fade_samples_## name ##p(uint8_t **dst, uint8_t * const *src, \ int nb_samples, int channels, int dir, \ int64_t start, int64_t range, int curve) \ { \ int i, c; \ \ for (i = 0; i < nb_samples; i++) { \ double gain = fade_gain(curve, start + i * dir, range); \ for (c = 0; c < channels; c++) { \ type *d = (type *)dst[c]; \ const type *s = (type *)src[c]; \ \ d[i] = s[i] * gain; \ } \ } \ } #define FADE(name, type) \ static void fade_samples_## name (uint8_t **dst, uint8_t * const *src, \ int nb_samples, int channels, int dir, \ int64_t start, int64_t range, int curve) \ { \ type *d = (type *)dst[0]; \ const type *s = (type *)src[0]; \ int i, c, k = 0; \ \ for (i = 0; i < nb_samples; i++) { \ double gain = fade_gain(curve, start + i * dir, range); \ for (c = 0; c < channels; c++, k++) \ d[k] = s[k] * gain; \ } \ }
4、adeclick
从输入信号移除脉冲噪声。使用自回归模型将检测为脉冲噪声的样本替换为插值样本。参数选项如下:
window, w:设置窗函数大小,单位为ms。默认为55,范围[10, 100]
overlap, o:设置窗体重叠比例,默认为75,范围[50, 95]
arorder, a:设置自回归阶数,默认2,范围[0, 25]
threshold, t:设置阈值,默认为2,范围[1, 100]
burst, b:设置聚变系数,默认为2,范围[0, 10]
method, m:设置重叠方法,可以为add, a或save, s
5、adelay
延迟效果,声道的延迟采样使用静音填充。代码位于libavfilter/af_adelay.c,采用宏定义把对应采样格式填充为静音。如果是u8类型填充为0x80,其他类型填充为0x00,核心代码如下:
#define DELAY(name, type, fill) \ static void delay_channel_## name ##p(ChanDelay *d, int nb_samples, \ const uint8_t *ssrc, uint8_t *ddst) \ { \ const type *src = (type *)ssrc; \ type *dst = (type *)ddst; \ type *samples = (type *)d->samples; \ \ while (nb_samples) { \ if (d->delay_index < d->delay) { \ const int len = FFMIN(nb_samples, d->delay - d->delay_index); \ \ memcpy(&samples[d->delay_index], src, len * sizeof(type)); \ memset(dst, fill, len * sizeof(type)); \ d->delay_index += len; \ src += len; \ dst += len; \ nb_samples -= len; \ } else { \ *dst = samples[d->index]; \ samples[d->index] = *src; \ nb_samples--; \ d->index++; \ src++, dst++; \ d->index = d->index >= d->delay ? 0 : d->index; \ } \ } \ } DELAY(u8, uint8_t, 0x80) DELAY(s16, int16_t, 0) DELAY(s32, int32_t, 0) DELAY(flt, float, 0) DELAY(dbl, double, 0) AG注释
6、aecho
回声效果,添加回声到音频流。回声是反射声音,在山间或房间内会形成自然反射声。数字回声信号可以模拟这种效果,通过调节原始声与反射声的延迟时间与衰减系数。其中原始声又称为干声,反射声称为湿声。参数选项如下:
in_gain:反射声的输入增益,默认为 0.6
out_gain:反射声的输出增益,默认为0.3
delays:每次反射声的延迟间隔,使用'|'分隔,默认为1000,范围为(0, 90000.0]
decays:每次反射声的衰减系数,使用'|'分隔,默认为0,范围为(0, 1.0]
比如,模拟在山间的回声,参考命令如下:
aecho=0.8:0.9:1000:0.3 AG注释
代码位于af_aecho.c,采用宏定义来设置不同采样格式的回声:
#define ECHO(name, type, min, max) \ static void echo_samples_## name ##p(AudioEchoContext *ctx, \ uint8_t **delayptrs, \ uint8_t * const *src, uint8_t **dst, \ int nb_samples, int channels) \ { \ const double out_gain = ctx->out_gain; \ const double in_gain = ctx->in_gain; \ const int nb_echoes = ctx->nb_echoes; \ const int max_samples = ctx->max_samples; \ int i, j, chan, av_uninit(index); \ \ av_assert1(channels > 0); /* would corrupt delay_index */ \ \ for (chan = 0; chan < channels; chan++) { \ const type *s = (type *)src[chan]; \ type *d = (type *)dst[chan]; \ type *dbuf = (type *)delayptrs[chan]; \ \ index = ctx->delay_index; \ for (i = 0; i < nb_samples; i++, s++, d++) { \ double out, in; \ \ in = *s; \ out = in * in_gain; \ for (j = 0; j < nb_echoes; j++) { \ int ix = index + max_samples - ctx->samples[j]; \ ix = MOD(ix, max_samples); \ out += dbuf[ix] * ctx->decay[j]; \ } \ out *= out_gain; \ \ *d = av_clipd(out, min, max); \ dbuf[index] = in; \ \ index = MOD(index + 1, max_samples); \ } \ } \ ctx->delay_index = index; \ } ECHO(dbl, double, -1.0, 1.0 ) ECHO(flt, float, -1.0, 1.0 ) ECHO(s16, int16_t, INT16_MIN, INT16_MAX) ECHO(s32, int32_t, INT32_MIN, INT32_MAX)
7、agate
噪声门,用于减少低频信号,消除有用信号中的干扰噪声。通过检测低于阈值的信号,将其除以设定的比例因子。参数选项如下:
level_in:输入等级,默认为0,范围[0.015625, 64]
mode:操作模式upward或downward.,默认为downward
range:增益衰减范围,默认为 0.06125,范围[0, 1]
threshold:增益提升的阈值,默认为0.125,范围[0, 1]
ratio:增益衰减的比例因子,默认为2,范围[1, 9000]
attack:信号放大时间,默认为20ms,范围[0.01, 9000]
release:信号衰减时间,默认为250ms.,范围[0.01, 9000]
makeup:信号放大系数,默认为1,范围[1, 64]
detection:探测方式,peak或 rms,默认为 rms
link:衰减方式,average或maximum,默认为average
8、alimiter
限幅器,用于防止输入信号超过设定阈值。使用前向预测避免信号失真,意味着信号处理有一点延迟。参数选项如下:
level_in:输入增益,默认为1
level_out:输出增益,默认为1
limit:限制信号不超过阈值,默认为1
attack:信号放大时间,默认为5ms
release:信号衰减时间,默认为50ms
asc:需要减少增益时,ASC负责减少到平均水平
asc_level:衰减时间等级, 0代表不需要额外时间,1代表需要额外时间
level:自动调整输出信号,默认关闭
限幅器的代码位于af_alimiter.c,核心代码如下:
static int filter_frame(AVFilterLink *inlink, AVFrame *in) { ...... // 循环检测每个sample for (n = 0; n < in->nb_samples; n++) { double peak = 0; for (c = 0; c < channels; c++) { double sample = src[c] * level_in; buffer[s->pos + c] = sample; peak = FFMAX(peak, fabs(sample)); } if (s->auto_release && peak > limit) { s->asc += peak; s->asc_c++; } if (peak > limit) { double patt = FFMIN(limit / peak, 1.); double rdelta = get_rdelta(s, release, inlink->sample_rate, peak, limit, patt, 0); double delta = (limit / peak - s->att) / buffer_size * channels; int found = 0; if (delta < s->delta) { s->delta = delta; nextpos[0] = s->pos; nextpos[1] = -1; nextdelta[0] = rdelta; s->nextlen = 1; s->nextiter= 0; } else { for (i = s->nextiter; i < s->nextiter + s->nextlen; i++) { int j = i % buffer_size; double ppeak, pdelta; ppeak = fabs(buffer[nextpos[j]]) > fabs(buffer[nextpos[j] + 1]) ? fabs(buffer[nextpos[j]]) : fabs(buffer[nextpos[j] + 1]); pdelta = (limit / peak - limit / ppeak) / (((buffer_size - nextpos[j] + s->pos) % buffer_size) / channels); if (pdelta < nextdelta[j]) { nextdelta[j] = pdelta; found = 1; break; } } if (found) { s->nextlen = i - s->nextiter + 1; nextpos[(s->nextiter + s->nextlen) % buffer_size] = s->pos; nextdelta[(s->nextiter + s->nextlen) % buffer_size] = rdelta; nextpos[(s->nextiter + s->nextlen + 1) % buffer_size] = -1; s->nextlen++; } } } buf = &s->buffer[(s->pos + channels) % buffer_size]; peak = 0; for (c = 0; c < channels; c++) { double sample = buf[c]; peak = FFMAX(peak, fabs(sample)); } if (s->pos == s->asc_pos && !s->asc_changed) s->asc_pos = -1; if (s->auto_release && s->asc_pos == -1 && peak > limit) { s->asc -= peak; s->asc_c--; } s->att += s->delta; for (c = 0; c < channels; c++) dst[c] = buf[c] * s->att; if ((s->pos + channels) % buffer_size == nextpos[s->nextiter]) { if (s->auto_release) { s->delta = get_rdelta(s, release, inlink->sample_rate, peak, limit, s->att, 1); if (s->nextlen > 1) { int pnextpos = nextpos[(s->nextiter + 1) % buffer_size]; double ppeak = fabs(buffer[pnextpos]) > fabs(buffer[pnextpos + 1]) ? fabs(buffer[pnextpos]) : fabs(buffer[pnextpos + 1]); double pdelta = (limit / ppeak - s->att) / (((buffer_size + pnextpos - ((s->pos + channels) % buffer_size)) % buffer_size) / channels); if (pdelta < s->delta) s->delta = pdelta; } } else { s->delta = nextdelta[s->nextiter]; s->att = limit / peak; } s->nextlen -= 1; nextpos[s->nextiter] = -1; s->nextiter = (s->nextiter + 1) % buffer_size; } if (s->att > 1.) { s->att = 1.; s->delta = 0.; s->nextiter = 0; s->nextlen = 0; nextpos[0] = -1; } if (s->att <= 0.) { s->att = 0.0000000000001; s->delta = (1.0 - s->att) / (inlink->sample_rate * release); } if (s->att != 1. && (1. - s->att) < 0.0000000000001) s->att = 1.; if (s->delta != 0. && fabs(s->delta) < 0.00000000000001) s->delta = 0.; for (c = 0; c < channels; c++) dst[c] = av_clipd(dst[c], -limit, limit) * level * level_out; s->pos = (s->pos + channels) % buffer_size; src += channels; dst += channels; } if (in != out) av_frame_free(&in); return ff_filter_frame(outlink, out); }
版权声明
本文仅代表作者观点,不代表码农殇立场。
本文系作者授权码农殇发表,未经许可,不得转载。