2025-11-23
Here's release 1.1.0 and a big refresher of code and
documentation. BruteFIR was originally developed back in 2000, and
has since about 2006 been "complete" and in maintenance
mode. The stability has made it a reference no-nonsense embeddable
convolution engine still in use today.
Some of the old-school aspects of the original design has however become a hindrance in some contexts, particularly the use of multiple processes to employ multiple CPU cores (BruteFIR was devised in a time when Linux pthreads could only use one core, and all CPU power available was certainly needed to do heavier work like auralization and wavefield synthesis). The big change in this release is therefore a move to single process and threads.
Other old school features are kept though, like the C-library based module interface and the command line interface, and in essence it's compatible with older versions and configuration files. If you have custom modules you need to at least recompile them though.
On to the details of this 1.1.0 release:
Older news
Older news are available in the archived documentation.
You are free to download version 1.1.0.
For reference, the soruce code is also on github, but as this is a legacy low maintenance project the github procedures for releases and packages are not used (you may use it for bug reporting though). The official release is found on this webpage as the downloadable archive.
BruteFIR is a software convolution engine, a program for applying long FIR filters to multi-channel digital audio, either offline or in realtime. Its basic operation is specified through a configuration file, and filters, attenuation and delay can be changed in runtime through a simple command line interface. The FIR filter algorithm used is an optimized and partitioned frequency domain algorithm, thus throughput is extremely high, and I/O-delay can be kept fairly low.
Through its highly modular design, things like adaptive filtering, signal generators and sample I/O are easily added, extended and modified, without the need to alter the program itself.
BruteFIR is free and open-source. It is licensed through the ISC License (a simplified MIT/BSD), which make it easy to integrate everywhere even commercial projects.
While it's most popularly used on embedded platforms directly integrated with ALSA, it can also be used on desktop computers, integrating natively also with JACK and PipeWire.
A few examples of applications where BruteFIR can be a central component:
Many of these applications need only short filters, but for example auralization may need filters that are several seconds long. BruteFIR works well for both.
Note that to actually use BruteFIR in these applications you need to design the filters in other software. The package does include an equalizer plugin though, so out of the box you can use it as an equalizer, which you can control live through the provided CLI module.
BruteFIR can be used out of the box on a Linux desktop computer, and then you typically use JACK or PipeWire I/O and static configuration with your desired filters that you have designed in some other software.
This is a basic use case:
It's very well suited for and was originally designed to be used in embedded contexts though, ie headless computers made dedicated for processing audio. In that case you usually use the ALSA I/O module and connect it directly to the hardware drivers without any sound server inbetween. If it needs to be chained with other software, JACK is the preferred interface, although ALSA loopback can also be used (it's a bit quirky to set up though).
If you are a software developer that wants to use BruteFIR as your filter engine for your project, the typical way to integrate is to not modify BruteFIR software at all, but possibly make custom I/O or logic modules and control it all using the CLI sending text commands. The controlling application could of course be written in any language.
The module interface is a bit old-school by today's standards, but if you don't want to use C you can wrap the API and use any language you prefer.
BruteFIR should be quite easy to port to MacOS and Windows, but for now the source code only supports Linux, and as embedded is its most popular use case, Linux is the way to go.
Here is a typical full integration use case:
When BruteFIR is run for the first time without parameters, it will
generate a default configuration file at
~/.config/BruteFIR/brutefir_defaults.conf (unless
the -nodefault option is used), and then complain
that it cannot find the main configuration file at its default
location ~/.config/BruteFIR/brutefir.conf.
The default configuration file contains default settings, which is
extended and/or overridden in the main configuration file. A setting
that is specified in the default configuration file does not need to
be in the main.
BruteFIR takes only four parameters, namely the
filename of the main configuration file, and optionally
-quiet to suppress title, warnings and informational messages
at startup, and -nodefault if BruteFIR should read all
settings from the main configuration file, and finally
the legacy setting -daemon if it should run as a daemon
(today setting up a systemd unit file is the normal way to run it as a
daemon).
If no parameters are given, the filename given in the default configuration file is used. If the filename is "stdin", BruteFIR will expect the configuration file to be available on the standard input, a useful option for integration projects.
The (default) default configuration file looks like this:
## DEFAULT GENERAL SETTINGS ##
float_bits: 32; # internal floating point precision
sampling_rate: 44100; # sampling rate in Hz of audio interfaces
filter_length: 65536; # length of filters
config_file: "$XDG_CONFIG_HOME/BruteFIR/brutefir.conf"; # standard location of main config file
overflow_warnings: true; # echo warnings to stderr if overflow occurs
show_progress: true; # echo filtering progress to stderr
max_dither_table_size: 0; # maximum size in bytes of precalculated dither
allow_poll_mode: false; # allow use of input poll mode
modules_path: "."; # extra path where to find BruteFIR modules
powersave: false; # pause filtering when input is zero
monitor_rate: false; # monitor sample rate
lock_memory: true; # try to lock memory if realtime prio is set
sdf_length: -1; # subsample filter half length in samples
convolver_config: "$XDG_CACHE_HOME/BruteFIR/brutefir_convolver_wisdom"; # FFTW wisdom
## COEFF DEFAULTS ##
coeff {
format: "text"; # file format
attenuation: 0.0; # attenuation in dB
blocks: -1; # how long in blocks
skip: 0; # how many bytes to skip
shared_mem: false; # allocate in shared memory
};
## INPUT DEFAULTS ##
input {
device: "file" {}; # module and parameters to get audio
sample: "S16_LE"; # sample format
channels: 2/0,1; # number of open channels / which to use
delay: 0,0; # delay in samples for each channel
maxdelay: -1; # max delay for variable delays
mute: false, false; # mute active on startup for each channel
};
## OUTPUT DEFAULTS ##
output {
device: "file" {}; # module and parameters to put audio
sample: "S16_LE"; # sample format
channels: 2/0,1; # number of open channels / which to use
delay: 0,0; # delay in samples for each channel
maxdelay: -1; # max delay for variable delays
mute: false, false; # mute active on startup for each channel
dither: false; # apply dither
merge: false; # merge discontinuities at coeff change
};
## FILTER DEFAULTS ##
filter {
process: -1; # process index to run in (-1 means auto)
delay: 0; # predelay, in blocks
crossfade: false; # crossfade when coefficient is changed
};
The syntax of the main configuration file is very similar as we will see. As we can see, there are five sections in the configuration:
The general syntax rules for the configuration files is easily grasped from the default configuration file. The semicolons are important, they note the end of a setting, not line breaks, so you may have several settings on one line if you like. All characters on a line after a # is found are ignored. There are three data types: strings, numbers and booleans. Strings are text between quotes, a number is either with or without a decimal dot, and a boolean is either 'true' or 'false'.
Note that everything is case sensitive, so setting names must be written with lowercase letters. Although the configuration file examples shown here is nicely ordered in sections, you can mix them in any order.
The general settings section in the main configuration file has the
same syntax as in the default configuration file. The difference is
that coeff, input, output and
filter structures can exist in multiples, and are given
names and more parameters.
Default values of all general settings (except logic) must be
given in the default configuration file. Any of these settings may be
overridden in the main configuration file (except
config_file). These settings are:
float_bits: <NUMBER>; internal
floating point resolution, either 32 or 64, per default 32, and
for all practical uses that is enough. However, if you are the
kind of person that runs dither on 24 bit output, 64 bit float
will gain you some extra confidence in the audio precision. With
modern CPUs and typical applications the extra memory and
processing requirement is not a problem.
sampling_rate: <NUMBER>; sample rate in Hz.
filter_length: <NUMBER>[,<NUMBER>];
specifies how long the filters should be, which can be done in two
ways. Either by specifying the length in one number, which must be
a power of two. If so, the convolution will be done on the whole
filter length. To partition a 65536 tap filter in 16 parts, you
write filter_length: 4096,16. Partitioned filters
can be used to improve performance and reduce I/O-delay. The
I/O-delay becomes twice of the partition length, ie 8192 in the
4096,16 example.
There is no formula for calculating the optimal number of partitions to get maximum throughput. It varies between hardware platforms, so trial and error is the only working method. More than about 16 partitions are generally not recommended though.
If you are using partitioned filters to reduce the I/O-delay for realtime filtering, make sure that it does not get too low. If I/O-delay is too low, the sound card can get overflowed/underflowed causing the program to exit with a broken pipe signal. Extremely low latencies, such as 64 sample partitions, may not work for long periods of time due to latency variations in the kernel.
The most loaded CPU core cannot be loaded more than typically 85% for safe realtime operation. For very low latencies, this number could go down to 70%. The reason for this is that computing time will vary somewhat, that is how modern computers work, and to be able to cope with the maximum computing times, some spare processor time must be left.
config_file: <STRING>; default location
of main configuration file.
overflow_warnings: <BOOLEAN>; if set to
true, information about overflows will be printed to stderr when they occur. Note that
overflowed samples are always set to the maximum output value of
the output device, so there is no actual overflow on the output
(unless the actual floating point value is overflowed). If
overflow occurs, it means that the filter is amplifying too much,
either through its coefficients or through input and output
attenuation. Overflow is not checked for if the output values are
floating point.
show_progress: <BOOLEAN>; if true,
echo progress / realtime index to stderr.
max_dither_table_size: <NUMBER>;
maximum size in bytes of pre-calculated dither. If dither is
applied to any output, a dither table will be calculated when the
program is started. It contains uncorrelated random values that is
used to generate the dither. The more channels that applies
dither, the larger table is needed, if to keep the dither
uncorrelated between channels. This table can get quite large
memory-wise (in embedded terms). If you want to limit its size,
set this value. It should rather not be less than one megabyte
though. If it is set to zero or negative (default), the program
will itself choose a size.
allow_poll_mode: <BOOLEAN>; if true,
poll mode is allowed. If a sound card which is used for input
cannot be configured to have a period size (interrupt interval)
equal to or smaller than the configured filter (partition) length,
or if it is cannot be a power of two (note: not common with modern
hardware), BruteFIR must be run in input poll mode. This means
that the sound card is polled for data, and sound card interrupts
are not used. BruteFIR will run just as reliably (as long as the
sound card allows for small transfers) but will consume more of
the spare processor time. Thus it will look like BruteFIR uses
more processor than it actually needs to. If more processor time
is used for filtering, less will be used for polling, thus input
poll mode does not mean that it is not possible to have as long
filters as running in normal mode. However, for some applications
(for example when the spare processor time is used by another
vital program), input poll mode is not suitable, and by setting
the allow_poll_mode to false (default), BruteFIR will
exit with an error if input poll mode is required.
modules_path: <STRING>; extra path
relative to the working directory where to find BruteFIR
modules. This setting is not available if BruteFIR was compiled
with SINGLE_MOD_PATH.
BruteFIR uses external modules to provide sample I/O, and optionally add extra logic. It will search a few default directories to find any modules that should be loaded, as specified in the configuration. This setting adds an extra directory, which is searched first. The value in the created default configuration file will be ".", that is the current working directory. This setting does not exist if BruteFIR is complied with SINGLE_MOD_PATH pointing out only one modules directory. Loading binary modules from unclear places can be a security problem, so if you are a package maintainer for a Linux distro, it is recommended to build with SINGLE_MOD_PATH set to make sure modules can only be loaded from one specific fixed directory.
logic: <STRING: logic module name> { <logic
module parameters> }[, ...]; If any logic modules
should be loaded, these are listed in the logic field,
in pairs of module name / module parameters, separated with
commas. Which logic modules that are available and what
functionality they provide can be found in
the Logic modules section.
powersave: <BOOLEAN or NUMBER>;
pause filtering when input is zero to reduce CPU load.
If activated, will monitor the inputs, and if an input channel provides zero samples, the associated filters will not do any processing, since with zero on the input, BruteFIR knows in advance that there will be zero on the output. BruteFIR will continue run as normal, and filters with non-zero inputs will continue to to process normally. As soon as there is non-zero input on a suspended filter, it starts processing again. This powersave feature is transparent, there will be no convolution errors if it is activated. The reason for having it optional is that one may want to make performance tests, without the need to feed a meaningful signal to BruteFIR.
If analog inputs are
used, the input will never be exactly zero, and thus the powersave
feature will not be triggered. However, if a value is specified
instead of the boolean (for
example powersave: -80;), that value is
interpreted as the lowest level in dB the input signal can be,
before BruteFIR will consider the input as zero, and trigger
powersave. Thus, a noise floor can be specified, and then
powersave can work together with analog inputs.
monitor_rate: <BOOLEAN>; monitor
sample rate, and abort if it changes. Useful for systems where
you want to restart with appropriate filter coefficients for the
sample rate when it changes.
lock_memory: <BOOLEAN>; try to lock
memory if realtime prio is set, to make sure no memory is put in
the swapfile (if available) that can jeopardize realtime
performance.
sdf_length: <NUMBER>[, <NUMBER>];
sub-sample delay filter half length in samples, and
optionally a second number with kaiser window beta. Sub-sample
delays are only needed in very special applications so the
default value is -1, that is off. If sub-sample
delays should be possible to set, the sdf_length
setting must be larger than zero. It specifies the half length of
a sub-sample delay filter. A sub-sample delay filter is simply a sinc sampled
with a sub-sample offset. Thus, when a signal is convolved with
the filter it is delayed with the corresponding offset. Since a
sinc signal is infinitely long, it must be windowed. A kaiser window
is used, default beta is 9.0, but an own value can be specified by
adding it after a comma (example: sdf_length: 31, 8.5;),
there is little reason to use other than the default though.
The
distortion caused by the windowing is a soft rolloff at higher
frequencies, the shape depends on the beta value. There is no phase
distortion. Since the sub-sample filters are linear phase, they will
add a pre-response (in practice I/O-delay), which is their half filter
length, that is the value given after the sdf_length
setting. If sub-sample delay are used only on inputs or outputs, the
added pre-response is the same as the sdf_length, if used on
both (usually not necessary), it will be twice the length. To activate
sub-sample delay, also a valid subdelay must be specified in
at least one of the input/output structures. The valid range is -99 to
99.
The advantage of a long sub-sample filter length is that the rolloff
in the high frequencies starts later and gets sharper, that is less
high frequency information is lost. The disadvantage of long
sub-sample filters is that the required CPU time increases, and the
added I/O-delay increases. Sub-sample filters are processed separately
in the frequency domain using FFT, and therefore it is recommended to
keep sdf_length at a power of two minus one (the actual
filter length is twice sdf_length plus one), which means that
as much as possible of the FFT block is used (an sdf_length
of 16 requires as much CPU time as an sdf_length of 31, since
the same block length is required). With an sdf_length of 31
and the default beta of 9.0, and a sample rate of 44100 Hz, the
response is flat up to 19 kHz, and then a soft rolloff begins which
reaches -0.20 dB at 20 kHz, which is good enough for most needs. The
next natural step, 63, keeps a flat response up to about 20500 Hz,
with -0.20 dB at 21 kHz.
convolver_config: <STRING>; specifies
where FFTW wisdom should be stored, that is optimization
information for the FFT calculations.benchmark: <BOOLEAN>; if true, start
in benchmark mode (can only be used in main config file) which will
then print performance statistics to the terminal.safety_limit: <NUMBER>; if non-zero
max dB in output before aborting. The purpose is to protect your
ears and expensive speakers. Every output sample is checked and if
it exceeds this value (in dB) BruteFIR will immediately exit with
an error message, before any sound is sent to the output.
<structure type name> <STRING: name (list for some) | NUMBER: index> {
<field name 1>: <setting 1>;
[...]
};
Names of structures (given after the type name) is not given in the default configuration file, but must be provided in the main configuration file. The name is either a custom string, or an index number, which must then be the same as the order of the structure in the file, that is the first structure must be indexed 0, the second 1 and so on. If a string name is given, the index number is given automatically (the opposite also applies), and when referring to the structure, either the string name or the index number can be used. Some structures, namely input and output, may have a comma-separated list of names, since the names applies to the channels defined in the structure.
After the name, or the structure type name if in the default configuration file, There is a left brace ({), and then structure fields and their settings, each field/setting pair ending with semicolon (;). As for the general settings, field names always end with a colon (:). The order of the fields is not important. The structure is closed with a right brace (}) and ended with a semicolon.
coeff <STRING: name | NUMBER: index> {
filename: <STRING: filename>; | <NUMBER: shmid>/<NUMBER: offset>/<NUMBER: blocks>[,...];
format: <STRING: sample format string | "text" | "processed">;
attenuation: <NUMBER: attenuation in dB>;
blocks: <NUMBER: length in blocks>;
skip: <NUMBER: bytes to skip in beginning of file>;
shared_mem: <BOOLEAN: allocate in shared mem>
};
In the default configuration file, the filename field is not
set, so it must be present in the main configuration file.
The coeff structure defines a set of filter coefficients, which becomes a FIR filter. There are several different file formats:
"text" coefficients are listed in a text file, one
coefficient per line. They are parsed with the standard C library
strtod() function."processed" coefficients are stored in the format
BruteFIR uses internally. Attenuation or adapted length cannot be
applied if this format is used.The coefficients can be scaled, by setting the attenuation to non-zero.
The blocks field says how long in filter blocks the coefficient
set should be. If it is set to -1, the full length is assumed. Note
that custom lengths are only possible if partitioned convolution is
employed (quite naturally, since else there will only be one filter
block covering the full length).
The skip field if given specifies how many bytes in the
beginning of the file that should be skipped. This can be used to skip
headers in a file or similar. The field will be ignored if the
coefficients are not read from file.
In some cases, when one wants to test the performance of a certain
BruteFIR configuration, but don't feel like generating coefficients,
one can set the filename to "dirac pulse". Then BruteFIR will
generate a dirac pulse filter internally and use it as any other
filter, and thus will cost as much in processing as any other filter
of the same length. However, if you need a dirac pulse in the real
case, it makes no sense using this feature, since simply setting the
coeff field in the filter structure to -1 gives the same effect and
uses very little processor power (and memory).
Instead of a filename, comma-separated number groups can be given.
The first number will be a shared memory ID (man shmat) where the data
is found, the second number is the offset in bytes into the shared
memory area where the program starts to read, and the third is how
many blocks that should be read. A block is a filter segment, that is
if filter_length is 4096,16 one block is 4096
coefficients, and there can be no more than 16 blocks per coefficient
set. If not all blocks covered in the first group, there must be
following number groups to provide the full length. When a shared
memory segment is given, it is required that the format is
"processed".
The shared memory and "processed" format features are
inteded for application developers that need to provide and
or/modify filter coefficients from a different process while
BruteFIR is running. The processed format is not documented but an
API for it exists in the module API so it can quite easily be
extracted/copied if needed. For normal stand-alone use there is no
need to used these features.
The shared_mem field indicates if the coefficient should be
stored in shared memory. In legacy versions of BruteFIR which used a
multi-process design this was required by some logic modules like the
equalization module. Today when BruteFIR is threaded all modules have
access to all memory so it's not needed for that reason. However, if
you develop a separate application which need to work with the filter
coefficients while BruteFIR is running the shared memory features are
still applicable.
input <STRING: name | NUMBER: index>[, ...] {
device: <STRING: I/O module name> { <I/O module settings> };
sample: <STRING: sample format>;
channels: <NUMBER: open channels>[/<NUMBER: channel index>[, ...]];
delay: <NUMBER: delay in samples>[, ...];
subdelay: <NUMBER: additional delay in 1/100th samples (valid range -99 - 99)>[, ...];
maxdelay: <NUMBER: maximum delay for dynamic changes>;
individual_maxdelay: <NUMBER: maximum delay for dynamic changes>[, ...];;
mute: <BOOLEAN: mute channel>[, ...];
mapping: <NUMBER: channel index>[, ...];
};
output <STRING: name | NUMBER: index>[, ...] {
device: <same syntax as for the input structure>;
sample: <same syntax as for the input structure>;
channels: <same syntax as for the input structure>;
delay: <same syntax as for the input structure>;
subdelay: <NUMBER: additional delay in 1/100th samples (valid range -99 - 99)>[, ...];
maxdelay: <same syntax as for the input structure>;
individual_maxdelay: <same syntax as for the input structure>;
mute: <same syntax as for the input structure>;
mapping: <same syntax as for the input structure>;
dither: <BOOLEAN: apply dither>;
merge: <BOOLEAN: merge discontinuities at coeff change>;
};
All fields for the input and output structures except
mapping, delay and mute
must be set in the default configuration file.
The device field specifies the source/destination of the digital audio. This is always an I/O module. First the name of the module is stated, followed by a its configuration within {}. If the audio is read/written from/to a module which does not continue forever (for example reading from a file), BruteFIR will finish when the first I/O module comes to an end (hopefully an input module, write failure of an output module is considered an error).
The sample format should be one of the following strings:
The common format 16 bit signed little endian found in for example 16 bit wav-files is thus "S16_LE". The floating point formats can be in any range, however all integer formats will be scaled to -1.0 to +1.0 internally, so if to match an integer format, the range should be -1.0 to +1.0. There is no overflow checking for floating point formats (that is values larger than +1.0 or lesser than -1.0 is not truncated).
The channels field specifies the number of open and used channels of the device. If the number of open channels exceed the number of used channels, a slash (/) followed by a comma-separated list of channel indexes of used channels must be appended. If we for example have a eight channel ADAT sound card, but we only want to use the first two, we write 8/0,1 as the channels setting. As you see, the lowest channel index is zero, not one.
The length of the list of names (given after the structure type name)
must match or exceed the number of used channels. If there are more
channels in the head (the logical, or virtual channels) than there are
available through the device, the specified channels must be mapped
onto the physical device channels. This is done with the
mapping field, which simply is a list of indexes, which index
in the head to map to which physical device channel. Here a simplified
example:
output 14,15,16 {
...
channels: 8/5,4;
mapping: 0,1,0;
};
In this example, two channels from the eight channel device are used,
channels with index 5 and 4. The order of the channel indexes matter,
physical channel 5 will now be considered the first (index 0) of the
available physical channels, and 4 the second (index 1). The
mapping fields tells how to map the channels called 14, 15
and 16 in the header to those two physical channels. The mapping is in
the same order as the channels in the header, that is 14 is mapped to
physical channel index 0 (which is channel 5 on the eight channel
device), 15 to index 1 (channel 4 on the device), and 16 to index 0,
that is the logical channels 14 and 16 will mix into the same output
on the device. In the standard case, where logical channels are the
same as the amount of channels made available through the
channels field, a mapping specification is not
needed. Then the first logical channel is mapped to the first listed
device channel and so on.
The list of delays specifies how many samples a channel should be
delayed. This could be used to compensate for speaker positions that
is either to close or too far away. It could also be used to
compensate for acasual filters. Delay can be changed in runtime, if
maxdelay is not set to a negative value. It defines the upper
bound of delay in samples. When the program is started, delay buffers
for all channels to match maxdelay is allocated. If it is negative,
only the precise amount specified by the delay array is allocated.
The setting individual_maxdelay was added later, and works
the same as maxdelay with the difference that it is specified
per channel. It is useful to save memory when there are many channels,
and only some of them need dynamic delay (or considerably larger
buffer than the others).
If the general setting sdf_length is larger than zero, the
subdelay setting will take effect. It specifies the
sub-sample delay per channel in 1/100th of samples (valid range is -99
to 99). This delay can be changed in runtime. To disable sub-sample
delay on a channel, set its sub-delay to a negative value outside the
valid range. Since sub-sample delay consumes CPU time, it is
recommended to only activate it where necessary. Sub-delay filters
adds pre-response, and therefore all channels with sub-delay disabled
will be automatically compensated with an I/O delay to make them
aligned.
The mute list of booleans, specifies, in order, which channels that should be muted from the beginning. The muted channels can later be unmuted from the CLI.
If the dither flag is set to true, dither is applied on all used channels. Dither is a method to add carefully devised noise to improve the resolution. Although most modern recordings contain dither, they need to be re-dithered after they have been filtered for best resolution. Dither should be applied when the resolution is reduced, for example from 24 bits on the input to 16 bits on the output. However, one can claim that dither should always be applied, since the internal resolution is always higher than the output. When BruteFIR is compiled with single precision, it is not possible to apply dither to 24 bit output, since the internal resolution is not high enough. BruteFIR's dither algorithm is the highly efficient HP TPDF dither algorithm (High Pass Triangular Probability Distribution Function).
If the merge flag is set to true, discontinuities that may occur when coefficients are changed in runtime, is smoothed out with a simple merge algorithm. This avoids "clicks" that may occur in the sound when coefficients are changed. Note that discontinuities occurs also when volume is changed, but that is not merged, since those discontinuities are generally not audible or masked by the volume change itself.
filter <STRING: name | NUMBER: index> {
from_inputs: <STRING: name | NUMBER: index>[/<NUMBER:attenuation in dB>][/<NUMBER:multiplier>][, ...];
from_filters: <same syntax as from_inputs field>;
to_outputs: <same syntax as from_inputs field>;
to_filters: <STRING: name | NUMBER: index>[, ...];
coeff: <STRING: name | NUMBER: index>;
delay: <NUMBER: pre-delay in blocks>;
crossfade: <BOOLEAN: cross-fade when coefficient is changed>;
process: <NUMBER: process index>;
};
The filter structure defines where a filter is placed and what its parameters are. This is done in a BruteFIR filter:
If an output channel exists in several filter structures, the filter outputs will be mixed into that channel. Thus, a set of filter structures defines how inputs and outputs should be copied, mixed and filtered.
With help of the from_filters and to_filters fields,
filters can be connected to each-other. The only real constraint is
that there must be no loops. BruteFIR will detect and point out errors
if such exist in a given filter graph. Note that if possible
coefficients should be pre-convolved rather than put as filters in
series, since a 2N length filter computes much faster than two
cascaded N length filters.
The from_inputs, from_filters and to_outputs fields have the same
syntax. One channel/filter is given as the string name or index
number, and if attenuation should be applied, it is followed by a
slash (/) and attenuation in dB. Instead of, or combined with,
attenuation in dB, a multiplier can be given, a number which all
samples will be multiplied with. The writing "channel
1"/6/-1 means that channel 1 is attenuated 6 dB and the polarity
is changed (multiplication with -1). It is also possible to write
"channel 1"//-0.5 which is equivalent to the first example.
If more than one channel should be included, they are separated with
commas. The to_filters field has the same syntax with the
exception that attenuation is not allowed.
The remaining filter settings are:
coeff: <STRING | NUMBER>;
specifies which coefficient set that should be used for
the filter. It could be given as the string name of the set, or as its
index number. If the index number is set to minus one (-1), there will
be no filtering in the filter, it will just mix and copy inputs/outputs
as specified. Note that the length of the coefficient set specifies
how processor intensive the filter will be.
delay: <NUMBER>; specifies how many filter
blocks pre-delay there should be. Zero or negative means no
delay. The maximum allowed delay is one block less than full
length. Thus, with unpartitioned filtering there can be no delay
at all. For this type of blocked delay the cost is zero both in
terms of memory and processing.
crossfade: <BOOLEAN>; if set to true, there will be a
cross-fade when the coefficient is changed in runtime, making the
coefficient change totally seamless. This means that when changing
coefficient (using the CLI for example), the filter will convolve one
block with the old coefficient, fade out that and mix it with a fade
in block with the new coefficient. This means that at the
time of coefficient change, there will be roughly twice the amount of
processing for that filter. This processing spike can of course cause
buffer underflow if running with a sound card and heavy CPU load in
the normal case. If there for example are 10 filters in a
configuration (all with crossfade active), and all coefficients are
changed at the same time, the normal CPU load should not exceed 50%,
since the spike will roughly require twice the load. However, if the
coefficients are changed only one filter at a time, only 10% extra
processing is required compared to the normal case in the example.
process <NUMBER>; specifies in which thread
(0, 1, 2, etc) the filter should be run. In legacy versions this
was actual separate processes, hence the name. This is used for
manual load balancing. If set to -1 (ie left out as -1 is
normally already set in the default config), an automatic but
naive load balancing will take place which for most applications
is good enough. But with this you can hand-tune if you want.
All filters with the same process index will run in the same thread. Process index 0 must exist, and if there are more processes they should be consecutive, 0, 1, 2, 3 and so on. The optimal situation is that there is one filter thread per CPU core, and that each filter thread requires the same processor time. Then you will get most out of your multi-core CPU. There is one limitation of how filters can be distributed between threads: mixing to an output channel or a filter input must be done within the same thread.
Here follows an example of a main configuration file, showing some of the aspects of BruteFIR's possibilities. It implements a cross talk cancellation filter for a stereo dipole. Note that the configuration uses the default settings extensively. For example, no general settings have been specified apart from the addition of the CLI logic module, and in the coeff structures, only the filename field is used.
logic: "cli" { port: 3000; };
coeff "direct path" {
filename: "direct_path.txt";
};
coeff "cross path" {
filename: "cross_path.txt";
};
input "left", "right" {
device: "file" { path: "/disk0/tmp/music.raw"; };
sample: "S16_LE";
channels: 2;
};
output "stereo dipole left", "stereo dipole right" {
device: "file" { path: "output01.raw"; };
sample: "S16_LE";
channels: 2;
};
filter "left speaker direct path" {
inputs: 0/6.0;
outputs: 0;
coeff: "direct path";
};
filter "left speaker cross path" {
inputs: "right"/6.0;
outputs: "stereo dipole left";
coeff: "cross path";
};
filter "right speaker direct path" {
inputs: "right"/6.0;
outputs: "stereo dipole right";
coeff: "direct path";
};
filter "right speaker cross path" {
inputs: "left"/6.0;
outputs: "stereo dipole right";
coeff: 1;
};
I/O modules are used to provide sample input and output for the BruteFIR convolution engine. It is entirely up to the I/O module of how to produce input samples or store output samples. It could for example read input from a sound card, a file, or simply generate noise from a formula.
In the BruteFIR configuration file, an I/O module is specified in each input and output structure.
The purpose of having I/O modules instead of building all functionality directly into BruteFIR is that it should be easy to extend with new functionality, without compromising the core convolution engine.
All I/O modules has the extension ".bfio".
The ALSA I/O module (named "alsa") is used to read and write samples from/to sound cards using the low-level Linux sound system.
When using the "hw" devices it's the closest to the hardware integration and therefore the recommended way to integrate BruteFIR on systems where there is no need to have a sound server running like JACK or PipeWire. On a desktop computer with a sound server running you generelly cannot use the "hw" device as it's already used by the sound server.
In addition to the hardware devices, there are usually one or more virtual devices made available through ALSA but these do not provide direct access to the underlying sound hardware. On a desktop computer where the sound hardware is shared by many applications it may be useful to use these devices, but if you use a sound server like JACK or PipeWire it's better to use the provided dedicated I/O module instead.
The module has the following configuration options:
device: <STRING>; (required), the sound
device to use. Preferably use a "hw" (direct hardware access)
whenever possible. Use for example "arecord -L" ALSA tool to list
available input devices and "aplay -L" to list available output
devices. A typical name looks like this: "hw:CARD=Device,DEV=0",
but can also be indexed by numbers like this: "hw:2,0,0". Devices
can also be found directly under /proc/asoundignore_xrun: <BOOLEAN>; default
false. Per default, when reading fails due to an overflow, or writing fails
due to and underflow, BruteFIR will abort. If your computer is heavily
loaded, and/or partitions are short, and/or other services are running
on the computer, over/underflow can occur occasionally. In those
cases, one might rather get occasional clicks in the sound rather than
a total stop. By setting this to true, the ALSA I/O module will hide
over/underflow from BruteFIR, and thus it will not abort when that
occurs.link: <BOOLEAN>; default true, which
links all ALSA devices together (inputs and outputs) such that
they start in a syncronized fashion. Only disable this if there is
a problem at startup.access: <STRING>; force an access mode
instead of using automatic. Automatic works in almost all cases
except for quirky drivers like the ALSA loopback driver. The
following modes exist:
In addition to this the common sample format and channels settings
affects how the sound device is opened. For example if you want to use
only channel 3,4 of an 8 channel sound card you set channels:
8/3,4. All BruteFIR sample formats that is also supported by
the selected device can be used.
If you use ALSA for both input and output, you need to make sure the devices used have the same clock source, or else they may drift apart over time causing underflow/overflow.
The ALSA Loopback device can be used for simple integration cases like if you want a streaming audio application feed sound to BruteFIR which then sends to the output, but in most cases it's easier to use JACK I/O if it needs to integrate with other audio applications.
The JACK I/O module (named "jack") provides BruteFIR with support for the low-latency JACK audio server. If BruteFIR needs to exist together in a system of other audio applications, using JACK is the preferred way to handle I/O.
To avoid putting I/O-delay into the JACK graph, the JACK buffer size should be set to the same as the BruteFIR partition size. It is however possible to set the JACK buffer size to a smaller value. The I/O-delay in number of JACK buffers as seen by following JACK clients will be:
2 * <BruteFIR partition size> / <JACK buffer size> - 2
Note that both the JACK buffer size and BruteFIR period size is always a power of two.
The sample format for the JACK device should be set to AUTO,
which will be the JACK sample format (floating point).
Here are the settings:
ports: <STRING: port name>[/<STRING: custom local
port name>][, ...]; used to connect to ports at startup.
If no ports should be connected, and the rest is left at defaults,
the JACK device clause is empty ("jack" { };).
Examples: "jack" { ports: "alsa_pcm:capture_1", "alsa_pcm:capture_2"; }
for input, and "jack" { ports: "alsa_pcm:playback_1",
"alsa_pcm:playback_2"; } for output. The port listing must be
set to the same amount as the number of channels for the device. However,
empty strings could be used if a specific channel index should not be
connected, for example: "jack" { ports: "", "alsa_pcm:capture_2";
} will only connect the second port.
If there is no custom naming, the names of the BruteFIR ports will be
"brutefir:input-N" for the inputs, and "brutefir:output-N", where
N is the channel index. Custom names are given like this: "jack" { ports:
"alsa_pcm:capture_1"/"in-A"; }, that is adding a slash and
specifying a name after that, this will replace the default "input-N"
for inputs and "output-N" for outputs. If a port should not be
connected but still be named, the first string is empty, like this:
"jack" { ports: ""/"in-A"; }.
clientname: <STRING>; optionally override the
default JACK client name, which will be "brutefir". It is a global
setting, and if used it must be set in the first JACK device
clause (the first from the top in the configuration file). The
clientname will change the port name prefix as well (the prefix is
the client name). If multiple BruteFIR instances should be run,
they must have different client names, or else the port names will
collide.
priority: <NUMBER>; optionally specify what
realtime priority the JACK callback thread will be running
at. BruteFIR relates its own thread priorities to that to make
realtime scheduling work ideally and needs to know it in
advance. It makes a guess, if wrong it will print a warning with
what value you need to set.
In the example configuration below, we set a custom client name to override the default "brutefir", and we connect both input ports to alsa PCM. On the output we connect only the second port, but we give both ports custom names instead of the default "output-0" and "output-1".
input "left", "right" {
device: "jack" {
clientname: "my-custom-clientname";
ports: "alsa_pcm:capture_1", "alsa_pcm:capture_2";
};
sample: "AUTO";
channels: 2;
};
output "left", "right" {
device: "jack" {
ports: ""/"my-output-A",
"alsa_pcm:playback_2"/"my-output-B";
};
sample: "AUTO";
channels: 2;
};
On current general-purpose Linux desktop computers PipeWire is the most popular low-level server for both audio and video that lets multiple applications share the same sound hardware.
This I/O module is the likely candidate to use if you want to run BruteFIR on your regular desktop computer without any changes. If you make a dedicated computer for audio processing you should either use ALSA I/O if BruteFIR doesn't need to integrate towards other audio applications, and JACK I/O if it has to.
On an interface level PipeWire works the same way as JACK, and the I/O module has the exact same configuration as the JACK I/O module, just replace "jack" with "pipewire" and connect to PipeWire ports either at startup, or connect later using PipeWire graph tools.
The raw PCM file I/O module (named "file") is used to read and write samples from/to files. It supports all BruteFIR sample formats and reads/writes them directly in raw form, interleaved format.
path: <STRING>; (required) path to the
filename to be opened, either absolute path or relative to the
work directory.
skip: <NUMBER>; (input only) skip the given
number of bytes in the beginning of the file. Set to 44 if to
skip the header of classic .wav files.
loop: <BOOLEAN>; (input only) loop the file
over and over again.
append: <BOOLEAN>; (output only) if the file
if it exists should be appended to rather than overwritten.
text: <BOOLEAN>; read/write text file instead
of raw format. The format is N floating point ASCII values per
line separated with whitespace, where N is the number of channels.
The module will convert to/from 64 bit floating point, and thus
requires that sample format (or use AUTO).
Here's a trick: by using /dev/stdin on input
and /dev/stdout on output, BruteFIR will read data from
standard input and write to standard output, so you can then use it in
pipes on the command line, like this:
ogg123 -d raw -f - test.ogg | brutefir stdio_config | aplay -f S16_LE -c2 -r48000The input/output configuration for the example above would be:
input "left", "right" {
device: "file" { path: "/dev/stdin"; };
sample: "S16_LE";
channels: 2;
};
output "left", "right" {
device: "file" { path: "/dev/stdout"; };
sample: "S16_LE";
channels: 2;
};
For this the source code is the documentation. There are two types of modules, either regular which read/write to/from file descriptiors like the ALSA and File I/O modules, and then callback-based modules like the JACK and PipeWire modules.
The CLI logic module (named "cli") provides a command line interface available through telnet, a local socket, a pipe, or a serial line. The CLI is used for changing settings in runtime, which is of course only suitable when BruteFIR is used in realtime. It can be used interactively by hand, for example by connecting to it through telnet. It is also suitable for scripting BruteFIR, or using it as a means of inter-process communication if BruteFIR is used as the convolution engine for another application.
port: ...; specifies how to connect to the CLI, there
are these variants to cover different needs:
port: <INTEGER: TCP port number>; the CLI will
listen on the given port number for incoming telnet clients.port: <STRING: "/dev/" ...>; when the string starts
with "/dev/" the CLI assumes a serial device (such as "/dev/ttyS0" on
Linux) is pointed out, and opens it as a serial port, with the default
line speed 9600 baud, if not the line_speed setting is used
specifying another speed.port: <STRING: name of local socket>; any other
string not starting with "/dev/" is handled as the file name for a
local socket, and the CLI will create and listen for incoming
connections on the given path. If the path exists, it will be
replaced.port: <INTEGER: read end file descriptor>, <INTEGER:
write end file descriptor>; the CLI will assume that the given
file descriptors are already opened and ready for use, and will attach
the read end to CLI input, and the write end to CLI output. This
interface is suitable as inter-process communication when BruteFIR is
integrated into another program, and is started through fork() and
exec().script: <STRING>; CLI script to run, see
examples further down. If the script is set, the port cannot be
set.
echo: <BOOLEAN>; set to true if to echo back
the commands. If telnet is used, this is not required and is off
per default.
line_speed: <NUMBER>; optionally set a custom
line speed for serial line interface. Only used if port specifies
a serial port.
The CLI does not have much terminal functionality to speak of, and is thus a bit cumbersome to use interactively. It reads a whole line at a time, and can interpret backspace, but that is about it.
Instead of specifying a port, one can specify a string of commands,
which will be run in a loop as a script. Example: "cli" { script:
"cfc 0 0;; sleep 10;; cfc 0 1;; sleep 10"; }. The script may span
several lines. Each line is carried out atomically (this is also true
for command line mode), so if there are several commands on a single
line, separated with semicolon, they will be performed atomically (an
atomic set of statements). The exception is when an empty statement is
put in the line (just a semicolon), like in the script example, this
will work as a line break, and thus separate atomic sets of
statements.
A typical use for atomic set of statements is to change filter coefficients and volume at the same time.
The sleep function in the CLI allows for sleeping in seconds,
milliseconds or blocks. One block is exactly the filter length in
samples, and if partitioned, it is the length of the partition. Block
sleep can only be used in script mode.
When in script mode, the first atomic statements will be executed just before the first block is processed, then the block is processed (and sent to the output), and then the next set of atomic statements is run. That is, each set of atomic statements is performed before the corresponding block is processed. The next atomic statement set is not performed until the next block is about to be processed.
The block sleep command (only works in script mode) works such that
the sleep is commenced at the next block. The statement
sleep b1; will thus cause the next block to be
skipped. Note that since one block passes for each atomic statement
set, a single line with only sleep b1; will skip two
blocks, not one, since one block is consumed when parsing the sleep
command, and the other is skipped by the sleep duration. That is to
skip only one block, either use sleep b0; alone, or use
sleep b1 as the last statement together with other
statements in an atomic statement set (recommended).
Sleep in seconds and milliseconds will start the timer when the command is issued (at the start of the block if in a script), and continue with the next command after at least the given time has passed. If run in a script, the timer is polled at the start of each block, and the next command is then executed at the start of the first block where the timer has expired.
If several sleep commands are executed in the same atomic statement set in a script, only the last will take effect, and will be executed only when all other commands in the set have been processed. To avoid confusion, it is thus recommended to employ sleep commands either alone, or as the last in the atomic statement set.
When connected and you type "help" at the prompt, you will get the following output:
Commands:
lf -- list filters.
lc -- list coefficient sets.
li -- list inputs.
lo -- list outputs.
lm -- list modules.
cfoa -- change filter output attenuation.
cfoa <filter> <output> <attenuation|Mmultiplier>
cfia -- change filter input attenuation.
cfia <filter> <input> <attenuation|Mmultiplier>
cffa -- change filter filter-input attenuation.
cffa <filter> <filter-input> <attenuation|Mmultiplier>
cfc -- change filter coefficients.
cfc <filter> <coeff>
cfd -- change filter delay. (may truncate coeffs!)
cfd <filter> <delay blocks>
cod -- change output delay.
cod <output> <delay> [<subdelay>]
cid -- change input delay.
cid <input> <delay> [<subdelay>]
tmo -- toggle mute output.
tmo <output>
tmi -- toggle mute input.
tmi <input>
imc -- issue input module command.
imc <index> <command>
omc -- issue output module command.
omc <index> <command>
lmc -- issue logic module command.
lmc <module> <command>
sleep -- sleep for the given number of seconds [and ms], or blocks.
sleep 10 (sleep 10 seconds).
sleep b10 (sleep 10 blocks).
sleep 0 300 (sleep 300 milliseconds).
abort -- terminate immediately.
tp -- toggle prompt.
ppk -- print peak info, channels/samples/max dB.
rpk -- reset peak meters.
upk -- toggle print peak info on changes.
rti -- print current realtime index.
quit -- close connection.
help -- print this text.
Notes:
- When entering several commands on a single line,
separate them with semicolons (;).
- Inputs/outputs/filters can be given as index
numbers or as strings between quotes ("").
Most commands are simple and don't need to be further explained. Naturally, any changes will lag behind as long as the I/O delay is. The exception is the mute and change delay commands, they will lag behind as long as the period size of the sound card is, which most often is smaller than the program's total I/O delay. However, when there is a virtual channel mapping, the mute and delay will be lagged as well.
The imc, omc and lmc commands are used to
give commands to I/O modules and logic modules in run-time. To find
out which modules that are loaded and which indexes they have, use the
command lm. Not all modules support run-time commands though.
Changing attenuations with cffa, cfia and
cfoa can be done with dB numbers or simply by giving a
multiplier, which then is prefixed with m, like this cfoa
0 0 m-0.5. Changing the attenuation with dB will not change the sign
of the current multiplier.
The equalizer logic module takes control over one or more coefficient sets, and renders equalizer filters to them, as specified by the user. This can be done in the initial configuration, and also updated in runtime, through the CLI.
Here is an example configuration with two equalizers, "eq-0" (which is double-buffered) and "eq-1":
filter_length: 1024,16;
logic:
"eq" {
debug_dump_filter: "/tmp/rendered-%d";
{
coeff: "eq-0", "eq-0-buf"; # double buffered
bands: "ISO octave";
magnitude: 31.5/-3.2, 125/8.5;
phase: 31.5/3.2;
};
{
coeff: "eq-1";
#bands: "ISO octave";
#bands: "ISO 1/3 octave";
bands: 20, 100, 200, 500;
magnitude: 20/-3.2, 100/8.5;
phase: 20/0, 100/180;
};
},
"cli" {
port: 3000;
};
coeff "eq-0" { filename: "dirac pulse"; };
coeff "eq-1" { filename: "dirac pulse"; blocks: 4; };
coeff "eq-0-buf" { filename: "dirac pulse"; };
filter "a" {
from_inputs: "left";
to_outputs: "left";
coeff: "eq-0";
crossfade: true;
};
filter "b" {
from_inputs: "right";
to_outputs: "right";
coeff: "eq-1";
};
If you want to analyze the rendered filters, the
debug_dump_filter setting specifies a file name where the
rendered coefficients will be written. It must contain %d, which will
be replaced by the coefficient index. Then follows equalizers. Each
specify which coefficient index (or name) it should render the
equalizer filter to. These must be allocated. If blocks is not
specified the will be as long as the filter length, so if you want
shorter you need to specify that, for example like this:
coeff 0 {
filename: "dirac pulse";
blocks: 4;
};
The dirac pulse will be replaced by the rendered filter. Each equalizer has a set of frequency bands (max 128), they can be manually specified, or use the ISO octave band presets. Optionally, magnitude (in dB) and phase (in degrees) settings can be specified. The frequency value must then match one of the given bands.
If you specify two filters, the rendering will be double-buffered, meaning that the eq module will keep one coefficient active in the filter(s), and render to the other, and switch when ready. This means that there is no risk of playing an incomplete equalizer, which can cause some noise (usually in the form of a beep), thus it is recommended to use double-buffered mode if the equalizer will be altered in runtime. In the filter configuration and when referring to the equalizer in the CLI, the first of the two coefficients should then be used.
In run-time, equalizers can be modified through the CLI. An example:
lmc eq 0 mag 20/-10, 4000/10 will set the magnitude to -10 dB
at 20 Hz and +10 dB at 4000 Hz for equalizer for coefficient 0. Instead
of mag, phase can be given. The command lmc eq
"eq-1" info will list the current settings for the equalizer
stored in the coefficient called "eq-1", example output:
> lmc eq "eq-0" info coefficient 0,2: band: 31.5 63.0 125 250 500 1000 2000 4000 8000 16000 mag: -3.2 0.0 8.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 phase: 3.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > lmc eq "eq-0" mag 500/-30 ok > lmc eq "eq-0" info coefficient 0,2: band: 31.5 63.0 125 250 500 1000 2000 4000 8000 16000 mag: -3.2 0.0 8.5 0.0 -30.0 0.0 0.0 0.0 0.0 0.0 phase: 3.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
The more heavily loaded the computer is by convolution, the longer time it will take to render the new equalizer (usually negligible on modern CPUs). If the coefficient set it renders to is very short, and the magnitude and phase response is very detailed (sharp edges etc) it will not be able to adapt to it fully.
This is documented through the source code. The CLI and EQ modules are good templates if you want to make your own.
BruteFIR calculates a "realtime index" which can be shown through the
CLI, or will be printed periodically to the terminal if the
show_progress flag is true. The realtime index is a floating point
value. When it is 1.0, 100% of the available processing power must be
used at all times to be able to achieve realtime performance. If it is
larger than 1.0, it means that with the current configuration,
BruteFIR will not manage realtime performance.
When BruteFIR was first released back in 2001 and used in complex auralization applications, processing power was often a limiting factor. Today with modern CPUs and typical simpler applications BruteFIR may only need less than one percent of avaiable processing power. That is in most cases you will not need to put much consideration to the realtime index. But if run the software on an embedded platform and with long filters it may still be relevant.
If your configuration is too demanding for realtime, you should shorten the filters (or remove channels) until the realtime index is very close below 1.0, perhaps 0.95. This way you make full use of your computer. However, if you have multiple CPU cores as is typical today, it is not as simple. The realtime index will show how much is needed from the most loaded processor. BruteFIR will load-balance automatically, but you can also do it manually using the "process" setting on the filter structures. So, devise your configuration carefully if you have multiple processors. The number of input and output channels and the filter length is what steals processor time. The number of filters, dither, delay, mixing and attenuation is very cheap in comparison.
When testing with realtime indexes above 1.0, inputs and outputs must of course be files. For performance testing, you could use "/dev/zero" for input and "/dev/null" for output. Also note that it takes some time for the index to stabilize.
The realtime index typically matches the processor load, if running with a sound card. However, if input poll mode is employed, real time index can be considerably lower than the processor load, since input polling is performed in the spare processor time.
If you use digital input and output, you may get problems if the sound card is not configured properly. It is very important that the input and output sample clock use the same clock as reference. Or else, micro-differences between the input and output sample clock will make BruteFIR's IO buffers to slide apart, and eventually make the program stop. Usually there is an option to set the digital sound card's sample clock to 'slave'.
If you have analog input or output or both, you cannot get this problem (unless you use several different sound cards, then it will fail due to differences in clocking).
Digital sound cards that work in slave mode allows that the sample clock is changed in runtime. Usually, this is not what one want for BruteFIR, since the filters are designed for only one sample rate. Therefore BruteFIR can be configured to exit if it detects a sample clock different from the one mentioned in the configuration file, with the "monitor_rate" setting.