Compressing voice recording the right way

Thread

6th Jan 2019 00:04 #1
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
I have a video recording of a conversation, with important variations in volume, depending on the distance each speaker was relative to the camera, including the sorry soul who was holding it. I would like to homogeneize the whole thing, but keep it lively, while avoiding any clipping, and if possible avoiding an increase of the noise/hiss (there's not much of it as it is, but it could become an issue if gain is applied globally).

I've tried first within the Magix NLE, using the included compressor, but even with the highest setting it doesn't reduce the loud passages or reduce the discrepancies between loud and low passages as much as I'd like, and produces some clipping – in fact it seems to be increasing the volume of everything, even though loud passages are increased less in proportion.
I tried processing that section with the accompanying standalone audio editor Magix Music Editor, which offers more control over the compression parameters, but the result is actually worse : even at medium settings it produces a lot of clipping. There's a limiter which can prevent the clipping, but it doesn't seem to be the cleanest way of processing audio.

I tried Audacity's internal compressor, which works well at reducing loud passages in “RMS” mode, but then applying gain will automatically increase the noise ; I also tried the “compression based on peaks” method, which seems to increase the volume of everything, not what I want. And according to this article that compressor is too rudimentary, the author recommends using the SC4 plugin instead. I tried SC4, but it doesn't seem satisfactory either : it reduces loud passages indeed, but leaves some very high peaks untouched (and it's quite tricky to set parameters as it doesn't have a curve representation like the internal compressor). The “leveler” effect seems excellent when looking at the curve – it does reduce the discrepancies, increasing the volume in low parts and reducing the volume in loud parts, and doesn't seem to touch the silent parts – but when listening to the result it's just ugly, with a lot of distorsion.

detail of the waveform before applying SC4

detail of the waveform before applying SC4 with :
RMS/peak = 0, attack time = 5ms, release time = 120ms, threshold = -18dB, ratio = 4, knee radius = 3.5, makeup gain = 0

What reliable method could I use to achieve the intended result ?

Last edited by abolibibelot; 6th Jan 2019 at 00:10.

Quote
6th Jan 2019 01:02 #2
Cornucopia

View Profile

View Forum Posts

Private Message
Member

Join Date
Oct 2001

Location
Deep in the Heart of Texas
Hard to know what settings/tools needed without hearing a clip.

But generally, here are a few things you must know:
1. As far as compressor & limiter & other filters are concerned, Noise is just another part of the same signal. Depending on your vocal differences, it may be easy, moderate, difficult, or impossible to separate the noise from the vocals, and if it is still there to any noticeable extent, it will be very noticeable when it fluctuates along with the vocal changes. This is known commonly as "pumping".
2. You may want to use a combo of compressor and limiter, possibly even with and expander/gate added in to improve the noise situation. Having not heard it, I cannot verify your noise and/or ambience/background sound levels, but they are commonly OFTEN much louder signal level than is assumed. Which makes the job harder.
3. You may need to have multiple tools working in concert on differing sections of the dynamic range spectrum (with different ratios of compression), or you may need to break up the signal into different eq bands and have them be worked on differently. This last part can do wonders for perception of noise reduction vs. wanted signal.
4. Often, the pumping described earlier can be mitigated by using longer amounts for attack & release time (esp. release). Play with that.
5. These are all best done using realtime live feedback. You may not have access to the best tools for the job.
6. Regardless, if done well, the treatment should NEVER result in distortion/clipping. That's a strong indicator the settings are wrong.

Scott

Quote
6th Jan 2019 04:23 #3
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
Thanks for the quick reply. Here are two short samples :
20151224 partie 20131224 sans aucun traitement - sample.flac
20151224 partie 20131224 sans aucun traitement - sample 2.flac
(The second one is cut around a transition between two sections of footage, all voices are distinctly louder in the second one, starting at 00m10s.)

1) Audacity's compressors have a noise level setting, but I don't quite understand what it does, especially in the default “RMS” mode which reduces loud sounds and doesn't touch the rest, unless “Compensate gain to 0dB” is checked”, but then the gain most likely affect the whole selection.
2) As I said, there's not much audible noise on the raw recording, and the Dehisser / Denoiser included with the NLE do a good job at reducing it, but it might become more audible, and tougher to reduce satisfyingly, after applying compression / gain.
3) I was hoping that a simple kind of processing would do the trick, with the right settings... although I'm willing to try something more sophisticated if it significantly improves the result. Magix Music Editor has a “Multi Max” feature, which, as far as I understand, acts as a 3 bands compressor.
4) Sometimes a loud voice (close to the mics) and a low voice (more distant) are heard almost simultaneously, so isn't it a bad idea to use a longer release time in this case ?
5) What do you mean by realtime live feedback ?
6) With Magix Music Editor, activating the compressor with the medium preset, and the limiter disabled, I got this (second waveform) :

Last edited by abolibibelot; 6th Jan 2019 at 04:41.

Quote
6th Jan 2019 15:54 #4
davexnet

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2008

Location
United States
I tried your second clip in Sound Forge, threshold -14.8db, ratio 1.8:1, no post gain. Is this closer to what you're looking for?

Attached Thumbnails

Quote
7th Jan 2019 00:59 #5
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
It seems close to what I get with Audacity's compressor and similar settings of threshold = -15dB, ratio = 2:1, in RMS mode, attack and release both at 0.40ms, no post gain (settings are less accurate : granularity of 1 unit for threshold, 0.5 for ratio).

The peaks are more reduced on your screenshot, but the general shape of the waveform is still distinctly “bulkier” on the second segment. If I apply a gain of -5dB to the second segment, the relative peaks are preserved, while the amplitude of the whole waveform is reduced.

I'm not sure of how this translates sonically, but I would suspect that keeping the relative peaks makes the recording livelier – for instance if someone bursts laughing it would be expected to have a little peak, even if that person is further from the microphones, right ? Yet, if that peak is above the threshold, a compressor with that kind of setting will reduce it just like it reduces a regular talking segment which was recorded at closer range. If it was just for that part, which is globally at a higher level than the rest, I could just apply a negative gain to it, or a positive gain to all the rest, but as the 1st sample illustrates, the main issue is when people are talking simultaneously at wildly different loudness levels. At first, as I wanted to avoid the fuss of doing extra audio processing outside of the NLE (which has to be done all over again if there's the slightest change in the editing – as a matter of fact I made a small change in this part just yesterday, at a very very late stage in this project), I tried to lower the volume of the most annoyingly loud voice (which was, well, mine, as I was holding the damn recording device, and mumbling something once in a while, when pressed for a reply, like : “NO THANKS, NO SUGAR, THAT'S FINE WITH NO SUGAR”, ’cuz why take somethin’ the body doesn't need right now ?!), using the volume curve, but obviously this is neither elegant nor practical.

Last edited by abolibibelot; 7th Jan 2019 at 02:07.

Quote
7th Jan 2019 02:46 #6
amaipaipai

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2018
The issues related with "smooth everything" and leveling it out, this tutorials can be found all over Youtube and other places.

In relation with noise, you want to get rid of the pop and crack sounds that is either underneath or over the persons voice , I've discussed about this before this kind of thing can't be filtered. You have to cancel it ou by using destructive interference principle, this process is simple but it's not easy.

You isolate the noise first using a vocal removal or similar tool, this will give you a base like alpha channels in Gimp/Photoshop to mark the noise you want to remove, next step is to make sure is all leveled out, if you select the range you are working with, go to "Amplify" you'll see a value of +12 dB you have do the reverse in to your base file applying -12 dB to it.

Last step is to eliminate the noise, invert the phase of the original and the "common"unwanted noise should cancel it self out, BUT, this doesn't work with Audacity just like that.

It require a special/dedicated software or hardware to sample it, compare it and do a noise cancel process on it multiple times because the noise spread over a wide band of frequencies. I've done a pretty basic isolation of the noise with voice cancel but it require a much more sophisticated way to handle it, I did two versions leveled and non-leveled, play with it.

You can watch and hear this principle in action working real time, a lot of things are happening, you hear the back vocals underneath the main singer, is possible to bring them forward and send the main vocalist underneath the back vocalists. It's also possible to do the same to a violine playing in the background, eliminating the main singer voice over it and bringing the instrument forward without any interfering with the violin sound.

This is way more advanced than a voice canceling effect, to solve your issue it require some level of sophistication like that.
https://youtu.be/wcDCdUlE4fM?t=141

Same with this one, you can erase the narrator without removing the background effects or music.
https://www.youtube.com/watch?v=cVJwZoUdV14

Attached Files

base_for_noise_removal.flac (399.0 KB, 142 views)

base_for_noise_removal2.flac (389.8 KB, 118 views)
Quote
7th Jan 2019 09:35 #7
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
The issues related with "smooth everything" and leveling it out, this tutorials can be found all over Youtube and other places.

Well, for issues related with just about everything, informations and tutorials can be found all over YouTube and other places nowadays, yet many people keep doing things wrong because they didn't know where to look, or didn't care, or had solidly ingrained misconceptions, or, precisely, because they couldn't sort out the signal from the noise... And places like this are supposed to help in that regard, by allowing a direct interaction with people having knowledge and experience in a particular field.
In this particular case I've made quite a lot of testing and read many articles and forum threads on the subject but still couldn't figure out how to do it right with the tools I have. The article I linked is the last I read before asking here, and the most interesting, the most informed, yet the most confusing as it contradicts a good amount of what I read before and what I thought I knew about audio compression.
“Loudness normalization is one of the most common misunderstandings in audio post production. Many people use peak normalization, which ensures that the maximum peak (= the maximum value of the audio data) reaches a specific level. However, the human perception of loudness does NOT depend on peak levels, therefore peak normalization is mostly useless. Recordings should be normalized according to its loudness, and not its peak level (see also Peak Normalization: Not the Solution).
The correct calculation of the perceived loudness is actually not that easy, because psychoacoustic properties of human perception must be considered (see e.g. equal-loudness contour, but that's a topic for another blog post). A very rough approximation is the RMS value (= short time quadratic mean of audio data), or even better: use your ears!”

In relation with noise, you want to get rid of the pop and crack sounds that is either underneath or over the persons voice , I've discussed about this before this kind of thing can't be filtered. You have to cancel it ou by using destructive interference principle, this process is simple but it's not easy.

Again, in this case noise is not a major issue, there's a slight hiss / hum which is not too distracting as it is, and the noise removal tools in the Magix NLE do a fine at removing it almost completely with no audible distorsion. So what you're mentioning here seems impressive (and probably expensive) but is kinda off-topic. The issue is to raise the level of the quiet segments and lower the too loud segments, to even things out (but not in a too pronunced and unnatural way like on the radio) and make for a more pleasant listening, in an automated but “sound” way. So, can I achieve that with the tools I have, or with other free tools, or is it too much to ask ?

Last edited by abolibibelot; 7th Jan 2019 at 09:41.

Quote
7th Jan 2019 11:52 #8
amaipaipai

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2018
Originally Posted by abolibibelot

Well, for issues related with just about everything, informations and tutorials can be found all over YouTube and other places nowadays, yet many people keep doing things wrong because they didn't know where to look, or didn't care, or had solidly ingrained misconceptions, or, precisely, because they couldn't sort out the signal from the noise... And places like this are supposed to help in that regard, by allowing a direct interaction with people having knowledge and experience in a particular field.

Totally agree with you in every point you have made.
People do things wrong and pass the wrong information forward because it get views, today people just want what is practical, very easy and short. This and other forums have participants with different expertise in many different fields but it doesn't mean we are obliged to help anyone, everything here is voluntary and free of charge.

Don't get me wrong, I have many unanswered questions of my own, people with their time can help you up to a point, if it require some level of sophistication people don't even answer it and I don't blame them.

Originally Posted by abolibibelot

In this particular case I've made quite a lot of testing and read many articles and forum threads on the subject but still couldn't figure out how to do it right with the tools I have. The article I linked is the last I read before asking here, and the most interesting, the most informed, yet the most confusing as it contradicts a good amount of what I read before and what I thought I knew about audio compression.
“Loudness normalization is one of the most common misunderstandings in audio post production. Many people use peak normalization, which ensures that the maximum peak (= the maximum value of the audio data) reaches a specific level. However, the human perception of loudness does NOT depend on peak levels, therefore peak normalization is mostly useless. Recordings should be normalized according to its loudness, and not its peak level (see also Peak Normalization: Not the Solution).
The correct calculation of the perceived loudness is actually not that easy, because psychoacoustic properties of human perception must be considered (see e.g. equal-loudness contour, but that's a topic for another blog post). A very rough approximation is the RMS value (= short time quadratic mean of audio data), or even better: use your ears!”

I see nothing wrong here, the author is correct, to this day people don't have a clue how to use the loudness function there is a huge amount os misconception about it done by so called "specialists" that don't take in the consideration for equal-loudness contour and other things, but that's a topic for another post.

Originally Posted by abolibibelot

Again, in this case noise is not a major issue, there's a slight hiss / hum which is not too distracting as it is, and the noise removal tools in the Magix NLE do a fine at removing it almost completely with no audible distorsion. So what you're mentioning here seems impressive (and probably expensive) but is kinda off-topic. The issue is to raise the level of the quiet segments and lower the too loud segments, to even things out (but not in a too pronunced and unnatural way like on the radio) and make for a more pleasant listening, in an automated but “sound” way.

I've heard distortions and others issues with your sample, maybe it's because I've trained ears to pick those up and you didn't even notice it?
The information you received has everything to do with the topic because if you do what you want without taking care of that noise, you'll raise it along with the signal making it even worse.

Originally Posted by abolibibelot

So, can I achieve that with the tools I have, or with other free tools

Sorry, you can't achieve that with the tools available to you.

Originally Posted by abolibibelot

or is it too much to ask ?

Yes, it is too much to ask because it require very expensive, studio grade tools not available to public and paid professionals to handle it.
You can try Audacity "Normalize" effect, it might get the desired effect to you but I understand that is a little more complicated than that.

By the way, I don't know who this people are and what they are talking about in the recording, but they sound so sweet and kind. It brings back very good memories of some of my aunts and grandma gathering in the kitchen to talk and laugh about silly things, I as a little boy stood there enjoyed the whole thing.
To bad they all passed away many years ago, I wish I had a recording like yours to remember them.

Quote
7th Jan 2019 14:51 #9
amaipaipai

View Profile

View Forum Posts

Private Message
Member

Join Date
Feb 2018
This might help you out:
https://www.learndigitalaudio.com/normalize-audio
https://www.learndigitalaudio.com/ab-level-matching-tutorial-using-free-plug-ins

But again, the author consider you have a clear sound material to work with and you have noise issues.
Good luck.

Quote
7th Jan 2019 22:51 #10
JVRaines

View Profile

View Forum Posts

Private Message
Member

Join Date
Aug 2010

Location
San Francisco, California
Originally Posted by Cornucopia

You may need to have multiple tools working in concert on differing sections of the dynamic range spectrum (with different ratios of compression), or you may need to break up the signal into different eq bands and have them be worked on differently. This last part can do wonders for perception of noise reduction vs. wanted signal.

This is called a multiband compressor and it generally runs rings around unitary compressors.

Quote
5th Mar 2019 16:54 #11
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
Could someone please explain what is happening in this video at 11min00s ?
https://www.youtube.com/watch?v=oVME_l4IwII
Audacity's “Equalization” effect is used, but it results in a significant boost of the volume in audible portions, which does not seem to affect silent or almost silent portions, which is pretty much what I was looking for, but couldn't quite achieve with Audacity's own “Compressor” effect, or other compressors I tried, as I wrote earlier. (I have let this thing lie on the side AGAIN since then — I've had some trouble with the computer, and it's so frustrating to run into so many technical obstacles, each of them seemingly requiring a top-notch expert using state-of-the-art tools to be solved properly, or a lifetime's worth of accumulated knowledge for me to deal with it satisfyingly... when all I want is to get it over with already...)

~~~~~~

[I was about to reply this in January when damn BSOD striked again... another issue I haven't been able to properly diagnose and fix, and for which asking for advice on various forums didn't help much – although it seems to have improved lately, I haven't had a BSOD in about 10 days, lucky me!]

I've heard distortions and others issues with your sample, maybe it's because I've trained ears to pick those up and you didn't even notice it?

Could you elaborate ? Aside from the hiss / hum (which is audible but not that severe, I've heard worse, and the signal-to-noise ratio is more than decent for such a little device — this was recorded with a Panasonic ZS3 camera) there are some manipulation noises, but they're relatively rare on the whole recording. There's also a faint beeping sound from the camera autofocus in other parts of the recording, but I don't hear it in those particular samples.
I listen to audio from the computer with a pair of Focal JM-Lab Chorus 705 S speakers through a Cambridge 540R amplifier, so it's probably not as accurate as monitoring speakers or good headphones, but still far better than regular computer speakers.

The information you received has everything to do with the topic because if you do what you want without taking care of that noise, you'll raise it along with the signal making it even worse.

Here is sample 2 processed with the NLE's own denoising filters, on the first file with a combination of “Denoise” (which uses an actual noise sample from a silent part of the file) and “Dehiss” (which doesn't use a noise print so may be less accurate but usually produces a fairly clean result on its own), both with the cursor at 4, on the second file with “Denoiser” alone, cursor at 8. Would you say that it's clean enough, or it's totally ugly ? Intuitively I thought that it would be better to denoise after applying compression / gain, so that is incorrect ?

[Attachment 47789 - Click to enlarge]

[Attachment 47790 - Click to enlarge]
What I meant was : it might be relevant in a professional context where the goal is to get crystal clear audio with state-of-the-art equipment and software, but in this case it seems a tad overkill. The audio is already much better than the picture on that footage (it was mostly shot in front of a large window, I struggled a lot to get something barely watchable out of crushed shadows and blown highlights, with a combination of Avisynth pre-processing and segment-by-segment processing in the NLE), and I'm just trying to improve it ever-so-slightly, compared to what I did with the first draft of that movie (for which I only used the NLE's internal effects).
There must be something between perfection and disaster (like... the whole world now that I think of it).

You can try Audacity "Normalize" effect, it might get the desired effect to you but I understand that is a little more complicated than that.

This is obviously not what I asked, “Normalize” raises the level of everything, or does absolutely nothing if there's only one peak at or near 0dB (or causes ugly saturations if allowed to go beyond 0dB).
Audacity's Compressor, even if it's said to be rudimentary, does work, produces consistent results, and doesn't seem to audibly damage the signal. But in the default mode it seems to only reduce the peaks, instead of globally reducing the volume of loud parts, and in the “based on peaks” mode I don't quite understand what it does and how to control the outcome.
SC4, which is said to be better, either does nothing audible, or produces ugly distorsions when using strong enough settings to actually have an effect.
I have also tried ThrillseekerLA (following a link in one of the articles you linked above) : I get essentially the same results as with SC4, either the effect is negligible, or the voices become harshly distorded (while there's no clipping according to Audacity, so even that can't be used as a rough guide).

By the way, I don't know who this people are and what they are talking about in the recording, but they sound so sweet and kind. It brings back very good memories of some of my aunts and grandma gathering in the kitchen to talk and laugh about silly things, I as a little boy stood there enjoyed the whole thing.
To bad they all passed away many years ago, I wish I had a recording like yours to remember them.

Well, thanks, I guess ! That was my mother, her younger sister (just before she suffered a stroke – actually she said repeatedly during that conversation that she wasn't feeling well, at first doctors thought it was just some bad cold or digestive disorder, common at that time of the year, but it turned out to be much more severe... luckily she recovered and doesn't have disabling sequellae as far as I know), my grandmother (who “went away” three years ago, or just two years after this recording was made) and myself, on Christmas day 2013. In the second sample my aunt says that she does not want my grandmother to go outside in the cold, and my mother is backing her; I had no opinion on the matter whatsoever.
As for the preservation of digital memories, I have an ambivalent attitude toward it. I'm a major offender since I store a lot of recorded stuff, but then I never watch most of it, or I obsess for months (years even in this case!) trying to salvage badly recorded footage, to the point where it becomes meaningless and devoid of the original emotional value which may have been attached to it. Memories get worn out each time they're evoked as Jean-Paul Sartre wrote (from La nausée) : “Sometimes, in my story, it happens that I pronounce these fine names you read in atlases, Aranjuez or Canterbury. New images are born in me, images such as people create from books who have never travelled. My words are dreams, that is all. For a hundred dead stories there still remain one or two living ones. I evoke these with caution, occasionally, not too often, for fear of wearing them out, I fish one out, again I see the scenery, the characters, the attitudes. I stop suddenly: there is a flaw, I have seen a word pierce through the web of sensations. I suppose that this word will soon take the place of several images I love. I must stop quickly and think of something else; I don't want to tire my memories. In vain; the next time I evoke them a good part will be congealed.” (English translation found here.)

Last edited by abolibibelot; 5th Mar 2019 at 17:06.

Quote
5th Mar 2019 17:18 #12
davexnet

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2008

Location
United States
Seems to be an equalization curve, in the curve shown, the low frequencies are increased and the high frequencies reduced
Just another way of visualizing the equalization adjustment

Quote
7th Mar 2019 09:37 #13
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
Seems to be an equalization curve, in the curve shown, the low frequencies are increased and the high frequencies reduced
Just another way of visualizing the equalization adjustment

So it has nothing to do with an actual audio compression, despite being used as an illustration of audio compression ?
More generally (even if it's kinda out of topic) what do you think of the claims the dude made about the evolution of popular music, in technical terms ?

So... seems like I'm back to square one on the main topic...
Can someone at least explain why I obtained so poor results with the supposedly better tools I tried ? (Audacity + SC4, ThrillseekerLA)
Is there any known reliable multiband compressor I could try, preferably freeware or costing few buck ? Or am I really J.W.F. and S.O.L. ?

“Good times are coming, I hear it everywhere I go
Good times are coming, but they're sure coming slow” – Neil Young

Quote
7th Mar 2019 21:48 #14
davexnet

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2008

Location
United States
This video shows the use of a "speech volume leveler" in Adobe Audition.
Perhaps you can find a standalone plugin that does something similar
https://library.creativecow.net/articles/devis_andrew/balancing-audio-levels-2/video-tutorial

Quote
11th Mar 2019 14:17 #15
abolibibelot

View Profile

View Forum Posts

Private Message
Member

Join Date
Apr 2015

Location
France
Thanks for this, it definitely looks like a step in the right direction. A few questions :
– Should this (kind of) effect be applied first, before any attempt at denoising ? Or should I denoise first ?
– Do you think (from the two samples in my 2019/03/05 post) that the Magix editor does a good enough job at denoising ?
– How does this effect proceed technically ? Apparently it can include a compression and gating, but that's optional, so what does it do with the basic settings ? Is it a well established kind of audio processing, or really a brand new feature developped by Adobe ?
– At the end of the video the guy talks about further processing he intends on applying to that recording, to “sweeten up the audio” : what does he mean by that, and what specific effect could he be alluding to ?

Quote
11th Mar 2019 14:29 #16
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
I've only skimmed over a few posts, but I did download the samples from post #3 (I think) and ran them through some volume normalising DSPs, courtesy of foobar2000. I make no claims about the results. They won't be perfect, as I just used my go-to settings (and the defaults for VLevel) and didn't listen to the output closely. In fact they may even suck, or you might find something you like.

There's also a utility called Levelator that's designed specifically for speech. It should be in the VideoHelp software section.

The DynamicAudioNormalizer command line. The other samples used a DSP in the conversion chain so there's no special command line as such.
/d /c c:\progra~1\foobar2000\encoders\ffmpeg.exe -i - -ignore_length true -af dynaudnorm=f=150 -c:a pcm_f32le -f wav - | c:\progra~1\foobar2000\encoders\QAAC\qaac.exe --ignorelength -s --no-optimize --no-delay -V 64 -o %d -

There's a thread here where "compressing" while converting with fb2k was discussed. There's links for the DSPs and some instructions for setting things up if you search through the thread, assuming one of the samples pleases you.

None of the DSPs I used for the samples "compress" as such. They increase the volume of the quiet parts instead of squishing the loud parts (the DAN does have a compressor, I think, but it wasn't enabled). In theory the end result should be the same as compression, if not better, and it's easier to configure and tends to be more set and forget.

"Compression" only. I didn't attempt anything else. They're AAC to save some bandwidth. The foo_r128norm samples will be quieter than the others as the DSP aims for EBU R128 volume. From what I heard it didn't compress the samples well anyway, and it has no options.

Compressed Samples.zip (3.6MB)

Last edited by hello_hello; 11th Mar 2019 at 16:17.

Quote
11th Mar 2019 19:18 #17
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
I've been meaning to enable the Dynamic Audio Normalizer's alternative boundary option for a fair while, as it can be slow to respond at the beginning of the audio. This thread finally prompted me to test it, and for the samples posted earlier it does improve the compression at the beginning.

For ffmpeg the alternative boundary mode is enabled with b=1. ie
-af dynaudnorm=f=150:b=1

Quote
11th Mar 2019 19:52 #18
davexnet

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2008

Location
United States
Originally Posted by hello_hello

I've been meaning to enable the Dynamic Audio Normalizer's alternative boundary option for a fair while, as it can be slow to respond at the beginning of the audio. This thread finally prompted me to test it, and for the samples posted earlier it does improve the compression at the beginning.

For ffmpeg the alternative boundary mode is enabled with b=1. ie
-af dynaudnorm=f=150:b=1

I took a look at your samples, the LoudMax files look good to me, good balance in the levels;
now you can use Audacity noise reduction and perhaps lower the levels slightly -
thanks for posting

Quote
11th Mar 2019 22:09 #19
hello_hello

View Profile

View Forum Posts

Private Message
Member

Join Date
Mar 2012
LoudMax comes as a VST plugin, a WinAmp plugin, an AU plugin, and a LADSPA plugin, so you can use any program that can load one of those types.

For the record, I used the VST version, and for foobar2000 that requires the VST Adapter. There's only two knobs. Threshold was -18dB and Output was -3dB.

Now I've looked I realised the version I'm using is dated May 2017. I think I should update it.

https://loudmax.blogspot.com/

Last edited by hello_hello; 11th Mar 2019 at 22:14.

Quote

Compressing voice recording the right way

Thread Tools

Similar Threads

MP3 noise reduction, voice improving in recording. Please Help

How can I alter my recording voice a little bit while irreversible?

Voice recording while walking

Suggestions for some voice changing software for cartoon voice-overs?

Voice over how to