Call recording wav files sound like someone talking through a fan

FreePBX 2.9.0rc1.3 Dahdi Asterisk 1.6 running on 5 servers.

Rhino RCB24 cards for phones (four servers), Rhino R4T1 card for external interface (1 PRI and 2 T1s, plus four SIP phones). The setup is exactly the same as our old PIAF setup based on Asterisk 1.2 and an very old version of FreePBX. In fact, with the exception of the new firmware version on the R4T1 card and new hard disks, the hardware is identical. This firmware is not compatible with the old ZAP drivers, so going back isn’t a possibility on that server.

The systems are all PIAF Bronze (Asterisk 1.6 and DAHDI) installation. Ever since the installation of the new system (installed a new hard drive and installed new software) we’ve been having three problems:

  1. All of our recordings sound like someone is talking through a fan. I was using “wav” format for our call recordings and have switched to “WAV” to see if that helps. I convert the files (using LAME) to mp3 files for long-term storage, and they sound like crap after the conversion. I thought it was a problem with LAME until I listened to one of the WAV versions of the recordings - they are identical. Note that this is happening on our SIP and DAHDI phones, so the problem is consistent across all five servers. We record on the server with the extensions, so all five servers are used for recording.

  2. We have to execute an “asterisk reload” on every server every hour or two or we start to have clicking, popping, and audio drop-out on our outgoing calls (we have no incoming service, just outbound). This drop-out is different - it sounds like an AT&T cell phone call (talking then -silence- then more talking). If I listen in on a barge extension, the “comfort noise” drops out. I can still hear the caller talking, but the people at the remote end cannot. The recordings sound “normal”, in that they still sound like someone talking through a fan, but the audio is all there. I played around with various rx and txgain settings, trying to see if that helped. It doesn’t, except that I can make the call unintelligible through too much or too little gain. This occurs on all calls (not just the DAHDI or SIP Phones).

  3. In small clusters, outgoing calls are terminated before the DAHDI line finishes connecting, which is then picked up by our “operator” extension as if it’s an incoming call. Since we don’t use the inbound for anything, the extension’s mailbox is now full at 100 messages of 42 second dialtone recordings. It seems to happen four or five calls in a row, randomly throughout the day. The callers get dead air (the line has hung up).

Skyking suggested call timing (in another forum), but I don’t really see how that’s possible, each of the servers is connected to one another via IAX trunks.

So, problem 1 might be solved by switching to a different recording format, but I’d really like to get some opinions on that before I start jerking all of the servers around through a list of 7 or 8 file formats.

Problem 2 isn’t “solved” but the workaround is OK until I can figure out the problem. I’m thinking there’s a leak somewhere, but I can’t figure out where to look.

Problem 3 might be related to problem 2, but I can’t say for sure. The problems seem to coincide, but the people on the phones are little more than monkeys that talk. I can’t get any kind of support for fixing the problem from them except for screeching and throwing poop.

Any suggestions would be greatly appreciated.

I’ve included the chan_dahdi.conf file for the R4T1 server below.

It is a timing problem, and trunks don’t provide timing.

Have you looked at the interrupt load on the box? How about context switches?

Is the Ethernet interface taking errors?

The interrupt load on the server is fine.

cpu 5529579 199117 1151378 109457824 1768031 5816923 181069 0
cpu0 5529579 199117 1151378 109457824 1768031 5816923 181069 0
intr 2910163308 1241329777 3156 0 0 0 1 0 2 1 0 1 1 9003 0 11087901 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5112908
0 0 0 0 0 0 0 242297820 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 31135 0 0
0 0 0 0 0 1241000899 0 0 0 0 0 0 0 169290703 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 1174718008
btime 1302123511
processes 2435784
procs_running 1
procs_blocked 0

1174M context switches over a 14 day period doesn’t seem unreasonable to me. Same with 2B interrupts (at 1000 ints/sec on the Rhino card alone). We are not using swap, and the system’s app memory never climbs about 60%.

There are no ethernet errors. The phone servers are using their own dedicated Gigabit switch.

The calls are recorded on the server that is connected to the callers’ phones, and the problems cut across all of the servers (The outbound server has no DAHDI phones, but is connected to the T1s and the others are using POTS phones for the callers). The number of phones doesn’t seem to figure in - one of the servers has a couple of phones in use, but it fails after a couple of hours, same as the server with 14 phones.

All of the calls, from the minute the server is turned on, have the same ‘through the fan’ audio characteristic.

One of the other things about the audio that I forgot to mention is that the audio is “half speed” - in addition to being choppy (on disk), if you remove the ‘blanks’, the sound file is normal. At one point, I was wondering if the file was getting recorded as it was two channels, but only the left channel is getting used. (Stereo wave files get recorded left channel, then right channel, then left channel, etc.) It’s as if the left channel is getting all of the audio, and the right channel is still being put down onto the file, even though there’s no data there.

The hardware is unchanged from 1.2 - it has been working fine for four or five years before the upgrade and now is suddenly having problems with the new software?

The ‘representative’ system (the one with the T1 card) has been up for about 14 days since the last reboot, the last Asterisk restart was 12 hours ago, and the last reload was a couple hours ago.

I’ll keep diving down this rabbit hole for as long as it takes, but I obviously need more than “it’s a timing problem.” I’ve been looking at this problem for over two months and every time I think I’m close, I find that I really haven’t solved anything yet.

I’m at the point that commercial support would be just super.