Periodically Delete CDRs Older than N Months

Continuing the discussion from How to clear CDR Database:

I have a list of commands I’m considering turning into a cron job. I’ve already tested and when I copy/paste these onto an old PBX system that is not in use, it does successfully work and the CDR database shrinks down from 2+ GB to <300 MB.

Does anybody have a better way to do this, or will this create a problem with the PBX itself?

mysql
use asteriskcdrdb;
delete from cdr where calldate < DATE_SUB(NOW(), INTERVAL 6 MONTH);
delete from cel where eventtime < DATE_SUB(NOW(), INTERVAL 6 MONTH);
optimize table cdr;
optimize table cel;
exit;

Would it be an issue to let the CDRs run this command nightly? So that it doesn’t have to delete 15 million + lines of data once every 6 months?

Nope - if you’re going to do this, you should do it every day to minimize the impact on the server.

I would like to implement this script created by @xptpa2020 on my FreePBX distro install of version 13 fully up to date. Is this exactly what I would put into a script and schedule a cron job each day to run it?

mysql
use asteriskcdrdb;
delete from cdr where calldate < DATE_SUB(NOW(), INTERVAL 6 MONTH);
delete from cel where eventtime < DATE_SUB(NOW(), INTERVAL 6 MONTH);
optimize table cdr;
optimize table cel;
exit;

I have never had to perform SQL commands at the command line before. This is a production system and I cannot risk it going down, so if you see any issues with this script please share with me your concerns.

1 Like

Create a script called (for example) “DeleteCDR8Mos.sql” in your “asterisk root directory” or in the “root” home directory. The example below assumes it’s in the “/root” directory.

use asteriskcdrdb;
repair table cdr;
repair table cel;
delete from cdr where calldate < DATE_SUB(NOW(), INTERVAL 6 MONTH);
delete from cel where eventtime < DATE_SUB(NOW(), INTERVAL 6 MONTH);
optimize table cdr;
optimize table cel;
exit;

In your crontab on the server (important part, since access to the database is limited to localhost), add something like:

X X X X X X mysql -u root asteriskcdrdb < /root/DeleteCDR8Mos.sql

The “X” entries are there for your time/date/day entries - salt to taste.

That should get you started.

3 Likes

I have implemented this and it worked great. My only concern is the extra disk thrashing that it is going to introduce on a daily basis.

This server has 300 users and about 2 years worth of CDR records. When I first ran the above script it took 30 minutes to run from start to finish.

After the first time I ran it a second time and even with only 6 months worth of CDR records it still took 15 minutes to complete.

I think I am going to only schedule this quarterly instead of daily which should minimize the extra disk usage that would otherwise be had daily.

@cynjut thanks so much for your help with this!!

Hi!

I am no MariaDB/MySQL DBA but the longer you wait between cleanups, the more disk trashing you will have because it will have to delete more records than if it was done daily…

Have a nice day!

Nick

@Marbled you are correct however from my initial tests even with 2 years of CDR records the first time it only took 30 minutes. If I do this daily there will be an extra 15 minutes of high disk activity vs. quarterly, lets estimate 20 minutes.

So its either 15 minutes every single day of heavy disk activity vs. only approx. 20 minutes 4 times per year.

I don’t see how daily is better by introducing all that extra heavy disk usage.

If you run the script daily, you will be deleting one day’s worth of records. Does it really take 15 min to delete a day’s worth of CDR/CEL records?

@lgaetz Hi Lorne no the deletion portion of the script takes a few seconds its the other portion of the script that takes so long. If it wasn’t for the other line items I would def agree with @Marbled that it would be better to run it daily.

Lorne what is your thoughts of eliminating some of the other lines of the script above and just include the lines that get you into the file and delete the daily entries that over 6 months old?

If that would be ok maybe I could consider running it daily with just the delete lines of the script and monthly or quarterly where it repairs and optimizes the cdr and cel.

your thoughts would be greatly appreciated.

What are you running this script for? Reduce backup sizes? Reduce disk space? Other?

@lgaetz Lorne I am trying to keep the CDR file size down. I feel the larger the file is the greater the risk of corruption and problems in general. This is an extremity important server and we try to be proactive in its maintenance and well being. We don’t use call details over 6 months old.

We have two other servers that are hot spares so when we refresh there data by doing the hot spare “backup / restore” it takes a long time now because of how big the CDR is.

The only question is, will you EVER have to look up historical data older than x months?

If so, then i would look into a way of moving the logs to a different server.

On MySQL, deletion is always a ‘high impact’ activity.

Things that might might help can include:

  1. Set up an index on the calldate field. This will help your machine find specific records more quickly and only delete the records that need to be deleted. Note that deleting data from an index doubles the number of deletes, but should improve the time it takes to do the deletion by avoiding the “full scan” of the database you do every time you run the query.

  2. Avoid optimizing your data on every delete. If you delete the records from the “front” of the database and them optimize the data, you will move all of the records in the database to the “front” of the database.

  3. If you are moving data from machine to machine, you should also consider deleting the records in the destination database before copying the data from the “live” machine to the backup. This avoids a lot of collistion errors that make the process take longer.

@lgaetz I was wondering if you had any more comments regarding this topic. You had asked the reason for running the script … file size.

We maintain 3 servers (primary live, and 2 online warm spares). When performing the backup on the main and the restore on the 2 warm spares it is taking a very long time now due to the CDR records being so large.

If you look at the script above you will see that it does a repair first, the record deletion next, optimize last.

Can I “Safely” get away with only deleting the records daily and running the table repair and that table optimize quarterly?? Again this is a very important series of servers and it is imperative that I do not introduce any problems.

Had the same issue with our warm spare, we decided that for this little downtime we have once or twice a year we don’t need primaries CDR while we are down. So we just don’t backup CDR to the warm spare.

We do have a additional full backup to a cloud server in case the primary is getting physically destroyed, so we can retrieve CDR or anything from that backup.

P.S. whats the idea of two warm spare’s?

Thanks

Yes we also considered not having the cdr on the warm spare also. I think we will reevaluate that issue.

I guess my next question then would be how big is too big when it comes to the CDR database. We have millions n millions of entries. Could I simply let it go and allow it to double or triple over the next 4-5 years??

The reason for two warm spares … simply that we can not be without a warm spare. If something would happen to one of the boxes we still are left with two.

Truth is, it would be nice to have a feature like Call Recording Reports Module which zips recordings older than a specific amount of time. Should do the same to CDR.
CDR isn’t huge files, just the restoring takes forever…

So your phones are looking to 3 Servers? Or they are looking to two DNS, and in case two servers are down, you just change the resolving address?

We do everything by way of dns. All locations all point to the same cloud server. Have an issue update dns to point to the next server.

Hi!

You probably set a pretty low TTL on that record for this work well (like 5 minutes)…

I hope your DNS servers are at multiple locations because having a low TTL with all servers at the same place make you disappear pretty fast…

Have a nice day!

Nick