Stupid command line tricks for managing disk use

Here are a few one-liners for quickly dealing with the usual cruft found on unmanaged development servers (and a few production boxes bogarted by the developers).

Find all Oracle-style HTTP log files named like “access.log.1258761600” (why can’t they use an actual timestamp for the suffix on those?) over a certain age (example, 30 days):

find . -type f -name "*log.*" -mtime +30

Delete all those files you found:

find . -type f -name "*log.*" -mtime +30 -exec rm -f {} \;

Gzip ‘em:

find . -type f -name "*log.*" -mtime +30 -exec gzip {} \;

Some further explanation:

It’s worth a gander the man page for find to see everything you can do with it.

The dot right after the command means “search from here on down”. To avoid getting anything below a particular directory, just specify the full directory path. For example:

find /apps/oracle/portal/j2ee/home -type f -name "*.xml" -mtime +30

You can specify multiple directories at the same time like this:

find /tmp /var/tmp -type f -name "*.xml" -mtime +30

The “-type f” is important because it tells find to only go for files, not directories.

Using “-name” to specify a filename pattern is always a good idea when practical. It’s extra insurance against a hard-to-reverse “oops”.

There are a number of different date stamps on files. The “-mtime” parameter specifies n*24 hours from the last time the file was modified.

An “-exec” is required to invoke additional commands on the files found, with rm the “-f” option tells it to go ahead and delete without prompting.

The empty braces (”{}”) are a placeholder for the file name values that will be plugged in as the shell runs through the list of files find… finds.

The semicolon (”;”) is needed to execute the commands in this one liner. It’s escaped (”\;”) to make sure the shell passes it literally without interpreting it (make sure to put whitespace between the “” and the end of your “-exec” command).

Assessing disk usage in general usually begins with a df -h and then a series of du queries.

Here’s one that shows space used by all the folders under the pwd (present working directory) and shows a sorted list of only those that are at least a gigabyte.

[root@myhost usr]% pwd
[root@myhost usr]% du -hs * | sort -n | grep G
1.1G    lib
1.2G    lib64
2.0G    share

Here’s a one-liner to archive a group of log files created within the last 10 days, handy when you’re trying to collect data for support purposes:

find . -type f -name "*log*" -mtime -10 -exec tar rvf example-logs.tar {} \;