[Gllug] Slightly large httpd request log requires splitting up
Matt Blissett
matt at blissett.me.uk
Tue Jul 15 17:26:10 UTC 2008
william pink wrote:
> Hello,
>
> I have had a bit of a Google but nothing came up as relevant, I have a
> Apache http request log file that is a whopping 17GB because it has not
> been rotated and compressed since it's creation. What I need to do is
> split it up into smaller chunks by date and compress it, Of course some
> sort of shell script would be the ideal solution but with only my basic
> knowledge of shell
> scripting this would take some considerable time to write. Does anyone of
> know of any scripts or app out there which could do this for me?
Lines in my log file look like this:
24.182.65.204 - - [01/Jun/2008:00:57:35 +0100] "GET / ....
So this would be sufficient:
for m in Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec;
do
grep "\[...$m.2008:" big_logfile | gzip -9 > ${m}_log.gz
done
I don't think it's a problem, but you might need to adjust the pattern to
make sure you don't match URLs.
To split by day as well:
for m in Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec;
do
grep "\[...$m.2008:" big_logfile > ${m}-temp
for d in `seq -w 31`;
do
grep "\[$d.$m.2008:" ${m}-temp | gzip -9 > ${m}-${d}_log.gz
done
rm -f ${m}-temp
done
(Which leaves you with some pointless files, like Feb-31_log.gz)
--
Matt
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list