[Gllug] Apache log files

william pink will.pink at gmail.com
Tue Apr 7 14:00:23 UTC 2009


On Tue, Apr 7, 2009 at 2:19 PM, william pink <will.pink at gmail.com> wrote:

> On Tue, Apr 7, 2009 at 11:57 AM, - Tethys <tethys at gmail.com> wrote:
>
>> On Tue, Apr 7, 2009 at 11:11 AM, william pink <will.pink at gmail.com>
>> wrote:
>>
>> > I have the rather horrible task of splitting up lots (40Gb's worth) of
>> > Apache log files by date, the last time I did this I found the line
>> number I
>> > then tailed the file and outputted it into a new file which was a long
>> > arduous task. I imagine this can be done in a few minutes with some
>> > Regex/Sed/AwkBash trickery but I wouldn't know where to start can anyone
>> > give me any pointers to get started?
>>
>>         #!/bin/bash
>>
>>        indatefmt="+%d/%b/%Y"
>>        outdatefmt="+%Y-%m-%d"
>>
>>        start_date="mar 25"
>>        end_date=$(date "$indatefmt")
>>
>>        count=0
>>        while true
>>        do
>>                indate=$(date "$indatefmt" -d "$start_date + $count days")
>>                outdate=$(date "$outdatefmt" -d "$start_date + $count
>> days")
>>
>>                fgrep "$indate" big_logfile > "small_logfile.$outdate"
>>
>>                [ "$indate" = "$end_date" ] && break
>>                ((count++))
>>        done
>>
>> It's a bit inefficient, as it scans the log file multiple times,
>> but for comparatively small log files like you have, that shouldn't
>> be too arduous. It'll also pick up any entries that happen to have
>> the date format you're looking for in the URL, for example. To work
>> around either of those, using a scripting language like python or
>> perl to read and examine each line in turn is probably the right
>> solution. But the quick and dirty approach above will probably be
>> fine for you.
>>
>> Then fix your setup so it logs to per-date files to start with...
>>
>> Tet
>>
>> --
>> The greatest shortcoming of the human race is our inability to
>> understand the exponential function -- Albert Bartlett
>> --
>> Gllug mailing list  -  Gllug at gllug.org.uk
>> http://lists.gllug.org.uk/mailman/listinfo/gllug
>>
>
> Hi Tet,
>
> Thats just what I needed, I promise to practice my bash scripting while
> this script runs through these logs.
>
>
> Many Thanks,
> Will
>

One question how can I adjust it so I can use multiple log files? I have
tried but I keep breaking it.

Thanks,
Will
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20090407/2ebd117d/attachment.html>
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug


More information about the GLLUG mailing list