[Sussex] Parsing a Logfile with Perl....
Richie Jarvis
richie at helkit.com
Wed Mar 12 10:32:19 UTC 2008
Hi All,
I have written a little perl script to read a logfile, and parse certain
values for matching lines into a csv file. It works great - until I
tried it on one of our systems and discovered that a colleague had put a
'-' character into one of the usernames I am parsing. After lots of
cursing, I am stuck on this one, and wonder if anyone can see how to
adjust my regex to span the situation where usernames with and without
funny characters can be encompassed?
Here is an example line from a well-formatted line:
2007-05-31 15:21:13 Sent SMS [SMSC:mbloxpsmsca] [SVC:fusion] [ACT:]
[BINF:] [from:62569] [to:16474075000] [flags:-1:1:-1:-1:-1]
[msg:100:01062F1F2DB69181923945413141363634383631323734414246333536363635343442423438464444353732303745433300030B6A00C54601C60001550187360603773700018707060354454D502D7B31363437343037353030307D0001873806034375]
[udh:12:0B05040B8423F00003210401]
Here is one from the badly-formatted line:
2008-03-07 05:09:54 Sent SMS [SMSC:mbloxpsmsca] [SVC:hpit-ems] [ACT:]
[BINF:] [from:62569] [to:+16475882516] [flags:-1:0:-1:-1:-1]
[msg:143://SS Please download mProveDM
https://fusiondm-itg.houston.hp.com:443/fusiondl/EMA.cab?D=a5619ITGITG13A0E4B216685DBD31C50B0C9E6F91F2N8AB384A870]
[udh:0:]
My script spits out the following output for these:
Good: 2007-05-31,15:21:12,fusion,62569,216.154.251.59,16474075000
Bad: 2008-03-07,05:09:53,hpit,ems,62569> (15.243.169,to
Currently, I have the rather ungainly regex as follows:
$_ =~
/^(\d+-\d+-\d+)\D+(\d+\D+\d+\D+\d+)\D+\w+\W+\w+\W+\w+\W+\w+\W+\w+\W+(\w+)\W+(\w+)\W+(\w+\W+\w+\W+\w+\W+\w+)\W+\w+\W+(\w+)\W+/;
I am sure there is a better way to do this - i.e. search for the string
[SVC: and gobble everything up to the ], but being a bit of a newbie to
regex, I am googling wildly, and not getting much inspiration.
Does anyone have any pointers?
Thanks in advance,
Richie
More information about the Sussex
mailing list