Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Lavezzi

macrumors member
Original poster
Jul 17, 2008
55
0
Hi guys,

I have a set of 6 log files which include test data on a specific website (each log file tests a different server). Each log file consists of:

7 Traceroutes + 1 Ping (which sends 120 packets) - taken every 4 hours for a 7 day period.

I created the script to do this myself using a bash shell script and crontab which was run on an external server.

What I need to do is to write a parser which will export my log data into a csv file, which I can then export into Excel in order to analyse the results.

I have an example parsing script originally written in C++, however I have only just taken up coding properly this year and at the moment I only know very basic Java.

I'm not asking anyone to do this task for me, but after looking around the internet for several hours I'm still none-the-wiser about where to start or what to do.

If anyone could give me any pointers, or knows any useful links then I would be incredibly grateful. I've included one of my logs and the sample parser in the .rar file attached if anyone needs a reference point.
 

Attachments

  • Parser.zip
    46.8 KB · Views: 229
What I need to do is to write a parser which will export my log data into a csv file, which I can then export into Excel in order to analyse the results.

I have an example parsing script originally written in C++, however I have only just taken up coding properly this year and at the moment I only know very basic Java.

What are you asking for? Be specific.

If you have a parsing program (not script) already written in C++, why can't you use that? Does it have to be ported or improved? Are you asking how to compile the source on Mac OS X?

What does Java have to do with it? Are you trying to rewrite your C++ program in Java?
 
What are you asking for? Be specific.

If you have a parsing program (not script) already written in C++, why can't you use that? Does it have to be ported or improved? Are you asking how to compile the source on Mac OS X?

What does Java have to do with it? Are you trying to rewrite your C++ program in Java?

I have a log file which has a lot of traceroute data and a lot of ping data about a specific website.

I need help in trying to write a parser which will export that data into a comma separated value file which I can then import into Excel for statistical analysis.

The parsing program I mentioned doesn't work for the format in which my data has been collected, therefore I need to modify the program so that it will read my data.

(I only mentioned the Java to highlight my lack of coding experience)
 
(I only mentioned the Java to highlight my lack of coding experience)

I take it you didn't write the C++ version, someone else did. Right?

What do the data-gathering shell scripts look like? Are they simple sequential scripts, or do they contain sophisticated shell-programming constructs?
 
I take it you didn't write the C++ version, someone else did. Right?

What do the data-gathering shell scripts look like? Are they simple sequential scripts, or do they contain sophisticated shell-programming constructs?
That's correct, someone else wrote it.

Code:
#!/bin/bash
#
#
# Display time here:
#
date >> 01-Duke.txt
#
#
# Begin traceroute code here:
echo "run traceroute" >> 01-Duke.txt
echo "" >> 01-Duke.txt
for ((c=0; c<7; c++)); do
traceroute -n 152.3.215.24 >> 01-Duke.txt
echo "" >> 01-Duke.txt
done
echo "end traceroute" >> 01-Duke.txt
#
#
echo "" >> 01-Duke.txt
# Begin ping code here:
#
#
date >> 01-Duke.txt
echo "run ping" >> 01-Duke.txt
ping -n -c 120 152.3.215.24 >> 01-Duke.txt
echo "end ping" >> 01-Duke.txt
#
#
echo "-------------------------------------" >> 01-Duke.txt
echo "" >> 01-Duke.txt

There is one of my shell scripts, as mentioned before this was run at 6 hour intervals every day for a week.
 
I'm kind of slow so would you mind also giving an example of what the resulting CSV file would look like given the "temp.log" you included with your "Parser.zip" file.
 
I'm kind of slow so would you mind also giving an example of what the resulting CSV file would look like given the "temp.log" you included with your "Parser.zip" file.
Hopefully I would be able to target certain parts of the file to be read, but if not then just making sure that every value is comma separated with no spaces.

Code:
1,139.222.0.1,1.414 ms,1.613 ms,1.932 ms,2,10.0.0.1,1.035 ms,1.046 ms,1.110 ms,3,172.16.0.34,1.462 ms,2.310 ms,2.349 ms,
etc..
 
Hopefully I would be able to target certain parts of the file to be read, but if not then just making sure that every value is comma separated with no spaces.

Code:
1,139.222.0.1,1.414 ms,1.613 ms,1.932 ms,2,10.0.0.1,1.035 ms,1.046 ms,1.110 ms,3,172.16.0.34,1.462 ms,2.310 ms,2.349 ms,
etc..

Given all the data in the "temp.log" file this "sample" report seems to be lacking WHAT the resulting report file is to pull into the CSV file.

A more complete explanation and example report file would surely help.
 
Given all the data in the "temp.log" file this "sample" report seems to be lacking WHAT the resulting report file is to pull into the CSV file.

A more complete explanation and example report file would surely help.
Code:
"HOP_NUMBER","IP","TIME1","TIME2","TIME3"
1,139.222.0.1,1.414 ms,1.613 ms,1.932 ms
2,10.0.0.1,1.035 ms,1.046 ms,1.110 ms
3,172.16.0.34,1.462 ms,2.310 ms,2.349 ms
4,193.62.92.71,2.454 ms,2.426 ms,2.347 ms
5,193.60.0.21,2.538 ms,2.519 ms,8.139 ms
x7

Code:
"SEQ_NO","TTL","MS_DELAY"
icmp_seq=1,ttl=238,time=118 ms
icmp_seq=2,ttl=238,time=119 ms
icmp_seq=3,ttl=238,time=123 ms
icmp_seq=4,ttl=238,time=118 ms
icmp_seq=5,ttl=238,time=118 ms

Hopefully then separated by the date/time of the readings. Does that help any more?
 
Code:
"HOP_NUMBER","IP","TIME1","TIME2","TIME3"
1,139.222.0.1,1.414 ms,1.613 ms,1.932 ms
2,10.0.0.1,1.035 ms,1.046 ms,1.110 ms
3,172.16.0.34,1.462 ms,2.310 ms,2.349 ms
4,193.62.92.71,2.454 ms,2.426 ms,2.347 ms
5,193.60.0.21,2.538 ms,2.519 ms,8.139 ms
x7

Code:
"SEQ_NO","TTL","MS_DELAY"
icmp_seq=1,ttl=238,time=118 ms
icmp_seq=2,ttl=238,time=119 ms
icmp_seq=3,ttl=238,time=123 ms
icmp_seq=4,ttl=238,time=118 ms
icmp_seq=5,ttl=238,time=118 ms

Hopefully then separated by the date/time of the readings. Does that help any more?

What happens at the transition between:

traceroute to 152.3.215.24 (152.3.215.24), 30 hops max, 40 byte packets

and

traceroute to 152.3.215.24 (152.3.215.24), 30 hops max, 40 byte packets
 
What happens at the transition between:

traceroute to 152.3.215.24 (152.3.215.24), 30 hops max, 40 byte packets

and

traceroute to 152.3.215.24 (152.3.215.24), 30 hops max, 40 byte packets
I don't suppose it would matter as Excel would just input 1 set of traceroute data below the other. If possible i'd like to insert a number increasing in value by 1 until it reaches 7 to indicate what test it was in that series.
 
I don't seem to be able to open the zip file you provided.

Something like this should be easy enough using Perl. Care to present some more sample-data and example of the output you would like ?
 
I don't seem to be able to open the zip file you provided.

Something like this should be easy enough using Perl. Care to present some more sample-data and example of the output you would like ?
I have re-zipped the package and attached it to this post. I hope this helps?
 

Attachments

  • Parser.zip
    53.8 KB · Views: 156
I have re-zipped the package and attached it to this post. I hope this helps?

Well, if I understood you correctly this (ugly) Perl script should do what you want (I'm assuming all the data you have is sanitized, I don't do error checking):

Code:
#!/usr/bin/perl -w

use strict;

open IN,"<data.log" or die "Unable to open file: $!\n";

while(<IN>) {
 if (/\w{3}\s+\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}/) {
  my $command = <IN>;
  my $header = ($command=~/traceroute/)?"'HOP_NUMBER','IP','TIME1','TIME2','TIME3'":"'SEQ_NO','TTL','MS_DELAY'";
  print "\n$_$header\n";
 };
 print "$1,$2,$3 ms,$4 ms,$5 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);
 print "icmp_seq=$1,ttl=$2,time=$3 ms\n" if (/icmp_seq=(\S+)\s+ttl=(\S+)\s+time=(\S+)/);
};
 
Well, if I understood you correctly this (ugly) Perl script should do what you want (I'm assuming all the data you have is sanitized, I don't do error checking):

Code:
#!/usr/bin/perl -w

use strict;

open IN,"<data.log" or die "Unable to open file: $!\n";

while(<IN>) {
 if (/\w{3}\s+\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}/) {
  my $command = <IN>;
  my $header = ($command=~/traceroute/)?"'HOP_NUMBER','IP','TIME1','TIME2','TIME3'":"'SEQ_NO','TTL','MS_DELAY'";
  print "\n$_$header\n";
 };
 print "$1,$2,$3 ms,$4 ms,$5 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);
 print "icmp_seq=$1,ttl=$2,time=$3 ms\n" if (/icmp_seq=(\S+)\s+ttl=(\S+)\s+time=(\S+)/);
};
Thanks very much! I have just run that script in my cmd window and it looks perfect, however I still need to test that it transfers over to Excel properly. Do I now need to add in an extra line to output the modified data?
 
Well, if I understood you correctly this (ugly) Perl script should do what you want (I'm assuming all the data you have is sanitized, I don't do error checking):

Code:
#!/usr/bin/perl -w

use strict;

open IN,"<data.log" or die "Unable to open file: $!\n";

while(<IN>) {
 if (/\w{3}\s+\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}/) {
  my $command = <IN>;
  my $header = ($command=~/traceroute/)?"'HOP_NUMBER','IP','TIME1','TIME2','TIME3'":"'SEQ_NO','TTL','MS_DELAY'";
  print "\n$_$header\n";
 };
 print "$1,$2,$3 ms,$4 ms,$5 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);
 print "icmp_seq=$1,ttl=$2,time=$3 ms\n" if (/icmp_seq=(\S+)\s+ttl=(\S+)\s+time=(\S+)/);
};

Code:
#!/usr/bin/perl -w

use strict;

my $input="04-Reg.txt";
my $output="FinishedReg.txt";

open (INFILE,$input) || die "Unable to open file: $input";
open (OUTFILE,">$output") || die "Unable to open file $output";

while(<INFILE>) {
 if (/\w{3}\s+\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}/) {
  my $command = <INFILE>;
  my $header = ($command=~/traceroute/)?"'HOP_NUMBER','IP','TIME1','TIME2','TIME3'":"'SEQ_NO','TTL','MS_DELAY'";
  print OUTFILE "\n$_$header\n";
 };
 print OUTFILE "$1,$2,$3 ms,$4 ms,$5 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);
 print OUTFILE "icmp_seq=$1,ttl=$2,time=$3 ms\n" if (/icmp_seq=(\S+)\s+ttl=(\S+)\s+time=(\S+)/);
};

I think I got it to work by modifying your original code slightly. Does this look ok because it seems to output ok?

EDIT: Sorry ChOas I forgot one important thing from the import..

Could you possibly add a line into the above code to include the parsing of this information? It's basically the summary of the ping statistics after the 120 packets have been sent.

--- 194.154.164.129 ping statistics ---
120 packets transmitted,120 received, 0% packet loss, time 120493ms
rtt min/avg/max/mdev = 10.138/12.177/48.368/5.176 ms
end ping

It would have to skip the part in red and read the rest so it would look like this:

"PACKET_LOSS","TOTAL_TIME","RTT_MIN","AVG","MAX","MDEV"
0% packet loss,time 120493ms,10.138,12.177,48.368,5.176

I'm guessing it would look something like this, but i'm not sure:

Code:
#!/usr/bin/perl -w

use strict;

my $input="04-Reg.txt";
my $output="FinishedReg.txt";

open (INFILE,$input) || die "Unable to open file: $input";
open (OUTFILE,">$output") || die "Unable to open file $output";

while(<INFILE>) {
 if (/\w{3}\s+\w{3}\s+\d+\s+\d{2}:\d{2}:\d{2}/) {
  my $command = <INFILE>;
my $header = ($command=~/traceroute/)?"'HOP_NUMBER','IP','TIME1','TIME2','TIME3'":"'SEQ_NO','TTL','MS_DELAY'":"'PACKET_LOSS','TOTAL_TIME','RTT_MIN','AVG','MAX','MDEV'";
  print OUTFILE "\n$_$header\n";
 };
 print OUTFILE "$1,$2,$3 ms,$4 ms,$5 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);
 print OUTFILE "icmp_seq=$1,ttl=$2,time=$3 ms\n" if (/icmp_seq=(\S+)\s+ttl=(\S+)\s+time=(\S+)/);
 print OUTFILE "packet loss=$1,time=$2,[COLOR="Red"]?????[/COLOR]
};
 
Hi Lavezzi!

substitute your unfinished line :

Code:
 print OUTFILE "packet loss=$1,time=$2,[COLOR="Red"]?????[/COLOR]

with:

Code:
 if (/(\S+\s+packet loss),\s+(time\s+\S+)/) {
  my $stats = "$1,$2";
  $stats.=",$1,$2,$3,$4" if (<IN>=~/=\s+(\S+)\/(\S+)\/(\S+)\/(\S+)\s+ms/);
  print OUTFILE '"PACKET_LOSS","TOTAL_TIME","RTT_MIN","AVG","MAX","MDEV"' . "\n$stats\n";
 };

That should take care of it!

(mind you, this is a VERY ugly script, but hey, if it works, it works :) )
 
Hi Lavezzi!

substitute your unfinished line :

Code:
 print OUTFILE "packet loss=$1,time=$2,[COLOR="Red"]?????[/COLOR]

with:

Code:
 if (/(\S+\s+packet loss),\s+(time\s+\S+)/) {
  my $stats = "$1,$2";
  $stats.=",$1,$2,$3,$4" if (<IN>=~/=\s+(\S+)\/(\S+)\/(\S+)\/(\S+)\s+ms/);
  print OUTFILE '"PACKET_LOSS","TOTAL_TIME","RTT_MIN","AVG","MAX","MDEV"' . "\n$stats\n";
 };

That should take care of it!

(mind you, this is a VERY ugly script, but hey, if it works, it works :) )
Heh, I guess you didn't see the addition I made to the code above the ?????s. I couldn't realise for a while why I was getting errors! All I had to do was remove my addition and change the <IN> above to <INFILE> and it works perfectly!

I can't thank you enough for your help. It's a shame the MacRumours forum doesn't have either Karma or Reputation because you would definitely have some coming your way. Thanks again!
 
Heh, I guess you didn't see the addition I made to the code above the ?????s. I couldn't realise for a while why I was getting errors! All I had to do was remove my addition and change the <IN> above to <INFILE> and it works perfectly!

I can't thank you enough for your help. It's a shame the MacRumours forum doesn't have either Karma or Reputation because you would definitely have some coming your way. Thanks again!

Whoops! Sorry about that :D

You are very welcome!, glad I could help!
 
How can I add to the perl script to read an additional variable from the file?

For example to be able to recognize and output this:

"HOP_NUMBER","URL","IP","TIME1","TIME2","TIME3"
1,url_example,139.222.0.1,1.414 ms,1.613 ms,1.932 ms
2,url_exampel,10.0.0.1,1.035 ms,1.046 ms,1.110 ms
 
How can I add to the perl script to read an additional variable from the file?

For example to be able to recognize and output this:

"HOP_NUMBER","URL","IP","TIME1","TIME2","TIME3"
1,url_example,139.222.0.1,1.414 ms,1.613 ms,1.932 ms
2,url_exampel,10.0.0.1,1.035 ms,1.046 ms,1.110 ms

Depends on your input :)

If the url is in the same place as you would like it in your output and everyting is still whitespace seperated then

Code:
 # substituting this:
print OUTFILE "$1,$2,$3 ms,$4 ms,$5 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);

 #by this:
print OUTFILE "$1,$2,$3, $4 ms,$5 ms,$6 ms\n"       if (/(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+ms\s+(\S+)\s+ms\s+(\S+)/);

# should work

if it isn't then, well, it won't work :)
 
How can I add to the perl script to read an additional variable from the file?

For example to be able to recognize and output this:

"HOP_NUMBER","URL","IP","TIME1","TIME2","TIME3"
1,url_example,139.222.0.1,1.414 ms,1.613 ms,1.932 ms
2,url_exampel,10.0.0.1,1.035 ms,1.046 ms,1.110 ms
Do you definitely need the resolve address in there? If not then just run traceroute -n option?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.