Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
OK you all have been such a great help, there is one final stage in the journey of these log files. As seen in the title, I need to send the now parsed data into a table on a database.

Now my database knowledge is limited, as displayed through my previous posts on this thread, so it may be a little tricky.

Right now, the php parses the data to an html page. When the souce is viewed, it looks like this:

Code:
Array
(
    [0] => 06/07/07 02:44:16.516 INFO Servlet.Engine.Transports : 0 PortalMonitorHandler is initialised with 'alertableLevel' set to 15 and 'maxLevel' set to 40
    [1] => 06/07/07
    [2] => 02:44:16.516
    [3] => INFO
    [4] => Servlet.Engine.Transports : 0
    [5] => PortalMonitorHandler is initialised with 'alertableLevel' set to 15 and 'maxLevel' set to 40
)
Array
(
    [0] => 06/07/07 02:44:21.798 INFO Servlet.Engine.Transports : 0 returning  cpsManager 
    [1] => 06/07/07
    [2] => 02:44:21.798
    [3] => INFO
    [4] => Servlet.Engine.Transports : 0
    [5] => returning  cpsManager 
)

Items 1 - 5 need to be put into a table, in that order. I think I need to make a database and a table before this can draw the information, is that correct?

Thanks!
 

iBert

macrumors regular
Jul 14, 2004
148
0
Not really, once you have the information in an array in php you can still use php to present the data in a table using html. You might have got a glimpse of this while looking at php, but you don't really need to make a DB to display data. They are used depending what's been developed, when you use a DB it's easier to display the information since all you do is a query and get all the information right there and create a function to make the viewing page. I'm quite certain that now a days this should be very easy to do with php and functions provided by the language.

But since you need to store this log file in a DB, I guess it would be a good exercise to learn what you need to do. Once you have managed to store the data with the testing you are doing then you should be done or close to be done with this project.
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
Well, the real point of this project is to eventually be able to search the database. There is no need to view the entire thing on the screen, it will be far too large, but there is a need to search the data in the DB by date, time, a range of dates, a range of times, or what is contained in the message part. This is why I think I should probably use a DB.
 

macfaninpdx

macrumors regular
Mar 6, 2007
198
0
If this will always be used to parse the same log file, I would create the table/fields first, and then have the code send INSERT statements. But if you will be using this on many different log files (i.e. different fields each time), then you can have the code send the CREATE TABLE query.

To set up the table initially, use the Apache admin mentioned previously, and click on the phpmyadmin link on the menu at the left. This is an excellent web-based GUI to mySQL administration. Create the table and fields, and add a user with the correct permissions to the table.

Then in your PHP code, you will connect to mySQL and select the db. Then your foreach loop will send an INSERT for each line of the log file. If this takes too long to process (if you have hundreds of thousands of lines each time the code runs), you will either need to increase the PHP script timeout property, or you will need to use mysqli_multi_query.

The latter can take multiple INSERT statements separated by a semi-colon. So your foreach loop would concatenate the INSERT statements with a semi-colon (using .=) and then after your loop you could issue the mysqli_multi_query. Although I do not know what the limit is on how large the query can be (or how many statements it can contain).
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
If this will always be used to parse the same log file, I would create the table/fields first, and then have the code send INSERT statements. But if you will be using this on many different log files (i.e. different fields each time), then you can have the code send the CREATE TABLE query.

It will be for many different log files. One a day. All of this will be done at midnight, hopefully ready by 7 AM the next day.

To set up the table initially, use the Apache admin mentioned previously, and click on the phpmyadmin link on the menu at the left. This is an excellent web-based GUI to mySQL administration. Create the table and fields, and add a user with the correct permissions to the table.

Do I need to do any database-creating first? Or will this alone do the trick.
 

krunk

macrumors regular
Jan 29, 2004
236
0
Oh OK, thats the tricky part.

Do I need to write a script to create a database? Or can I do it in XAMPP?

Follow the instructions above to install phpMyAdmin. It's a very intuitive and thorough mysql front end in which you can create databases as well as manage them.

Or enter the mysql console and type:

> create database myDBName;
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
Follow the instructions above to install phpMyAdmin. It's a very intuitive and thorough mysql front end in which you can create databases as well as manage them.

Or enter the mysql console and type:

> create database myDBName;

Thanks!
 

stndn

macrumors member
Oct 22, 2006
80
1
earth
As you mentioned more about the thing you wanted to accomplish, and that you don't know things that well, here's some learning steps you can follow to figure out how things should work and make your next task a little easier.

You mentioned that:
1. You have several log files to work with
2. The log file will be parsed and entered to database on daily basis.
3. The user will be able to search the database for specific log file.

You also mentioned that everyday the script will parse the whole log file (which I hope is a 1-day log file and not everything from the years ago).

Since one of the tasks would be searching by date, I suggest that you make (in your database) your date column with type "DATE" or even "DATETIME" if you want to store the date and time together (although I don't think DATETIME can store up to milisecond). That way searching will be much easier since you can use the SQL's DATE functions and not simply the date characters stored as-is.

This means you will have to make some changes to the PHP script to make the text '15/08/07' into something like '2007-08-15'. After that, you can store the datestamp in the table as DATE (or DATETIME) type.


After that, assuming you have PHP and MySQL available, here's what you should do:
1.
Create a database (required, of course)

2.
Create one table for each of the log files you want to work with (to avoid mixed data and too much clutter). This process only needs to be done once, and you can use phpMyAdmin or other MySQL client to do that.

3.
Find out the SQL query to insert the data manually. Don't try to build upon the PHP script provided to you just yet.

4.
Once that works, modify the PHP script you have to connect, select, and insert data to database. Since you'll want to start from a simple trial, forget about parsing the log file and just use the hard-coded texts for now. Alternatively, create a sample log file that's small (a hundred record or so) just to check that your script works.

5.
Verify that your data has been entered correctly. Run the script several times and keep checking your database to make sure it works.

6.
Clean the table in your database from your test data (use 'TRUNCATE' on the tables)

7.
Modify your script to parse the real log file, and set up a cron job to run the PHP script on daily basis.

8.
Work on the search algorithm (which in itself is not really an easy task)


Note that this is assuming that you will only ADD new log to the database on daily basis, and not recreate the whole log on daily basis (which will be overkill anyway). I know I've left out some details, but you're free to ask for more information if you're stuck somewhere. -)


-stndn.
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
As you mentioned more about the thing you wanted to accomplish, and that you don't know things that well, here's some learning steps you can follow to figure out how things should work and make your next task a little easier.

You mentioned that:
1. You have several log files to work with
2. The log file will be parsed and entered to database on daily basis.
3. The user will be able to search the database for specific log file.

You also mentioned that everyday the script will parse the whole log file (which I hope is a 1-day log file and not everything from the years ago).

It is indeed only for that day.

Since one of the tasks would be searching by date, I suggest that you make (in your database) your date column with type "DATE" or even "DATETIME" if you want to store the date and time together (although I don't think DATETIME can store up to milisecond). That way searching will be much easier since you can use the SQL's DATE functions and not simply the date characters stored as-is.

This means you will have to make some changes to the PHP script to make the text '15/08/07' into something like '2007-08-15'. After that, you can store the datestamp in the table as DATE (or DATETIME) type.

I have actually been working through this is my head for the past few days. DATETIME does not store the millisecond, which is neccesary. I thought about making the millisecond it's own column, but that would require some extra regex work, and I am lucky I got to where I am with the regex, never mind doing more.

Also, I wouldn't be sure how to change the format of the time and date, and both would require these changes if DATETIME was to apply. Do you think the date handling in MySQL is worth the extra effort here?

After that, assuming you have PHP and MySQL available, here's what you should do:
1.
Create a database (required, of course)

2.
Create one table for each of the log files you want to work with (to avoid mixed data and too much clutter). This process only needs to be done once, and you can use phpMyAdmin or other MySQL client to do that.

Would I need to manually create a new table each day, or could I automatically create a table before the info is parsed each night?

3.
Find out the SQL query to insert the data manually. Don't try to build upon the PHP script provided to you just yet.

What do you mean?

4.
Once that works, modify the PHP script you have to connect, select, and insert data to database. Since you'll want to start from a simple trial, forget about parsing the log file and just use the hard-coded texts for now. Alternatively, create a sample log file that's small (a hundred record or so) just to check that your script works.

5.
Verify that your data has been entered correctly. Run the script several times and keep checking your database to make sure it works.

6.
Clean the table in your database from your test data (use 'TRUNCATE' on the tables)

7.
Modify your script to parse the real log file, and set up a cron job to run the PHP script on daily basis.

8.
Work on the search algorithm (which in itself is not really an easy task)


Note that this is assuming that you will only ADD new log to the database on daily basis, and not recreate the whole log on daily basis (which will be overkill anyway). I know I've left out some details, but you're free to ask for more information if you're stuck somewhere. -)


-stndn.

Thank you very much for the detailed reply. I am just a little confused about a few of your steps. You're saying I should skip the parser for now, and just work on drawing data directly to the table in the DB from the log file? Then, apply what I come up with to the parsed data?

I also need to decide about the DATETIME stuff. What is your opinion on this. Is it worth the change, from a search perspective?

Thanks
 

stndn

macrumors member
Oct 22, 2006
80
1
earth
Using the DATETIME format will be very beneficial when you are searching by date, because if you are storing it as plain text, you will have to break apart the values in your column into the date and time format before you can perform your date-based search. That means more work when selecting the data (per-use basis) vs one-time work when parsing and inserting your data.

MySQL (or rather, SQL) has many date-related functions you could use. For example, here's one from the MySQL manual on Date and Time Functions to select data from the last 30 days:

SELECT something FROM tbl_name
-> WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= date_col;

As for the milisecond part, I think storing it in a separate column (of numeric type) will be better than trying to mash things (date and time) in string-type columns. Searching for numeric values in string columns is a mess.

You may need to change the PHP script a little bit. For one, you will need to convert the date and time to SQL format.

No need to use another regex (although you could). Use PHP's split() to accomplish that. If you are using PHP 5, you may even use strptime () to help you out. After you split the year, month, date, hour, minute, second, milisecond, you simply concatenate them as one string:

PHP:
$datetime = "20{$y}-{$m}-{$d} {$H}:{$M}:{$S}"; # Remember to prefix the '20' for your year to avoid problems
$milisecond = $MS;

Actually, here's a modified regex to parse the date and time parts on their own:
PHP:
foreach ($myText as $val)
{
    preg_match ('/^(\d+)\/(\d+)\/(\d+)\s+(\d+):(\d+):(\d+)\.(\d+)\s+(.*?)\s+(Servlet.Engine.Transports : \d+)\s+(.*)$/i', $val, $myResult);

    print_r ($myResult);
}

Would I need to manually create a new table each day, or could I automatically create a table before the info is parsed each night?

Only once when you initially create the database. Once it is created, you simply add more data into it on nightly basis. Or, you could be fancy and perform a check and have your script create the table if it doesn't exist. But I don't think it's necessary and it's too much work.

Find out the SQL query to insert the data manually. Don't try to build upon the PHP script provided to you just yet.

By this I mean, manually try to figure out the correct SQL query that will insert your data into your database. This means, from your MySQL client (phpMyAdmin is a good one), type your SQL INSERT query manually to perform the data entry. Once you find out the correct syntax for entering your data, then continue with the next step. This will save you frustration when your script doesn't work as expected, as you may be wondering whether the problem lies with the PHP script or the SQL query.

For step 4, sorry if it was confusing. What I meant was skip the parsing of your log file, and not skip the parsing of the input texts. So, instead of running the full script in post #50, try running the partial script in post #46 first. That is actually a trial and debugging step, where you try your completed PHP script to see whether it will work as expected. That is, check whether your PHP script can parse the data and insert it correctly. There's no point in running 5 hours of processing just to try out your script, right?

Step 3 and 4 should be repeated when you are doing your select script. So, try to come up with the correct SQL SELECT statement first, then apply it to your PHP script.

Btw, I'm assuming this is only for internal use, right? Because if it's for public use, then there's other things to worry about, such as security and all that.


-stndn.
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
OK, the updated regex returns this:

Code:
Array
(
    [0] => 06/07/07 02:44:16.516 INFO Servlet.Engine.Transports : 0 PortalMonitorHandler is initialised with 'alertableLevel' set to 15 and 'maxLevel' set to 40
    [1] => 06
    [2] => 07
    [3] => 07
    [4] => 02
    [5] => 44
    [6] => 16
    [7] => 516
    [8] => INFO
    [9] => Servlet.Engine.Transports : 0
    [10] => PortalMonitorHandler is initialised with 'alertableLevel' set to 15 and 'maxLevel' set to 40
)

So now that each piece of the date and time is split into a seperate item in the array, I can use the DATETIME function for it? Do I need to write a new script for this, or can I include it in the parser?

Edit: One more question. I am making my table in my database. Should the date and time go into one column? I am going to do it this way for now, let me know if I am wrong.

Edit: Here is what I made for my table:
5 columns:
DATETIME - type DATETIME - this is for date and time
MS - type INT(3) - this is the milliseconds.
LVL - varchar(10) - this will be INFO or DEBUG or something short like that
ID - varchar(30) - this is the "Servlet.Engine.Transports : 0" part
MSG - varchar(500) - this is the mssage, it can get pretty long

How does this look?
 

macfaninpdx

macrumors regular
Mar 6, 2007
198
0
So now that each piece of the date and time is split into a seperate item in the array, I can use the DATETIME function for it? Do I need to write a new script for this, or can I include it in the parser?

Edit: One more question. I am making my table in my database. Should the date and time go into one column? I am going to do it this way for now, let me know if I am wrong.

Edit: Here is what I made for my table:
5 columns:
DATETIME - type DATETIME - this is for date and time
MS - type INT(3) - this is the milliseconds.
LVL - varchar(10) - this will be INFO or DEBUG or something short like that
ID - varchar(30) - this is the "Servlet.Engine.Transports : 0" part
MSG - varchar(500) - this is the mssage, it can get pretty long

How does this look?
The DATETIME field does hold both the date and time info (see here). So you will need to concatenate your array fields to the following format before inserting it into the table:
YYYY-MM-DD HH:MM:SS

You could either concatenate the fields manually:
PHP:
$datefield = "{$yourarray[3]}-{$yourarray[1]}-{$yourarray[2]} {$yourarray[4]}:{$yourarray[5]}:{$yourarray[6]}";
Or you can use a combination of the mktime and date functions, as follows:
PHP:
$datefield = date( "Y-m-d H:i:s", mktime( $yourarray[4], $yourarray[5], $yourarray[6], $yourarray[1], $yourarray[2], $yourarray[3] ) );
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
The DATETIME field does hold both the date and time info (see here). So you will need to concatenate your array fields to the following format before inserting it into the table:
YYYY-MM-DD HH:MM:SS

You could either concatenate the fields manually:
PHP:
$datefield = "{$yourarray[3]}-{$yourarray[1]}-{$yourarray[2]} {$yourarray[4]}:{$yourarray[5]}:{$yourarray[6]}";
Or you can use a combination of the mktime and date functions, as follows:
PHP:
$datefield = date( "Y-m-d H:i:s", mktime( $yourarray[4], $yourarray[5], $yourarray[6], $yourarray[1], $yourarray[2], $yourarray[3] ) );

So I would need to includ this in my parsing script after the data is made into arrays then? I assume this would be placed in the script somewhere between when I parse it and when I send it to the DB.
 

macfaninpdx

macrumors regular
Mar 6, 2007
198
0
So I would need to includ this in my parsing script after the data is made into arrays then? I assume this would be placed in the script somewhere between when I parse it and when I send it to the DB.

Yes, you would add the line after your regex line parses the data into an array (and of course replace $yourarray above with whatever variable name you used to store the array).

Then your insert statement will be something like,
PHP:
$sql = "INSERT INTO tablename (`datetime`, `ms`, `lvl`, `id`, `msg`) VALUES ('$datefield', '{$yourarray[7]}', '{$yourarray[8]}', '{$yourarray[9]}', '{$yourarray[10]}')";

Now that I think about it, you probably shouldn't use reserved names as fieldnames in your MySQL table - I have been bitten by this before. Use something instead of DATETIME and ID for your field names.
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
Now that I think about it, you probably shouldn't use reserved names as fieldnames in your MySQL table - I have been bitten by this before. Use something instead of DATETIME and ID for your field names.

Oh yeah, good catch. Rookie mistake :eek:

OK here is what I have going on:

PHP:
<?php 

$logfile = file("C:\Documents and Settings\a407667\Desktop\Parser\sampleLogs"); 

foreach ($logfile AS $linenum => $val) { 
    preg_match ('/^(\d+)\/(\d+)\/(\d+)\s+(\d+):(\d+):(\d+)\.(\d+)\s+(.*?)\s+(Servlet.Engine.Transports : \d+)\s+(.*)$/i', $val, $myResult); 

    print_r ($myResult);   
}

$datefield = "{$myResult[3]}-{$myResult[1]}-{$myResult[2]} {$myResult[4]}:{$myResult[5]}:{$myResult[6]}";

$sql = "INSERT INTO diamond (`DT`, `MS`, `LVL`, `DENT`, `MSG`) VALUES ('$datefield', '{$myResult[7]}', '{$myResult[8]}', '{$myResult[9]}', '{$myResult[10]}')";  

  
?>

I assume I need to connect to the database at the beginning.
 

macfaninpdx

macrumors regular
Mar 6, 2007
198
0
Another thing I noticed: your array variable $myResult is getting overwritten for each line in the logfile. So that means you will either need to perform the SQL INSERT inside the foreach loop, or append the $myResult to a new array, and then use another foreach loop to process all the results.

For example:
PHP:
foreach ($logfile AS $linenum => $val) {  
    preg_match ('/^(\d+)\/(\d+)\/(\d+)\s+(\d+):(\d+):(\d+)\.(\d+)\s+(.*?)\s+(Servlet.Engine.Transports : \d+)\s+(.*)$/i', $val, $myResult);  

//    print_r ($myResult);    

      $datefield = "{$myResult[3]}-{$myResult[1]}-{$myResult[2]} {$myResult[4]}:{$myResult[5]}:{$myResult[6]}"; 

      $sql = "INSERT INTO diamond (`DT`, `MS`, `LVL`, `DENT`, `MSG`) VALUES ('$datefield', '{$myResult[7]}', '{$myResult[8]}', '{$myResult[9]}', '{$myResult[10]}')";
      $result = mysqli_query( $link, $sql );
}
or
PHP:
foreach ($logfile AS $linenum => $val) {  
    preg_match ('/^(\d+)\/(\d+)\/(\d+)\s+(\d+):(\d+):(\d+)\.(\d+)\s+(.*?)\s+(Servlet.Engine.Transports : \d+)\s+(.*)$/i', $val, $myResult);  

//    print_r ($myResult);

      $resultsArray[] = $myResult;
}
foreach( $resultsArray AS $thisResult ) {
      $datefield = "{$thisResult [3]}-{$thisResult [1]}-{$thisResult [2]} {$thisResult [4]}:{$thisResult [5]}:{$thisResult [6]}"; 

      $sql = "INSERT INTO diamond (`DT`, `MS`, `LVL`, `DENT`, `MSG`) VALUES ('$datefield', '{$thisResult [7]}', '{$thisResult [8]}', '{$thisResult [9]}', '{$thisResult [10]}')";
      $result = mysqli_query( $link, $sql );
}

Although as mentioned previously, this may take a long time if you have thousands of INSERT statements being executed. But I think as stndn mentioned, it is best to get the code working first before optimizing it.
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
I am using this as my code to connect to the database, but it is erroring out.

PHP:
$mysqli = new mysqli("localhost","test","******","roger");
if (mysqli_connect_errno()) 
{
  printf("Connection failed: %s\n", mysqli_connect_error());
  exit();
}

Is there something wrong with it?
 

macfaninpdx

macrumors regular
Mar 6, 2007
198
0
What is the error? Are you using PHP 5.0? Is your database "roger" created, and does the user "test" have access permission to it?
 

barr08

macrumors 65816
Original poster
Aug 9, 2006
1,361
0
Boston, MA
What is the error? Are you using PHP 5.0? Is your database "roger" created, and does the user "test" have access permission to it?

Yeah, roger exists, and test has access to it.

It was silly of me to post this without the error :eek:. Here it is:

Warning: mysqli_query() expects parameter 1 to be mysqli, null given in C:\xampp\htdocs\PHP\Parser2.php on line 20

Repeated one time for each entry in the log file.

Line 20 is:

Code:
      $result = mysqli_query( $link, $sql );
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.