Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

macfaninpdx

macrumors regular
Original poster
Mar 6, 2007
198
0
I am wondering if anyone can help me form a Unix RegEx command that will parse a text file. I am familiar with Unix commands, but the regular expression complexity always seems to evade my comprehension.

Here is what I want to do:
Read a text file (see file format below) and load into some javascript arrays. The text file looks like this:
Code:
A bunch of text that can be ignored is at the top of the file.
Blah blah blah.

----------------------------------------------------------------
Filepath: [COLOR="Red"]/full/path/to/a/filename.ext[/COLOR]
Filename: [COLOR="red"]filename[/COLOR]
Title: filename [COLOR="red"]"Some Title"[/COLOR]
File Contents:
A bunch of other stuff I can igore.

----------------------------------------------------------------
Filepath: [COLOR="Blue"]/full/path/to/a/filename2.ext[/COLOR]
Filename: [COLOR="Blue"]filename2[/COLOR]
Title: filename [COLOR="Blue"]"Some Other Title"[/COLOR]
File Contents:
A bunch of other stuff I can igore.

So the end result will be that I will have three arrays. The "Red" text above will be the first element of the arrays, "Blue" will be second, etc. It will look like this:
Code:
array1[0] = /full/path/to/a/filename.ext;
array1[1] = /full/path/to/a/filename2.ext;

array2[0] = filename;
array2[1] = filename2;

array3[0] = "Some Title"
array3[1] = "Some Other Title"

I realize that I am in need of a lot of help, so I appreciate any advice you can give. I also realize that a small script, or at least a couple of commands will probably be necessary. I am pouring over the man pages now, and scouring the web for examples as you read this.

But if it is simple enough for someone to reply, I would tremendously appreciate it. :)

Thanks in advance!
 

savar

macrumors 68000
Jun 6, 2003
1,950
0
District of Columbia
You want the script to generate Javascript code, or you want it to generate javascript arrays?

In either case, the regex are pretty simple. These are perl style regex:

/^Filepath: (.*)$/
/^Filename: (.*)$/
/^Title: (.*)$/

Explanation of the first one (other are two are very similar): Match a line that begins with the text "Filepath: " and capture the part between the ": " and the end of the line.

In perl, for example, after executing "/^Filepath: (.*)$/", the information on that line would be extracted into a variable called $1. I imagine there is something similar in Javascript too. I still don't quite understand what you're trying to do.
 

macfaninpdx

macrumors regular
Original poster
Mar 6, 2007
198
0
I still don't quite understand what you're trying to do.

It might make more sense if I give the bigger picture. I am creating a Widget. In the JavaScript code of the widget, I will be using a Unix shell command to read and parse a text file. The stdout will be stored in a Javascript variable, or a Javascript array, whichever is easier depending on the output.

So, in a Widget JavaScript I can do something like this:
Code:
var myresult = widget.system("/bin/egrep '<regex expression here>' ~/inputfile.txt").outputString;

I will look at the Perl expressions you listed. If the variable myresult (above) is a text string, I can split it into an array, which will be perfect.

Thanks.
 

macfaninpdx

macrumors regular
Original poster
Mar 6, 2007
198
0
Got it working

OK, using the perl suggestion above, I came up with the following:
Code:
var myresult = widget.system("/usr/bin/egrep '^Filepath: (.*)$' inputfile.txt", null).outputString;
var myarray = myresult.split(/(\n)/);

Thanks for your help. The reason I was having so many problems in the first place was because the text file hac Mac line endings instead of Unix, so I had to use tr to convert it. Then my results were much easier to understand. ;)

One more question: The output above returns the entire line, including the "Filepath: " text. Is there a way I can return the line not including the "Filepath: "? I can always clean it up afterward using a replace, but I am always trying to learn more.

Thanks.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.