Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Snipeye

macrumors newbie
Original poster
Sep 29, 2008
26
0
Alright. I have a large amount of text I need to process in a couple of ways.

It all starts with 1 big text file, formatted kinda like this:

Code:
directory/folder1/name1
directory/folder1/name2
directory/folder1/name3
directory/folder1/name4

directory/folder2/name5
directory/folder2/name6
directory/folder2/name7
directory/folder2/name8

directory/folder3/name9
directory/folder3/name10
directory/folder3/name11
directory/folder3/name12


Sometimes the folders have more than 4 items, sometimes less - but the key thing is there is a return between them. There is also some junk that may need to be filtered out, but the first part, the /directory, will not be part of the junk.

I need a script/workflow to separate these into separate text files (based on the returns between the sections) and name each file something like 'file1' 'file2' 'file3' etc.

STEP 2: (Maybe part of step 1?)

I now need to take these files, and count how many lines are in each, and set the name accordingly: If there were 11 lines in file1, then it would name it: "file1(11)"

STEP 3:

Now, it takes one of these files (for example, file1(11)), and copies line 1 (without the return), and pastes it in place of "COMMAND 1" in another text file named something similar, like "file1(11).different extension). It then takes line 2, and replaces COMMAND 2, line 3 replaces COMMAND 3, etc, etc, etc.

STEP 4: (Maybe part of step 3?)

It takes the name of part of one of the lines:

directory/folder1/NAME1

So, the "NAME1" part, and replaces "ECHO1" with "NAME1". The "ECHO1" will be in the same file as COMMAND 1, COMMAND 2, and COMMAND 3.

Occasionally, there will be some stuff after name 1, but it will be separated by a comma - directory/folder1/NAME1,blah blah blah - but I still only want the NAME1 part.



How possible is this? I don't care if it takes a bunch of files/workflows, but there's so much info I have to go through that doing it manually is out of the question.
 
Do you know regular expressions at all? This is beyond my level of awk experience, but you could probably do this with an awk script or three--or maybe just bash using sed. Python has regular expressions built in as a library, so that's also a viable option.

I'm no Automator expert, but unless its capabilities are far beyond what I'm expecting, I wouldn't even begin to consider it as an option.
 
Erm, just for future reference - I have extremely little programming experience of any kind, except for web programing.

Does anybody know what/how would be the best/easiest way to go about this?
 
Programming is programming (unless it's the fake HTML stuff that people call programming).

This is complex enough that I doubt anyone is just going to write it for you. If you know what an "if/else" statement is, you know what a function is, and you understand the concept of wildcards, then you could try learning the little bit of python you would need to know to write something.
 
How possible is this? I don't care if it takes a bunch of files/workflows, but there's so much info I have to go through that doing it manually is out of the question.

First, as Detrius pointed out, you can probably use awk, sed, bash, etc. There are plenty of people here who can walk you through it, but you'll be doing a lot of learning on your own.

Second, it should be pretty easy to do this also in AppleScript. You might pay a performance penalty, but you might gain on the back end because it'll be easier to maintain in the long run.

The first part will be to break up your data into chunks. Here's this much:

Code:
set blob to "directory/folder1/name1
directory/folder1/name2
directory/folder1/name3
directory/folder1/name4

directory/folder2/name5
directory/folder2/name6
directory/folder2/name7
directory/folder2/name8

directory/folder3/name9
directory/folder3/name10
directory/folder3/name11
directory/folder3/name12"

set myDelim to ((ASCII character 10) & (ASCII character 10))

set TID to AppleScript's text item delimiters
set AppleScript's text item delimiters to myDelim

set blobprime to text items of blob

set ablob to text item 1 of blobprime

set AppleScript's text item delimiters to TID

blobprime

If you run this in AppleScript editor, the results window will provide a list of items based on the double return separating your data into groups. (If there's an error, it could be with the line on ASCII character 10. That's what popped up copying and pasting the data from MacRumors to AS. Your real data could have ASCII 13 as the carriage return.)

This will write the data to file:

Code:
repeat with x from 1 to (number of items in blobprime)
	set y to number of paragraphs of item x of blobprime
	set fileName to "file" & x & "(" & y & ")"
	display dialog fileName
	
	set msgToUser to "File saved OK"
	
	set theFile to choose file name default name fileName
	
	try
		set fileRef to (open for access theFile with write permission)
	on error errMsg number errNum
		display dialog ("Open for Access, Error Number: " & errNum as string) & return & errMsg
	end try
	
	set filesEOF to get eof fileRef
	
	set dataOut to item x of blobprime as text
	
	set eof of fileRef to 0
	
	try
		write dataOut to fileRef
	on error errMsg number errNum
		display dialog ("Write, Error Number: " & errNum as string) & return & errMsg
	end try

	set eof of fileRef to (length of dataOut)
	
	try
		close access fileRef
	on error errMsg number errNum
		display dialog ("Close, Error Number: " & errNum as string) & return & errMsg
	end try
	
	display dialog msgToUser
end repeat

The choose file name is unnecessary. I'm sure you know where the data should appear, but this makes a script I can test and send to you. I overdue try blocks, in part because it makes it easier to debug and in part because it's easier than making assumptions on your end.

The rest of your needs -- I'm not sure I follow what you need and I gotta run. But basic text substitution is easy in AppleScript. It shouldn't be hard to develop what you need.

mt
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.