Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

zippyfly

macrumors regular
Original poster
Mar 22, 2008
141
0
Hi. I am trying to parse CSV data. Not a huge file. Probably a kilobyte or two.

Should I just be using NSString's stringWithContentsOfFile and iterate over the string to figure out the comma delimited data? (Is there a maximum NSString length?)

I was looking at NSData and I wasn't really clear how I iterate over the data in the NSData object. I guess "byte" gets me the head pointer, and I increment the pointer to iterate the length?

(I take it the advantage of NSData is the huge file size it can load into memory as pure bytes, correct? And in this case I don't need to worry about it and just parse the loaded NSString?)

Thanks.

(By the way, I googled some CSV code around the Internet but want to roll my own; mostly because I don't really understand what others are doing so might as well just do it my own way and understand it).
 
I'd suggest dropping the data into an NSString and using the NSScanner class to actually parse it. The main problem with just reading in bytes directly from NSData is you are assuming the input file is either UTF8 or ASCII which, while they may work for 90% of the cases, could cause issues down the line.
 
Hello... I am stuck on the NSScanner part.

I've successfully opened the CSV file and read it into a string.

Now I am parsing the NSString...

Code:
NSScanner *myScanner = [NSScanner scannerWithString:csvData];
		
NSString	nameValue;
float		numValue;
	
[myScanner scanCharactersFromSet:[NSCharacterSet alphanumericCharacterSet] intoString:&nameValue];

[myScanner scanFloat:&numValue];

The scanFloat seems straightforward but I found there was no scanString equivalent. Looking up the API I see a few promising methods, and tried scanCharactersFromSet: intoString:

The scanCharactersFromSet: wants a set from NSCharacterSet (I assume, it checks against that set and coverts the bytes to the chars) and I see an alphanumericCharacterSet, so I am trying to return that set from NSCharacterSet.

Anyway, it's not at all working and I am totally confused.

Xcode also says NSString nameValue; is statically allocated (???)

Help...
 
You could also use -[NSString componentsSeparatedByString:] on a NSString containing the entire file. First to separate each row, and then to separate each comma-delimited field*.

* You will need to handle escaped commas manually.
 
Thanks guys for pointing those out.

I'll consider doing the string analysis manually but...

...back on NSScanner (which I want to learn how to use properly), what should I be doing to parse strings from the input?

Thanks.
 
Actually... I just ran the program after adding back the pointer operator to the string head and it ran OK. I will now do some tests to see if the values were actually taken by the NSScanner.
 
I keep getting

program received signal: “EXC_BAD_ACCESS”.

and have tried many, many things... double checking the data with

Code:
	NSLog (@"Scanner string\n");
	NSLog (@"%@",[myScanner string]);

shows the string is properly initalized to NSScanner

What might “EXC_BAD_ACCESS” mean here?

I get that error immediately when it starts parsing the first string (which is ASCII text).

The text file data is basically:

ASCII_location_name;float;float;int;float;float;int;\n

To narrow down the problem, I have manually changed all commas in the CSV text file to semicolons, and am using:

Code:
NSCharacterSet *semicolonSet;	
semicolonSet = [NSCharacterSet characterSetWithCharactersInString:@";"];
and

Code:
while ([myScanner isAtEnd] == NO) {

NSLog (@"%@",[myScanner scanUpToCharactersFromSet:semicolonSet intoString:&location]);

...
 
Post your code.

In its entirety.

Not fragments.

Or pieces.

Broken across multiple lines.

And don't ... because it's not.

Be sure to include about 5 or so lines of your input file, and enclose it in a QUOTE or CODE block so it doesn't get mangled.

You should also consider posting the crash-log from the EXC_BAD_ACCESS, or the backtrace from the debugging window, if you're running a debugger.
 
Code:
#import <Foundation/Foundation.h>
#import <Foundation/NSObject.h>
#import <Foundation/NSString.h>
#import <Foundation/NSFileManager.h>
#import <Foundation/NSAutoreleasePool.h>
#import <Foundation/NSDictionary.h>

int main (int argc, const char * argv[]) {

    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    
    // Set up hard path

	NSString *fName = [NSHomeDirectory() stringByAppendingFormat:@"/prototype/marketdata.txt"];
	
	// Set up for reading CSV data
	
	NSString *csvData;
	
	NSFileManager *fm;

	fm = [NSFileManager defaultManager];
	
	if ( [fm fileExistsAtPath: fName] == NO ) {
		
		NSLog(@"File Not Exist");
		return 1;
	}
		 
	// load NSString with CSV data from file
	
	csvData = [NSString stringWithContentsOfFile: fName];

	// Parse CSV with NSScanner
	
    NSScanner *myScanner = [NSScanner scannerWithString:csvData];
	
	NSString	*location;
    float		theRevenue;
	float		thePercent;
	int			theRank;
	
	// Set up data delimiter using semicolon

	NSCharacterSet *semicolonSet;
	
	semicolonSet = [NSCharacterSet characterSetWithCharactersInString:@";"];
	
	// Double check scanner string

	NSLog (@"Scanner string\n");
	NSLog (@"%@",[myScanner string]);
	
	// scanner loop start

	while ([myScanner isAtEnd] == NO) {
		
		// Read Location text up to ; delimiter
		
		NSLog (@"Reading Location\n");
				
		NSLog (@"%@",[myScanner scanUpToCharactersFromSet:semicolonSet intoString:&location]);
		
		// Skipping the ; delimiter
		
		[myScanner scanString:@";" intoString:NULL]; 
		
		// Read Revenue data up to ; delimiter and skipping

		NSLog (@"Reading Revenue\n");
		NSLog (@"%i",[myScanner scanFloat:&theRevenue]);
		[myScanner scanString:@";" intoString:NULL]; 
		
		// Read Percentage data up to ; delimiter and skipping

		NSLog (@"Reading Percentage\n");
		NSLog (@"%i",[myScanner scanFloat:&thePercent]);
		[myScanner scanString:@";" intoString:NULL]; 
		
		// Read Ranking data up to ; delimiter and skipping

		NSLog (@"Reading Ranking\n");
		NSLog (@"%i",[myScanner scanInt:&theRank]);
		[myScanner scanString:@";" intoString:NULL]; 
		
	}	
	
	[pool drain];
		
    return 0;
}
 
Data sample as in text file:

Code:
Los Angeles;8.25;0.580561574;1;Tokyo;1.9;0.643872234;1;Honolulu;0;0;0;Toronto;7.9;5.3322;3;

Data sample for easier reading:

Code:
Los Angeles;8.25;0.580561574;1;
Tokyo;1.9;0.643872234;1;
Honolulu;0;0;0;
Toronto;7.9;5.3322;3;
 
NSLog (@"%@",[myScanner scanUpToCharactersFromSet:semicolonSet intoString:&location]);

What type does scanUpToCharactersFromSet:intoString: return?

Does that type match the "%@" format specifier?

If you had made a separate call to scanUpToCharactersFromSet:intoString: you might have seen the error sooner. For example:

Code:
location = nil;

mysteryTypenameGoesHere success =
[myScanner scanUpToCharactersFromSet:semicolonSet intoString:&location];

NSLog (@"%@",location);

I recommend putting less code inside NSLog() function calls, and more in direct and simple statements.
 
Hi -- thanks very much.

And thanks for the guideline about separating out the code.

After double-checking the API documentation, I revised the code to be:

Code:
NSLog (@"Reading Location\n");
		
BOOL tempVal =[myScanner scanUpToCharactersFromSet:semicolonSet intoString:&location];
				
NSLog (@"%@",location);
And it works!

I will now proceed to the rest of the program.
 
Further revised to:

Code:
if ( [myScanner scanUpToCharactersFromSet:semicolonSet intoString:&location] ) {
			
NSLog (@"%@",location);

}
to avoid using temp BOOL; will later consolidate the BOOL method returns into a single if.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.