Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

nino65

macrumors newbie
Original poster
Dec 12, 2007
8
0
Hi all,
I was a Linux user. I bought a macbook pro a month ago. I installed Leopard: every thing is fine... but
I wrote a C program that reads several data files and makes a lot of computations with that. Well in a Linux box with a intel core duo it used to last 10 to 20 seconds to end its job while in my new intel macbook pro it lasts several minutes.

I checked with top and Bigtop and the cpu usage is below 10% and I can't succeed in changing its cpu usage neither with renice. In the Linux the same process used up to 100% of one CPU.

I suspect there is a top cpu usage for the processes that are launched by users, isn't it?

Any idea? can anyone help me with some hints?

thanks in advance
nino65
 

hhlee

macrumors 6502
May 19, 2005
255
1
the bottleneck is in the hard disk access bandwidth, particularly if you're loading a bunch of smaller files. you could try to cat all the files together - that might force the system cache up just one large file.
 

nino65

macrumors newbie
Original poster
Dec 12, 2007
8
0
thanks for the fast answer, hhlee,
I had the same suspect and thinking to this I bought the 7200 rpm HD, nevertheless I can't believe the linux HD was so faster than this one to justify the differrence in performance.

I checked also with Shark and the most time consuming part was a loop in wich I do some calculation and not the file reading, but maybe shark does not report on the system calls ...
 

ChrisA

macrumors G5
Jan 5, 2006
12,914
2,164
Redondo Beach, California
First off, I'm not surprised that Linux is faster. There has been an open competition for years anoung kernel hackers to squeeze performance out of the code. mac OS X was written with different goals. Also Mac OS runs on the mach microkernal. Every system call goes through that level. But 3x slower? My guess would have been 1.5x or there about.

Leopard has Dtrace. They got it from Sun. It is the perfect tool if you'd like to figure out what going on.
http://www.sun.com/bigadmin/content/dtrace/

If you want anyone here to help you will have to post small bits of C code
 

iSee

macrumors 68040
Oct 25, 2004
3,540
272
There isn't a default cap on CPU usage of user processes, so something else is bottlenecking your code.

Are you doing any I/O at all in your computation loops? Any reading or writing to file or console, etc?

If not, maybe it is memory allocation. Are you doing lots of allocation and freeing of memory in your computation loops? Are there large allocations (take up a large fraction of available RAM)?
 

Gelfin

macrumors 68020
Sep 18, 2001
2,165
5
Denver, CO
I suspect there is a top cpu usage for the processes that are launched by users, isn't it?

Definitely not, and this is easy to check:
Code:
int main(void)
{
   int i = 0;
   while(i >= 0) i++;
   return 0;
}

This produces 100%ish CPU.

From what you've described my first suspicion was a filesystem bottleneck too, but there are a number of things that might be causing your issue, and it's hard to know without seeing your code. Are you reading in a very large amount of data? Even if your primary FS reads are fine, there's always the chance you could be swapping yourself into the ground. How does the RAM on the MacBook Pro compare to the Linux box?
 

Catfish_Man

macrumors 68030
Sep 13, 2001
2,579
2
Portland, OR
Shark is your best friend in these things, and it most definitely does report system calls. Learn to use its data mining tools and it should help you pinpoint the issue.
 

nino65

macrumors newbie
Original poster
Dec 12, 2007
8
0
thanks a lot to everyone,
I'll try to resume here answers to each question

first of all, the code is several hundreds of lines so I cannot post it here, the linux box and my macbook pro have got both an intel core duo with 2 GB memory, I don't remember the model of the HD mounted by the linux machine but for sure it was nothing special.

- I used Shark and the time profile is (>3%)
11.3% 11.3% AutoCode buildForecasts
9.8% 9.8% libSystem.B.dylib __svfscanf_l
7.0% 7.0% mach_kernel ml_set_interrupts_enabled
4.2% 4.2% mach_kernel memcmp
4.2% 4.2% mach_kernel name_cache_unlock
3.5% 3.5% mach_kernel cache_lookup_path
3.5% 3.5% libSystem.B.dylib strtod_l$UNIX2003

- I verified, with the simple program suggested by Gelfin, that there is no cpu cup. By the way, I had to modify it a bit because it quickly exited with i=2147483648 :D

- As suggested first by hhlee and then by iSee, Gelfin ... the bottleneck SURELY IS the file reading (it reads about 40,000 files of 100 rows x 2 cols each). I tried to improve a bit the code "caching" the files in a matrix hosting 1024 files at a time but it was not helping very much. I think I have to do what suggested by hhlee: joining all the files togheter (or in a few big ones)

- STILL the surprise (and the mistery) is that in the linux box it was not 3 but 30 TIMES FASTER!

thanks again to all of you
 

garethlewis2

macrumors 6502
Dec 6, 2006
277
1
If you are coming from a Linux background moving into OS X, you are going to bump into a hell of alot of bottlenecks than you will ever expect.

OS X needs to make sure sure that when you read a file, it is actually sending that data correctly, so the file reading code, waits until all IO is finished before returning. In Linux, since it is not an industrial strength OS, the OS call returns immediately spawning a seperate thread to read the file, it couldn't give a rats arse if the data was read correctly. This is where Linux is trouncing OS X.
 

fimac

macrumors member
Jan 18, 2006
95
1
Finland
OS X needs to make sure sure that when you read a file, it is actually sending that data correctly, so the file reading code, waits until all IO is finished before returning. In Linux, since it is not an industrial strength OS, the OS call returns immediately spawning a seperate thread to read the file, it couldn't give a rats arse if the data was read correctly. This is where Linux is trouncing OS X.

Huh? Of course Linux is industial strength. Were you being sarcastic? Do you have a link? Are you confusing read with write?

This may be of interest to the OP: http://sekhon.berkeley.edu/macosx/.
 

stupidregister

macrumors member
Sep 29, 2007
52
0
In Linux, since it is not an industrial strength OS, the OS call returns immediately spawning a seperate thread to read the file, it couldn't give a rats arse if the data was read correctly.

Doesn't this imply the opposite? That is unless slowness is considered a strength.
 

iSee

macrumors 68040
Oct 25, 2004
3,540
272
...(it reads about 40,000 files of 100 rows x 2 cols each)....

Opening 40,000 files is a lot!

Seek time on hard drives is measured in milliseconds. For with an average seek time of 8ms, and assuming 1 seek per file, it is going to take 320 seconds just to position the read head over the first sector of data of every file.

If the Linux box had a desktop hard drive in it, it is going to be a lot faster than the laptop hard drive (even at 7200 RPM) in the Macbook. Other components of the I/O stack are probably faster too.

But, as the OP points out, something else is going on.

8-9ms is a pretty representative average seek time for a desktop hard drive--laptop drives are usually at least a couple of ms slower (though I haven't looked in to this for a while).

So obviously the Linux box is doing a great job beating the average seek time, while the MB is not so much.

It's not necessarily OS X's fault though. You could try OS X with a different file system, for example. And it might be the hard drive's built-in cache that is doing such a good job.

If I had to guess, though, I'd "blame" OS X. Still, it's not part of the OS X development philosophy to optimize something like this (reading 40,000 small files). They concentrate on optimizing the overall user level responsiveness of the system.
 

Catfish_Man

macrumors 68030
Sep 13, 2001
2,579
2
Portland, OR
Rather than get into pointless platform contention here, let's brainstorm solutions.

Have you considered mmap()ing the files? What are your read patterns for them?
 

eastcoastsurfer

macrumors 6502a
Feb 15, 2007
600
27
In Linux, since it is not an industrial strength OS, the OS call returns immediately spawning a seperate thread to read the file, it couldn't give a rats arse if the data was read correctly. This is where Linux is trouncing OS X.

That's funny.

To the OP, are you doing any multi-threading? The BSD code that OSX is based on use to be notoriously slow when doing threading (IIRC, context switching was extremely slow). To the point where it was recommended you not use OSX for things like an Apache web server. Of course they may have fixed this issues by now.
 

MarkCollette

macrumors 68000
Mar 6, 2003
1,559
36
Toronto, Canada
Can you use Shark to differentiate between the time to:
- Open each file
- Read the first byte in each file
- Subsequent reads on each file

That would let us know if the problem is seeking between files.

If it is some lack of proper operating system hard drive read queuing / read coalescing, then maybe we can reorganise the code a little. You could have a thread for opening and reading in files, and a thread for processing the read-in file's data. As soon as you've read in one file, you get that thread to read in the next. It will mostly be blocking, waiting for I/O, but that's alright, because your processing thread will be busy crunching the previously read in data. If that appears to help, then you can make it so that the reader thread constantly runs, setting its data on a queue, which the processing thread can grab work off of, at its own rate. You'll have to limit the queue size, so that the reading thread doesn't fill RAM up with unprocessed data, leaving no room to actually do the processing in.
 

ChrisA

macrumors G5
Jan 5, 2006
12,914
2,164
Redondo Beach, California
If you are coming from a Linux background moving into OS X, you are going to bump into a hell of alot of bottlenecks than you will ever expect.

OS X needs to make sure sure that when you read a file, it is actually sending that data correctly, so the file reading code, waits until all IO is finished before returning. In Linux, since it is not an industrial strength OS, the OS call returns immediately spawning a seperate thread to read the file, it couldn't give a rats arse if the data was read correctly. This is where Linux is trouncing OS X.

A thread to read the file??? Are you sure. Show me the code. Where is the thread created?

You may be thinking about writing. Writting is different If you don't fsync() after a write you can't be sure the data is really ini the disk and yes in Linux there is a process (not thread) that flushes data back to the disk.
 

ChrisA

macrumors G5
Jan 5, 2006
12,914
2,164
Redondo Beach, California
Can you use Shark to differentiate between the time to:
- Open each file
- Read the first byte in each file
- Subsequent reads on each file.

The way I do this is to comment out sections of my code. Finally whaen I comment out a section and find that my execution time is reduced by 90% I know there the problem is. Then I try and see if commenting out even fewer line has the same effect. Of course there are more sophisticated tools but a few #if 0 ... #endif pairs are very quick to toss in.
 

MarkCollette

macrumors 68000
Mar 6, 2003
1,559
36
Toronto, Canada
The way I do this is to comment out sections of my code. Finally whaen I comment out a section and find that my execution time is reduced by 90% I know there the problem is. Then I try and see if commenting out even fewer line has the same effect. Of course there are more sophisticated tools but a few #if 0 ... #endif pairs are very quick to toss in.

Yes, that's a pretty straightforward approach, when the steps are not dependent on each other, or if you can hard-code the outputs of a function.

The problem is, in my example of opening, reading the first byte, and reading the rest, that will isolate initial seeks versus straight read times, and there isn't really a way to measure the read without first doing the initial seek.

But, the reason I've found, for using profilers, is that they can show you performance bottlenecks that you've never anticipated. I remember this one time, programming in Java, it took a really long time to sort the rows of a table. It turns out that one specific object type was really slow at being compared, and I just had to wrap it in something that could cache some intermediate calculation. I would never have thought that could have been it. Or this other time, when parsing a file seemed fast enough, but I still profiled it. It turns out that in a very specific set of circumstances, some code that reads byte by byte, was directly reading from a file. All I had to do was change one line of code to use buffered I/O, and it became an order of magnitude faster to parse the whole file. It was buried so deeply in an area that I thought was already as fast as possible. Really, once your programs hit 100+ KB of source code, profilers are mandatory.
 

nino65

macrumors newbie
Original poster
Dec 12, 2007
8
0
thank you all for this interesting discussion.

I must say that I'm not a professional programmer, I'm a researcher (a physicist) who is used to write his own program to simulate physical/biological/chemical processes. So, even though I try to write efficient and smart code as much as I can, I am not at a level to know every single detail of C language or of system calls and so on. I learned how to use Shark just a couple of day ago and don't know how to collect information about what MarkCollette suggested

As I am a physicist I made an experiment that I think could give some hints on what's the central point.
I run a first instance of the program and when it was at more or less half of the way, I compiled and run another instance of the program (with other parameter values). Well, the second instance reached in few seconds the point at which the first one was and then they proceeded together (slowly) to the end.

I think that this means that it's all depending on the cache of the HD or how the OS is managing it or somthing like that, I don't have enough knowledge on that.

We do not have the definitive answer but I think we are very close.

thank you again
 

toddburch

macrumors 6502a
Dec 4, 2006
748
0
Katy, Texas
I'm writing some C++ now to process large files. I've found that when the input files are on a USB flash drive, performance is REALLY bad. When the file is on the HD, it is MUCH faster. The second time I run against the HD, it is VERY, VERY fast. This tells me OS X caching is to be leveraged whenever possible.

Todd
 

nino65

macrumors newbie
Original poster
Dec 12, 2007
8
0
yes, i forgot to say that the linux was a desktop (I don't remember if it was a Dell or an HP)
 

iSee

macrumors 68040
Oct 25, 2004
3,540
272
OS X does generally cache files that have been read. If you look at the Memory tab of the system profiler Activity Monitor, the blue part of the pie chart shows the memory currently being used for this.

If the OP has enough memory and will use the same file sets multiple times, the current program might be good enough--the first time it is run it will be slow, but subsequent passes involving the same files will be much faster.

nino65, you shouldn't need to overlap execution as you did in your test (assuming you have enough RAM that all the files get cached)--the OS X cache will leave files in its cache until it runs out of space (it will yeild its space to apps that allocate memory and, of course, new file access).

If that doesn't work, then you might want to try hhlee's suggestion of combining the source files into one large one if it doesn't complicate the code too much.

Also, I'm not sure if you are writing to files, you may want to try keeping the results in memory until processing is finished and writing it out all at once (and hopefully to one file and not many).

It would be interesting to hear what approach is successful...
 

nino65

macrumors newbie
Original poster
Dec 12, 2007
8
0
If you look at the Memory tab of the system profiler, the blue part of the pie chart shows the memory currently being used for this.
iSee, maybe I am too new to Mac OS but I don't see any pie chart in the system profiler. My one looks like this
 

Attachments

  • Picture 1.png
    Picture 1.png
    40.2 KB · Views: 104
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.