Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

orangezorki

macrumors 6502a
Original poster
Aug 30, 2006
633
30
Hello all,

I'm trying to do something which may be impossible, but I wonder if any of you can help out?

My program will create many instances of a class that I am creating, but a key part of the class will be a binary series/array which can vary widely in size. It seems silly to allocate the worst case scenario memory (10k plus) to every one, when sometimes only a few bytes will be enough. I will know the size required at the creation of each instance.

Any ideas?
 
Hello all,

I'm trying to do something which may be impossible, but I wonder if any of you can help out?

My program will create many instances of a class that I am creating, but a key part of the class will be a binary series/array which can vary widely in size. It seems silly to allocate the worst case scenario memory (10k plus) to every one, when sometimes only a few bytes will be enough. I will know the size required at the creation of each instance.

Any ideas?

You can use NSMutableData.
 
Do you mean an NSArray (which is flexibly sized anyway), or do you mean an actual C-style array?

For a C-style array, call malloc() or calloc() to get an arbitrary block of RAM, which you then treat as a C array.

There are other possibilities, such as sparse arrays, tree structures, etc. It all depends on exactly what you're trying to accomplish. Post details.
 
Thanks for replying so soon.

It's a math project, where I will be creating multiple instances of binary strings, do some calculations on them, and from that create a new group of longer strings, etc. I need to be able to look at individual bits, which is why I thought that using a bool [] might be easier - I can't see a way of using NSMutableArray or something similar that lets me look at the individual bits.

BTW - I do have some previous C experience, but am still struggling to get my head around the objective part!

David
 
You didn't say what you're storing, if this is a 2d "jagged array" or will you just need different arrays of different sizes throughout your code? Do these need to be C arrays or NSArrays? Will they ever change size, or are they fixed at initialization-time?

-Lee

My reply was late:
If you need a huge bitset it's probably easiest to malloc space, probably at 8, 16, or 32 bit increments so you can use literals of that size for masking.
 
You didn't say what you're storing, if this is a 2d "jagged array" or will you just need different arrays of different sizes throughout your code? Do these need to be C arrays or NSArrays? Will they ever change size, or are they fixed at initialization-time?

-Lee

Sorry, shouldn't really have used the word array - it's just a one-dimensional string of 1s and 0s, each one won't change once created.
 
I will know the size required at the creation of each instance.

Any ideas?

If the size will remain the same, just add a pointer as an instance variable then allocate memory dynamically. Don't forget to add a call to free as part of the dealoc method though, or you will leak memory.
 
Thanks for replying so soon.

It's a math project, where I will be creating multiple instances of binary strings, do some calculations on them, and from that create a new group of longer strings, etc. I need to be able to look at individual bits, which is why I thought that using a bool [] might be easier - I can't see a way of using NSMutableArray or something similar that lets me look at the individual bits.

BTW - I do have some previous C experience, but am still struggling to get my head around the objective part!

David
NSMutableArray is for holding objects. More specifically, pointers to objects. So if you used an NSMutableArray to hold individual bits, you'd be storing a pointer in the NSMutableArray for each represented bit. Think about the efficiency of that.

As previously mentioned, you might use NSMutableData, which is essentially an object wrapper around a mutable sequence of bytes. You have to go through the wrapper to access the bytes; i.e. you have to use the NSMutableData methods to read or modify the bytes.

If you're just banging bits, it might be more sensible to use a malloc'ed or calloc'ed block of raw memory, and get direct access. NSMutableData's access restrictions (going through its methods) may be more hindrance than help. If you use malloc() or calloc(), however, you also have to take memory management into account.
 
Thank you all for the suggestions - very helpful.

Using malloc within the wrapper of a class, if I can do that, seems perfect to me as it allows the low level access to the data that I need, while keeping the possibly thousands of instances I need at the same time each in a neat wrapper along with a few labels (result, size, etc).

Thanks again,

David
 
I think the easiest approach would be to allocate space with malloc or calloc in an initializer, with enough space for N longs where N is:
Code:
int bitsPerLong = sizeof(unsigned long)*8;
N=(x+bitsPerLong-1)/(bitsPerLong);
unsigned long *mBackingArray = (unsigned long *)calloc(N,sizeof(unsigned long));

Then you need to have a testBitAtIndex:(int) that will calculate how to index the array and which bit within the long to use. You could do this all with char instead of long if 56 extra wasted bits makes a big difference.
Code:
BOOL testBitAtIndex:(unsigned int) idx {
  unsigned int arrayIndex = idx / bitsPerLong; // Could be a right shift by 6
  unsigned long maskValue = 1UL << (idx % 64);
  return (mBackingArray[arrayIndex] & maskValue) ? YES : NO;
}

This omits error checking and free'ing the calloc'd array when your object is deallocated. If you want this to be mutable rather than feeding in all values during initialize, etc. you could build a set/clear bit method too.

I'm sure techniques like this (that are written more elegantly and efficiently) are lying around core foundation and cocoa somewhere. Most recently I saw this in Java's JumboEnumSet, which uses longs to represent bit patterns based on enum ordinals.

-Lee
 
Last edited:
Do you mean an NSArray (which is flexibly sized anyway), or do you mean an actual C-style array?

For a C-style array, call malloc() or calloc() to get an arbitrary block of RAM, which you then treat as a C array.

There are other possibilities, such as sparse arrays, tree structures, etc. It all depends on exactly what you're trying to accomplish. Post details.

Is there any drawback to using a C-style array in Obective-C code? They should be fast given that it's just a pointer and a block of memory, but I've never seen them used in Objective C code.
 
Is there any drawback to using a C-style array in Obective-C code? They should be fast given that it's just a pointer and a block of memory, but I've never seen them used in Objective C code.

C gives you speed (and can be used to create more compact data structures) but you will lose various conveniences and code is likely to become longer/more complex:

  • Object C classes that take arrays as input generally expect something like NSArray, so to pass a C-array will require you to create a copy e.g. a NSArray that contains the same data - this results in extra code and complexity, increasing the risk for errors.
  • It's easier to mess up memory management as you will need to do malloc()/free() manually. The risk of screwing this up can be reduced if you warp the C-array in a class and call malloc/free in -init/-release (-dealloc for ARC) as needed.
  • C arrays are fixed size and don't know their own length (you will need to keep track of it). They are also fairly expensive to resize as realloc() may need to copy all the bytes.
  • Classes like NSArray and NSMutableArray have lots of nice methods to work with arrays. They also implement protocols like NSSecureCoding that can be used to serialize data to disk.
  • While it is possible to store retained pointers in a C-array this does come with certain limitations when using ARC.
  • You will also have to use the more error prone "for(i = 0; i < length; i++) {...}" construct rather than the simpler "for (x in array) {...}" one.
 
  • Object C classes that take arrays as input generally expect something like NSArray, so to pass a C-array will require you to create a copy e.g. a NSArray that contains the same data - this results in extra code and complexity, increasing the risk for errors.
  • It's easier to mess up memory management as you will need to do malloc()/free() manually. The risk of screwing this up can be reduced if you warp the C-array in a class and call malloc/free in -init/-release (-dealloc for ARC) as needed.

That sounds like it would defeat the point if that code portion isn't somewhat isolated. I mean if you have to duplicate it as an NSArray anyway, it's likely to kill any advantage granted by the low overhead of C code. I know what you mean about having to deal with malloc and free. You have to be much more careful with pointers, and it is annoying.


C gives you speed (and can be used to create more compact data structures) but you will lose various conveniences and code is likely to become longer/more complex:

  • C arrays are fixed size and don't know their own length (you will need to keep track of it). They are also fairly expensive to resize as realloc() may need to copy all the bytes.
  • Classes like NSArray and NSMutableArray have lots of nice methods to work with arrays. They also implement protocols like NSSecureCoding that can be used to serialize data to disk.
  • While it is possible to store retained pointers in a C-array this does come with certain limitations when using ARC.
  • You will also have to use the more error prone "for(i = 0; i < length; i++) {...}" construct rather than the simpler "for (x in array) {...}" one.

Yeah my impression was that they're merely faster due to the lack of features for performing things like bounds checking. I haven't dealt too much with Objective-C up to this point.
 
That sounds like it would defeat the point if that code portion isn't somewhat isolated. I mean if you have to duplicate it as an NSArray anyway, it's likely to kill any advantage granted by the low overhead of C code. I know what you mean about having to deal with malloc and free. You have to be much more careful with pointers, and it is annoying.
It could be viable if it is only needed occasionally for certain actions like doing a save.

Yeah my impression was that they're merely faster due to the lack of features for performing things like bounds checking. I haven't dealt too much with Objective-C up to this point.

There are a number of possible overheads in doing a lookup on a NSArray/NSMutableArray:
  • The cost of calling a method (finding the IMP of the method, passing parameters on the stack, ...) this is roughly comparable to the cost of doing a function call
  • The cost of finding the array element if the array is implemented as e.g. a tree or hash table (to allow for growth) - as several steps may need to be done through the data structure to find the desired element. The actually NSArray/NSMutableArray implementation uses a mix of data structures depending on the number of elements stored in it.
  • Bound checking.

This can be compared to a C-array access that amounts to only a few assembly instructions:
  • get array memory base address
  • add index * element size to this
  • retrieve data from resulting address
 
I my humble opinion ...

there are several factors coming into effect here. Sometimes using object orientation and standard libraries can be seen as a tradeoff between performance and development effort: simple development giving less performance. But things are seldom that simple.

One experience is that by lifting you head from the nitty gritty details of the implementation allows you to focus on getting the algorithms right. Often a "smart" implementation can give a factor of 3 in speed, but a "smart" algorithm could give you a 100-fold improvement. And if you add in that computers get faster and faster with a lot of memory, things sort of change over time.

One other experience is that low-level code is more difficult to debug. There are reasons for the design decisions in a a high-level system: things like always doing bounds checks, allocating things of any size dynamically, and so on. These makes the chance that the program is right from the beginning higher.

One extra reason for programming safe instead of fast from the beginning is that it gives you a reference implementation to compare your fast algorithms against. Ideally they should give the same answer, and they can be checked against each other.

So, to make it to the bottom line, my experience is that it is a good idea to start with writing clear, easily understood code using the OO libraries in your selected program languages. Once you know for certain that it gives the right answers you can start optimizing speed or memory footprint, but only if necessary. Oddly enough, the experience says, the slow parts are not always where we expect them to be.

// Gunnar
 
It could be viable if it is only needed occasionally for certain actions like doing a save.



There are a number of possible overheads in doing a lookup on a NSArray/NSMutableArray:
  • The cost of calling a method (finding the IMP of the method, passing parameters on the stack, ...) this is roughly comparable to the cost of doing a function call
  • The cost of finding the array element if the array is implemented as e.g. a tree or hash table (to allow for growth) - as several steps may need to be done through the data structure to find the desired element. The actually NSArray/NSMutableArray implementation uses a mix of data structures depending on the number of elements stored in it.
  • Bound checking.

This can be compared to a C-array access that amounts to only a few assembly instructions:
  • get array memory base address
  • add index * element size to this
  • retrieve data from resulting address

For what it's worth, the typical implementation of NSMutableArray for *nearly* all arrays on modern systems is an array-deque. This provides:

O(1) random access
O(1) sequential access
amortized O(1) prepend
amortized O(1) append
O(1) remove at head
O(1) remove at tail
O(n) insert
O(n) delete
O(n) find unless sorted
O(n log n) sort
Good locality of reference
Zero per-element memory overhead
Low total memory overhead (isa + count + capacity + spare capacity due to not growing one element at a time)
Constant runtime overhead of one modulus operation per lookup vs a plain C array backing store

It's a beautiful data structure.
 
Thank you again - everyone - for all the suggestions on here. I have been spending quite a bit of my free time thinking about my problem, and I am starting to realise that it is using such small bits of information (literally bits!) that it's not really what current OOP is designed to do. I'm starting to think about ditching the objective C for most of the project.

I have to ask - if any of you do basic maths problems in your programming, do you find it worth it using Objective C, or do you just use C for all the non-UI bits?

Now, this is in part because I have a problem I just can't get my head around. I tried to send a message to an object that I had created in the main function from within another function. It didn't know what it was, so I thought that I would just declare a pointer in my global variables. But then, I get an error saying something about it not being supported in ARC. Any suggestions?

Thanks,

David
 
Thank you again - everyone - for all the suggestions on here. I have been spending quite a bit of my free time thinking about my problem, and I am starting to realise that it is using such small bits of information (literally bits!) that it's not really what current OOP is designed to do. I'm starting to think about ditching the objective C for most of the project.

I have to ask - if any of you do basic maths problems in your programming, do you find it worth it using Objective C, or do you just use C for all the non-UI bits?

It depends on the math.

Now, this is in part because I have a problem I just can't get my head around. I tried to send a message to an object that I had created in the main function from within another function. It didn't know what it was, so I thought that I would just declare a pointer in my global variables. But then, I get an error saying something about it not being supported in ARC. Any suggestions?
Post your code. Even if it doesn't work.

You've posted a description, and not even a detailed one. Descriptions alone are nearly impossible to debug.

When you post code, also do two things:
1. Describe what you expected to happen.
2. Describe what actually happened.

If what actually happened is it didn't compile, then post the exact text of the compiler error message. Don't paraphrase it; post it exactly.
 
Thank you again - everyone - for all the suggestions on here. I have been spending quite a bit of my free time thinking about my problem, and I am starting to realise that it is using such small bits of information (literally bits!) that it's not really what current OOP is designed to do. I'm starting to think about ditching the objective C for most of the project.

I have to ask - if any of you do basic maths problems in your programming, do you find it worth it using Objective C, or do you just use C for all the non-UI bits?

It isn't really related to OOP. Have you considered to look for a bit set class on the internet, or is the purpose of this to create it yourself as an exercise? Here's one I just found: https://github.com/MergeConflictCo/Bitset you could perhaps look at it for inspiration at least.

(I have not tested this)

C isn't really any more suited to do math, you have to take care with data types, over flow, casts and the like.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.