Answer to your question:
iMac 2.4 GHz, Aluminium, 21", EMC No: 2133, Mac OS X (10.6.4), Xcode 5.0.2
So latest software versions and old mac.
1) Your software versions are not the latest ones. 10.6.8 is the latest version of the 10.6 (Snow Leopard) releases.
2) You didn't identify what CPU architecture your code was compiled and run for. In my test case below, I compiled and ran both 32-bit (i386) and 64-bit (x86_64) versions.
I suggest that you update to 10.6.8, for reasons which should become apparent below.
First, I suggest making a well-isolated and more informative (verbose) test case. It can be C or C++, but it should be well-isolated so others can compile and run it without needing to add code, and it should be more verbose so its output can be collected into a file and diff'ed.
Here's my test case, in C:
File: vecTest.c:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <vecLib/vecLib.h>
typedef vU512 MYVEC_t;
#define MYVEC_name "vU512"
#define MYVEC_HalfMultiply vU512HalfMultiply
#define MYVEC_Divide vU512Divide
int main( int argc, const char * argv[] )
{
MYVEC_t A,B,C,Q,R;
printf( " vec type: %s\n", MYVEC_name );
memset(&A,0,sizeof(A));
memset(&B,0,sizeof(B));
memset(&Q,0,sizeof(Q));
memset(&R,0,sizeof(R));
A.s.LSW = 1;
B.s.LSW = 3;
for (int64_t k=1; k<129; k++)
{
fprintf( stderr, "k:%02u ", k );
MYVEC_HalfMultiply(&A,&B,&C);
fprintf( stderr, "A:%u B:%u C:%u ", A.s.LSW, B.s.LSW, C.s.LSW );
A = C;
Q.s.LSW = k; R.s.LSW = k;
MYVEC_Divide(&C,&B,&Q,&R);
fprintf( stderr, " Q:%u R:%u \n", Q.s.LSW, R.s.LSW );
// assert(R.s.LSW == 0);
}
}
My reasons for using the MY_VEC family of types and defines will become apparent in the results presented below.
I compiled and ran both 32-bit and 64-bit versions on a 10.4 Tiger machine. This machine has Xcode and gcc tools on it. My other working machines don't. I have a currently-dead 10.6.8 machine with a Core 2 Duo CPU, but I haven't needed it for my main work since it died, so I've delayed repairing it.
When I run both versions on 10.4.11, I get identical output. So both architectures are consistent.
However, inspecting the output, it's apparent that both architectures are also wrong. Here's a section of the 10.4 output:
Code:
k:01 A:1 B:3 C:3 Q:1 R:1
k:02 A:3 B:3 C:9 Q:2 R:2
k:03 A:9 B:3 C:27 Q:3 R:3
k:04 A:27 B:3 C:81 Q:4 R:4
k:05 A:81 B:3 C:243 Q:5 R:5
k:06 A:243 B:3 C:729 Q:6 R:6
k:07 A:729 B:3 C:2187 Q:7 R:7
k:08 A:2187 B:3 C:6561 Q:8 R:8
k:09 A:6561 B:3 C:19683 Q:9 R:9
k:10 A:19683 B:3 C:59049 Q:10 R:10
k:11 A:59049 B:3 C:177147 Q:11 R:11
That's right, the quotient (Q) and remainder (R) values are not being returned in the output variables.
Furthermore, when I change the types to vU256, vU1024, or any other referential type (i.e. not vu128), the results are the same. The results of the division are not stored into the output variables.
The assert() must be disabled here, otherwise the test won't run to completion. Also, the Q & R variables must be initialized, otherwise they will contain garbage values. Finally, I chose to store an iteration-specific value into the parts of Q & R that produce output, to distinguish between a possible "zeroes the output" vs. a "doesn't change the output" failure modes.
Next, I copied the executables and collected outputs (./v32 &>out32.txt) to my 10.6.8 (Snow Leopard) machine. Unfortunately, this machine only has a Core Duo CPU, which is not 64-bit. Here's a sample of its output:
Code:
k:01 A:1 B:3 C:3 Q:1 R:0
k:02 A:3 B:3 C:9 Q:3 R:0
k:03 A:9 B:3 C:27 Q:9 R:0
k:04 A:27 B:3 C:81 Q:27 R:0
k:05 A:81 B:3 C:243 Q:81 R:0
k:06 A:243 B:3 C:729 Q:243 R:0
k:07 A:729 B:3 C:2187 Q:729 R:0
k:08 A:2187 B:3 C:6561 Q:2187 R:0
k:09 A:6561 B:3 C:19683 Q:6561 R:0
k:10 A:19683 B:3 C:59049 Q:19683 R:0
k:11 A:59049 B:3 C:177147 Q:59049 R:0
k:12 A:177147 B:3 C:531441 Q:177147 R:0
k:13 A:531441 B:3 C:1594323 Q:531441 R:0
k:14 A:1594323 B:3 C:4782969 Q:1594323 R:0
k:15 A:4782969 B:3 C:14348907 Q:4782969 R:0
k:16 A:14348907 B:3 C:43046721 Q:14348907 R:0
k:17 A:43046721 B:3 C:129140163 Q:43046721 R:0
k:18 A:129140163 B:3 C:387420489 Q:129140163 R:0
k:19 A:387420489 B:3 C:1162261467 Q:387420489 R:0
k:20 A:1162261467 B:3 C:3486784401 Q:1162261467 R:0
k:21 A:3486784401 B:3 C:1870418611 Q:3486784401 R:0
k:22 A:1870418611 B:3 C:1316288537 Q:1870418611 R:0
k:23 A:1316288537 B:3 C:3948865611 Q:1316288537 R:0
k:24 A:3948865611 B:3 C:3256662241 Q:3948865611 R:0
k:25 A:3256662241 B:3 C:1180052131 Q:3256662241 R:0
k:26 A:1180052131 B:3 C:3540156393 Q:1180052131 R:0
k:27 A:3540156393 B:3 C:2030534587 Q:3540156393 R:0
k:28 A:2030534587 B:3 C:1796636465 Q:2030534587 R:0
k:29 A:1796636465 B:3 C:1094942099 Q:1796636465 R:0
k:30 A:1094942099 B:3 C:3284826297 Q:1094942099 R:0
k:31 A:3284826297 B:3 C:1264544299 Q:3284826297 R:0
k:32 A:1264544299 B:3 C:3793632897 Q:1264544299 R:0
k:33 A:3793632897 B:3 C:2790964099 Q:3793632897 R:0
k:34 A:2790964099 B:3 C:4077925001 Q:2790964099 R:0
k:35 A:4077925001 B:3 C:3643840411 Q:4077925001 R:0
...
k:120 A:989468363 B:3 C:2968405089 Q:989468363 R:0
k:121 A:2968405089 B:3 C:315280675 Q:2968405089 R:0
k:122 A:315280675 B:3 C:945842025 Q:315280675 R:0
k:123 A:945842025 B:3 C:2837526075 Q:945842025 R:0
k:124 A:2837526075 B:3 C:4217610929 Q:2837526075 R:0
k:125 A:4217610929 B:3 C:4062898195 Q:4217610929 R:0
k:126 A:4062898195 B:3 C:3598759993 Q:4062898195 R:0
k:127 A:3598759993 B:3 C:2206345387 Q:3598759993 R:0
k:128 A:2206345387 B:3 C:2324068865 Q:2206345387 R:0
Here, the results of the division are definitely being calculated and stored into the output variables.
The displayed values are the unsigned low 32-bits of the larger number, which can be cross-checked in vecLib by adding A to itself a total of 3 times, i.e. A+A+A = A*3. (Left as an exercise for the reader.)
Results up to around 2^56 can also be checked using "Calculator.app", because its floating-point operands carry 56-bits of precision. These checks require the Programmer mode under its View menu.
I also get the same 32-bit and 64-bit outputs when I run the executables on a 10.8.4 Mountain Lion machine. Whatever bug existed in 10.4.11 and disappeared by 10.6.8, it seems to remain at bay in 10.8.4.
The fact that I got R:0 in all cases under 10.6.8 suggests that 10.6.8 does not have whatever problem you're seeing.
Before updating to 10.6.8, I strongly recommend compiling and running both 32-bit and 64-bit versions of my test case above, and collecting the output. You can post the output as evidence for or against correct behavior. I won't hazard a guess what output you'll see, mainly because of the surprising results I got when running under 10.4.11.
The short answer to your question is: "Yes, vecLib has bugs", but that answer must be qualified by a "depending on OS version, arithmetic operation, and possibly other factors".
Finally, debugging is primarily a process of
confirming expectations by
gathering evidence. Your expectations were seemingly disconfirmed by the use of assert(). Your next step should always be to make a well-isolated test that gathers informative evidence.
You can then present both the test (code) and the evidence to support your position. It would have saved me some time if you'd made a stand-alone compilable program that produced useful output. It may also have enlisted the help of more people with more data-points (OS versions, architectures).
EDIT
I just realized you wrote "Xcode 5.0.2", which tends to invalidate the "10.6.4" version of OS X. I.e. that version of Xcode is incapable of running on that OS version.
Accuracy is important in programming.