macOS Anyone got any recommendations?

lee1210 · Jul 8, 2008

Cromulent said:
Bah, the program works perfectly when compiling on the command line . Yet another reason to hate IDE's which just get in your way...

I guess I'm going to have to have a closer look at Xcode.

Did you try sticking the python script in a well-defined place, and putting:

Code:

sys.path.append('/users/Cromulent/scripts');

in the code? I'm sure Xcode uses some subdirectory under the project directory as the working directory when your program runs, so that would be more difficult to find and place the python script in.

-Lee

Cromulent · Jul 8, 2008

Okay, that seems to have fixed it. If you don't change the path the Python source needs to be in the same directory as the executable which would be rather inconvenient but there you go.

Cromulent · Aug 14, 2008

Just gone back to this after a few weeks putting it off. The Python part works fine and I can return the data to C and convert it to the relevant C types. The problem, really, is down to design. How do you guys decide the best way to pass arguments around functions? If I carry on down the route I'm going 90% of my functions will require 4+ arguments which is pretty hard to remember when you have a fair amount.

How would you handle a situation where you have 6 arrays each with 8000+ elements to them? Obviously put them in a struct but would you make each of the items an array or make an array of structs? Is there much of a difference in terms of performance or memory use? All elements are the same size so that is not an issue (in terms of array length).

lee1210 · Aug 14, 2008

Cromulent said:
Just gone back to this after a few weeks putting it off. The Python part works fine and I can return the data to C and convert it to the relevant C types. The problem, really, is down to design. How do you guys decide the best way to pass arguments around functions? If I carry on down the route I'm going 90% of my functions will require 4+ arguments which is pretty hard to remember when you have a fair amount.

How would you handle a situation where you have 6 arrays each with 8000+ elements to them? Obviously put them in a struct but would you make each of the items an array or make an array of structs? Is there much of a difference in terms of performance or memory use? All elements are the same size so that is not an issue (in terms of array length).

It might seem "nicer" to use the bridge in this manner, but is this really superior to passing in a filename which points to a file with the data to process?

If you did want to send it all over the bridge, how slow this is will depend on the speed of the embedded python code. I'm guessing it's pretty quick, but have no idea. I would thing that you would want a struct in C that holds all of your arrays, that you pass to a function that breaks them out into 6 separate PyListObjects of the appropriate type (say, PyIntObjects). You can then compose that into one big list and stick all 6 of the PyListObjects you generated in that, and have that be your one parameter to your python function. This will take a lot of C to do what's a few lines worth of Python, but so is the life of a programmer doing embedding.

-Lee

Cromulent · Aug 14, 2008

lee1210 said:
It might seem "nicer" to use the bridge in this manner, but is this really superior to passing in a filename which points to a file with the data to process?

If you did want to send it all over the bridge, how slow this is will depend on the speed of the embedded python code. I'm guessing it's pretty quick, but have no idea. I would thing that you would want a struct in C that holds all of your arrays, that you pass to a function that breaks them out into 6 separate PyListObjects of the appropriate type (say, PyIntObjects). You can then compose that into one big list and stick all 6 of the PyListObjects you generated in that, and have that be your one parameter to your python function. This will take a lot of C to do what's a few lines worth of Python, but so is the life of a programmer doing embedding.

-Lee

Thanks for that. Basically my Python code reads the file, strips out the commas and splits it into 6 lists (one for each heading). I then return the 6 lists in a tuple, extract each list from the tuple and convert it into an array of either char (for the date), double or int depending on the type in question. Doing it this way basically means that the brunt of the code is C, and Python just does the stuff which is a royal pain in the arse in C.

I'm just trying to work out a nice solution to these arrays that does not require me to a) make them global or b) pass structs around by value. Plus the program does not know how big the list will be until Python returns the tuple so additionally they need to be C99 style dynamic arrays.

Design is not my strong point.

lee1210 · Aug 14, 2008

Cromulent said:
Thanks for that. Basically my Python code reads the file, strips out the commas and splits it into 6 lists (one for each heading). I then return the 6 lists in a tuple, extract each list from the tuple and convert it into an array of either char (for the date), double or int depending on the type in question. Doing it this way basically means that the brunt of the code is C, and Python just does the stuff which is a royal pain in the arse in C.

I'm just trying to work out a nice solution to these arrays that does not require me to a) make them global or b) pass structs around by value. Plus the program does not know how big the list will be until Python returns the tuple so additionally they need to be C99 style dynamic arrays.

Design is not my strong point.

I'm really not familiar with python, but i was playing with this using lists. You should be able to, on return from the python function, use PyList_Size to get the length, then allocate that times sizeof(type). I'm not familiar with C99 dynamic arrays, either, so I'm not sure how that changes things, but probably not much.

I'm playing with this now to see how long it's taking to setup all of the python objects from 51,000 ints.

-Lee

Cromulent · Aug 14, 2008

Here's my Python code:

Code:

def fileInput():

    # Lists for data read from file
    
    data = []
    date = []
    opening = []
    high = []
    low = []
    closing = []
    volume = []
    
    s = raw_input("Please enter the filename to process (enter full path if not in current directory): ")

    fd = open(s, "r")
    
    fd.readline()    # Throw away
    
    for record in fd:
        record = record.strip()
        items = record.split(',')
        
        for i in range(1, 5):
            items[i] = float(items[i])
        
        items[5] = int(items[5])

        for j in range(6):
            date.append(items[0])
            opening.append(items[1])
            high.append(items[2])
            low.append(items[3])
            closing.append(items[4])
            volume.append(items[5])
    
    fd.close()
    
    return (date, opening, high, low, closing, volume)

and here's the part of the C program which deals with the return:

Code:

retTuple = setupPyScriptForUse();
    if(retTuple == NULL)
    {
        printf("Failed to open and examine file.\n");
        Py_Finalize();
        return EXIT_FAILURE;
    }
    
    listItem = PyTuple_GetItem(retTuple, 1);
    if(listItem == NULL)
    {
        printf("Failed to extract item from Tuple.\n");
        return EXIT_FAILURE;
    }
    
    listSize = PyList_Size(listItem);
    finDataPtr = parsePyInputOpen(listItem, listSize);

and the parsePyInputOpen function:

Code:

finData * parsePyInputOpen(PyObject *returnValue, Py_ssize_t nitems)
{
    PyObject *processed = NULL;
    Py_ssize_t i = 0;
    finData finDataStruct[nitems];
    finData *finDataPtr = &finDataStruct[0];
    
    printf("Size of the list: %i\n", (int)nitems);
    
    for(i = 0; i < nitems; i++)
    {
        processed = PyList_GetItem(returnValue, i);
        finDataStruct[i].finOpen = PyFloat_AsDouble(processed);
    }
    
    return finDataPtr;
}

I'm just not sure the best way to handle it is design wise. I could probably keep it in the psuedo Python / C object style but I'd rather have the data as raw doubles etc for later on.

lee1210 · Aug 14, 2008

I played with this and this waas what I came up with:

logic.py (just used the same one as I did originally when looking at this):

Code:

def HW(list):
  print 'Hello from Python!'
  print len(list);
  print len(list[0]);
  list[0][4:10] = list[3][1:7];
  list[0].extend(list[1]);
  list[0].extend(list[2][1:8024]);
  return list;

testpy.c:

Code:

#include <stdio.h>
#include <stdlib.h>
#include "Python.h"

int main(int argc, char *argv) {
  int cListA[8500];
  int cListB[8500];
  int cListC[8500];
  int cListD[8500];
  int cListE[8500];
  int cListF[8500];
  int *cResult = NULL;
  int randInt = -1;
  int loopControl = 0;
  int sz;
  PyObject *module, *dict, *func, *value, *arglist,*pArgs;
  PyListObject *listA, *listB, *listC, *listD, *listE, *listF, *componentList;
  Py_Initialize(); //Set up the interpreter
  PyRun_SimpleString("import sys\n"); //This line and the next
  PyRun_SimpleString("sys.path.append('.')\n"); //are only b/c of my system

  module = PyImport_ImportModule("logic"); //Import the module in logic.py
  dict = PyModule_GetDict(module); //I have no idea
  func = PyDict_GetItemString(dict, "HW"); //Get a reference to the function
  for(loopControl = 0; loopControl < 8500; loopControl++) { //C data setup
    cListA[loopControl] = rand();
    cListB[loopControl] = rand();
    cListC[loopControl] = rand();
    cListD[loopControl] = rand();
    cListE[loopControl] = rand();
    cListF[loopControl] = rand();
  }
  listA = PyList_New(8500);
  listB = PyList_New(8500);
  listC = PyList_New(8500);
  listD = PyList_New(8500);
  listE = PyList_New(8500);
  listF = PyList_New(8500);
  for(loopControl = 0; loopControl < 8500; loopControl++) {
    PyList_SetItem(listA,loopControl,PyInt_FromLong((long)cListA[loopControl]));
    PyList_SetItem(listB,loopControl,PyInt_FromLong((long)cListB[loopControl]));
    PyList_SetItem(listC,loopControl,PyInt_FromLong((long)cListC[loopControl]));
    PyList_SetItem(listD,loopControl,PyInt_FromLong((long)cListD[loopControl]));
    PyList_SetItem(listE,loopControl,PyInt_FromLong((long)cListE[loopControl]));
    PyList_SetItem(listF,loopControl,PyInt_FromLong((long)cListF[loopControl]));
  }
  componentList = PyList_New(6);
  PyList_SetItem(componentList,0,listA);
  PyList_SetItem(componentList,1,listB);
  PyList_SetItem(componentList,2,listC);
  PyList_SetItem(componentList,3,listD);
  PyList_SetItem(componentList,4,listE);
  PyList_SetItem(componentList,5,listF);

  pArgs = PyTuple_New(1);
  PyTuple_SetItem(pArgs,0,componentList);
  value = PyObject_CallObject(func,pArgs); //Call HW

  Py_DECREF(module);
  Py_DECREF(dict);
  Py_DECREF(func);
  Py_DECREF(listA);
  Py_DECREF(listB);
  Py_DECREF(listC);
  Py_DECREF(listD);
  Py_DECREF(listE);
  Py_DECREF(listF);
  Py_DECREF(componentList);
  Py_DECREF(pArgs);

  sz = PyList_Size(PyList_GetItem(value,0));
  cResult = malloc(sz*sizeof(int));
  for(loopControl = 0; loopControl < sz; loopControl++) {
    cResult[loopControl] = (int) PyInt_AsLong(PyList_GetItem(PyList_GetItem(value,0),loopControl));
    if(loopControl % 1000 == 0) printf("Value: %d\n",cResult[loopControl]);
  }
  free((void *)cResult);
  Py_DECREF(value);

  Py_Finalize();
  return 0;
}

This doesn't do much for you than what you already are doing... i would be more comfortable, personally, dynamically allocating memory in C on the heap than returning something from a functions stack that has variable length. You do have to remember to free it, but you can do this multiple times without messing up your data if you do it that way.

The way I would handle it would just be to pass the data in whatever encapsulated form you use it elsewhere in your C program into a function, that will then tear it apart and stick it in to python objects. Call the python function, get the result, and unpack it from the python data structures into whatever you need back in C. I was concerned about the time it would take to generate the python objects, etc. but it seemed pretty negligible for the large number of ints I was working with. I doubt doubles will make this much worse.

If it makes you more comfortable your python interface function can call some other helper functions for performing specific tasks, like unboxing things from python.

-Lee

lee1210 · Aug 14, 2008

One more thought, for efficiency considerations:
Does each "record" get processed by the python routine individually? That's to say, are you acting 6000 times on 6000 items? Or does the results depend on interaction between the records?

If you are just processing each record independently, it would be easy to loop over each record, encapsulate it in python types, call a python function that processes the single record, return the result, and break the result back into C types. This way you can make a single loop. You'll have to remember to decref the python stuff before you assign a new object to them so you don't leak memory.

It doesn't seem like this will work for you, exactly, but I thought I'd mention it.

-Lee

Cromulent · Aug 15, 2008

lee1210 said:
One more thought, for efficiency considerations:
Does each "record" get processed by the python routine individually? That's to say, are you acting 6000 times on 6000 items? Or does the results depend on interaction between the records?

Individually.

lee1210 said:
If you are just processing each record independently, it would be easy to loop over each record, encapsulate it in python types, call a python function that processes the single record, return the result, and break the result back into C types. This way you can make a single loop. You'll have to remember to decref the python stuff before you assign a new object to them so you don't leak memory.

Yeah, I completely forgot to decref the Python objects. Thanks for the reminder, I guess I'll have to look into how I'm going about this. At the moment the main function basically just calls the Python script from the start and then deals with the results, rather than setting up an adequate solution before hand.

lee1210 said:
It doesn't seem like this will work for you, exactly, but I thought I'd mention it.

-Lee

Cheers for that. You've got me thinking about a few new possibilities.

Search

Search

macOS Anyone got any recommendations?

lee1210

macrumors 68040

Cromulent

macrumors 604

Cromulent

macrumors 604

lee1210

macrumors 68040

Cromulent

macrumors 604

lee1210

macrumors 68040

Cromulent

macrumors 604

lee1210

macrumors 68040

lee1210

macrumors 68040

Cromulent

macrumors 604

Our Staff