Getting data in to C++ from excel, any help would be greatly appreciated - Page 4
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 4 of 4 FirstFirst 1234
Results 46 to 54 of 54

Thread: Getting data in to C++ from excel, any help would be greatly appreciated

  1. #46
    Join Date
    Jan 2013
    Posts
    27

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Getting all info now ( i stupidly had the csv open when running the programme), and is still running at just under 6 mins. Although now i have unchecked the 1st for loop, it will take around 180mins to completely run

  2. #47
    Join Date
    Dec 2012
    Location
    England
    Posts
    1,993

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Pleased to be of help. Hope it all goes OK for you now. Welcome to the world of c++!

  3. #48
    Join Date
    May 2009
    Posts
    2,413

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Quote Originally Posted by Surreall View Post
    I have been writing in excel vba for large number crunching, and the code is now taking quite a while to run. A friend of mine suggested i start writing in C++,
    Switching to C++ may not have the great impact on efficiency you expect. Proper choise of algorithms and data structures is often more important than language. I suggest you spend some time reviewing your current program from that perspective. It may very well be that after the considerable effort of learning C++ and making the port you'll have to do it anyway. Although C++ is a very efficent language it's not a silver bullet for all efficiency problems.
    Last edited by nuzzle; February 2nd, 2013 at 01:24 AM.

  4. #49
    Join Date
    Jan 2013
    Posts
    27

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Quote Originally Posted by nuzzle View Post
    Switching to C++ may not have the great impact on efficiency you expect. Proper choise of algorithms and data structures is often more important than language. I suggest you spend some time reviewing your current program from that perspective. It may very well be that after the considerable effort of learning C++ and making the port you'll have to do it anyway. Although C++ is a very efficent language it's not a silver bullet for all efficiency problems.
    Fair point, although i am very glad i made the change. As the vba code takes over 2 weeks to run, and C++ takes 3 hours. I would say that is a massive improvement

  5. #50
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    1,998

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Quote Originally Posted by Surreall View Post
    Getting all info now ( i stupidly had the csv open when running the programme), and is still running at just under 6 mins. Although now i have unchecked the 1st for loop, it will take around 180mins to completely run
    If you can provide some sample data, others may be able to test and suggest performance enhancements. Are you sure that you have all optimizations enabled when building?

    With these kinds of data intensive computations, cache locality is very important for performance. You need to lay out your data and your nested loops to minimize cache misses. E.g. if you look at
    Code:
                for (jRows = 0;jRows < rows - indicator2position - trade2position; jRows++)
                    {
                    if (dataarray[jRows][q]==2 || directionarray[jRows]==15 || directionarray[jRows]==45){}
                    else if (numberofcells == 4)
                        {
                            if (dataarray[jRows + indicator1position][c]==indicator1value && \
                            dataarray[jRows + indicator2position][c]==indicator2value && \
                            dataarray[jRows + trade1position][q]==trade1value && \
                            dataarray[jRows + trade2position][q]==trade2value)
    // ...
    You are accessing four memory locations that are relatively distance from each other. Making the 'dataarray' column-major may drastically improve cache performance. You could also consider storing these comparisons in a std::vector<bool>, which is space optimized to use only one bit per element. Finally, you can try rearranging your nested for-loops to get better performance.

    Of course, multi-threading is also an option to make use of multiple processor(core)s.
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  6. #51
    Join Date
    Jan 2013
    Posts
    27

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Quote Originally Posted by D_Drmmr View Post
    If you can provide some sample data, others may be able to test and suggest performance enhancements. Are you sure that you have all optimizations enabled when building?

    With these kinds of data intensive computations, cache locality is very important for performance. You need to lay out your data and your nested loops to minimize cache misses. E.g. if you look at
    Code:
                for (jRows = 0;jRows < rows - indicator2position - trade2position; jRows++)
                    {
                    if (dataarray[jRows][q]==2 || directionarray[jRows]==15 || directionarray[jRows]==45){}
                    else if (numberofcells == 4)
                        {
                            if (dataarray[jRows + indicator1position][c]==indicator1value && \
                            dataarray[jRows + indicator2position][c]==indicator2value && \
                            dataarray[jRows + trade1position][q]==trade1value && \
                            dataarray[jRows + trade2position][q]==trade2value)
    // ...
    You are accessing four memory locations that are relatively distance from each other. Making the 'dataarray' column-major may drastically improve cache performance. You could also consider storing these comparisons in a std::vector<bool>, which is space optimized to use only one bit per element. Finally, you can try rearranging your nested for-loops to get better performance.

    Of course, multi-threading is also an option to make use of multiple processor(core)s.
    Wow good info thanks, i will look into that. About multithreading how would i go about making use of multiple processors?

  7. #52
    Join Date
    Jul 2005
    Location
    Netherlands
    Posts
    1,998

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Quote Originally Posted by Surreall View Post
    About multithreading how would i go about making use of multiple processors?
    By creating multiple threads that can execute simultaneously. However, there is a lot more stuff involved. You need to make sure your threads don't read and write the same value (only reading is ok; google for "data race") and you need to prevent "false sharing". I would first optimize your inner loop(s). Then, it's probably easy to run each iteration of the outermost loop(s) in a separate thread or on a thread pool.
    Cheers, D Drmmr

    Please put [code][/code] tags around your code to preserve indentation and make it more readable.

    As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky

  8. #53
    Join Date
    Dec 2012
    Location
    England
    Posts
    1,993

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    A simple possible performance improvement - depending upon the compiler - is to look at the condition part of the for loops. In the for loop

    Code:
    for (trade2position = 1 + trade1position; trade2position < 9 + trade1position; trade2position++) {...
    For every iteration of the loop the computer evaluates the expression 9 + trade1position. If this is evaluated first outside of the for loop (as it doesn't change during the body of the loop) then this will provide a very small speed improvement for each iteration. However, some compilers set to produce speed optimised code already peform this type of optimisation. But there would be no harm in trying it to see if you obtain a speed increase.

    Code:
    int trade2iter = 9 + trade1position;
    for (trade2position = 1 + trade1position; trade2position < trade2iter; trade2position++)
    If all the for loops are changed similar to this hopefully you may get a speed increase - depending upon how well your compiler has already optimised this type of code.

    I see that you have

    Code:
    int columns = 37;
    ...
    for (q = 0; q < columns - 1; q++) {
         for (c = 0; c < columns - 1; c++) {
    This could be replaced as columns isn't changed during the program

    Code:
    int columns = 37 - 1;    //Leave 37 so this tallies with your array bounds and you know where it came from
    ...
    for (q = 0; q < columns; q++) {
        for (c = 0; c < columns; c++) {

  9. #54
    Join Date
    Apr 1999
    Posts
    27,418

    Re: Getting data in to C++ from excel, any help would be greatly appreciated

    Quote Originally Posted by Surreall View Post
    What i am acheiving with the arrays, is comparing corresponding 1's and 0's in each column of dataarray.
    I know this was an earlier post of yours, but I'll respond:

    I don't really understand your explanation with the indices you're using the example. Why [1][1], [2][1], and [0][0]? Unless this is a typo, don't you mean (row, col) [0][1], [1][1], [2][1]? If not, maybe you should first rearrange the data so that you're not jumping around all over the place like that, since there really is no coherence to the small example you gave. Again, if your original explanation is a typo, it would be good if you corrected it.
    I know i am not explaining too well its hard to get ya head round it
    If you attempted to explain in general terms what it is you're really doing, then you would get many more responses with much more efficient and maintainable ways of doing what you're doing. Some of those methods may not have even have been considered by yourself, such as using bits, bitsets, etc. to represent 1,0, and "don't care" columns.

    Maybe that's the problem -- you're looking at this in a "hard-coded" fashion, and not seeing the general pattern. Once you see the general pattern, then you have all the power of the C++ algorithms and containers to do much of this grunt work. This is what nuzzle was referring to earlier, and that is the algorithm. You are using the worst case loop of n^2 times when actually you may only need to loop log(n) times, speeding up the whole process logarithmically (which would be much better than tweaking a loop variable here or there).

    But only if we have a clear understanding of the process in general terms could we really look at and suggest a better approach.

    In your code, I see a lot of repeated or close to repeated code. If you're writing 10 lines of code that is the same except for an index or a variable, or the variable type is different, etc. then that code should be refactored into a general approach to get rid of the redundancies. A well-written C++ program shouldn't have 2 or more sections that look exactly the same except for a variable or two.
    Code:
    int indicator1position = 4;
    int indicator2position = 5;
    int indicator3position = 6;
    int indicator4position = 7;
    int indicator1value = 1;
    int indicator2value = 1;
    int indicator3value = 1;
    int indicator4value = 1;
    
    int trade1position = 4;
    int trade2position = 5;
    int trade3position = 6;
    int trade4position = 7;
    int trade1value = 1;
    int trade2value = 1;
    int trade3value = 1;
    int trade4value = 1;
    Why aren't these arrays themselves?
    Code:
    int indicatorposition[] = {4,5,6,7};
    int indicatorvalue[] = {1,1,1,1};
    int tradeposition[] = {4,5,6,7};
    int tradevalue[] = {1,1,1,1};
    Just this one change can make the difference between seeing the problem in a general approach, as opposed to the hard-coded, ad-hoc approach you seem to be taking. Now you're forced to think in terms of arrays and indices and the relationship between indicatorposition[x], indicatorvalue[x], etc. where x is some number.

    Regards,

    Paul McKenzie

Page 4 of 4 FirstFirst 1234

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center