Hello, sorry for having multiple thread open at the same time but this is different enough that I thought it should be on it's own.

I have some incremental results. In practice these end up as a vector of float. As the code progresses, I am keeping the top N largest values overall.

For example where N=3, after the first iteration, my top 3 vector contains,

The next set begin evaluated to see if it has any element that qualify for the top 3 is,Code:5.5 4.4 2.2

Since 4.7 is larger than any element on my current top 3, I would replace 2.2 with 4.7 to end up with a new top 3 list of,Code:1.1 4.7 0.1

My top_N_set is already sorted large to small. I have the largest value in the new set in scope, so what I am doing is first asking if the largest new value is larger than the smallest value in the current top_N_set. If it isn't, there is nothing to do. If it is, I combine the vectors, resort, and resize.Code:5.5 4.7 4.4

There are allot of ways to combine vectors. I also looked at,Code:unsigned int N = 3; float largest_new_value; vector<float> top_N_set, new_set; // the top_N_set is already sorted large to small sort( top_N_set.begin(), top_N_set.end(), greater<float>() ); // check to see if the largest new value is > the smallest value in top_N_set if( largest new value > top_N_set.back() ) { // combine top_N_set[] with new_set[] // preallocate memory for combined vector top_N_set.reserve( top_N_set.size() + new_set.size() ); // insert new_set at end of top_IHB_product_sums top_N_set.insert( top_N_set.end(), new_set.begin(), new_set.end() ); // resort top_N_set large to small sort( top_N_set.begin(), top_N_set.end(), greater<float>() ); // trim top_N_set back to N top_N_set.resize(N); }

With some testing, it seems as if the reserve() and insert() method gives the best performance.Code:// a more old fashioned way assigning by [] unsigned int N = 3; unsigned int start_position, resize_to; float largest_new_value; vector<float> top_N_set, new_set; // create new size for top_N_set vector and resize start_position = top_N_set.size(); resize_to = start_position + new_set.size(); top_N_set.resize(resize_to); // add all elements of new_setto end of top_N_set for(i=0; i<new_set.size(); i++) { top_N_set[start_position+i] = new_set[i]; } //-------------------------------------------------------------------------------------// // just using push_back // push all elements of new_set onto to end of top_N_set for(i=0; i<new_set.size(); i++) { top_N_set.push_back( new_set[i] ); }

The top_N_set vector will never be very large (probably < 30), but the new_set vector will range from 1 to many thousands. I am not sure that there is an approach that will work equally well for all set sizes so I am tying to settle on something that will be reasonable.

There are other approaches like sorting new_set and then searching for the first element that is not > top_N_set.back(). An iterator set to this position would define the range of new_set that is relevant and that might help in situations where new_set is large and the number of values that might go into top_N_set is small. Every additional step adds processing that is probably not helpful in situations where new_set is small of where most of new_set is > top_N_set.back().

Are there any suggestions on this as far as how I am combining vectors or on the overall approach? I am using gcc3.3/g++3.3/g77 in both 32 and 64-bit.

LMHmedchem