# counting occurences of combinations

• June 20th, 2008, 03:40 PM
gary_88
counting occurences of combinations
for the following input file
11 22 33 44
11 22

im supposed to get
11,22 2
11,33 1
11,44 1
11,22,33 1
11,22,44 1
11,33,44 1
22,33,44 1
11,22,33,44 1
there is no repetition of elements on each line ..
i used following code to count "pairs" ie {11,22} and all....How do u do it for triplets and quadruplets and so on..??
--------------------------------
#define REP(i,n) for(int i=0;i<n;i++)
#define FOR(i,a,b) for(int i=a;i<b;i++)
ifstream fin("input.txt");
ofstream fout("output.txt");
string a;
map <pair<int,int>,int> b;
vector <int> x;
set <int> ss;
int m;
int main()
{
ifstream fin("input.txt");
while(getline(fin,a))
{
stringstream s;
s<<a;
x.clear();
while(s>>m) { x.push_back(m); ss.insert(m);}
REP(i,x.size())
FOR(j,i+1,x.size())
if(x[i]<x[j])
b[make_pair(x[i],x[j])]++;
else
b[make_pair(x[j],x[i])]++;
}
for(set <int>::iterator i=ss.begin();i!=ss.end();i++)
for(set <int>::iterator j=i;j!=ss.end();j++)
if(b[make_pair(*i,*j)])
fout<<"{"<<*i<<","<<*j<<"}"<<"\t"<<b[make_pair(*i,*j)]<<endl;
}
• June 20th, 2008, 08:02 PM
Duoas
Re: counting occurences of combinations
Don't use maps or pairs. Use a set.

Hope this helps.
• June 21st, 2008, 01:25 AM
gary_88
Re: counting occurences of combinations
But after inserting elements into the set how do u do the counting part??
Can u demonstrate the code?
• June 21st, 2008, 06:57 AM
Cheezewizz
Re: counting occurences of combinations
Would a multiset not be better? A set would just ignore any duplicates so his 11,22 2 would be become 11,22 1 ....
• June 21st, 2008, 11:28 AM
mysterd429
Re: counting occurences of combinations
So, mathematically, you have several sets, A[1] = {11, 22, 33, 44}, A[2] = {11, 22}, etc. You then take the union of all of these to get B = {11, 22, 33, 44}. You then want to count the number of times that each subset of B is a subset of an A. This is a naive implementation, using a set of sets to represent A. I'm sure it's a slow as possible, but I think it does what you want.

I omitted some utility code.

Code:

```/**  * Build one-tuple sets, the double sets, ..., n-tuple sets.  Terminates when  * tupleSize is greater than the number of elements in bee.  * @param bee The set from which elements are taken.  * @param aye The set of sets being generated.  * @param tupleSize The number of elements in each set for this iteration.  * @pre bee has all of the elements.  * @pre aye is empty.  * @post aye is populated with all subsets of size tupleSize.  */ void BuildTupleSets(set<int> bee, set<set<int> > &aye, int tupleSize = 1) {   // cerr << "called with tupleSize = " << tupleSize << endl;   // first iteration.   if(tupleSize == 1)     {       for(set<int>::iterator i = bee.begin(); i != bee.end(); i++)         {           set<int> ayeSubN;           ayeSubN.insert(*i);           // cerr << "Adding " << ayeSubN << endl;           aye.insert(ayeSubN);         }       BuildTupleSets(bee, aye, tupleSize + 1);     }   // iterations where we are building sets bigger than 1   else if(tupleSize <= bee.size())     {       for(set<set<int> >::iterator i = aye.begin(); i != aye.end(); i++)         {           // cerr << "aye[i] is " << *i << endl;           // If the current size is one less than the desired size, add to it.           if(i->size() + 1 == tupleSize)             {               for(set<int>::iterator j = bee.begin(); j != bee.end(); j++)                 {                   set<int> ayeSubN(*i);                   ayeSubN.insert(*j);                   // cerr << "Adding " << ayeSubN << endl;                   aye.insert(ayeSubN);                 }             }         }       BuildTupleSets(bee, aye, tupleSize + 1);     } } /**  * Checks to see if ess is a superset ot tee.  * @param ess The potential superset.  * @param tee The potential subset.  * @return True if ess is a superset, false otherwise.  */ bool IsSubset(const set<int> &ess, const set<int> &tee) {   for(set<int>::iterator j = tee.begin(); j != tee.end(); j++)     {       set<int>::iterator found = ess.find(*j);       if(found == ess.end())         {           return false;         }     }   return true; }```
I use the function like this:
Code:

```int main() {   map<set<int>, int> results;   // list<set<int> > results;   set<int> bee;   set<set<int> > aye;   vector<set<int> > input;   ifstream fin("data.txt");   int i;   while(fin)     {       char lineBuffer[65535];       set<int> inputSubN;       fin.getline(lineBuffer, 65535);       stringstream lin(lineBuffer, stringstream::in);       while(lin)         {           lin >> i;           // cerr << "Adding " << i << " to bee." << endl;           bee.insert(i);           // cerr << "Adding " << i << " to inputSubN." << endl;           inputSubN.insert(i);         }       input.push_back(inputSubN);     }   // cerr << "bee is " << bee << endl;   BuildTupleSets(bee, aye);   // cerr << "aye is " << aye << endl;   for(vector<set<int> >::iterator super = input.begin(); super != input.end(); super++)     {       for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)         {           if(IsSubset(*super, *sub))             {               // cerr << *sub << " is a subset of " << *super << endl;               results[*sub] += 1;             }         }     }   for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)     {       cout << *sub << " : " << results[*sub] << endl;     }   return 0; }```
The output is then
Code:

```{11} : 2 {11, 22} : 2 {11, 22, 33} : 1 {11, 22, 33, 44} : 1 {11, 22, 44} : 1 {11, 33} : 1 {11, 33, 44} : 1 {11, 44} : 1 {22} : 3 {22, 33} : 1 {22, 33, 44} : 1 {22, 44} : 1 {33} : 1 {33, 44} : 1 {44} : 1```
Let me know if that does what you want.

Cheers!
• June 21st, 2008, 11:37 AM
TheCPUWizard
Re: counting occurences of combinations
A nice solution :thumb: :wave:

Performance has not been given as a requirement [ie number be ables to determine all of the subset counts for an original set of size n in under x seconds]

The only things I might do would be do create some trivial derived classes to make if a bit more readable....

Code:

```class SingleSet : public std::set<int> {} class : public std::set<SingleSet> {} void BuildTupleSets(SingleSet bee, MultiSet  &aye, int tupleSize = 1) { // etc.... }```
• June 21st, 2008, 02:37 PM
gary_88
Re: counting occurences of combinations
Thanks a lot mysterd429.... Although there are some small problems..

1. In the main function..this code

for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
cout << *sub << " : " << results[*sub] << endl;
}

Here cout<<*sub gives error "no match for operator<<"....the expression results[*sub] gives the right answer like u said..

2. I noticed you used char lineBuffer[65535];.. Since this doesnt work for bigger files( 1or 2mb txt files)..i tried using string linebuffer although that didnt work..Wat is the change to be made here??

3. Just wanted to know if the answer could be printed in the format like

{11} 2
{22} 2
{11,22} 1
{11,22,33} 1
{11,22,33,44} 1

something like this where lower sets could be printed first followed by the remaining higher ones.

Thanks again!! Cheers!
• June 21st, 2008, 04:16 PM
mysterd429
Re: counting occurences of combinations
Quote:

Originally Posted by gary_88
Thanks a lot mysterd429.... Although there are some small problems..

1. In the main function..this code

for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
cout << *sub << " : " << results[*sub] << endl;
}

Here cout<<*sub gives error "no match for operator<<"....the expression results[*sub] gives the right answer like u said..

I omited some of the code I used for output purposes. I've pasted the whole source file below.

Quote:

Originally Posted by gary_88
2. I noticed you used char lineBuffer[65535];.. Since this doesnt work for bigger files( 1or 2mb txt files)..i tried using string linebuffer although that didnt work..Wat is the change to be made here??

As long as no one line exceeds that 65535 characters, you should be okay. Do you have more than 65535 characters per line? If lines are longer than 65535 characters, you could either increase the buffer size, or work out some other scheme.

Quote:

Originally Posted by gary_88
]3. Just wanted to know if the answer could be printed in the format like

{11} 2
{22} 2
{11,22} 1
{11,22,33} 1
{11,22,33,44} 1

something like this where lower sets could be printed first followed by the remaining higher ones.

Thanks again!! Cheers!

You could specify a different function for the third template parameter. I don't have a moment to create a good example right now, but this page might have your answer.

Cheers!

Don

Code:

```#include <set> #include <map> #include <fstream> #include <iostream> #include <string> #include <vector> #include <sstream> #include <list> using namespace std; ostream & operator<<(ostream & lhs, set<int> rhs) {   set<int>::iterator i = rhs.begin();   lhs << "{";   if(i != rhs.end())     {       lhs << *i;       i++;       for(; i != rhs.end(); i++)         {           lhs << ", " << *i;         }     }   lhs << "}";   return lhs; } ostream & operator<<(ostream & lhs, set<set<int> > rhs) {   set<set<int> >::iterator i = rhs.begin();   lhs << "{";   if(i != rhs.end())     {       lhs << *i;       i++;       for(; i != rhs.end(); i++)         {           lhs << ", " << *i;         }     }   lhs << "}";   return lhs; } /**  * Build one-tuple sets, the double sets, ..., n-tuple sets.  Terminates when  * tupleSize is greater than the number of elements in bee.  * @param bee The set from which elements are taken.  * @param aye The set of sets being generated.  * @param tupleSize The number of elements in each set for this iteration.  * @pre bee has all of the elements.  * @pre aye is empty.  * @post aye is populated with all subsets of size tupleSize.  */ void BuildTupleSets(set<int> bee, set<set<int> > &aye, int tupleSize = 1) {   // cerr << "called with tupleSize = " << tupleSize << endl;   // first iteration.   if(tupleSize == 1)     {       for(set<int>::iterator i = bee.begin(); i != bee.end(); i++)         {           set<int> ayeSubN;           ayeSubN.insert(*i);           // cerr << "Adding " << ayeSubN << endl;           aye.insert(ayeSubN);         }       BuildTupleSets(bee, aye, tupleSize + 1);     }   // iterations where we are building sets bigger than 1   else if(tupleSize <= bee.size())     {       for(set<set<int> >::iterator i = aye.begin(); i != aye.end(); i++)         {           // cerr << "aye[i] is " << *i << endl;           // If the current size is one less than the desired size, add to it.           if(i->size() + 1 == tupleSize)             {               for(set<int>::iterator j = bee.begin(); j != bee.end(); j++)                 {                   set<int> ayeSubN(*i);                   ayeSubN.insert(*j);                   // cerr << "Adding " << ayeSubN << endl;                   aye.insert(ayeSubN);                 }             }         }       BuildTupleSets(bee, aye, tupleSize + 1);     } } /**  * Checks to see if ess is a superset ot tee.  * @param ess The potential superset.  * @param tee The potential subset.  * @return True if ess is a superset, false otherwise.  */ bool IsSubset(const set<int> &ess, const set<int> &tee) {   for(set<int>::iterator j = tee.begin(); j != tee.end(); j++)     {       set<int>::iterator found = ess.find(*j);       if(found == ess.end())         {           return false;         }     }   return true; } int main() {   map<set<int>, int> results;   // list<set<int> > results;   set<int> bee;   set<set<int> > aye;   vector<set<int> > input;   ifstream fin("data.txt");   int i;   while(fin)     {       char lineBuffer[65535];       set<int> inputSubN;       fin.getline(lineBuffer, 65535);       stringstream lin(lineBuffer, stringstream::in);       while(lin)         {           lin >> i;           // cerr << "Adding " << i << " to bee." << endl;           bee.insert(i);           // cerr << "Adding " << i << " to inputSubN." << endl;           inputSubN.insert(i);         }       input.push_back(inputSubN);     }   // cerr << "bee is " << bee << endl;   BuildTupleSets(bee, aye);   // cerr << "aye is " << aye << endl;   for(vector<set<int> >::iterator super = input.begin(); super != input.end(); super++)     {       for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)         {           if(IsSubset(*super, *sub))             {               // cerr << *sub << " is a subset of " << *super << endl;               results[*sub] += 1;             }         }     }   for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)     {       cout << *sub << " : " << results[*sub] << endl;     }   return 0; }```
• June 21st, 2008, 04:58 PM
Duoas
Re: counting occurences of combinations
I'm confused. Are you counting permutations or combinations? They are different things...
• June 21st, 2008, 05:42 PM
gary_88
Re: counting occurences of combinations
Thanks..code was able to compile but still it isnt working for big files even though a single line is not more than 65535 characters

30 31 32
33 34 35
36 37 38 39 40 41 42 43 44 45 46
38 39 47 48
38 39 48 49 50 51 52 53 54 55 56 57 58
32 41 59 60 61 62
3 39 48
63 64 65 66 67 68
32 69
48 70 71 72
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133

this a small example of how my input file is...the o/p is not coming for this..
• June 22nd, 2008, 10:44 AM
mysterd429
Re: counting occurences of combinations
Hi Gary,

There was a small bug in my input code. I was not correctly reading in from lin before I checked to see if it had failed. That section of code should look like the one included below.

The new code may or may not fix the problem that you were seeing with large files. Try making the changing and seeing if it still fails. When the program fails for large files, what happens? What error do you get? Can you use an interactive debugger to find the exact line of code where it fails?

Cheers!

P.S. - Thanks for the feedback, CPUWizard.

Code:

```  ifstream fin("data.txt");   while(fin)     {       char lineBuffer[65535];       set<int> inputSubN;       fin.getline(lineBuffer, 65535);       int i;       // cerr << "Read \"" << lineBuffer << "\"." << endl;       stringstream lin(lineBuffer, stringstream::in);       lin >> i;       while(lin)         {           // cerr << "Adding " << i << " to bee." << endl;           bee.insert(i);           // cerr << "Adding " << i << " to inputSubN." << endl;           inputSubN.insert(i);           lin >> i;         }       input.push_back(inputSubN);     }```
edit: If you have 14 possible values, (using 11, 22, 33, 44 you have 4), then the total number of subsets is

2 * (14! / (1! * 13!) + 14! / (2! * 12!) + 14! / (3! * 11!) + 14! / (4! * 10!) + 14! / (5! * 9!) + 14! / (6! * 8!)) + 14! / (7! * 7!)

That's a relatively large number, 16383 combinations. If you have a 2 MiB file, that's something like 200,000 lines. That gets you 1.683e4 * 2e5 iterations of comparing set to set, which is about 3.2e9, or 3,200,000,000 set-to-set comparisons. That may take a very long time, depending on your computer. My dual processor G4 (at 1.0 GHz) has been chugging away for an hour or so. Like I said, this ain't an efficient algorithm ;-)
• June 22nd, 2008, 12:01 PM
gary_88
Re: counting occurences of combinations
hi don

Firstly i just checked up on the following input
11 22 33 44 55
66 77 88
it gave me 8 element set long answer.. I'm really sorry but that isnt wat i wanted as o/p.. Perhaps tthis wud be a bette example..
i/p
I1 I2 I5
I2 I4
I2 I3
I1 I2 I4
I1 I3
I2 I3
I1 I3
I1 I2 I3 I5
I1 I2 I3
-----------------------------------------------
o/p
6 I1
7 I2
2 I5
2 I4
6 I3
I1,I2 4
I1,I5 2
I1,I4 1
I1,I3 4
I2,I5 2
I2,I4 2
I2,I3 4
I5,I4 0
I5,I3 1
I4,I3 0
{I1,I2,I5} 2
{I1,I2,I4} 1
{I1,I2,I3} 2
{I1,I5,I4} 0
{I1,I5,I3} 1
{I1,I4,I3} 0
{I2,I5,I4} 0
{I2,I5,I3} 1
{I2,I4,I3} 0
{I5,I4,I3} 0
{I1,I2,I5,I4} 0
{I1,I2,I5,I3} 1
{I1,I2,I4,I3} 0
{I2,I5,I4,I3} 0

im really sorry for troubling you..
thanks
Cheers!
• June 24th, 2008, 02:00 AM
gary_88
Re: counting occurences of combinations