CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com

# Thread: counting occurences of combinations

1. Junior Member Join Date
Jun 2008
Posts
9

## counting occurences of combinations

for the following input file
11 22 33 44
11 22

im supposed to get
11,22 2
11,33 1
11,44 1
11,22,33 1
11,22,44 1
11,33,44 1
22,33,44 1
11,22,33,44 1
there is no repetition of elements on each line ..
i used following code to count "pairs" ie {11,22} and all....How do u do it for triplets and quadruplets and so on..??
--------------------------------
#define REP(i,n) for(int i=0;i<n;i++)
#define FOR(i,a,b) for(int i=a;i<b;i++)
ifstream fin("input.txt");
ofstream fout("output.txt");
string a;
map <pair<int,int>,int> b;
vector <int> x;
set <int> ss;
int m;
int main()
{
ifstream fin("input.txt");
while(getline(fin,a))
{
stringstream s;
s<<a;
x.clear();
while(s>>m) { x.push_back(m); ss.insert(m);}
REP(i,x.size())
FOR(j,i+1,x.size())
if(x[i]<x[j])
b[make_pair(x[i],x[j])]++;
else
b[make_pair(x[j],x[i])]++;
}
for(set <int>::iterator i=ss.begin();i!=ss.end();i++)
for(set <int>::iterator j=i;j!=ss.end();j++)
if(b[make_pair(*i,*j)])
fout<<"{"<<*i<<","<<*j<<"}"<<"\t"<<b[make_pair(*i,*j)]<<endl;
}  Reply With Quote

2. Member Join Date
May 2008
Posts
96

## Re: counting occurences of combinations

Don't use maps or pairs. Use a set.

Hope this helps.  Reply With Quote

3. Junior Member Join Date
Jun 2008
Posts
9

## Re: counting occurences of combinations

But after inserting elements into the set how do u do the counting part??
Can u demonstrate the code?  Reply With Quote

4. Member Join Date
May 2008
Posts
38

## Re: counting occurences of combinations

Would a multiset not be better? A set would just ignore any duplicates so his 11,22 2 would be become 11,22 1 ....  Reply With Quote

5. Junior Member Join Date
Oct 2007
Posts
15

## Re: counting occurences of combinations

So, mathematically, you have several sets, A = {11, 22, 33, 44}, A = {11, 22}, etc. You then take the union of all of these to get B = {11, 22, 33, 44}. You then want to count the number of times that each subset of B is a subset of an A. This is a naive implementation, using a set of sets to represent A. I'm sure it's a slow as possible, but I think it does what you want.

I omitted some utility code.

Code:
```/**
* Build one-tuple sets, the double sets, ..., n-tuple sets.  Terminates when
* tupleSize is greater than the number of elements in bee.
* @param bee The set from which elements are taken.
* @param aye The set of sets being generated.
* @param tupleSize The number of elements in each set for this iteration.
* @pre bee has all of the elements.
* @pre aye is empty.
* @post aye is populated with all subsets of size tupleSize.
*/
void BuildTupleSets(set<int> bee, set<set<int> > &aye, int tupleSize = 1)
{
// cerr << "called with tupleSize = " << tupleSize << endl;
// first iteration.
if(tupleSize == 1)
{
for(set<int>::iterator i = bee.begin(); i != bee.end(); i++)
{
set<int> ayeSubN;
ayeSubN.insert(*i);
// cerr << "Adding " << ayeSubN << endl;
aye.insert(ayeSubN);
}
BuildTupleSets(bee, aye, tupleSize + 1);
}
// iterations where we are building sets bigger than 1
else if(tupleSize <= bee.size())
{
for(set<set<int> >::iterator i = aye.begin(); i != aye.end(); i++)
{
// cerr << "aye[i] is " << *i << endl;
// If the current size is one less than the desired size, add to it.
if(i->size() + 1 == tupleSize)
{
for(set<int>::iterator j = bee.begin(); j != bee.end(); j++)
{
set<int> ayeSubN(*i);
ayeSubN.insert(*j);
// cerr << "Adding " << ayeSubN << endl;
aye.insert(ayeSubN);
}
}
}
BuildTupleSets(bee, aye, tupleSize + 1);
}
}

/**
* Checks to see if ess is a superset ot tee.
* @param ess The potential superset.
* @param tee The potential subset.
* @return True if ess is a superset, false otherwise.
*/
bool IsSubset(const set<int> &ess, const set<int> &tee)
{
for(set<int>::iterator j = tee.begin(); j != tee.end(); j++)
{
set<int>::iterator found = ess.find(*j);
if(found == ess.end())
{
return false;
}
}
return true;
}```
I use the function like this:
Code:
```int main()
{
map<set<int>, int> results;
// list<set<int> > results;
set<int> bee;
set<set<int> > aye;
vector<set<int> > input;

ifstream fin("data.txt");

int i;

while(fin)
{
char lineBuffer;
set<int> inputSubN;
fin.getline(lineBuffer, 65535);
stringstream lin(lineBuffer, stringstream::in);
while(lin)
{
lin >> i;
// cerr << "Adding " << i << " to bee." << endl;
bee.insert(i);
// cerr << "Adding " << i << " to inputSubN." << endl;
inputSubN.insert(i);
}
input.push_back(inputSubN);
}

// cerr << "bee is " << bee << endl;

BuildTupleSets(bee, aye);

// cerr << "aye is " << aye << endl;

for(vector<set<int> >::iterator super = input.begin(); super != input.end(); super++)
{
for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
if(IsSubset(*super, *sub))
{
// cerr << *sub << " is a subset of " << *super << endl;
results[*sub] += 1;
}
}
}

for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
cout << *sub << " : " << results[*sub] << endl;
}

return 0;
}```
The output is then
Code:
```{11} : 2
{11, 22} : 2
{11, 22, 33} : 1
{11, 22, 33, 44} : 1
{11, 22, 44} : 1
{11, 33} : 1
{11, 33, 44} : 1
{11, 44} : 1
{22} : 3
{22, 33} : 1
{22, 33, 44} : 1
{22, 44} : 1
{33} : 1
{33, 44} : 1
{44} : 1```
Let me know if that does what you want.

Cheers!
Last edited by mysterd429; June 21st, 2008 at 11:31 AM.  Reply With Quote

6. ## Re: counting occurences of combinations

A nice solution  Performance has not been given as a requirement [ie number be ables to determine all of the subset counts for an original set of size n in under x seconds]

The only things I might do would be do create some trivial derived classes to make if a bit more readable....

Code:
```class SingleSet : public std::set<int> {}
class : public std::set<SingleSet> {}

void BuildTupleSets(SingleSet bee, MultiSet  &aye, int tupleSize = 1)
{
// etc....
}```  Reply With Quote

7. Junior Member Join Date
Jun 2008
Posts
9

## Re: counting occurences of combinations

Thanks a lot mysterd429.... Although there are some small problems..

1. In the main function..this code

for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
cout << *sub << " : " << results[*sub] << endl;
}

Here cout<<*sub gives error "no match for operator<<"....the expression results[*sub] gives the right answer like u said..

2. I noticed you used char lineBuffer;.. Since this doesnt work for bigger files( 1or 2mb txt files)..i tried using string linebuffer although that didnt work..Wat is the change to be made here??

3. Just wanted to know if the answer could be printed in the format like

{11} 2
{22} 2
{11,22} 1
{11,22,33} 1
{11,22,33,44} 1

something like this where lower sets could be printed first followed by the remaining higher ones.

Thanks again!! Cheers!  Reply With Quote

8. Junior Member Join Date
Oct 2007
Posts
15

## Re: counting occurences of combinations Originally Posted by gary_88
Thanks a lot mysterd429.... Although there are some small problems..

1. In the main function..this code

for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
cout << *sub << " : " << results[*sub] << endl;
}

Here cout<<*sub gives error "no match for operator<<"....the expression results[*sub] gives the right answer like u said..
I omited some of the code I used for output purposes. I've pasted the whole source file below. Originally Posted by gary_88
2. I noticed you used char lineBuffer;.. Since this doesnt work for bigger files( 1or 2mb txt files)..i tried using string linebuffer although that didnt work..Wat is the change to be made here??
As long as no one line exceeds that 65535 characters, you should be okay. Do you have more than 65535 characters per line? If lines are longer than 65535 characters, you could either increase the buffer size, or work out some other scheme. Originally Posted by gary_88
]3. Just wanted to know if the answer could be printed in the format like

{11} 2
{22} 2
{11,22} 1
{11,22,33} 1
{11,22,33,44} 1

something like this where lower sets could be printed first followed by the remaining higher ones.

Thanks again!! Cheers!
You could specify a different function for the third template parameter. I don't have a moment to create a good example right now, but this page might have your answer.

Cheers!

Don

Code:
```#include <set>
#include <map>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <list>

using namespace std;

ostream & operator<<(ostream & lhs, set<int> rhs)
{
set<int>::iterator i = rhs.begin();

lhs << "{";
if(i != rhs.end())
{
lhs << *i;
i++;
for(; i != rhs.end(); i++)
{
lhs << ", " << *i;
}
}
lhs << "}";

return lhs;
}

ostream & operator<<(ostream & lhs, set<set<int> > rhs)
{
set<set<int> >::iterator i = rhs.begin();

lhs << "{";
if(i != rhs.end())
{
lhs << *i;
i++;
for(; i != rhs.end(); i++)
{
lhs << ", " << *i;
}
}
lhs << "}";

return lhs;
}

/**
* Build one-tuple sets, the double sets, ..., n-tuple sets.  Terminates when
* tupleSize is greater than the number of elements in bee.
* @param bee The set from which elements are taken.
* @param aye The set of sets being generated.
* @param tupleSize The number of elements in each set for this iteration.
* @pre bee has all of the elements.
* @pre aye is empty.
* @post aye is populated with all subsets of size tupleSize.
*/
void BuildTupleSets(set<int> bee, set<set<int> > &aye, int tupleSize = 1)
{
// cerr << "called with tupleSize = " << tupleSize << endl;
// first iteration.
if(tupleSize == 1)
{
for(set<int>::iterator i = bee.begin(); i != bee.end(); i++)
{
set<int> ayeSubN;
ayeSubN.insert(*i);
// cerr << "Adding " << ayeSubN << endl;
aye.insert(ayeSubN);
}
BuildTupleSets(bee, aye, tupleSize + 1);
}
// iterations where we are building sets bigger than 1
else if(tupleSize <= bee.size())
{
for(set<set<int> >::iterator i = aye.begin(); i != aye.end(); i++)
{
// cerr << "aye[i] is " << *i << endl;
// If the current size is one less than the desired size, add to it.
if(i->size() + 1 == tupleSize)
{
for(set<int>::iterator j = bee.begin(); j != bee.end(); j++)
{
set<int> ayeSubN(*i);
ayeSubN.insert(*j);
// cerr << "Adding " << ayeSubN << endl;
aye.insert(ayeSubN);
}
}
}
BuildTupleSets(bee, aye, tupleSize + 1);
}
}

/**
* Checks to see if ess is a superset ot tee.
* @param ess The potential superset.
* @param tee The potential subset.
* @return True if ess is a superset, false otherwise.
*/
bool IsSubset(const set<int> &ess, const set<int> &tee)
{
for(set<int>::iterator j = tee.begin(); j != tee.end(); j++)
{
set<int>::iterator found = ess.find(*j);
if(found == ess.end())
{
return false;
}
}
return true;
}

int main()
{
map<set<int>, int> results;
// list<set<int> > results;
set<int> bee;
set<set<int> > aye;
vector<set<int> > input;

ifstream fin("data.txt");

int i;

while(fin)
{
char lineBuffer;
set<int> inputSubN;
fin.getline(lineBuffer, 65535);
stringstream lin(lineBuffer, stringstream::in);
while(lin)
{
lin >> i;
// cerr << "Adding " << i << " to bee." << endl;
bee.insert(i);
// cerr << "Adding " << i << " to inputSubN." << endl;
inputSubN.insert(i);
}
input.push_back(inputSubN);
}

// cerr << "bee is " << bee << endl;

BuildTupleSets(bee, aye);

// cerr << "aye is " << aye << endl;

for(vector<set<int> >::iterator super = input.begin(); super != input.end(); super++)
{
for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
if(IsSubset(*super, *sub))
{
// cerr << *sub << " is a subset of " << *super << endl;
results[*sub] += 1;
}
}
}

for(set<set<int> >::iterator sub = aye.begin(); sub != aye.end(); sub++)
{
cout << *sub << " : " << results[*sub] << endl;
}

return 0;
}```  Reply With Quote

9. Member Join Date
May 2008
Posts
96

## Re: counting occurences of combinations

I'm confused. Are you counting permutations or combinations? They are different things...  Reply With Quote

10. Junior Member Join Date
Jun 2008
Posts
9

## Re: counting occurences of combinations

Thanks..code was able to compile but still it isnt working for big files even though a single line is not more than 65535 characters

30 31 32
33 34 35
36 37 38 39 40 41 42 43 44 45 46
38 39 47 48
38 39 48 49 50 51 52 53 54 55 56 57 58
32 41 59 60 61 62
3 39 48
63 64 65 66 67 68
32 69
48 70 71 72
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133

this a small example of how my input file is...the o/p is not coming for this..  Reply With Quote

11. Junior Member Join Date
Oct 2007
Posts
15

## Re: counting occurences of combinations

Hi Gary,

There was a small bug in my input code. I was not correctly reading in from lin before I checked to see if it had failed. That section of code should look like the one included below.

The new code may or may not fix the problem that you were seeing with large files. Try making the changing and seeing if it still fails. When the program fails for large files, what happens? What error do you get? Can you use an interactive debugger to find the exact line of code where it fails?

Cheers!

P.S. - Thanks for the feedback, CPUWizard.

Code:
```  ifstream fin("data.txt");

while(fin)
{
char lineBuffer;
set<int> inputSubN;
fin.getline(lineBuffer, 65535);
int i;

// cerr << "Read \"" << lineBuffer << "\"." << endl;
stringstream lin(lineBuffer, stringstream::in);
lin >> i;
while(lin)
{
// cerr << "Adding " << i << " to bee." << endl;
bee.insert(i);
// cerr << "Adding " << i << " to inputSubN." << endl;
inputSubN.insert(i);
lin >> i;
}
input.push_back(inputSubN);
}```
edit: If you have 14 possible values, (using 11, 22, 33, 44 you have 4), then the total number of subsets is

2 * (14! / (1! * 13!) + 14! / (2! * 12!) + 14! / (3! * 11!) + 14! / (4! * 10!) + 14! / (5! * 9!) + 14! / (6! * 8!)) + 14! / (7! * 7!)

That's a relatively large number, 16383 combinations. If you have a 2 MiB file, that's something like 200,000 lines. That gets you 1.683e4 * 2e5 iterations of comparing set to set, which is about 3.2e9, or 3,200,000,000 set-to-set comparisons. That may take a very long time, depending on your computer. My dual processor G4 (at 1.0 GHz) has been chugging away for an hour or so. Like I said, this ain't an efficient algorithm ;-)
Last edited by mysterd429; June 22nd, 2008 at 11:44 AM.  Reply With Quote

12. Junior Member Join Date
Jun 2008
Posts
9

## Re: counting occurences of combinations

hi don

Firstly i just checked up on the following input
11 22 33 44 55
66 77 88
it gave me 8 element set long answer.. I'm really sorry but that isnt wat i wanted as o/p.. Perhaps tthis wud be a bette example..
i/p
I1 I2 I5
I2 I4
I2 I3
I1 I2 I4
I1 I3
I2 I3
I1 I3
I1 I2 I3 I5
I1 I2 I3
-----------------------------------------------
o/p
6 I1
7 I2
2 I5
2 I4
6 I3
I1,I2 4
I1,I5 2
I1,I4 1
I1,I3 4
I2,I5 2
I2,I4 2
I2,I3 4
I5,I4 0
I5,I3 1
I4,I3 0
{I1,I2,I5} 2
{I1,I2,I4} 1
{I1,I2,I3} 2
{I1,I5,I4} 0
{I1,I5,I3} 1
{I1,I4,I3} 0
{I2,I5,I4} 0
{I2,I5,I3} 1
{I2,I4,I3} 0
{I5,I4,I3} 0
{I1,I2,I5,I4} 0
{I1,I2,I5,I3} 1
{I1,I2,I4,I3} 0
{I2,I5,I4,I3} 0

im really sorry for troubling you..
thanks
Cheers!  Reply With Quote

13. Junior Member Join Date
Jun 2008
Posts
9

## Re: counting occurences of combinations

thanks!  Reply With Quote

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
• 