CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com

# Thread: Find cartesians for matrix elements

1. Member +
Join Date
Feb 2017
Posts
639

## Re: Find cartesians for matrix elements

Originally Posted by JackK
But the problem is that the algorithm is unknown to me yet.
I thought that it's maybe known to some of you.
You can always apply exhaustive combinatorics:

1. Systematically generate all Cartesian matrixes that may occur in any matrix the size of the input matrix. Select those that are present in the input matrix.

2. Systematically generate all possible combinations of the Cartesian matrixes selected in 1. Select the valid combinations, that is where the sum of the Cartesian matrixes without overlapping matches the input matrix exactly.

3. Out of all valid combinations found in 2, select the combination(s) with the fewest Cartesian matrixes. That's it!

There are 2^(n+m) possible Cartesian matrices in an n*m input matrix, so step 1 is only feasible for small matrixes. The feasibility of step 2 depends on how many Cartesian matrixes there actually are in the input matrix. If there are k, there will be 2^k possible combinations, so also the number of combinations becomes intractable very fast. But still, this algorithm will work, and that's better than having nothing.

Both steps 1 and 2 involves generating the powerset of a set,

https://en.wikipedia.org/wiki/Power_set

Good luck!
Last edited by wolle; July 20th, 2021 at 01:20 AM.

2. Member +
Join Date
Feb 2017
Posts
639

## Re: Find cartesians for matrix elements

I've had another look at this problem, and it seems to be a variation of the (exact) set cover problem:

https://en.wikipedia.org/wiki/Set_cover_problem

Here, the problem has the additional requirement that sets must form Cartesian matrixes. The problem is NP-hard, and so exact solutions for large input matrixes are intractable.

For fun, I've implemented the algorithm I suggested in #16. I've kept step 1 as is. It requires 2^(n+m) iterations for an n times m input matrix. If m=n=10, there will be about 1 million iterations. It is still tractable but kind of the upper limit.

Step 2 becomes intractable even for a 10 by 10 input matrix. So I follow the advice in the Wikipedia link above and replace step 2 with an approximate greedy algorithm. The Cartesian matrixes found in step 1 are sorted from larger to smaller. The algorithm then considers them one by one, and if a Cartesian matrix does not overlap with any other already in the solution, it is added to the solution. So the larger the Cartesian matrix, the sooner it will be considered for inclusion in the solution. This notion of "greediness" seems to work well and generate solutions very close to optimum (when the input matrix is split into the fewest Cartesian matrixes possible).

Code:
```	#include <iostream>
#include <algorithm>
#include <ranges>
#include <utility>
#include <functional>
#include <string>
#include <array>
#include <vector>

void test() {
std::cout << "--- Test ---" << std::endl;

// Types

using Set = unsigned;	// a bitset
using CM = std::pair<Set,Set>;	// a Cartesian matrix (CM) is a <row-bitset, column-bitset> pair

// Helper functions

auto set_reverse = [](Set s, int n) { // reverse a Set s of n bits
Set r=0;
for (; --n >= 0; s >>= 1) r = (r <<= 1) | s&1;
return r;
};

// iterate the Set s and call the function f with the position of every 1-bit found
auto set_scan = [](Set s, const std::function<void(int)>& f) {
for (int i=0; s!=0; s>>=1,++i) if ((s&1) != 0) f(i);
};

auto set_count = [](Set s) { // count the 1-bits of the Set s
int i=0;
for (; s!=0; s>>=1) if ((s&1) != 0) ++i;
return i;
};

auto cm_string = [&] (CM cm) { // generate a string representation of a CM-pair
std::string s="", t="";
set_scan(cm.first, [&](int i) {s += char('A'+ i);});
set_scan(cm.second, [&](int i) {t += char('a'+ i);});
return s + ":" + t;
};

// compare two CM-pairs and decide whether the first is greater than the second
auto cm_greater = [&] (CM cm1, CM cm2) {
return set_count(cm1.first)*set_count(cm1.second) > set_count(cm2.first)*set_count(cm2.second);
};

auto cm_overlap = [] (CM cm1, CM cm2) { // decide whether two CM-pairs overlap
for (Set r = cm1.first & cm2.first; r!=0; r >>= 1) {
if ((r&1) != 0 && (cm1.second & cm2.second) != 0) return true;
}
return false;
};

// The internal representation of the input matrix
// is an array of ROWS rows, each holding a bitset of COLS bits
// (A matrix has the 0 column to the left, whereas a bitset has the 0 bit to the right.
// Therefore the columns of the input matrix are reversed when represented with bitsets).
const int ROWS = 6;
const int COLS = 5;
std::array<Set, ROWS> input_matrix = { // example matrix 2 in post #1
set_reverse(0b01101, COLS),
set_reverse(0b01001, COLS),
set_reverse(0b11000, COLS),
set_reverse(0b11011, COLS),
set_reverse(0b10011, COLS),
set_reverse(0b00111, COLS)
};

/*		const int ROWS = 4;
const int COLS = 4;
std::array<Set, ROWS> input_matrix = { // example matrix in post #10
set_reverse(0b0111, COLS),
set_reverse(0b0111, COLS),
set_reverse(0b0110, COLS),
set_reverse(0b1100, COLS),
};
*/

// Algorithm

// find all valid CMs of the input matrix
std::vector<CM> valid_CM;
for (Set r=1; r<(1<<ROWS); ++r) { // the full row powerset
for (Set c=1; c<(1<<COLS); ++c) { // the full column powerset
Set s=(1<<COLS)-1;
set_scan(r, [&](int i) {s &= input_matrix[i];});
if ((s&c) == c) {valid_CM.emplace_back(CM{r,c});}
}
}

std::ranges::sort(valid_CM, cm_greater); // sort valid CMs from larger to smaller CMs

std::cout << "--- All possible CMs in sorted order from larger to smaller ---" << std::endl;
std::cout << "Number of CMs = " << valid_CM.size() << std::endl;
for (CM cm : valid_CM) std::cout << cm_string(cm) << std::endl; // print sorted CMs

// find approximate solution by selecting non-overlapping CMs greedily,
// that is from larger to smaller
std::vector<CM> solution_CM;
for (CM new_cm : valid_CM) {	// all possible CMs sorted from larger to smaller
bool no_overlap = true;
for (CM sol_cm : solution_CM) {
if (cm_overlap(sol_cm, new_cm)) {	// compare whether the new CM overlap
no_overlap = false;				// with any CM already in the solution
break;
}
}
// store the new CM if it is not overlapping with any CM already in the solution
if (no_overlap) solution_CM.push_back(new_cm);
}

// Print the input matrix
std::cout << std::endl;
std::cout << "--- Input matrix ---" << std::endl;
std::cout << "  ";
for (int c=0; c<COLS; ++c) std::cout << char('a'+ c) << " ";
std::cout << std::endl;
for (int r=0; r<ROWS; ++r) {
std::cout << char('A'+ r) << " ";
Set s = input_matrix[r];
for (int c=0; c<COLS; ++c) {
std::cout << (((s&1) != 0) ? "1" : "0") << " ";
s >>= 1;
}
std::cout << std::endl;
}

// Print the solution CMs
std::cout << std::endl;
std::cout << "--- Solution CMs exactly covering the input matrix ---" << std::endl;
for (CM cm : solution_CM) std::cout << cm_string(cm) << std::endl;

std::cout << "--- End ---" << std::endl;
std::cout << std::endl;
} // test```
Last edited by wolle; July 29th, 2021 at 03:34 AM.

3. Member +
Join Date
Feb 2017
Posts
639

## Re: Find cartesians for matrix elements

I find it intriguing that the greedy algorithm I used in #17 works so well. I decided to check out whether it was just luck with the choice of input matrixes. So this time, I generate the input matrix randomly. I also randomly shuffle same-sized Cartesian matrixes while keeping the overall order from large to small. Amazingly, the greedy algorithm consistently produces solutions that are very close to optimum or even at optimum.

I've polished the code a little. I use two bit-fiddling operations that are new in C++ 20, popcount and countr_zero. They are hopefully one-instruction operations on most modern CPUs. I also try out C++ 20 ranges with a class VectorView. It works as a view onto a vector. And finally, I've simplified cm_overlap that worked but was over-complicated by mistake.

Code:
```	#include <iostream>
#include <algorithm>
#include <ranges>
#include <utility>
#include <functional>
#include <string>
#include <array>
#include <vector>
#include <random>

// Range view of a vector
// See std::ranges::view_interface
template <class T>
class VectorView : public std::ranges::view_interface<VectorView<T>> {
using VI = std::vector<T>::iterator;
public:
VectorView(VI begin, VI end) : b(std::move(begin)), e(std::move(end)) {}
VectorView(const std::vector<T>& v) : VectorView(v.begin(), v.end()) {}
VI begin() const {return b;}
VI end() const {return e;}
private:
VI b, e;
};

void test2() {
std::cout << "--- Test 2 ---" << std::endl;

// Types

using Set = uint32_t; //unsigned;	// a bitset
using CM = std::pair<Set,Set>;	// a Cartesian matrix (CM) is a <row-bitset, column-bitset> pair

// Helper functions

auto set_reverse = [](Set s, int n) { // reverse a Set s of n bits
Set r=0;
for (; --n >= 0; s >>= 1) r = (r <<= 1) | s&1;
return r;
};

// iterate the Set s and call the function f with the position of every 1-bit found
auto set_scan = [](Set s, const std::function<void(int)>& f) {
while (s!=0) {
f(std::countr_zero(s)); // count rightmost 0-bits (equals the position of the rightmost 1-bit)
s &= (s-1); // remove rightmost 1-bit of s ((s-1) never underflows)
}
};

auto set_count = [](Set s) { // count the 1-bits of the Set s
return std::popcount(s);
};

auto cm_size = [&] (CM cm) { // determine the size of CM-pair
return set_count(cm.first)*set_count(cm.second);
};

auto cm_greater = [&] (CM cm1, CM cm2) { // is the first CM-pair greater than the second?
return cm_size(cm1) > cm_size(cm2);
};

auto cm_overlap = [] (CM cm1, CM cm2) { // do the CM-pairs overlap? (share at least one row and one column)
return ((cm1.first & cm2.first) != 0 && (cm1.second & cm2.second) != 0);
};

auto cm_string = [&] (CM cm) { // generate a string representation of a CM-pair
std::string s="", t="";
set_scan(cm.first, [&](int i) {s += char('A'+ i);});
set_scan(cm.second, [&](int i) {t += char('a'+ i);});
return s + ":" + t;
};

// The internal representation of the input matrix
// is an array of ROWS rows, each holding a bitset of COLS bits
// (A matrix has the 0 column to the left, whereas a bitset has the 0 bit to the right.
// Therefore the columns of the input matrix are reversed when represented with bitsets).
/*		const int ROWS = 6;
const int COLS = 5;
std::array<Set, ROWS> input_matrix = { // example matrix 2 in post #1
set_reverse(0b01101, COLS),
set_reverse(0b01001, COLS),
set_reverse(0b11000, COLS),
set_reverse(0b11011, COLS),
set_reverse(0b10011, COLS),
set_reverse(0b00111, COLS)
};
*/
/*		const int ROWS = 4;
const int COLS = 4;
std::array<Set, ROWS> input_matrix = { // example matrix in post #10
set_reverse(0b0111, COLS),
set_reverse(0b0111, COLS),
set_reverse(0b0110, COLS),
set_reverse(0b1100, COLS),
};
*/
const int ROWS = 12;
const int COLS = 12;
std::array<Set, ROWS> input_matrix = {}; // random matrix
{	// generate input matrix[ROWS,COLS] with random 0/1 content
const unsigned seed = 42u; // change seed to get another input matrix
std::default_random_engine gen(seed);
std::uniform_int_distribution<Set> distr(0, (1<<COLS)-1); // Note: the bits become Bernoulli distributed.
for (int c=0; c<COLS; ++c) input_matrix[c] = distr(gen);
}

// Algorithm

// find all valid CMs of the input matrix
std::vector<CM> valid_CM;
for (Set r=1; r<(1<<ROWS); ++r) { // the full row powerset
for (Set c=1; c<(1<<COLS); ++c) { // the full column powerset
Set s=(1<<COLS)-1;
set_scan(r, [&](int i) {s &= input_matrix[i];});
if ((s&c) == c) {valid_CM.emplace_back(CM{r,c});}
}
}

std::ranges::sort(valid_CM, cm_greater); // sort valid CMs from larger to smaller CMs

std::vector<VectorView<CM>> view_CM;	// the valid_CM vector is divided into views where each view
{										// represents all CM-pairs of the same size.
auto it_prev = valid_CM.begin();
auto it = it_prev;
while (++it != valid_CM.end()) {
if (cm_size(*it) != cm_size(*it_prev)) {
view_CM.emplace_back(VectorView<CM>{it_prev, it});
it_prev = it;
}
}
if (it_prev != valid_CM.end()) view_CM.emplace_back(VectorView<CM>{it_prev, it});
}

std::cout << "--- All possible CMs in sorted order from larger to smaller ---" << std::endl;
std::cout << "Number of sorted CMs = " << valid_CM.size() << std::endl;
// for (CM cm : valid_CM) std::cout << cm_string(cm) << std::endl; // print sorted CMs

// Print the input matrix
std::cout << std::endl;
std::cout << "--- Input matrix ---" << std::endl;
std::cout << "  ";
for (int c=0; c<COLS; ++c) std::cout << char('a'+ c) << " ";
std::cout << std::endl;
for (int r=0; r<ROWS; ++r) {
std::cout << char('A'+ r) << " ";
Set s = input_matrix[r];
for (int c=0; c<COLS; ++c) {
std::cout << (((s&1) != 0) ? "1" : "0") << " ";
s >>= 1;
}
std::cout << std::endl;
}

std::default_random_engine gen; // random number generator (with default seeding)

const int R = 10;			// number of runs
for (int i=0; i<R; ++i) {

for (VectorView<CM> view : view_CM) std::ranges::shuffle(view, gen); // shuffle the CM-pairs inside each view

// find approximate solution by selecting non-overlapping CMs greedily,
// that is from larger to smaller
std::vector<CM> solution_CM;
for (VectorView<CM> view : view_CM) {	// all views of same-size CM-pairs
for (CM new_cm : view) {			// all CM-pairs in a view
bool no_overlap = true;
for (CM sol_cm : solution_CM) {
if (cm_overlap(sol_cm, new_cm)) {	// compare whether the new CM overlap
no_overlap = false;				// with any CM already in the solution
break;
}
}
// store the new CM if it is not overlapping with any CM already in the solution
if (no_overlap) solution_CM.push_back(new_cm);
}
}

// Print the solution CMs
std::cout << std::endl;
std::cout << "--- Solution CMs exactly covering the input matrix ---" << std::endl;
std::cout << "Number of solution CMs = " << solution_CM.size() << std::endl;
for (CM cm : solution_CM) std::cout << cm_string(cm) << std::endl;

}

std::cout << std::endl;
std::cout << "--- End ---" << std::endl;
std::cout << std::endl;
} // test 2```
Last edited by wolle; August 3rd, 2021 at 10:52 AM.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•

Click Here to Expand Forum to Full Width

Featured