Trying to garner a feel for the issue with arrays and contiguous block of memory. So now consider

Code:
// definition of class Matrix
template <class T> class Matrix
{
   protected:
      // Data Fields
      int rowNum, colNum;
      Vector<T> *mat;

  // more stuff
};

template <class T> Matrix<T>::Matrix(int r, int c)
   : rowNum(r), colNum(c), mat(new Vector<T>[r])
{
   assert(mat != NULL);
   for (int i=0; i<r; i++) mat[i].resize(c);
}

//  Here's resize that's being called in Matrix constructor.

template <class T> void Vector<T>::resize(int length)
{
   int i;
   T zero(0);
   T *newData = new T[length]; assert(newData != NULL);

   if (length <= size)
      for (i=0; i<length; i++) newData[i] = data[i];
   else
   {
      for (i=0; i<size; i++)   newData[i] = data[i];

      for (i=size; i<length; i++) newData[i] = zero;
   }
   delete[] data;
   size = length;
   data = newData;
}
Further consider a 2 x 2 matrix type T. The implementation allocates 2 rows then resizes said rows by 2 T.

This is where things start to go wrong.
"mat(new Vector<T>[r])" (with r=2) allocates enough space to store 2 objects of type Vector<T>. If Vector<T> is implemented as a resizable array, then enough space is allocated for 2 * (length indicator + pointer). Assuming a
typical 32-bit platform, this will be 2*(4+4) = 16 octets.

The calls to resize() in the Matrix<T> constructor make that the pointer members of the corresponding Vector<T> objects each point at a fresh piece of memory that is large enough to hold 2 objects of type T.

The memory is in now way contiguous. In essence the rows and coloumns could look like

Code:
              (2* T)                   (2* T)
 +---------+---------+        +---------+---------+
 |@0x300|@0x304|        |@0x220|@0x224|
 |             |            |        |             |             |
 |             |            |        |             |             |
 +---------+---------+        +---------+---------+
Assumign a 32 bit machine where the sizeof (int) is 4 bytes.

I could envision the performance loss here ( as opposed to allocating the length in one shot) but really is it even worth cosidering?