In order to use the aligned SSE instruction, the data must be aligned.
I have aligned the array using __declspec(align(16)) int array[16][32],

I am loading data using "_mm_load_si128((__m128i*)&array[0][0]);" In this case the data is aligned to 16 byte,
but the problem is when i am loading the array from next column like "_mm_load_si128((__m128i*)&array[0][1])" then program get crashed because of accessing the unaligned memory.

So here the question is, As i have aligned the array to 16 byte and still i need to align every element of array as 16-bit aligned in order to access the aligned version of load and store instruction.

Can any one please tell me, is there any other way to solve this problem.
Please revet if you need more details.

Thanking you,
Mandar