-
September 22nd, 2020, 04:37 AM
#1
Looking for SSE instruction
Hi,
I'm looking for an SSE instruction, which I suspect is available somewhere in its instruction set.
If I have 4 arrays, A, B, C and D:
[A3] [B3] [C3] [D3]
[A2] [B2] [C2] [D2]
[A1] [B1] [C1] [D1]
[A0] [B0] [C0] [D0]
I'm looking for an instruction that merges these arrays into a single array:
[A0][B0][C0][D0] [A1][B1][C1][D1] [A2][B2][C2][D2] [A3][B3][C3][D3]
Does such an instruction exist?
Or is the best approach to simply do something along the lines of:
Code:
int target[16];
for (int i = 0; i < 4; i++)
{
target[i * 4] = A[i];
target[i * 4 + 1] = B[i];
target[i * 4 + 2] = C[i];
target[i * 4 + 3] = D[i];
}
-
September 22nd, 2020, 06:24 AM
#2
Re: Looking for SSE instruction
IMHO, "simply do something along the lines" is what you need.
Victor Nijegorodov
-
September 22nd, 2020, 06:35 AM
#3
Re: Looking for SSE instruction
Is performance an issue here? You'd need to measure, but you might get better performance from having 4 loops - one for each of A, B, C, D But for 16 items, you'd not notice/easily measure any difference. Try it using godbolt.
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.6.5)
-
September 22nd, 2020, 08:52 AM
#4
Re: Looking for SSE instruction
Originally Posted by 2kaud
Is performance an issue here? You'd need to measure, but you might get better performance from having 4 loops - one for each of A, B, C, D But for 16 items, you'd not notice/easily measure any difference. Try it using godbolt.
Performance is an issue, however, I will indeed just measure different approaches.
I was just wondering if perhaps there was a dedicated function for this.
-
September 22nd, 2020, 10:54 AM
#5
Re: Looking for SSE instruction
If you're using VS, have at look at https://docs.microsoft.com/en-us/cpp...s?view=vs-2019 and the links to manufacture specific ones.
There might be something here of interest.
All advice is offered in good faith only. All my code is tested (unless stated explicitly otherwise) with the latest version of Microsoft Visual Studio (using the supported features of the latest standard) and is offered as examples only - not as production quality. I cannot offer advice regarding any other c/c++ compiler/IDE or incompatibilities with VS. You are ultimately responsible for the effects of your programs and the integrity of the machines they run on. Anything I post, code snippets, advice, etc is licensed as Public Domain https://creativecommons.org/publicdomain/zero/1.0/ and can be used without reference or acknowledgement. Also note that I only provide advice and guidance via the forums - and not via private messages!
C++23 Compiler: Microsoft VS2022 (17.6.5)
-
September 22nd, 2020, 11:43 AM
#6
Re: Looking for SSE instruction
Originally Posted by rmirani
Performance is an issue
You could replace the loop with 16 straight assignments like,
Code:
target[0] = A[0];
target[1] = B[0];
target[2] = C[0];
target[3] = D[0];
target[4] = A[1];
target[5] = B[1];
target[6] = C[1];
target[7] = D[1];
// ... 8 more ....
An optimizing compiler will probably "unroll" the loop like this anyway but if you do it you know it gets done for sure.
Last edited by wolle; September 22nd, 2020 at 11:51 AM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|