|
-
August 12th, 2015, 01:03 AM
#1
xgetbv
Using MSVC8, I'm compiling some code that was originally written for Linux. It contains a conventional 'C' function called _xgetbv which looks like this:-
Code:
static uint64_t
_xgetbv (uint32_t xcr)
{
uint32_t eax, edx;
__asm__ volatile ("xgetbv" : "=a" (eax), "=d" (edx) : "c" (xcr));
return (static_cast<uint64_t>(edx) << 32) | eax;
}
I'm guessing that the intention is either to set (or return) the value from some CPU register. However, the above syntax isn't recognised by MSVC. Does anyone know how I could convert this to something which MSVC8 understands?
"A problem well stated is a problem half solved.” - Charles F. Kettering
-
August 12th, 2015, 07:06 AM
#2
Re: xgetbv
for VS (needs 2010 or later)
All of the AVX extentions are already available as intrinsincs in the compiler and so you do not need to define or implement them yourself.
this is particularly handy since if you use AVX you probably also want to be going x64, and x64 doesn't support __asm blocks.
used right, the intrinsics will allow you to do all of your MMX/SSE/SIMS/AVX work without needing native assembly (or __asm blocks) and you'll get the same kind of performance.
That said, if you have highly parallellizable walculations, a pure asm implementation could beat the pants off a C/C++ solution, but it'll take some serious effort to go there.
Anyways. as I said, for VS 2010 or later:
#include <intrin.h>
you can then just use _xbgetv() defined in the header as
Code:
extern unsigned __int64 __cdecl _xgetbv(unsigned int ext_ctrl_reg);
the only currently supported/defined value for the control register index is _XCR_XFEATURE_ENABLED_MASK (or 0)
if you need this in a VS older than VS2010, then you're out of luck. __asm doesn't support the instruction and there's no intrinsic, you can emit the bytecode directly, but this'll be a regular function rather than the more optimal intrinsic.
Code:
unsigned __int64 __cdecl XGETBV(unsigned int ext_ctrl_reg)
{
__asm {
mov ecx, [ext_ctrl_reg]
__asm _emit 0x0f __asm _emit 0x01 __asm _emit 0xd0 /*xgetbv*/
}
}
Last edited by OReubens; August 12th, 2015 at 07:18 AM.
-
August 13th, 2015, 03:32 AM
#3
Re: xgetbv
Thanks O'Reubens. For MSVC8 I used your last example and it compiled okay. Just one question though... will it return a value? The original code did this:-
Code:
return (static_cast<uint64_t>(edx) << 32) | eax;
Would I need something similar with your code?
"A problem well stated is a problem half solved.” - Charles F. Kettering
-
August 13th, 2015, 07:23 AM
#4
Re: xgetbv
xgetbv returns the result in edx:eax
which is exactly the expected return profile for a C function returning an uint64, so no extra action is needed.
Side Note: be aware that because of using _emit to generate bytecode, the compiler can't know you used eax/edx, and so the compiler will assume it can still use those 2 registers for it's own purposes. If you use the solution with _emit, you HAVE TO put the sourcecode in a separate .cpp and compile that .cpp without optimzation (it can have other stuff in there, but the entire .cpp needs to be compiled with optimisations off).
This means the compiler is forced to generate a call to the function rather than trying to inline it (and make wrong assumptions about usage of eax & edx.
I sort of assumed you were aware of this pitfall when dealing with __emit, but it just occured to me that most prople probably don't know this at all 
it's a nasty one since debug builds typically have optimisations off so it would work, and could cause weird behaviour in release builds only.
-
August 13th, 2015, 09:51 AM
#5
Re: xgetbv
 Originally Posted by OReubens
If you use the solution with _emit, you HAVE TO put the sourcecode in a separate .cpp and compile that .cpp without optimzation (it can have other stuff in there, but the entire .cpp needs to be compiled with optimisations off).
This means the compiler is forced to generate a call to the function rather than trying to inline it
Thanks again. Alternatively, could I declare the function using __declspec(noinline) ?
"A problem well stated is a problem half solved.” - Charles F. Kettering
-
August 14th, 2015, 07:14 AM
#6
Re: xgetbv
no, not inlining isn't strong enough. the compiler can still make assumptions about eax/edx not being changed by the call to the function. So it could do something like.
[/code]
mov eax, 14
... more code using eax having value 14
call xgetbv // compiler assumes no registers changed other than ecx, and any registers it altered and didn't preserve in the function prolog and epilog the compiler added.
...
.. use eax, still assuming it's 14.
[/code]
So it needs to be compiled with optimizations off (that includes disabling link time code generation for at least that file).
and it needs to be in a separate compilation unit. so the compiler can't make assumptions about what happens inside that function.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|