In our MFC application, one of our dll will call "tolower" frequently. Here is the sample code:

void foo(char c)
{
for(int i=0;i<200000;++i)
{
aaa += tolower(c);
}
}

I found that there is a very stranger behavior:
1) The tolower function only slow the performance signifcantly when foo is called in the thread of the main process.
if we fork a subprocess, then create a thread to call the function foo, it works fine.
2) The release version is even much slower than the debug version.

I have tested toupper, tolower, isalpha, islower. The same phenomenon will be observed.
But if I change to function sin, it works as fast as running with subprocess

I am using VS2012, MFC.