Click to See Complete Forum and Search --> : Handle Problem - Not Quite Understanding What's Going On


wdolson
June 4th, 2010, 12:11 AM
I've been stuck on this problem for several days. The behavior was consistent with an out of bounds array error, but I'm even more confused now. Hopefully I can explain this in a way someone else can understand.

This system has a number of different EXEs that run at the same time and share information. (Note this is an ancient program, I found comments in the code dated 1986.)

There is a block of shared memory allocated when you open a certain type of file. Later on it tries to get that memory for an operation and it fails. It looks like it succeeds a random number of times, then fails.

I ran Application Verifier. It doesn't always trigger an error, but when it does, the log says this:


- (http://www.codeguru.com/forum/#) <avrf:logEntry Time="2010-06-03 : 18:09:08" LayerName="Memory" StopCode="0x61A" Severity="Error">
<avrf:message>Unmapping memory block with invalid start address.</avrf:message>

<avrf:parameter1>4090000 - Address of memory block being unmapped.</avrf:parameter1>

<avrf:parameter2>3d30000 - Expected correct memory block address.</avrf:parameter2>





The 3d30000 is the address returned from the initialization. 4090000 is pointing off into the ether.

Here is the initialization (I removed some code that wasn't pertinent):

hFamiliesLoaded = ttw_AllocSharedMemory(FamilyTableMemMapName,0x40 * sizeof(DMLoadedFamily), 1);
memFamiliesLoaded = ttw_MapSharedMemory(hFamiliesLoaded, 0x40*sizeof(DMLoadedFamily));

typedef struct LoadedFamilyStruct
{
char lf_name[132]; // Name of this Family,
DMRecordDir lf_dirbuf[0x1000]; /* Directory buffer */
BYTE lf_dirbufOK; /* TRUE if lf_dirbuf is initialized */
BYTE lf_locked; // Directory lock to prevent multiple use
WORD lf_dirmark; /* Marker into the directory */
BYTE lf_famtype; /* File type number - index into FamDescList*/
BYTE lf_fidnum; /* fid number of this entry */
WORD lf_DirEntries[0x1000]; /* List of entries in dirbuf */
char lf_logname[132]; /* Name of this Trace Family.*/
} DMLoadedFamily;

typedef struct
{
DWORD size;
DWORD flags;
char* viewAddress;
} SharedMemoryHeader;

HANDLE ttw_AllocSharedMemory(char* NameTag, DWORD msize, BOOL zeroMem)
{
HANDLE hsm;
char* addr;

if (!(hsm = CreateFileMapping((HANDLE)0xffffffff, NULL, PAGE_READWRITE,0, msize+sizeof(SharedMemoryHeader), NameTag)))
return NULL;

if (!(addr = MapViewOfFile(hsm, FILE_MAP_ALL_ACCESS, 0, 0, 0)))
{
CloseHandle(hsm);
return NULL;
}

memset(addr, 0, zeroMem ? msize + sizeof(SharedMemoryHeader) : sizeof(SharedMemoryHeader));

((SharedMemoryHeader*)addr)->size = msize;

if (zeroMem)
((SharedMemoryHeader*)addr)->flags |= SHARED_MEMORY_ZEROMEM;

UnmapViewOfFile(addr);

return hsm;
}

char* ttw_MapSharedMemory(HANDLE hsm, DWORD viewSize)
{
char* addr;

if (!hsm)
return NULL;


// if memory has been locked, wait til it is free
while(1)
{
if (!(addr = MapViewOfFile(hsm, FILE_MAP_ALL_ACCESS, 0, 0, viewSize)))
return NULL;

if ( ((SharedMemoryHeader*)addr)->flags & SHARED_MEMORY_LOCKED )
{
UnmapViewOfFile(addr);
Sleep(20);
}
else
break;
}

((SharedMemoryHeader*)addr)->flags |= SHARED_MEMORY_LOCKED;
((SharedMemoryHeader*)addr)->viewAddress = addr + sizeof(SharedMemoryHeader);

return addr + sizeof(SharedMemoryHeader);
}


When the code is trying to get an address, it goes south in the following functions:


BOOL UnLockDirMemory(HGLOBAL hMem)
{
char* view;
BOOL memLocked;

hMem = ttw_accessLM_SharedHandle(hMem, SM_HANDLE_FROM_LIST_MANAGER,!SM_DELETE_SOURCE_HANDLE);

// get the locked memory view
view = ttw_GetLockedLocalView(hMem, &memLocked);
CloseHandle(hMem);

// can't unlock if memory wasn't locked
if (!memLocked)
return FALSE;

return ttw_UnMapSharedMemory(view);
}

HANDLE ttw_accessLM_SharedHandle(HANDLE smh, BOOLEAN fromListManager, BOOLEAN CloseSourceHandle)
{
DWORD pId, err=FALSE, flags;
HANDLE targetHandle=NULL, process1=NULL, process2=NULL;

if (!smh)
return NULL;

pId = GetCurrentProcessId();
process1 = OpenProcess(PROCESS_DUP_HANDLE, FALSE, pId);

if (process1)
{
GetWindowThreadProcessId((HWND)ttw_GetLMInfo(1), &pId);
process2 = OpenProcess(PROCESS_DUP_HANDLE, FALSE, pId);
}

flags = DUPLICATE_SAME_ACCESS;
if (CloseSourceHandle)
flags |= DUPLICATE_CLOSE_SOURCE;

if (process2)
if (fromListManager)
{
// converting a shared handle created and sent from LM
if (DuplicateHandle(process2,smh,process1,&targetHandle,0,FALSE,flags))
return targetHandle;
}
else
{
// converting a shared handle created here for LM
if (DuplicateHandle(process1,smh,process2,&targetHandle,0,FALSE,flags))
return targetHandle;
}

return NULL;
}

char* ttw_GetLockedLocalView(HANDLE hMem, BOOL* MemLocked)
{
char *addr, *view;


if (MemLocked)
*MemLocked = FALSE;

if (!hMem)
return NULL;

if (!(addr = MapViewOfFile(hMem, FILE_MAP_ALL_ACCESS, 0, 0, sizeof(SharedMemoryHeader))))
return NULL;


if (MemLocked)
if (((SharedMemoryHeader*)addr)->flags & SHARED_MEMORY_LOCKED)
*MemLocked = TRUE;

view = ((SharedMemoryHeader*)addr)->viewAddress;

UnmapViewOfFile(addr);

return view;
}


memFamiliesLoaded is a global with the address to the buffer. I'm not quite sure why view isn't just set to memFamiliesLoaded. With all the complexity in getting view when trying to read the memory, I'm not sure if I'm missing something.

Edit: It looks like I was wrong with the code that allocates the block in question. The block in question is allocated elsewhere, I haven't found that allocation code yet. I think the general principle is the same though. When I run the code through the debugger before it goes south, the address recovered it always the same, then it eventually returns a bad address. At initialization, I expect there is another structure saved with a global pointer like memFamiliesLoaded.

wdolson
June 4th, 2010, 05:29 AM
Issue appears to be resolved. It was a dual processor problem. This is a suite of programs that communicate with one another. It ran fine in the 16 bit version because 16 bit programs all run under one process (if I remember correctly). When we converted to 32 bit, we now had a separate process for each program and when they were spread out across processors the ancient hand shaking mechanism between programs appears to have been hanging.

I set processor affinity to processor 0 for all programs and ran several trials with no crashes.

MrViggy
June 4th, 2010, 04:17 PM
That's not a great solution. Essentially, this just hides the problem. You really should try to beef up the synchronization.

Viggy

wdolson
June 4th, 2010, 06:05 PM
Most of the code that has the problem will be going away in six months. It isn't cost effective to rebuild the synchronization system and then have to rewrite it again in six months.

This package has it's own file system that was created to get around the 8.3 naming limitations in DOS. We'll be getting rid of that and replacing it with the native Win32 file system. That's not immediately up on the schedule though.

There are a few places where it does handshaking that isn't related to the file system, but I will deal with those when I redo the file system.

Bill