October 9th, 2009, 10:22 PM
How to create and manipulate Terabyte size Arrays with Win32API
If you are like me, working on serious scientifc projects, then the chances are you are required to deal with huge data, data that does not fit into your tiny 2 or 4Gb RAM. If you are looking for a way of creating and accessing very large arrays, arrays that can handle content in the order of Tera Bytes, then probably you might find a technique I have developed sometime back, described at http://gpalem.web.officelive.com/TerabyteArrays.html, useful.
Its a simple Win32 File mapping concept, but very powerful when used in the right way with C++ templates. File mapping is the association of a file's contents with a portion of the virtual address space of a process. It allows the process to work efficiently with a large data files without having to map the whole file into memory. Processes read from and write to the file view using pointers, just as they would with dynamically allocated memory. This improves efficiency because the file resides on disk, but the file view resides in memory, the page-in and page-out happening seamlessly behind the scenes.
In a typical Win32 environment, applications have 4 gigabyte (GB) of virtual address space available. The virtual address space is divided so that 2 GB is available to the application and the other 2 GB is available only to the system. The size of a file mapping object that is backed by a named file is limited by disk space. The size of a file view is limited to the largest available contiguous block of unreserved virtual memory. This is at most 2 GB minus the virtual memory already reserved by the process.
To create a huge array, create multiple temporary files, each of _MaxFileSize, _MaxFileSize being any convenient maximum file size limit such as 2GB, and access their content by mapping and unmapping portions of them to the main memory as required. Creating the temporary files and mapping/unmapping them can be taken care by the CreateFile, CreateFileMapping, MapViewOfFile, and UnmapViewOfFile API. A simple implementation of this and sample code can be found at: http://gpalem.web.officelive.com/TerabyteArrays.html
Creator of CFugue, the world's first and only high level Music Note Programming Library for C/C++
October 12th, 2009, 02:36 AM
Re: How to create and manipulate Terabyte size Arrays with Win32API
File mapping pushes all your work reading and writing the file to the system, and usually gives you poor efficiency.
Tags for this Thread
Click Here to Expand Forum to Full Width
This is a CodeGuru survey question.