-
July 6th, 2012, 10:08 AM
#1
Efficient way to write huge boost dynamic_bitset vector to a file and read it back
I have a huge vector of boost dynamic_bitset. I want to write the dynamic_bitset vector to a file and later read the file back into a dynamic_bitset vector. Is the memory for dynamic_bitset allocated as a contiguous block of memory (so that I can write the entire vector at once without traversing) ?
The size of the bitset vector is in order of millions. So I am looking for an efficient way to write them to a file instead of iterating through the elements.
I converted the dynamic_bitset to a string and then wrote the string to a file. Later read the string from the file and converted it back to dynamic_bitset.
Below is the code I wrote in C++ using Visual Studio:
Code:
#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <string>
#include <boost/dynamic_bitset.hpp>
using namespace std;
int main(int argc, char* argv[])
{
// Initializing a bitset vector to 0s
boost::dynamic_bitset<> bit_vector(10000000);
bit_vector[0] = 1; bit_vector[1] = 1; bit_vector[4] = 1; bit_vector[7] = 1; bit_vector[9] = 1;
cout<<"Input :"<<bit_vector<<endl; //Prints in reverse order
//Converting dynamic_bitset to a string
string buffer;
to_string(bit_vector, buffer);
//Writing the string to a file
ofstream out("file", ios::out | ios::binary);
char *c_buffer = (char*)buffer.c_str();
out.write(c_buffer, strlen(c_buffer));
out.close();
//Find length of the string and reading from the file
int len = strlen(c_buffer);
char* c_bit_vector = new char(len+1);
ifstream in;
in.open("file", ios::binary);
in.read(c_bit_vector, len);
c_bit_vector[len] = 0;
in.close();
//Converting string back to dynamic_bitset
string str2 = c_bit_vector;
boost::dynamic_bitset<> output_bit_vector( str2 );
cout<<"Output:"<<output_bit_vector<<endl;
system("PAUSE");
return 0;
}
But even this method, storing it as a string, takes a long time to write to the file. And when I try to read back from the file into the string, I get an "unhandled access violation exception". Is there a more efficient way to implement the same?
-
July 6th, 2012, 10:25 AM
#2
Re: Efficient way to write huge boost dynamic_bitset vector to a file and read it bac
Originally Posted by SyncMr
Is the memory for dynamic_bitset allocated as a contiguous block of memory (so that I can write the entire vector at once without traversing) ?
This is implementation details that you should not rely on. Besides, even if it were a contiguous block of memory, you would need to traverse through it to write it.
Originally Posted by SyncMr
I converted the dynamic_bitset to a string and then wrote the string to a file. Later read the string from the file and converted it back to dynamic_bitset.
Instead of to_string, have you tried to_block_range? Then you use from_block_range to read it. Granted, this will only reduce your storage by a constant factor, but unless you can exploit special characteristics of your data (i.e., compress it), I don't see how you can do any better.
-
July 9th, 2012, 06:16 AM
#3
Re: Efficient way to write huge boost dynamic_bitset vector to a file and read it bac
My suggestion is to go with boost serialization.
Code:
#include <fstream>
#include <boost/dynamic_bitset.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
//////////////////////////////////////////////////////////////////////////
#include <cstddef>
#include <boost/config.hpp>
#include <boost/serialization/split_free.hpp>
#include <boost/dynamic_bitset.hpp>
#include <boost/serialization/vector.hpp>
namespace boost
{
namespace serialization
{
// --------------------------------------------------------------------
template < class Archive , typename Block , typename Allocator >
inline void save( Archive & ar , boost::dynamic_bitset< Block , Allocator > const & t , const unsigned int /* version */ )
{
// Serialize bitset size
std::size_t size = t.size();
ar << size;
// Convert bitset into a vector
std::vector< Block > v( t.num_blocks() );
to_block_range( t, v.begin() );
// Serialize vector
ar & v;
}
// --------------------------------------------------------------------
template < class Archive , typename Block , typename Allocator >
inline void load( Archive & ar, boost::dynamic_bitset< Block , Allocator > & t, const unsigned int /* version */ )
{
std::size_t size;
ar & size;
t.resize( size );
// Load vector
std::vector< Block > v;
ar & v;
// Convert vector into a bitset
from_block_range( v.begin() , v.end() , t );
}
// --------------------------------------------------------------------
template <class Archive, typename Block, typename Allocator>
inline void serialize( Archive & ar, boost::dynamic_bitset<Block, Allocator> & t, const unsigned int version )
{
boost::serialization::split_free( ar, t, version );
}
// --------------------------------------------------------------------
} // namespace serialization
} // namespace boost
//////////////////////////////////////////////////////////////////////////
using namespace std;
int main(int argc, char* argv[])
{
boost::dynamic_bitset<> bs1(10000000);
bs1[0] = 1; bs1[1] = 1; bs1[4] = 1; bs1[7] = 1; bs1[9] = 1;
std::string file("filename");
{ // Saving...
std::ofstream ofs( file );
boost::archive::binary_oarchive oa( ofs );
oa << bs1;
}
boost::dynamic_bitset<> bs2;
{ // Loading...
std::ifstream ifs( file );
boost::archive::binary_iarchive ia( ifs );
ia >> bs2;
}
system("PAUSE");
return 0;
}
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|