Read binary file with line delimeter - Page 2
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 2 of 11 FirstFirst 12345 ... LastLast
Results 16 to 30 of 156

Thread: Read binary file with line delimeter

  1. #16
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    2- I wanted to replace with a variable the delimiter string, but for some reason the error says that is expected 2 parameters and provided 3 (this if I use the line in blue and replace "FF77" with Sep in all places).
    Replace Sep in the find() with Sep.c_str()
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  2. #17
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    You can simply the function slightly and also make it more general so that it can be used to seperate fields given any delimeter. A possible way would be:

    Code:
    vector<string> getFields(const string& FSepStr, const string& Sep = "FF77");
    
    vector<string> getFields(const string& FSepStr, const string& Sep)
    {
    size_t pos = 0;
    size_t LastFS;
    vector<string> V;
     
    	while ((LastFS = FSepStr.find(Sep.c_str(), pos + 1, Sep.size())) != string::npos)
    		if ((LastFS = FSepStr.find(Sep.c_str(), (pos = FSepStr.find(Sep.c_str(), LastFS - 1, Sep.size())) + 1, Sep.size())) != string::npos)
    			V.push_back(FSepStr.substr(pos + 4, LastFS - pos - Sep.size()));        
    
    	return V;
    }
    so that you can specify the separators in the call to getFields if it is not "FF77".

    PS You can do this function with just one .find in total rather than the 5 in your original code - as they say in all the best books, I'll leave that as an exercise!
    Last edited by 2kaud; October 10th, 2013 at 06:00 AM.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  3. #18
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Hello Paul,

    Thanks for the correction in for loop. I've changed.

    Hello 2kaud,

    Thanks for the suggestion to fix the Sep with find and for the simplification of the function, It took me some time to understand the if statement jeje.

    In what I have issues now is how to return the position of last field separator to the main function. If I use the code as below I receive error, but If I delete the text in red, works fine.

    I need to know what is the last position to know the offset that I need to put to read the next 1000 bytes.

    By the way, which function can I use to read the binary file in chunks of 1000 bytes that let me put an offset and size of chunk? something like read(file, offset, 1000), so in the first "read" offset would be 0, and then, offset will be the value of LastPos up to the end.

    Code:
    #include <string>
    #include <vector>
    #include <iostream>
    
    using namespace std;
    
    vector<string> getFields(const string& FSepStr, const string& Sep = "FF77") 
    {      
    int i = 0;  
    size_t pos = 1;  
    size_t LastFS; 
    size_t LastPos;
    vector<string> V;
     
     while ((LastFS=FSepStr.find(Sep.c_str(),pos+1,Sep.size()))!=string::npos)       
          if ((LastFS=FSepStr.find(Sep.c_str(),(pos=FSepStr.find(Sep.c_str(),LastFS-1,Sep.size()))+1,Sep.size()))!=string::npos){
            V.push_back(FSepStr.substr(pos+Sep.size(), LastFS - pos - Sep.size()));
            LastPos=LastFS; //Storing position of last Field Separator
          } 
     return V, LastPos; 
    }
    int main()
    {
        const string InputStr = "Test1FF77Test2FF77Test3FF77Some textFF77other textFF772";
        vector<string> sVector;
        sVector = getFields(InputStr);
        size_t LastPos;
        
        for (int i=0;i<sVector.size();i++){
          cout<<"V["<<i<<"]="<<sVector[i]<<endl;
        }
        //cout <<"Last FSep: "<<LastPos<<endl; 
    }
    Thanks for the patience and great help!

    Regards

  4. #19
    Join Date
    Apr 1999
    Posts
    27,418

    Re: Read binary file with line delimeter

    Quote Originally Posted by Philidor View Post
    I need to know what is the last position to know the offset that I need to put to read the next 1000 bytes.
    In C++, you can only return 1 entity. That statement with two values doesn't do what you think it does. It invokes the comma operator (do a google on this operator) -- this results in one value being returned.

    If you want to return multiple values, read up on structs. Or in your case, since it is only two values, read up on std::pair.
    Code:
    #include <map>
    #include <vector>
    
    typedef std::vector<std::string> StringVector;
    typedef std::pair<StringVector, size_t> ParseInfo;  
    
    ParseInfo getFields(const std::string& FSepStr, const std::string& Sep = "FF77")
    {
        ParseInfo pInfo;
        StringVector& V = pInfo.first;
        //...
        pInfo.second = LastPos;
        return pInfo;
    }
    The pair holds two items, first and second. So one entity is still being returned, but it contains two items.

    Regards,

    Paul McKenzie

  5. #20
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    With just returning the last pos info, you'll have a problem with dealing with multiple blocks. Your function doesn't return the data from the start of the block to the first delimeter and the data from the end of the last delimeter to the end of the block. So you haven't got the data to concaternate the end of one block to the beginning of the next. Your function needs to return data before the first delimeter and data after last delimeter.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  6. #21
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    This version of getFields returns also the data before the first delimeter and after the last delimeter (with just 1 find!). You don't need to return the position of last delimeter.

    Code:
    vector<string> getFields(const string& commaStr, const string& sep = "FF77");
    
    vector<string> getFields(const string& commaStr, const string& sep)
    {
    vector<string> V;
    
    size_t	pos1;
    
    	for (size_t pos2 = 0; pos1 = pos2, (pos2 = commaStr.find(sep.c_str(), pos1, sep.size())) != string::npos; pos2 += sep.size())
    			V.push_back(commaStr.substr(pos1, pos2 - pos1));
    
    	V.push_back(commaStr.substr(pos1, commaStr.size() - pos1));
    
    	return V;
    }
    So all that needs to be done is to loop reading blocks until no more blocks. Keep the results of the current and previous blocks and append the first element of the current block to the last element of the previous block.

    There is one problem, however. What happens if the delimeter spans two blocks? ie FF last char of one block and 77 first char of the next block? In this case this whole method won't work properly.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  7. #22
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Hello Paul and 2kaud,

    Thanks for the help one more time.

    @2kaud
    Just fine the way with only 1 find, I really thougth how to do it, but I didn't get it. Regarding the function is fine that now get first and last element, that fixes many issues. Thank you.

    But imho, to fix when delimeters apans 2 block, I need to store the position of last delimiter, so in that way I could use that position as origin (offset) of the next block of 1000 bytes to read.

    Paul and 2kaud,

    In this way, I'm trying to combine Paul's suggestion with 2kaud's last function to get position of last delimiterr, but it seems I'm not doing it in correct way. I get errors in all lines in red.
    Code:
    #include <map>
    #include <string>
    #include <vector>
    #include <iostream>
    
    using namespace std;
    
    typedef vector<string> sVector;
    typedef pair<sVector, size_t> ParseInfo;
    
    ParseInfo getFields(const string& FSepStr, const string& sep = "FF77")
    {
    ParseInfo pInfo;
    typedef vector<string> V;
    sVector& V = pInfo.first;
    size_t	pos1;
    	for (size_t pos2 = 0; pos1 = pos2, (pos2 = FSepStr.find(sep.c_str(), pos1, sep.size())) != string::npos; pos2 += sep.size()){
                    V.push_back(FSepStr.substr(pos1, pos2 - pos1));
                    pInfo.second = pos2;
            }
                    V.push_back(FSepStr.substr(pos1, FSepStr.size() - pos1));  //Return element after last FSep
    return pInfo;
    }
    int main()
    {
        const string InputStr = "Test1FF77Test2FF77Test3FF77Some textFF77other textFF7";
        
        vector<string> sVector;  
        sVector = getFields(InputStr);
        
        for (int i=0;i<sVector.size();i++){
          cout<<"V["<<i<<"]="<<sVector[i]<<endl;
        }   
        //cout <<"Last FSep: "<<LastPos<<endl; 
    }
    Thanks again for help so far

  8. #23
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    This is how you might do it

    Code:
    #include <string>
    #include <vector>
    #include <iostream>
    
    using namespace std;
    
    typedef vector<string> sVector;
    typedef pair<sVector, size_t> ParseInfo;
    
    ParseInfo getFields1(const string& FSepStr, const string& sep = "FF77")
    {
    size_t	pos1;
    sVector	V;
    ParseInfo pInfo;
    
         for (size_t pos2 = 0; pos1 = pos2, (pos2 = FSepStr.find(sep.c_str(), pos1, sep.size())) != string::npos; pos2 += sep.size())
    		V.push_back(FSepStr.substr(pos1, pos2 - pos1));
    
         V.push_back(FSepStr.substr(pos1, FSepStr.size() - pos1));  //Return element after last FSep
         pInfo.first = V;
         pInfo.second = pos1;
         return pInfo;
    }
    
    int main()
    {
    const string InputStr = "Test1FF77Test2FF77Test3FF77Some textFF77other textFF7";
        
    ParseInfo	pi;
    sVector sv;
    
    	pi = getFields1(InputStr);
        
    	for (int i = 0; i < pi.first.size(); i++){
    		cout << "V[" << i << "]=" << pi.first[i] << endl;
    	}
    
    	cout << "Last FSep: " << pi.second << endl;
    	return 0;
    }
    It returns the position of the first char past the end of the last full delimeter. Using your test string, this outputs

    Code:
    V[0]=Test1
    V[1]=Test2
    V[2]=Test3
    V[3]=Some text
    V[4]=other textFF7
    Last FSep: 40
    The problem is v[4]. This contains FF7 at the end. It might be any combination of F, FF or FF7. The only way you're going to know if this is part of a delimeter or some other valid text is to parse v[4] together with v[0] of the next block, which might start ith F77, 77 or 7.
    Last edited by 2kaud; October 11th, 2013 at 04:22 PM.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  9. #24
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    You don't need to return the position of the last delimeter. A possible way of parsing the blocks is

    Code:
    #include <string>
    #include <vector>
    #include <iostream>
    
    using namespace std;
    
    typedef vector<string> sVector;
    
    sVector getFields(const string& FSepStr, const string& sep = "FF77")
    {
    size_t	pos1;
    sVector	V;
    
    	for (size_t pos2 = 0; pos1 = pos2, (pos2 = FSepStr.find(sep.c_str(), pos1, sep.size())) != string::npos; pos2 += sep.size())
    		V.push_back(FSepStr.substr(pos1, pos2 - pos1));
    
    	V.push_back(FSepStr.substr(pos1, FSepStr.size() - pos1));  //Return element after last FSep
    	return V;
    }
    
    void output(const sVector& sv)
    {
    static int cnt = 0;
    
    	for (size_t i = 0; i < sv.size(); i++)
    		cout << "V[" << cnt++ << "]=" << sv[i] << endl;
    }
    
    int main()
    {
    const string InputStr1 = "Test1FF77Test2FF77Test3FF77Some textFF77other textFF";
    const string InputStr2 = "77Test4FF77Test5FF77Test6FF77Some text7FF77other text8";
        
    sVector sv1,
    	sv2;
    
    string elem;
    
    	sv1 = getFields(InputStr1);
    	sv2 = ((elem = sv1[sv1.size() - 1]) != "") ? getFields(elem + InputStr2) : getFields(InputStr2);
    	sv1.pop_back();
    
    	output(sv1);
    	output(sv2);
       
    	return 0;
    }
    This gives the output

    Code:
    V[0]=Test1
    V[1]=Test2
    V[2]=Test3
    V[3]=Some text
    V[4]=other text
    V[5]=Test4
    V[6]=Test5
    V[7]=Test6
    V[8]=Some text7
    V[9]=other text8
    which is as required.
    Last edited by 2kaud; October 12th, 2013 at 03:28 PM.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  10. #25
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    This is a possible way to process multiple blocks

    Code:
    int main()
    {
    sVector	blocks;
    
    	blocks.push_back("Test1FF77Test2FF77Test3FF77Test4FF77Test5FF");
    	blocks.push_back("77Test6FF77Test7FF77Test8FF77Test9FF77Test10F");
    	blocks.push_back("F77Test11FF77Test12FF77Test13FF77Test14FF77Test15FF7");
    	blocks.push_back("7Test16FF77Test17FF77Test18FF77Test19FF77Test20");
    	blocks.push_back("FF77Test21FF77Test22FF77Test23FF77Test24FF77Test25FF77");
    	blocks.push_back("Test26FF77Test27FF77Test28FF77Test29FF77Test30");
    
    string elem;
    
    sVector sv1 = getFields(blocks[0]);
    
    	for (size_t b = 1; b < blocks.size(); b++) {
    		sVector sv2 = ((elem = sv1[sv1.size() - 1]) != "") ? getFields(elem + blocks[b]) : getFields(blocks[b]);
    		sv1.pop_back();
    		output (sv1);
    		sv1 = sv2;
    	}
    
    	output(sv1);
       
    	return 0;
    }
    Producing the output

    Code:
    V[0]=Test1
    V[1]=Test2
    V[2]=Test3
    V[3]=Test4
    V[4]=Test5
    V[5]=Test6
    V[6]=Test7
    V[7]=Test8
    V[8]=Test9
    V[9]=Test10
    V[10]=Test11
    V[11]=Test12
    V[12]=Test13
    V[13]=Test14
    V[14]=Test15
    V[15]=Test16
    V[16]=Test17
    V[17]=Test18
    V[18]=Test19
    V[19]=Test20
    V[20]=Test21
    V[21]=Test22
    V[22]=Test23
    V[23]=Test24
    V[24]=Test25
    V[25]=Test26
    V[26]=Test27
    V[27]=Test28
    V[28]=Test29
    V[29]=Test30
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  11. #26
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    As you are going to parse blocks, the code below may be of interest. It assumes you have a function that gets a block.

    Code:
    //get a block to parse
    //returns true if block got, false if not got
    bool getBlock(string& block)
    {
    sVector	blocks;
    
    	blocks.push_back("Test1FF77Test2FF77Test3FF77Test4FF77Test5FF");
    	blocks.push_back("77Test6FF77Test7FF77Test8FF77Test9FF77Test10F");
    	blocks.push_back("F77Test11FF77Test12FF77Test13FF77Test14FF77Test15FF7");
    	blocks.push_back("7Test16FF77Test17FF77Test18FF77Test19FF77Test20");
    	blocks.push_back("FF77Test21FF77Test22FF77Test23FF77Test24FF77Test25FF77");
    	blocks.push_back("Test26FF77Test27FF77Test28FF77Test29FF77Test");
    	blocks.push_back("30FF77Test31");
    
    static int	blkno = 0;
    
    	if (blkno < blocks.size()) {
    		block = blocks[blkno++];
    		return true;
    	}
    
    	block = "";
    	return false;
    }
    
    int main()
    {
    string	block;
    
    sVector sv1,
    	sv2;
    
    bool	got;
    
    	for (got = getBlock(block), sv1 = getFields(block); got; sv1 = sv2) {
    		got = getBlock(block);
    		sv2 = getFields(sv1[sv1.size() - 1] + block);
    		sv1.pop_back();
    		output(sv1);
    	}
    
    	output(sv1);
    
    	return 0;
    }
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  12. #27
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Hello 2kaud,

    Many thanks for the time and help.

    I've been trying to test your code, but I get 2 errors for lines in red.
    Code:
    #include <string>
    #include <vector>
    #include <iostream>
    
    using namespace std;
    
    typedef vector<string> sVector;
    
    sVector getFields(const string& FSepStr, const string& sep = "FF77")
    {
    size_t	pos1;
    sVector	V;
    
    	for (size_t pos2 = 0; pos1 = pos2, (pos2 = FSepStr.find(sep.c_str(), pos1, sep.size())) != string::npos; pos2 += sep.size())
    		V.push_back(FSepStr.substr(pos1, pos2 - pos1));
    
    	V.push_back(FSepStr.substr(pos1, FSepStr.size() - pos1));  //Return element after last FSep
    	return V;
    }
    
    void output(const sVector& sv)
    {
    static cnt = 0; //Error: 'cnt' does not name a type
    
    	for (size_t i = 0; i < sv.size(); i++)
    		cout << "V[" << cnt++ << "]=" << sv[i] << endl; //Error: 'cnt' was not declared in this scope
    }
    
    //get a block to parse
    //returns true if block got, false if not got
    bool getBlock(string& block)
    {
    sVector	blocks;
    
    	blocks.push_back("Test1FF77Test2FF77Test3FF77Test4FF77Test5FF");
    	blocks.push_back("77Test6FF77Test7FF77Test8FF77Test9FF77Test10F");
    	blocks.push_back("F77Test11FF77Test12FF77Test13FF77Test14FF77Test15FF7");
    	blocks.push_back("7Test16FF77Test17FF77Test18FF77Test19FF77Test20");
    	blocks.push_back("FF77Test21FF77Test22FF77Test23FF77Test24FF77Test25FF77");
    	blocks.push_back("Test26FF77Test27FF77Test28FF77Test29FF77Test");
    	blocks.push_back("30FF77Test31");
    
    static int	blkno = 0;
    
    	if (blkno < blocks.size()) {
    		block = blocks[blkno++];
    		return true;
    	}
    
    	block = "";
    	return false;
    }
    
    int main()
    {
    string	block;
    
    sVector sv1,
    	sv2;
    
    bool	got;
    
    	for (got = getBlock(block), sv1 = getFields(block); got; sv1 = sv2) {
    		got = getBlock(block);
    		sv2 = getFields(sv1[sv1.size() - 1] + block);
    		sv1.pop_back();
    		output(sv1);
    	}
    
    	output(sv1);
    
    	return 0;
    }
    Thanks again

  13. #28
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    static int cnt = 0;
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  14. #29
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Thanks so much 2kaud, I was able to test it now and it works just fine.

    I'd like to test it with real blocks now, what do you suggest me to read the binary file in chunks of 1000 bytes and be able to
    insert an offset?

    Would be something like this?
    Code:
    ifstream file ("binfile", ios::in|ios::binary|ios::ate);
        file.read (block, size);
        file.seekg (offset, ios::beg);
        file.close();
    But I'm not sure if only 1000 bytes would be in memory in any moment, since I'd like to avoid to load in memory the complete binary file due to its 2GB size.

    Thanks again for all the help.
    Last edited by Philidor; October 12th, 2013 at 04:00 PM.

  15. #30
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,026

    Re: Read binary file with line delimeter

    Try this for getBlock.

    Code:
    bool getBlock(string& block)
    {
    static ifstream ifs("binfile", ios::binary);
    
    char	buf[1001];
    
    	if (!ifs.is_open()) {
    		block = "";
    		return false;
    	}
    
    	ifs.read(buf, 1000);
    	buf[ifs.gcount()] = 0;
    	if (ifs.gcount() > 0) {
    		block = buf;
    		return true;
    	}
    
    	ifs.close();
    	block = "";
    	return false;
    }
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

Page 2 of 11 FirstFirst 12345 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center