Read binary file with line delimeter - Page 9
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 9 of 11 FirstFirst ... 67891011 LastLast
Results 121 to 135 of 156

Thread: Read binary file with line delimeter

  1. #121
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    So if there is no 940E sequence at the end, you ignore the sub blocks? Is that right?
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  2. #122
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Quote Originally Posted by 2kaud View Post
    So if there is no 940E sequence at the end, you ignore the sub blocks? Is that right?
    Yes 2kaud. If is not present the complete conditions, then the string doesn't qualify to be a sub-block.

    It should contain 059X +.. +940E+14 bytes

  3. #123
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    Try this

    Code:
    int main()
    {
    FileFields	ff;
    
    	//if (!ff.open("d:\\philidor\\bin2g")) {
    	if (!ff.open("d:\\philidor\\binsmall")) {
    		cout << "Cannot open file!" << endl;
    		return 1;
    	}
    
    string	header;
    	ff.getField(header);
    
    string	block;
    	block.reserve(7000);
    
    string preliminar;
    	preliminar.reserve(7000);
    
    string cx;
    	cx.reserve(7000);
    
    string sub;
    	sub.reserve(7000);
    
    DWORD	number;
    
    time_t timest = time(NULL);
    
    	for (DWORD blk = 1; ff.getBlock(block, number, preliminar); blk++) {
    		size_t ff79;
    		if ((ff79 = block.find(SBLOCK)) != string::npos) {
    			size_t five;
    			if ((five = block.find("05", ff79)) != string::npos) {
    				cx = block.substr(five + 2);
    				sub = "";
    				bool got4 = false;
    				for (size_t c = 0; c < cx.size() && !got4; c+= 2)
    					if (cx[c] == '9' && (cx[c + 1] >= '0' && cx[c + 1] <= '7' /*&& cx[c + 1] != '5'*/)) {
    						int slen = (convh[(cx[c + 2] - '0')] * 16 + convh[cx[c + 3] - '0']) * 2 + 4;
    						if (cx[c + 1] != '5')
    							sub += cx.substr(c, slen) + '|';
    
    						got4 = (cx[c + 1] == '4');
    						c += slen - 2;
    					}
    
    				if (got4)
    					preliminar += sub;
    			}
    		}
    		cout << number << preliminar << endl;
    	}
    
    	cout << "Time taken: " << time(NULL) - timest << endl;
    	return 0;
    }
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  4. #124
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Hello 2kaud,

    I've tested your last code and it works, it extracts all substrings expected.

    I found that each substring that begins with 9X... could take 2 more values.

    9X, where X=0,1,2,3,6,7,A,B

    So, could begin with 9A and 9B too. I think this could be a problem since are not decimal numbers.

    Now, I think is the more difficult part. I hope explain well

    Each substring is composed like this:

    When second byte is 0F (15 bytes), then is like this:
    1 byte + 1 byte + 1 byte + 1 byte + 4 bytes + 8 bytes + 1 byte
    90-0F-01-02-00000030-8147526905FFFFFF-00

    When second byte is 10 (16 bytes), then is like this:
    1 byte + 1 byte + 1 byte + 1 byte + 4 bytes + 8 bytes + 1 byte + 1 byte
    93-10-01-0C-0000000D-8147526905FFFFFF-01-01

    And I'd like to print from byte 4 to the end, converting each group of bytes to decimal, except the group of 8 bytes that should be printed without convertion but without "f's". Then for the 2 examples substring the print would be.

    For first sample substring:
    Code:
    900F0102000000308147526905FFFFFF00 --> original substring
    90-0F-01-02-00000030-8147526905FFFFFF-00 --> separating in groups
    02-00000030-8147526905FFFFFF-00 -->  These are the groups I want to print
    2,48,8147526905,0 --> separated in commas, but in decimal except the section of 8 bytes that only is needed to remove the "fs".
    For 2nd sample substring:
    Code:
    9310010C0000000D8147526905FFFFFF0101 --> original substring
    93-10-01-0C-0000000D-8147526905FFFFFF-01-01 --> separating in groups
    0C-0000000D-8147526905FFFFFF-01-01-->  These are the groups I want to print
    12,13,8147526905,1,1 --> separated in commas, but in decimal except the section of 8 bytes that only is needed to remove the "fs".
    And when substring is the last substring, the one that begins with 940E + 14 bytes, I want to print each individual byte of those 14 bytes, in decimal
    Code:
    940E0001000001000100FFFF00000101 --> original
    940E-00-01-00-00-01-00-01-00-FF-FF-00-00-01-01 --> Composed by 14 bytes
    00-01-00-00-01-00-01-00-FF-FF-00-00-01-01 --> These 14 bytes I want to print
    0,1,0,0,1,0,1,0,255,255,0,0,1,1 --> separated in commas, but in decimal
    Then, currently the output with your last code using the binSmall file is:
    Code:
    65398|532064019659172|81440415264|900F0102000000308147526905FFFFFF00|910F01020000013A81475269559FFFFF00|9310010C0000009F8147526905FFFFFF0101|960F010E000000EB81475269596FFFFF00|970F01010006F69981475269563FFFFF00|940E0001000001000100FFFF00000101|
    65399|532064024496121|81440415265|
    65400|532064019659174|81440415266|
    65401|532064019659175|81440415267|910F01020000000D8147526905FFFFFF00|9310010C0000000D8147526905FFFFFF0101|960F010C0000000D81475269565FFFFF00|940E01020102010001FFFFFF02010201|
    65402|532064019659176|81440415268|
    and the output expected is:
    Code:
    65398|532064019659172|81440415264|2,48,8147526905,0|2,314,81475269559,0|12,159,8147526905,1,1|14,235,81475269596,00|1,456345,81475269563,0|0,1,0,0,1,0,1,0,255,255,0,0,1,1
    65399|532064024496121|81440415265
    65400|532064019659174|81440415266
    65401|532064019659175|81440415267|2,13,8147526905,0|12,14,8147526905,1,1|12,14,81475269565,0|1,2,1,2,1,0,1,255,255,255,2,1,2,1
    65402|532064019659176|81440415268
    Thanks again for all the help.

  5. #125
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    To accomodate 9A and 9B is trivial

    Code:
             for (DWORD blk = 1; ff.getBlock(block, number, preliminar); blk++) {
    		size_t ff79;
    		if ((ff79 = block.find(SBLOCK)) != string::npos) {
    			size_t five;
    			if ((five = block.find("05", ff79)) != string::npos) {
    				cx = block.substr(five + 2);
    				sub = "";
    				bool got4 = false;
    				for (size_t c = 0; c < cx.size() && !got4; c+= 2)
    					if (cx[c] == '9' && ((cx[c + 1] >= '0' && cx[c + 1] <= '7') || cx[c + 1] == 'A' || cx[c + 1] == 'B')) {
    						int slen = (convh[(cx[c + 2] - '0')] * 16 + convh[cx[c + 3] - '0']) * 2 + 4;
    						if (cx[c + 1] != '5')
    							sub += cx.substr(c, slen) + '|';
    
    						got4 = (cx[c + 1] == '4');
    						c += slen - 2;
    					}
    
    				if (got4)
    					preliminar += sub;
    			}
    		}
    		cout << number << preliminar << endl;
    	}
    I'll have a look at the decomposition over the next couple of days when I have time.

    For what's currently output, what's the speed like for a large file?
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  6. #126
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    Code:
    65398|532064019659172|81440415264|900F0102000000308147526905FFFFFF00|910F01020000013A81475269559FFFFF00|9310010C0000009F8147526905FFFFFF0101|960F010E000000EB81475269596FFFFF00|970F01010006F69981475269563FFFFF00|940E0001000001000100FFFF00000101|
    65399|532064024496121|81440415265|
    65400|532064019659174|81440415266|
    65401|532064019659175|81440415267|910F01020000000D8147526905FFFFFF00|9310010C0000000D8147526905FFFFFF0101|960F010C0000000D81475269565FFFFF00|940E01020102010001FFFFFF02010201|
    65402|532064019659176|81440415268|
    and the output expected is:
    Code:
    65398|532064019659172|81440415264|2,48,8147526905,0|2,314,81475269559,0|12,159,8147526905,1,1|14,235,81475269596,00|1,456345,81475269563,0|0,1,0,0,1,0,1,0,255,255,0,0,1,1
    65399|532064024496121|81440415265
    65400|532064019659174|81440415266
    65401|532064019659175|81440415267|2,13,8147526905,0|12,14,8147526905,1,1|12,14,81475269565,0|1,2,1,2,1,0,1,255,255,255,2,1,2,1
    65402|532064019659176|81440415268
    Shouldn't the expected output for 65401 be 13 rather than the 14 highlighted - as hex D is 13 decimal?
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  7. #127
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    Apart from the issue raised above in post #126, the program below produces the expected output as per your post #124. Have fun!

    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    #include <ctime>
    #include <cstdlib>
    using namespace std;
    
    typedef unsigned char BYTE;
    typedef unsigned short int WORD;
    typedef unsigned long int DWORD;
    
    #ifndef LOBYTE
    	#define LOBYTE(w)	((BYTE)((WORD)(w) & 0xff))
    #endif
    
    #ifndef HIBYTE
    	#define HIBYTE(w)	((BYTE)((WORD)(w) >> 8))
    #endif
    
    #define CONVDEC(num)	(convh[cx[c + (num)] - '0'] * 16 + convh[cx[c + (num) + 1] - '0'])
    
    const char  hconv[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
    const int   convh[23] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0, 0, 10, 11, 12, 13, 14, 15};
    const WORD  SEPAR = 0xFF77;
    const char  SBLOCK[] = "FF79";
    
    class FileFields
    {
    private:
    	ifstream	ifs;
    	bool		opened;
    
    public:
    	FileFields() : opened(false) {}
    
    	~FileFields() {
    		if (opened)
    			ifs.close();
    	}
    
    	bool open(const char* name);
    
    	bool getBlock(string& field, DWORD& number, string& firstpart, WORD delim = SEPAR);
    	bool getField(string& field, WORD delim = SEPAR);
    
    };
    
    bool FileFields::open(const char* name) {
    	ifs.open(name, ios::binary);
    	return (opened = ifs.is_open());
    }
    
    bool FileFields::getBlock(string& field, DWORD& number, string& firstpart, WORD delim)
    {
    BYTE	num[3],
    	first[16],
    	by,
    	ub,
    	lb;
    
    	number = 0;
    	firstpart = "|";
    
    	if (!opened || !ifs.good())
    		return false;
    
    	ifs.read((char*)num, 3);
    	number = (num[0] << 16) + (num[1] << 8) + num[2];
    
    	if (!ifs.good())
    		return false;
    
    	ifs.read((char*)first, 16);
    
    	for (int p = 1; p <= 2; p++) {
    		const int last = p * 8;
    		for (int i = (p - 1) * 8; i < last; i++)
    			if ((ub = ((by = first[i]) >> 4)) < 0xf) {
    				firstpart += hconv[ub];
    				if ((lb = (by & 0x0f)) < 0xf)
    					firstpart += hconv[lb];
    				else
    					break;
    			} else
    				break;
    
    		firstpart += '|';
    	}
    
    	return getField(field);
    }
    
    bool FileFields::getField(string& field, WORD delim)
    {
    char	by;
    
    bool	cont = true;
    
    	field = "";
    
    	if (!opened || !ifs.good())
    		return false;
    
    	for (ifs.get(by); cont && ifs.gcount(); ifs.get(by)) {
    		if ((BYTE)by == HIBYTE(delim))
    			if ((BYTE)ifs.peek() == LOBYTE(delim))
    				cont = false;
    
    		if (cont) {
    			field += hconv[(BYTE)by >> 4];
    			field += hconv[(BYTE)by & 0xf];
    		}
    	}
    
    	return true;
    }
    
    int main()
    {
    FileFields	ff;
    
    	//if (!ff.open("d:\\philidor\\bin2g")) {
    	if (!ff.open("d:\\philidor\\binsmall")) {
    		cout << "Cannot open file!" << endl;
    		return 1;
    	}
    
    string	header;
    	ff.getField(header);
    
    string	block;
    	block.reserve(7000);
    
    string preliminar;
    	preliminar.reserve(7000);
    
    string cx;
    	cx.reserve(7000);
    
    string sub;
    	sub.reserve(7000);
    
    DWORD	number;
    
    char num[10];
    
    time_t timest = time(NULL);
    
    	for (DWORD blk = 1; ff.getBlock(block, number, preliminar); blk++) {
    		size_t ff79;
    		if ((ff79 = block.find(SBLOCK)) != string::npos) {
    			size_t five;
    			if ((five = block.find("05", ff79)) != string::npos) {
    				cx = block.substr(five + 2);
    				sub = "";
    				bool got4 = false;
    				for (size_t c = 0; c < cx.size() && !got4; c+= 2)
    					if (cx[c] == '9' && ((cx[c + 1] >= '0' && cx[c + 1] <= '7') || cx[c + 1] == 'A' || cx[c + 1] == 'B')) {
    						const int slen = CONVDEC(2) * 2;
    						if (got4 = (cx[c + 1] == '4'))
    							for (int i = 4; i < slen + 4; i += 2) {
    								sub += _itoa(CONVDEC(i), num, 10);
    								if (i != slen + 2) 
    									sub += ',';
    							}
    						else 
    							if (cx[c + 1] != '5') {
    								sub += _itoa(CONVDEC(6), num, 10);
    								sub += ',';
    								int dec = 0;
    								for (int s = 8; s < 16; s += 2)
    									dec = (dec << 8) + CONVDEC(s);
    
    								sub += _itoa(dec, num, 10);
    								sub += ',';
    								for (size_t s = c + 16; s < c + 32; s++)
    									if (cx[s] != 'F')
    										sub += cx[s];
    									else
    										break;
    
    								sub += ',';
    								sub += _itoa(CONVDEC(32), num, 10);
    								if (slen == 32) {
    									sub += ',';
    									sub += _itoa(CONVDEC(34), num, 10);
    								}
    								sub += '|';
    							}
    
    						c += slen + 2;
    					}
    
    				if (got4)
    					preliminar += sub;
    			}
    		}
    		cout << number << preliminar << endl;
    	}
    
    	cout << "Time taken: " << time(NULL) - timest << endl;
    	return 0;
    }
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  8. #128
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Hello 2kaud,

    Thanks! I've tried and it seems to work just fine, but I'll continue trying because with one small file I got segmentation fault
    and only prints the first line, I need to check that file.

    For the previous code with a 2G file it was processed in 471 seconds (7.85 min)

    The last output I'd like to get is a mapping for the substrings, I mean, when the substring begins with 90, print the values for substring
    in column 4, if begin with 91 print its values in column 5 and so on. But if any substring doesn't exist within sub-block, then print empty
    space.

    The mapping I'd like is as below.

    if begins with 90 print its values in 4th column
    if begins with 91 print its values in 5th column
    if begins with 9A print its values in 6th column
    if begins with 92 print its values in 7th column
    if begins with 93 print its values in 8th column
    if begins with 9B print its values in 9th column
    if begins with 96 print its values in 10th column
    if begins with 97 print its values in 11th column
    if begins with 94 print its values in 12th column

    So, the current output with your last code is:
    Code:
    65398|532064019659172|81440415264|2,48,8147526905,0|2,314,81475269559,0|12,159,8147526905,1,1|14,235,81475269596,0|1,456345,81475269563,0|0,1,0,0,1,0,1,0,255,255,0,0,1,1
    65399|532064024496121|81440415265|
    65400|532064019659174|81440415266|
    65401|532064019659175|81440415267|2,13,8147526905,0|12,13,8147526905,1,1|12,13,81475269565,0|1,2,1,2,1,0,1,255,255,255,2,1,2,1
    65402|532064019659176|81440415268|
    And desired output
    Code:
    65398|532064019659172|81440415264|2,48,8147526905,0|2,314,81475269559,0|||12,159,8147526905,1,1||14,235,81475269596,0|1,456345,81475269563,0|0,1,0,0,1,0,1,0,255,255,0,0,1,1
    65399|532064024496121|81440415265|||||||||
    65400|532064019659174|81440415266|||||||||
    65401|532064019659175|81440415267||2,13,8147526905,0|||12,13,8147526905,1,1||12,13,81475269565,0||1,2,1,2,1,0,1,255,255,255,2,1,2,1
    65402|532064019659176|81440415268|||||||||
    Thanks for all the help.

  9. #129
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    If present, do the substrings beginning with 9X always occur in the order 90, 91, 9A, 92, 93, 9B, 96, 97 and 94 - or can they occur in any order with 94 always being the last?
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  10. #130
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    When they appear (90, 91, 9A, 92, 93, 9B, 96, 97), can occur in any order, but always the substring 94X.... is at the end.

  11. #131
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    Yes, I thought you were going to say that! That complicates matters. I'll have to think about this. I'll probably have to create a vector for the substrings with index based upon the 9X code - as I can't just concaternate the output together as I do now. Hmm.

    Can you confirm that for any ff79 block, the sub-blocks starting 9x can only appear once but in ary order with 94 at the end - ie say 91 sub-block can only occur once and not multiple times?

    Incidentially, once you have the output mapped as per post #128, what are you going to do with it?
    Last edited by 2kaud; October 29th, 2013 at 04:12 PM.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  12. #132
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Quote Originally Posted by 2kaud View Post
    Incidentially, once you have the output mapped as per post #128, what are you going to do with it?
    I undertand that could be more complicate print in that order. I don't know how to put in code or change your code to do that, but my idea is something have an array A[90]=4, A[91]=5, etc. And and array B with 9 empty values, then, when the first byte is 90 do B[A[x]-4]=B[A[90]-4]=B[0]="12,13,814264845,0" . Then this would fill element 0 of array B.

    It's only an idea.

    Regarding your question, since 90, 91, 9A, etc form part of a different category, I'd like to print in the same column the corresponding values and then would be easy to open in Excel for example.

    Thanks again for the help.
    Last edited by Philidor; October 29th, 2013 at 04:34 PM.

  13. #133
    Join Date
    Dec 2012
    Location
    England
    Posts
    2,460

    Re: Read binary file with line delimeter

    Fine, but you haven't answered my question

    Can you confirm that for any individual ff79 block, the sub-blocks starting 9x can only appear once within that block but in any order with 94 at the end - ie say 91 sub-block can only occur once and not multiple times in any one block?[/QUOTE]

    Unless you say differently, I'm going to assume that each 9X can only occur once in the same ff79 block.
    All advice is offered in good faith only. You are ultimately responsible for effects of your programs and the integrity of the machines they run on.

  14. #134
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Sorry 2kaud.

    Yes, each substring that begins with 9X only appears once, in any order and only once. and the substring 940EX... always appears at the end if at least there is one substring.

  15. #135
    Join Date
    Oct 2013
    Posts
    63

    Re: Read binary file with line delimeter

    Hello again,

    I've tried your last code in CodeBlocks with GNU GCC and the compilation works, but I've tested in Visual Studio 2013 and I receive error in compilation with _itoa() saying "itoa() is not safe, you can use instead itoa_s()".

    I changed in all cases from
    Code:
    _itoa(CONVDEC(6), num, 10)
    to
    Code:
    _itoa_s(CONVDEC(6), num, sizeof(num) 10)
    But is only some strings.

Page 9 of 11 FirstFirst ... 67891011 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center