Click to See Complete Forum and Search --> : Generic command parser wanted
dude_1967
July 25th, 2002, 09:14 AM
Hi Gurus,
I would like to find a generic command parser for ascii lines containing variable string and number elements. Preferably it should be written in C++ and it should be very lean, using a minimum written code length, hopefully relying on use of STL.
Does anyone have a finished parser or a link to one?
Alternatively feel free to suggest a solution technique for writing my own simple and lean command parser using STL. I suppose just a few lines of the right code using stringstream's would be enough for the simple commands to be parsed.
Included is a sample of the commands to be parsed.
They are really quite simple.
Thanks for any help.
Sincerely,
Chris
FA 0, 20000 C3
L 0,3FF0 ..\sample_files\Al01.hex
L 4000,4000 ..\sample_files\Al01.hex 4000
L 18000,8000 ..\sample_files\Al01.hex 18000
CC 0,3FF0 4000,4000 18000,7FFE 1FFFE
HC 0,3FF0 ..\sample_files\Al01_flash.hex
HA 4000,4000 ..\sample_files\Al01_flash.hex 4000
HA 18000,8000 ..\sample_files\Al01_flash.hex 18000
HE 0,0 ..\sample_files\Al01_flash.hex
Q
zdf
July 25th, 2002, 12:43 PM
Fine, Chris: I'll try to help you. Just give me some time.
Regards,
PaulWendt
July 25th, 2002, 01:03 PM
Here's a function that GNU put up on their gcc website a while back; I'm pretty sure you could use it to get what you want [if you want to]:
/*
* stringtok.h -- Breaks a string into tokens. This is an example for lib3.
*
* Template function looks like this:
*
* template <typename Container>
* void stringtok (Container &l,
* string const &s,
* char const * const ws = " \t\n");
*
* A nondestructive version of strtok() that handles its own memory and can
* be broken up by any character(s). Does all the work at once rather than
* in an invocation loop like strtok() requires.
*
* Container is any type that supports push_back(a_string), although using
* list<string> and deque<string> are indicated due to their O(1) push_back.
* (I prefer deque<> because op[]/at() is available as well.) The first
* parameter references an existing Container.
*
* s is the string to be tokenized. From the parameter declaration, it can
* be seen that s is not affected. Since references-to-const may refer to
* temporaries, you could use stringtok(some_container, readline("")) when
* using the GNU readline library.
*
* The final parameter is an array of characters that serve as whitespace.
* Whitespace characters default to one or more of tab, space, and newline,
* in any combination.
*
* 'l' need not be empty on entry. On return, 'l' will have the token
* strings appended.
*
*
* [Example:
* list<string> ls;
* stringtok (ls, " this \t is\t\n a test ");
* for (list<string>::const_iterator i = ls.begin();
* i != ls.end(); ++i)
* {
* cerr << ':' << (*i) << ":\n";
* }
*
* would print
* :this:
* :is:
* :a:
* :test:
* -end example]
*
* pedwards@jaj.com May 1999
*/
#include <string>
#include <cstring> // for strchr
/*****************************************************************
* This is the only part of the implementation that I don't like.
* It can probably be improved upon by the reader...
*/
namespace {
inline bool
isws (char c, char const * const wstr)
{
return (strchr(wstr,c) != NULL);
}
}
/*****************************************************************
* Simplistic and quite Standard, but a bit slow. This should be
* templatized on basic_string instead, or on a more generic StringT
* that just happens to support ::size_type, .substr(), and so on.
* I had hoped that "whitespace" would be a trait, but it isn't, so
* the user must supply it. Enh, this lets them break up strings on
* different things easier than traits would anyhow.
*/
template <typename Container>
void
stringtok (Container &l, string const &s, char const * const ws = " \t\n")
{
const string::size_type S = s.size();
string::size_type i = 0;
while (i < S) {
// eat leading whitespace
while ((i < S) && (isws(s[i],ws))) ++i;
if (i == S) return; // nothing left but WS
// find end of word
string::size_type j = i+1;
while ((j < S) && (!isws(s[j],ws))) ++j;
// add word
l.push_back(s.substr(i,j-i));
// set up for next loop
i = j+1;
}
}
It might not be able to help you, but it seems fairly generic; you could probably use it as a building block for what you need if you so desire.
--Paul
zdf
July 25th, 2002, 03:02 PM
The syntax is not very clear. The code below is just quick written code and it is a starting point (I’m sure: not the best). I hope it’ll help you.
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
class Parser
{
public:
Parser();
bool run( istream& a_is = cin );
private:
typedef struct
{
string m_string; // if string empty then it is number
int m_number;
} Argument;
virtual void process( const string& a_cmd_name, const vector<Argument>& a_cmd_args );
Parser( const Parser& ) : m_is(cin) {}
enum { tkEOF, tkEnter, tkCommand, tkNumber, tkString, tkArgSeparator };
int get_token();
void command();
istream& m_is;
int m_last_token;
int m_last_number;
string m_last_string;
int m_line_count;
};
Parser::Parser() : m_is(cin)
{
}
bool Parser::run( istream& a_is )
{
// init
m_line_count = 0;
//
try
{
while ( get_token() != tkEOF )
command();
return true;
}
catch( char* s )
{
cerr << "e r r o r : " << s << endl;
return false;
}
}
void Parser::command()
{
// get the command name
if ( m_last_token != tkString )
throw "command name expected"; // todo: your exception
string cmd_name = m_last_string;
// get the command arguments
vector<Argument> cmd_args;
while ( get_token() != tkEOF )
{
switch ( m_last_token )
{
case tkArgSeparator: // same as ws
// nop
break;
case tkString:
{
Argument arg;
arg.m_string = m_last_string;
cmd_args.push_back( arg );
}
case tkNumber:
{
Argument arg;
arg.m_number = m_last_number;
cmd_args.push_back( arg );
}
break;
case tkEnter:
{
process( cmd_name, cmd_args );
}
return;
}
}
}
void Parser:: process( const string& a_cmd_name, const vector<Argument>& a_cmd_args )
{
cout << "p r o c e s s i n g . . . " << a_cmd_name << endl;
}
int Parser::get_token()
{
// alias
istream& is = m_is;
// last char
char c;
// eat white
do
{
if ( ! is.get(c) )
return m_last_token = tkEOF;
} while ( c != '\n' && isspace(c) );
// analyze
switch ( c )
{
case '\n':
++m_line_count;
return m_last_token = tkEnter;
case ',':
return m_last_token = tkArgSeparator;
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
is.putback(c);
is >> hex >> m_last_number;
return m_last_token = tkNumber;
default: // todo: get path as well
if ( ! isalpha(c) )
throw "bad token"; // todo: your exception
m_last_string = c;
while ( is.get(c) && isalnum(c) )
m_last_string += c;
is.putback(c);
return m_last_token = tkString;
}
}
int main()
{
Parser().run();
return 0;
}
Regards,
dude_1967
July 26th, 2002, 02:04 AM
Thanks for the suggestions.
These are both good solutions.
Chris.
:)
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.