CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 8 of 8
  1. #1
    Join Date
    Mar 2008
    Posts
    161

    [RESOLVED] Regular Expression in C#

    I am making a program in C# that reads an html file that another one of my programs has outputted.

    Code:
    <input type="hidden" name="value5" value="1247555244" />
    what i need to find is what value equals, eg i need something to return "1247555244" and only that.

    Please keep in mind that there is much other html than this in the file. i am new to regex and i need someone to show me how to make a regular expression for use in c# to find a value that contains all digits and return only those digits.

    Thanks.

  2. #2
    Join Date
    Jul 2006
    Posts
    297

    Re: Regular Expression in C#

    Well what i suggest doing is downloading a program called RegexBuddy. Anyone just starting regular expressions should use it. It allows you to build it step by step and shows you what all the different characters do. It also has a bunch of presets, like email, phone numbers, etc...

    As for your particular problem this might work. I would need a larger sample of your html to make sure that the regex finds all the cases and is strict enough to find only what you're looking for. The expression I used should find the value attribute of any HTML tag that has a value consisting of only digits.

    Code:
            Match m = Regex.Match(inputString, @"<([\w]+) (.+)?value=""(?<value>[\d]+)""([^/>]+)?/?>");
            while (m.Groups["value"].Success)
            {
                // Place your code here
                Int32 value = Convert.ToInt32(m.Groups["value"].Value);
    
                m = m.NextMatch();
            }

  3. #3
    Join Date
    Mar 2008
    Posts
    161

    Re: Regular Expression in C#

    Quote Originally Posted by monalin View Post
    Well what i suggest doing is downloading a program called RegexBuddy. Anyone just starting regular expressions should use it. It allows you to build it step by step and shows you what all the different characters do. It also has a bunch of presets, like email, phone numbers, etc...

    As for your particular problem this might work. I would need a larger sample of your html to make sure that the regex finds all the cases and is strict enough to find only what you're looking for. The expression I used should find the value attribute of any HTML tag that has a value consisting of only digits.

    Code:
            Match m = Regex.Match(inputString, @"<([\w]+) (.+)?value=""(?<value>[\d]+)""([^/>]+)?/?>");
            while (m.Groups["value"].Success)
            {
                // Place your code here
                Int32 value = Convert.ToInt32(m.Groups["value"].Value);
    
                m = m.NextMatch();
            }

    Thats almost what i need, but it returns

    Code:
    "<input type=\"hidden\" name=\"value25\" value=\"1247605870\" />"
    and i need it to only return the number "1247605870"

    and thanks for the tip about the newbie regular expression program.

  4. #4
    Join Date
    Jun 2008
    Posts
    2,477

    Re: Regular Expression in C#

    If you need to parse different tags and whatnot you should build your own parser. This will allow you to do things like read a tag and then get its attributes (i.e., "value") as properties of your class. You will need more regular expressions than simply the one needed to parse out "value", so I would approach it from that angle. Your parsing will be done by your HTMLDocument class (I made up the name obviously). You cannot just parse HTML as XML as it does not follow the XML spec.

  5. #5
    Join Date
    Mar 2008
    Posts
    161

    Re: Regular Expression in C#

    Quote Originally Posted by BigEd781 View Post
    If you need to parse different tags and whatnot you should build your own parser. This will allow you to do things like read a tag and then get its attributes (i.e., "value") as properties of your class. You will need more regular expressions than simply the one needed to parse out "value", so I would approach it from that angle. Your parsing will be done by your HTMLDocument class (I made up the name obviously). You cannot just parse HTML as XML as it does not follow the XML spec.
    I don't need to find just "value" i need to find a value that has only numbers in it.

    and i know for a fact that there will only be one value= with 100% numbers after it.

  6. #6
    Join Date
    Jul 2006
    Posts
    297

    Re: Regular Expression in C#

    Quote Originally Posted by Pale View Post
    Thats almost what i need, but it returns

    Code:
    "<input type=\"hidden\" name=\"value25\" value=\"1247605870\" />"
    and i need it to only return the number "1247605870"

    and thanks for the tip about the newbie regular expression program.
    I just tried it again. When i run it, the Int32 value gets assigned 1247605870. The only group that will return the whole string is.

    Code:
    m.Groups[0].Value
    Also, its not a newbie regular expression program. I still use it every day, i don't use the regex generator i type it in myself but it allows me to easily test each regex i make.

  7. #7
    Join Date
    Jun 2008
    Posts
    2,477

    Re: Regular Expression in C#

    Quote Originally Posted by Pale View Post
    I don't need to find just "value" i need to find a value that has only numbers in it.

    and i know for a fact that there will only be one value= with 100% numbers after it.
    I fail to see how that is relevant. When I say "value", I mean an attribute named "value" and its corresponding...value. You said yourself that you need to handle other types of attributes and tags, so why only program for each individual case when you can simply create a routine that returns attribute:

    Code:
    HtmlDocument myDoc = new HtmlDocument( myHtmlSource);
    int value = Int32.Parse( myDoc.Nodes( "input" ).GetAttribute( "value" ) );
    Nicer, eh? Now the parsing is confined to the HtmlDocument class and is reusable anywhere in code and also can do more than one thing.

  8. #8
    Join Date
    Mar 2008
    Posts
    161

    Re: Regular Expression in C#

    Thanks a lot guys.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured