CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 5 of 5

Threaded View

  1. #1
    Join Date
    Aug 1999
    Location
    Germany
    Posts
    2,338

    String to long for Regex?

    Hello!

    I have a website loaded into a string which is 1262594 chars long. I want to do a RegExp-search on it to find all the links to a page like:

    Code:
    Pattern = "<a(.[^<>]*)href([ \\s]*)=([ \\s'\"]*?)(http://|https://)([^<>'\"\\?]*?)(example.com)([^'\"> ]*)(['\" ]*)(.*?)>(.*?)</a>";
    Regex myRegex = new Regex(Pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
    Console.WriteLine("Done Regex");
    MatchCollection mc = myRegex.Matches(html);
    Console.WriteLine("Done Matches");
    if (mc.Count == 0) {
    Console.WriteLine("Done mc.Count");
    }
    Console.WriteLine("Done all");
    This works fine for shorter strings, but the program hangs-up itself using long string: The main-window just freezes, no exception called or anything else. I waited about 30 minutes and then killed the process.

    The output is:

    Done Regex
    Done Matches

    ... so it seems that the if (mc.Count == 0) crashes somehow.

    When setting a breakpoint at the if (mc.Count == 0) and look at mc.Count in the Auto-Watch-Window, I get:
    Count Function evaluation disabled because a previous function evaluation timed out. You must continue execution to reenable function evaluation. int
    Step a line further crashes the applicaiton as well.

    Any ideas about that?
    Last edited by martho; December 21st, 2009 at 08:58 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured