-
December 20th, 2009, 02:39 PM
#1
I need a little help with Regex
Hello
I want to do a thing that looks very easy to do but because it needs Regex, I can't manage to do it.
I've got a HTML code in a string that contains links in this format:
Code:
<a href="/dir1/file">
Note: number of directories is not always just 1, it can be more.
and I want my program to change all occurences of links in a format like above to a format like that(it's just adding ".html"):
Code:
<a href="/dir1/file.html">
It would be very easy to do if only String.Replace() allowed me to use wildcards. If it did, it would be probably as easy as that:
Code:
str = str.Replace("a href=\"*\">", "a href=\"*.html\">"
I tried to use Regex, doing it this way:
Code:
str = Regex.Replace(str, "<a href=\"(?<link>[a-zA-Z0-9_/-])\"", "<a href=${link}" + ".html\"");
but it doesn't work.
Any help will be very appreciated.
-
December 21st, 2009, 03:47 AM
#2
Re: I need a little help with Regex
Try this :
Code:
Regex regex = new Regex(@"\<a\s+href=""(?<link>(/\w+)+)""\s+/>");
This matches the link to one or more occurrances of "forward slash" + "more than one instances of a word character".
You should always replace spaces in matches with a space match (i.e. \s+) too.
Darwen.
-
December 21st, 2009, 05:58 AM
#3
Re: I need a little help with Regex
This works:
Code:
string input = "<a href=\"/dir1/file\">";
Regex pattern = new Regex("<a href=\"([/\\w+]*)\">");
Match m = pattern.Match(input);
if ( m.Success )
{
Console.WriteLine("New link is {0}", "<a href=\"" + pattern.Replace(input,m.Groups[1].Value + ".html\">"));
}
-
December 21st, 2009, 11:36 AM
#4
Re: I need a little help with Regex
Thank you very much for your help. I succeeded to edit Your regex(because it's basically the same except the * and + difference) to work with Regex.Replace(), it now looks like this:
Code:
str2 = Regex.Replace(str2, "<a href=\"(?<link>[/\\w-+]+)\">", "<a href=\"${link}.html\">");
If I ever have any problems with Regex, I'll be sure to write here. Again, thank you very much.
@Darwen
No matter which space I replace with \s+, the IDE gives me a warning about unrecognized escape sentence.
-
December 21st, 2009, 05:14 PM
#5
Re: I need a little help with Regex
No matter which space I replace with \s+, the IDE gives me a warning about unrecognized escape sentence.
Not if you put an '@' at the front of the string which turns it into a literal string i.e. no escaping of \
Darwen.
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|