CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 3 of 3
  1. #1
    Join Date
    Oct 2011
    Posts
    2

    Smile Html.SelectNodes Selecting Twice?

    Hi All,

    I'm probably doing something silly but I just don' see it and have lost half my hair trying to figure it out. I'm using HtmlAgilityPack to select some nodes from a screen scrape I've done. Here is the first select.

    HtmlNodeCollection _Nodes = html.DocumentNode.SelectNodes("//tr[@class = 'RateRow ']");

    If I do a count on this node _Nodes.Count I get 6. (ALL GOOD!)

    Now I want to manipulate this node(list) and get some info bach from each node, passing it to my method.

    foreach (var _Node in _Nodes)
    {
    Set return object here = ManipulateNode(_Node);
    }

    In my ManipulateNode method, I take the node and get some "sub" nodes with this statement:

    HtmlNodeCollection _nodes = _Node.SelectNodes("//td[starts-with(@class,'week')]|//td[starts-with(@class,'sold')]");

    This is where I run into a problem. If I do a _Node.Count now (inside my method), it should give me 14 nodes (days). Instead I get 84 (which is oddly 6 * 14, as if it is selecting the entire list of nodes, not one at a time which is what I'm passing in.

    If I use a string to my method (instead of the SelectNodes) , all looks okay. It only passes in the node I want six times and I can see that the inner html is correct.

    The problem seems to be when I then convert one of the six nodes passed in to a new Html.NodeCollection. Then I end up with 84 nodes! ARGGGG.

    Any help or suggestions would be most appreciated.

    Cheers from Sydney.
    John

  2. #2
    Join Date
    Jan 2009
    Posts
    596

    Re: Html.SelectNodes Selecting Twice?

    Have you tried replacing the // (both of them) in

    Code:
    HtmlNodeCollection _nodes = _Node.SelectNodes("//td[starts-with(@class,'week')]|//td[starts-with(@class,'sold')]");
    with 'descendant::'? I think the // at the beginning of the query means to search from the root, i.e. the root of the document.

  3. #3
    Join Date
    Oct 2011
    Posts
    2

    Thumbs up Re: Html.SelectNodes Selecting Twice?

    Thanks Pete,

    I removed the // and that stopped selecting the root node...

    I'm an 'xpath selector noob'.

    duh...

    I appreciate the reply.

    Cheerios from Sydney

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured