fetching ul li items under div class using html agility pack

.net c# html html-agility-pack

Question

<div class="outer">
    <div class="divOne"></div>
    <div class="divContent">
       <h3>SomeTitle</h3>
       <h4>SomeSubtitle</h4>
       <ul>
          <li><a href="/someUrlx.htm">SomeUrl</a>
               <span> Nr of records under this url </span>
          </li>
       </ul>
       <h4>Some Other Subtitle</h4>
       <ul>
          <li><a href="/someUrlx.htm">SomeUrl</a>
              <span> Nr of records under this url </span>
          </li>
       </ul>
     </div>
</div>

Once again, I want to collect every item in the unordered list in the aforementioned HTML format.

I can get the content of the divContent class using

var regs = htmlDoc.DocumentNode.SelectSingleNode(@"//div[@class='outer']");

var descendant = regs.Descendants()
                    .Where(x => x.Name == "div" && x.Attributes["class"].Value == "divContent")
                    .Select(x => x.OuterHtml);

I now need an expression to get the elements from the ul and li.

1
1
8/23/2014 6:51:15 PM

Accepted Answer

It should function properly:

IEnumerable<string> listItemHtml = htmlDoc.DocumentNode.SelectNodes(
    @"//div[@class='outer']/div[@class='divContent']/ul/li")
    .Select(li => li.OuterHtml);

Example:https://dotnetfiddle.net/fnDPLB


Adaptation based on the remarks below:

If you're looking for merely<li> components that belong to<ul> directly related components to an<h4> the value "SomeSubtitle" for an element, the following XPath query ought to function:

//div[@class='outer']      // Get div.outer
/div[@class='divContent']  // under that div, find div.divContent
/h4[text()='SomeSubtitle'] // under div.divContent, find an h4 with the value 'SomeSubtitle'
/following::ul[1]/li       // Get the first ul following the h4 and then get its li elements.

Example:https://dotnetfiddle.net/AfinpV

4
8/23/2014 7:15:57 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow