I've just started using HTML Agility Pack. I'm having trouble looking for some documents.
I own the next code:
<div class="person"> <a href="blah1.html">Person 1</a> </div> <div class="person"> <a href="blah2.html">Person 2</a> </div> <div class="person"> <a href="blah3.html">Person 3</a> </div> <div class="person"> <a href="blah4.html">Person 4</a> </div>
How can I use the parser to only retrieve links inside of divs with the class person?
Zzz-5-Zzz (available on NuGet) allows for:
HtmlDocument html = new HtmlDocument(); html.Load(path_to_html); // or html.LoadHtml(html_string) var links = html.DocumentNode.SelectNodes("//div[@class='person']/a") .Select(n => n.GetAttributeValue("href", null));
"blah1.html" "blah2.html" "blah3.html" "blah4.html"
According on your description, the XPath below is appropriate:
It will bring back the
characteristics of the first
components that are immediately underneath any
component of the
a quality that is equivalent to
Instead of utilizing the HTML Agility Pack, you may want to try using CsQuery if you are more familiar with jQuery style selectors.