I am very new to HTML Agility Pack. I am trying to find some documentation but having some issues.
I have the following code:
<div class="person"> <a href="blah1.html">Person 1</a> </div> <div class="person"> <a href="blah2.html">Person 2</a> </div> <div class="person"> <a href="blah3.html">Person 3</a> </div> <div class="person"> <a href="blah4.html">Person 4</a> </div>
Using the parser, how can I only grab links within a div that has a class person?
With Html Agility Pack (available on NuGet):
HtmlDocument html = new HtmlDocument(); html.Load(path_to_html); // or html.LoadHtml(html_string) var links = html.DocumentNode.SelectNodes("//div[@class='person']/a") .Select(n => n.GetAttributeValue("href", null));
"blah1.html" "blah2.html" "blah3.html" "blah4.html"
The following XPath corresponds to your description:
It will return the
href attributes of the first
a elements that reside directly under any
div element with the
class attribute that is equal to
If you are more comfortable with jQuery style selectors, take a look at using CsQuery instead of the HTML Agility Pack.