HtmlAgilityPack: get all elements by class

.net asp.net c# html-agility-pack regex

Question

I have an HTML, and i need to get some nodes by class. So i can't do it because

  1. I dunno XML path
  2. Items needed has no ID, only class
  3. HtmlAgilityPack do not allow to get all elements (like XDocument allows), but doc.Elements() works only if i have an id, but i haven't. So i also dunno XML path so i cannot use SelectNodes method
  4. I cannot use regexps

my code was

public static class HapHelper
{
    private static HtmlNode GetByAttribute(this IEnumerable<HtmlNode> htmlNodes, string attribute, string value)
    {
        return htmlNodes.First(d => d.HasAttribute(attribute) && d.Attributes[attribute].ToString() == value);
    }

    public static HtmlNode GetElemenyByAttribute(this HtmlNode parentNode, string attribute, string value)
    {
        return GetByAttribute(parentNode.Descendants(), attribute, value);
    }

    public static bool HasAttribute(this HtmlNode d, string attribute)
    {
        return d.Attributes.Contains(attribute);
    }

    public static HtmlNode GetElementByClass(this HtmlNode parentNode, string value)
    {
        return parentNode.GetElemenyByAttribute("class", value);
    }
}

but it doesn't works, because Descendants() returns only nearest nodes.

What can I do?

Accepted Answer

Learn XPath! :-) It's really simple, and will serve you well. In this case, what you want is:

SelectNodes("//*[@class='" + classValue + "']") ?? Enumerable.Empty<HtmlNode>();


Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why