HtmlAgilityPack, using XPath contains method and predicates

contains html-agility-pack xpath

Question

HtmlAgilityPack, using XPath contains method

I'm using HtmlAgilityPack and i need to know if a class attribute contains a specific word, now i have this page:

<div class="yom-mod yom-art-content "><div class="bd">
<p class="first"> ....................
  </p>
</div>
</div>

I'm doing this:

HtmlDocument doc2 = ...;
List<string> paragraphs = doc2.DocumentNode.SelectNodes("//div[@class = 'yom-mod yom-art-content ']//p").Select(paragraphNode => paragraphNode.InnerHtml).ToList();

But it's too much specific that I need is something like this:

List<string> paragraphs = doc2.DocumentNode.SelectNodes("//div[contains(@class, 'yom-art-content']//p").Select(paragraphNode => paragraphNode.InnerHtml).ToList();

But it don't work, please help me..

1
5
2/4/2013 7:12:00 PM

Accepted Answer

Perhaps the issue is simply that you're missing the closing parenthesis on the contains() function:

//div[contains(@class, 'yom-art-content']//p
                                        v
//div[contains(@class, 'yom-art-content')]//p


List<string> paragraphs = 
        doc2.DocumentNode.SelectNodes("//div[contains(@class, 'yom-art-content')]//p")
            .Select(paragraphNode => paragraphNode.InnerHtml).ToList();

As a general suggestion, please explain what you mean when you say things like "it didn't work". I suspect you're getting an error message that might help track down the issue?

17
12/24/2013 2:59:03 PM

Popular Answer

Instead of using the HAP for this, look into CsQuery that provides jQuery style selectors.

It looks particularly suited for what you are trying to do.

CsQuery is a jQuery port for .NET 4. It implements all CSS2 & CSS3 selectors, all the DOM manipulation methods of jQuery, and some of the utility methods. The majority of the jQuery test suite (as of 1.6.2) has been ported to C#.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow