How to get all HTML tags that contains specific string in their attribute values using Html Agility Pack?

c# html html-agility-pack

Question

I'm trying to get all HTML elements with a certain string in their attribute values into the code below.

<meta name="DCSext.oo_market" content="en-us">
<a href="http://office.microsoft.com/en-us/support/" title="Find help for Word">
<a href="http://windows.microsoft.com/en-us/windows-live/microsoft-account-help#microsoft-account=tab1" title="Microsoft Account">

I want all the html tags with the attribute "en-us," hence my output must include all of the aforementioned html elements. Could someone kindly tell me how to use HTML Agility Pack to acquire it?

1
1
10/30/2013 2:13:28 PM

Accepted Answer

You may use the XPath below.//*[@*[contains(., 'en-us')]] This chooses any components whose attributes include strings.en-us :

HtmlDocument doc = new HtmlDocument();
doc.Load(path_to_html_file);    
var nodes = doc.DocumentNode.SelectNodes("//*[@*[contains(., 'en-us')]]");

or using LINQ:

var nodes = doc.DocumentNode.Descendants()
               .Where(n => n.Attributes.Any(a => a.Value.Contains("en-us")));
3
10/30/2013 2:25:11 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow