How to get all HTML tags that contains specific string in their attribute values using Html Agility Pack?

c# html html-agility-pack

Question

I am working on getting all HTML tags that contains specific string in thier attribute values in the below code

<meta name="DCSext.oo_market" content="en-us">
<a href="http://office.microsoft.com/en-us/support/" title="Find help for Word">
<a href="http://windows.microsoft.com/en-us/windows-live/microsoft-account-help#microsoft-account=tab1" title="Microsoft Account">

I want all the tags which contains "en-us" in their attribute means my output should return all the above html tags. Could anyone please help me how to get it using HTML Agility Pack?

1
1
10/30/2013 2:13:28 PM

Accepted Answer

You can use following XPath //*[@*[contains(., 'en-us')]] which selects any elements which have any attribute which contains string en-us:

HtmlDocument doc = new HtmlDocument();
doc.Load(path_to_html_file);    
var nodes = doc.DocumentNode.SelectNodes("//*[@*[contains(., 'en-us')]]");

Or LINQ way:

var nodes = doc.DocumentNode.Descendants()
               .Where(n => n.Attributes.Any(a => a.Value.Contains("en-us")));
3
10/30/2013 2:25:11 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow