I am working on getting all HTML tags that contains specific string in thier attribute values in the below code
<meta name="DCSext.oo_market" content="en-us">
<a href="http://office.microsoft.com/en-us/support/" title="Find help for Word">
<a href="http://windows.microsoft.com/en-us/windows-live/microsoft-account-help#microsoft-account=tab1" title="Microsoft Account">
I want all the tags which contains "en-us" in their attribute means my output should return all the above html tags. Could anyone please help me how to get it using HTML Agility Pack?
You can use following XPath //*[@*[contains(., 'en-us')]]
which selects any elements which have any attribute which contains string en-us
:
HtmlDocument doc = new HtmlDocument();
doc.Load(path_to_html_file);
var nodes = doc.DocumentNode.SelectNodes("//*[@*[contains(., 'en-us')]]");
Or LINQ way:
var nodes = doc.DocumentNode.Descendants()
.Where(n => n.Attributes.Any(a => a.Value.Contains("en-us")));