Html Agility Pack link and img src extraction

c# html html-agility-pack

Question

I have pages that use images as links, and I am trying to get the href link as well as the images src. The problem is what I have now is collecting the href's fine, but it is only getting the first img src and just repeating.

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
foreach (HtmlNode linkNode in linkNodes)
{
HtmlAttribute link = linkNode.Attributes["href"];
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
HtmlAttribute src = imageNode.Attributes["src"];

string imageLink = link.Value;
string imageUrl = src.Value;
}

Can some one tell me whats wrong or another way of doing it? Thanks.

Popular Answer

Try changing

HtmlNode imageNode = linkNode.SelectSingleNode("//img");

to

HtmlNode imageNode = linkNode.SelectSingleNode(".//img");

Hope this helps.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why