Img src extraction and connect to HTML Agility Pack

c# html html-agility-pack

Question

I have pages that use images as links, and I am trying to get the href link as well as the images src. The problem is what I have now is collecting the href's fine, but it is only getting the first img src and just repeating.

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
foreach (HtmlNode linkNode in linkNodes)
{
HtmlAttribute link = linkNode.Attributes["href"];
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
HtmlAttribute src = imageNode.Attributes["src"];

string imageLink = link.Value;
string imageUrl = src.Value;
}

Can some one tell me whats wrong or another way of doing it? Thanks.

1
3
9/8/2011 12:14:23 AM

Popular Answer

Try changing

HtmlNode imageNode = linkNode.SelectSingleNode("//img");

to

HtmlNode imageNode = linkNode.SelectSingleNode(".//img");

Hope this helps.

2
10/14/2011 11:26:10 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow