Get Image Absolute URL From Some Node in HtmlAgilityPack.HtmlDocument

c# html html-agility-pack redirect relative-url

Question

I want fetch some webpage from internet, and get absolute URLs of some images on the page by using HtmlAgilityPack in C#.

The problem is...

The website will first redirect the URL to some other one, and then the src attribute in the <img> tag is related URL.


Currently, I have some codes like this:

using HtmlAgilityPack;

HtmlDocument webpageDocument = new HtmlWeb().Load("http://xyz.example.com/");
HtmlNodeCollection nodes = webpageDocument.DocumentNode.SelectNodes("//img");
String url = nodes[0].Attributes["src"].Value.ToString();

Above codes fetch a webpage from the given example url, and get some <img> element from the DOM tree, and get src attribute of it.

It works if the <img> has absolute url. But unfortunately the website I want to handle give me a related URI (e.g. /img/01.png). I need the absolute URL so that I can do more options about the image.

So, I need to know what URL is the base URL for given src, but failed. Or, in another word, I don't know how to get the location of the webpage after redirect.


Server side is not mine (I have no control to it).

Popular Answer

Consider ResponseUri and to avoid second call give html agility parser the string with the content of the page instead.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why