Get Image Absolute URL From Some Node in HtmlAgilityPack.HtmlDocument

c# html html-agility-pack redirect relative-url

Question

I want fetch some webpage from internet, and get absolute URLs of some images on the page by using HtmlAgilityPack in C#.

The problem is...

The website will first redirect the URL to some other one, and then the src attribute in the <img> tag is related URL.


Currently, I have some codes like this:

using HtmlAgilityPack;

HtmlDocument webpageDocument = new HtmlWeb().Load("http://xyz.example.com/");
HtmlNodeCollection nodes = webpageDocument.DocumentNode.SelectNodes("//img");
String url = nodes[0].Attributes["src"].Value.ToString();

Above codes fetch a webpage from the given example url, and get some <img> element from the DOM tree, and get src attribute of it.

It works if the <img> has absolute url. But unfortunately the website I want to handle give me a related URI (e.g. /img/01.png). I need the absolute URL so that I can do more options about the image.

So, I need to know what URL is the base URL for given src, but failed. Or, in another word, I don't know how to get the location of the webpage after redirect.


Server side is not mine (I have no control to it).

1
1
4/29/2017 11:58:09 AM

Popular Answer

Consider ResponseUri and to avoid second call give html agility parser the string with the content of the page instead.

0
4/29/2017 2:05:03 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow