HTML Agility pack - parsing img src and href from relative paths

c# html-agility-pack

Question

I have a html that have img src and link href as relative paths. I need to move from relative to complete url path like "http://localhost:port.." using Html Agility Pack.

src="/Expo/imagename.s3lb" in S3 href="../Etch/Exposition/...aspx?sflang=en"

Can some one tell me a way of doing it? Thanks.

Accepted Answer

I cant test or run this now, but you can try something like that:

var htmlStr = "yourhtml";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);
var baseUri = new Uri("baseUriOfYourSite");
var images = doc.DocumentNode.SelectNodes("//img/@src").ToList();
var links = doc.DocumentNode.SelectNodes("//a/@href").ToList();
foreach (var item in images.Concat(links))
{
    item.InnerText =  new Uri(baseUri, item.InnerText).AbsoluteUri;    
}



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why