Wrong encoding with HTML Agility Pack

character-encoding encoding html-agility-pack unicode

Question

Im trying to parse http://www.wein-wg.de/wwg/rheinhessen/worms-pfeddersheim/weingut-goldschmidt/ but cant get the correct charset. The website is using iso-8859-1. Somehow all unicode characters are displayed as ? in Visual Studio.

Is there a possibility to transfer it to the right charset in Visual Studio or anywhere else?

1
1
2/10/2013 1:24:55 AM

Accepted Answer

using HtmlAgilityPack;

HtmlDocument doc;
HtmlWeb web = new HtmlWeb();

private void getPage(string url)
{
    web.OverrideEncoding = Encoding.GetEncoding("iso-8859-1");
    doc = web.Load(url);
    webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
}

getPage("http://www.wein-wg.de/wwg/rheinhessen/worms-pfeddersheim/weingut-goldschmidt/");
2
12/11/2018 1:31:43 PM

Popular Answer

Solved with:

HtmlWeb Webget = new HtmlWeb();
HtmlDocument doc = new HtmlDocument();
Webget.AutoDetectEncoding = false;
Webget.OverrideEncoding = Encoding.UTF8;

doc_tmp.OptionOutputAsXml = true;
doc_tmp.OptionReadEncoding = true;
doc_tmp.OptionFixNestedTags = true;
doc_tmp.OptionDefaultStreamEncoding = Encoding.UTF8;

doc_tmp.LoadHtml(tmp.InnerHtml);
doc_tmp.Save(Console.Out);


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow