HtmlAgilityPack produces missing closing tags in OuterHtml

c# html html-agility-pack

Question

I am using HtmlAgilityPack to parse and manipulate html text. However it seems the DocumentNode.OuterHtml gives missing closing tags.

To isolate the issue now I am doing nothing else just parse and get the OuterHtml (no manipulation):

var document = new HtmlDocument();
document.LoadHtml(myHtml);
result = document.DocumentNode.OuterHtml;

Original: (myHtml)

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="X-UA-Compatible" content="IE=Edge" /><title>
     MyTitle
</title>

OutputHtml: (result) Notice that meta element is not closed

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="X-UA-Compatible" content="IE=Edge"><title>
    MyTitle
</title>

Similarly all input and img elements are leaved open. (Please do not answer that it should not be a problem. Well it should not be, but it is.) Chrome can not render the page correctly. Keep reading.

What is more weird:

Original: (myHtml)

    <option value="10">Afrikaans</option>
    <option value="11">Albanian</option>
    <option value="12">Arabic</option>
    <option value="13">Armenian</option>
    <option value="14">Azerbaijani</option>
    <option value="15">Basque</option>

OutputHtml: (result) Notice that that complete explicit closing tags are missing

    <option value="10">Afrikaans
    <option value="11">Albanian
    <option value="12">Arabic
    <option value="13">Armenian

Using HtmlAgilitPack latest NuGet package: id="HtmlAgilityPack" version="1.4.9"

Accepted Answer

There are several options that you can set when you are loading the document.

OptionAutoCloseOnEnd

Defines if closing for non closed nodes must be done at the end or directly in the document. Setting this to true can actually change how browsers render the page.

document = new HtmlDocument();
document.OptionAutoCloseOnEnd = true;
document.LoadHtml(content);

Related sources worth reading:

HtmlAgilityPack Drops Option End Tags

Image tag not closing with HTMLAgilityPack




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why