How to fix html tags(which is missing the & tags) with HTMLAgilityPack

c# dom html-agility-pack

Question

I have html that<div><h1> hello Hi</div> <div>hi </p></div>

Required Results:<div><h1> hello </h1></div> <div><p>hi </p></div>

Is it feasible to resolve difficulties with missing closing and opening tags with HTML Agility Pack?

1
10
8/23/2013 7:10:48 AM

Accepted Answer

The library lacks the intelligence to design the openingp where you placed it, but it's clever enough to fill in the gapsh1 . Additionally, it consistently produces proper HTML, but not necessarily the kind you would anticipate.

Hence, this code:

        HtmlDocument doc = new HtmlDocument();
        doc.Load(yourhtml);
        doc.Save(Console.Out);

will discard this

<div><h1> hello Hi</h1></div> <div>hi <p></div>

this, although not what you desire, is nonetheless legitimate HTML. You may also include a little technique like this:

        HtmlNode.ElementsFlags["p"] = HtmlElementFlag.Closed;
        HtmlDocument doc = new HtmlDocument();
        doc.Load(yourhtml);
        doc.Save(Console.Out);

which will discard this:

<div><h1> hello Hi</h1></div> <div>hi <p></p></div>
14
8/23/2013 8:06:36 AM

Popular Answer

If performingHtmlAgilityPack.HtmlDocument.LoadHTML(yourhtml) The tags will be automatically fixed for you by HTMLAgilityPack, and you can then access those tags by using:HtmlAgilityPack.HtmlDocument.DocumentNode.OuterHTML



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow