Which HTML tidy bundle is the best? Is there a way to make an HTML website clean using HTML agility pack?

c# html-agility-pack html-parsing htmltidy winforms

Question

I'm parsing HTML with agility pack in HTML instead of table-based data. There are now certain html pages with missing ending tags, which prevents the HTML Agility Pack from correctly parsing the information on such pages. Therefore, I need to add finishing tags wherever there are lacking closing tags to ensure that the html agility pack parses the data correctly. What should I do then to add the missing finishing tags? Should I use my tidy pack in html to accomplish that instead of my own coding?

What is the best html tidy pack, and if feasible, provide an example of how to apply it? And if I had own coding, what may it be like?

Is there a way to make it such that we can create the html page first and then the website later?

1
4
5/26/2010 3:27:52 PM

Accepted Answer

I was unable to locate any option in HTML Agility Pack that would clean up an HTML page. There is a method that adds the missing closing tags, however it only works in certain html pages. This is the HTML Agility Pack option:

  HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
      doc.OptionFixNestedTags=true;

Regex also works for certain html pages alone, but I've tested it for that as well.

I so discovered the optimum html tidy pack is:

http://www.devx.com/dotnet/Article/20505/1763/page/2.

There are instructions on how to use the tidy pack and import the dll, as well as example code. It really is fantastic. It may produce your tidy html page. and insert the absence of closing tags.

Many thanks for all your assistance.

7
3/29/2010 9:16:32 AM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow