How to use Html Agility Pack for HTML validations

c# html-agility-pack


I am using HTML Agility Pack for validating my html. Below is what I am using,

public class MarkupErrors
    public string ErrorCode { get; set; }
    public string ErrorReason { get; set; }

public static List<MarkupErrors> IsMarkupValid(string html)
    var document = new HtmlAgilityPack.HtmlDocument();
    document.OptionFixNestedTags = true;

    var parserErrors = new List<MarkupErrors>();
    foreach(var error in document.ParseErrors)
        parserErrors.Add(new MarkupErrors
                                 ErrorCode = error.Code.ToString(),
                                 ErrorReason = error.Reason

    return parserErrors;

So say my input is something like the one shown below :

Hello World</h2> 
<h3>Missing close h3 tag

So my current function return a list of following errors

- Start tag <h2> was not found
- End tag </h3> was not found

which is fine...

My problem is that I want the entire html to be valid, that is with a proper <head> and <body> tags, because this html will later be available for preview, download as .html files.

So I was wondering if I could check for this using HTML Agility Pack ?

Any ideas or other options will be appreciated. Thanks

5/20/2013 8:15:20 AM

Accepted Answer

You can check there is a HEAD element or a BODY element under an HTML element like this for example:

bool hasHead = doc.DocumentNode.SelectSingleNode("html/head") != null;
bool hasBody = doc.DocumentNode.SelectSingleNode("html/body") != null;

These would fail if there is no HTML element, or if there is no BODY element under the HTML element.

Note I don't use this kind of XPATH expression "//head" because it would give a result even if the head was not directly under the HTML element.

5/20/2013 8:45:12 AM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow