HTML Agility Pack Parsing With Upper & Lower Case Tags?

c# html html-agility-pack html-parsing

Question

I am using the HTML Agility Pack to great effect, and am really impressed with it - However, I am selecting content like so

doc.DocumentNode.SelectSingleNode("//body").InnerHtml

How to I deal with the following situation, with different documents?

<body>
<Body>
<BODY>

Will my code above only get the lower case versions?

Accepted Answer

The Html Agility Pack handles HTML in a case insensitive way. It means it will parse BODY, Body and body the same way. It's by design since HTML is not case sensitive (XHTML is).

That said, when you use its XPATH feature, you must use tags written in lower case. It means the "//body" expression will match BODY, Body and body, and "//BODY" will match nothing.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why