I would like to use HTMLAgility pack to replace a node within the document with a text node. The purpose of this is to remove tags surrounding the node itself. Currently, I do something like this:
//This code fixes redundant HTML formatting tags
//This is a snippet of code
foreach (var hChildNode in hd.DocumentNode.SelectNodes("//b//b | //i//i | //u//u") ?? Enumerable.Empty<HtmlNode>())
hChildNode.Name = "remove";
StringBuilder sb = new StringBuilder(hd.DocumentNode.WriteTo());
sb.Replace("<remove>", string.Empty);
sb.Replace("</remove>", string.Empty);
Is there a better way to do this? If I try to create a new text node, and then do something like the code snippet below, I receive an invalid cast error:
foreach (var hChildNode in hd.DocumentNode.SelectNodes("//b//b | //i//i | //u//u") ?? Enumerable.Empty<HtmlNode>())
{
HtmlNode hNewNode = hd.CreateTextNode(hChildNode.InnerHtml);
hChildNode.ParentNode.ReplaceChild(hNewNode, hChildNode);
}
(updated after a typo was pointed out, however the problem still remains)
Am I using the method wrong? Is there another method I am supposed to use to perform functions like this? Thanks.
The purpose of this is to remove tags surrounding the node itself
Your second code snipped performs exactly tag removing except one typo (I guess):
HtmlNode hNewNode = hd.CreateTextNode(hNewNode.InnerHtml);
You should replace hNewNode.InnerHtml
by hChildNode.InnerHtml
otherwise your code won't even compile (use of unassigned variable).
Also want to mention, after creation of text node it won't have child nodes of the replaced one (instead of this it will have the same value for the InnerHtml
property with the node replaced).