If a node has no parent node, how do you remove it?

.net asp.net c# html-agility-pack

Question

I'm using the HTML agility pack to clean up input to a WYSIWYG. This might not be the best way to do this but I'm working with developers who explode on contact with regex so it will have to suffice.

My WYSIWYG content looks something like this (for example):

<p></p>
<p></p>
<p><span><input id="textbox" type="text" /></span></p>

I need to strip the empty paragraph tags. Here's how I'm doing it at the moment:

HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//p");
if (nodes == null)
    return;

foreach (HtmlNode node in nodes)
{
    node.InnerHtml = node.InnerHtml.Trim();
    if (node.InnerHtml == string.Empty)
        node.ParentNode.RemoveChild(node);
}

However, because the HTML is not a complete document the paragraph tags do not have a parent node and RemoveChild will therefore fail since ParentNode is null.

I can't find another way to remove tag though, can anyone point me at an alternate method?

1
3
4/17/2012 12:37:36 PM

Accepted Answer

Technically, first-level elements are children of the document root, so the following code should work:

if (node.InnerHtml == String.Empty) {
    HtmlNode parent = node.ParentNode;
    if (parent == null) {
        parent = doc.DocumentNode;
    }
    parent.RemoveChild(node);
}
4
4/17/2012 12:55:02 PM

Popular Answer

You want to remove from the collection, right?

HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//p");
if (nodes == null)
    return;

for (int i = 0; i < nodes.Count - 1; i++)
{
    nodes[i].InnerHtml = nodes[i].InnerHtml.Trim();
    if (nodes[i].InnerHtml == string.Empty)
        nodes.Remove(i);
}


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow