HtmlAgilityPack - removing all nodes in a collection

c# html html-agility-pack windows-runtime windows-store-apps

Question

I'm trying to fix this weird nested HTML I get from using contentEditable

<span lang="">
   <p>line one</p>
   <p>line two</p>
</span>

I want to replace each of these span nodes with its children

<p>line one</p>
<p>line two</p>

Here's what I tried.

var spans = doc.DocumentNode.Descendants().Where(x => x.Name == "span" && x.Attributes["lang"] != null).ToList();
foreach (var span in spans)
{
    foreach (var child in span.ChildNodes)
    {
        var ch = doc.CreateElement(child.Name);
        ch.InnerHtml = child.InnerHtml;
        doc.DocumentNode.InsertBefore(ch, span);
    }            
    span.Remove();
}

This throws a System.ArgumentOutOfRangeException with the following message.

Node "<span lang=""></span>" was not found in the collection

I understand why this is happening. Editing the document voids my collection of span elements. So how do go about doing this?

Also, how do I cope with text which is not contained in a childnode? Suppose I found this element

<span lang="">
   <p>line one</p>
   <p>line two</p>
   line three
</span>

How do I de-nest that?

PLEASE NOTE: This is HtmlAgilityPack for WinRT, so SelectSingleNode and all xpath commands are not available to me

1
2
9/8/2014 1:54:13 PM

Accepted Answer

As for your issue the fix should be to invoke InsertBefore from the parent node, not the document root.

Moreover I think you can directly "move" nodes without creating new ones:

foreach (var span in spans)
{
    foreach (var child in span.ChildNodes)
    {
        span.ParentNode.InsertBefore(child, span);
    }
    span.Remove();
}
3
9/8/2014 9:38:53 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow