HtmlAgilityPack replace node

c# html-agility-pack

Question

I want to replace a node with a new node. How can I get the exact position of the node and do a complete replace?

I've tried the following, but I can't figured out how to get the index of the node or which parent node to call ReplaceChild() on.

string html = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
HtmlDocument document = new HtmlDocument();
document.LoadHtml(html);

var bolds = document.DocumentNode.Descendants().Where(item => item.Name == "b");

foreach (var item in bolds)
{

    string newNodeHtml = GenerateNewNodeHtml();
    HtmlNode newNode = new HtmlNode(HtmlNodeType.Text, document, ?);
    item.ParentNode.ReplaceChild( )
}

Accepted Answer

To create a new node, use the HtmlNode.CreateNode() factory method, do not use the constructor directly.

This code should work out for you:

var htmlStr = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);

var query = doc.DocumentNode.Descendants("b");
foreach (var item in query.ToList())
{
    var newNodeStr = "<foo>bar</foo>";
    var newNode = HtmlNode.CreateNode(newNodeStr);
    item.ParentNode.ReplaceChild(newNode, item);
}

Note that we need to call ToList() on the query, we will be modifying the document so it would fail if we don't.


If you wish to replace with this string:

"some text <b>node</b> <strong>another node</strong>"

The problem is that it is no longer a single node but a series of nodes. You can parse it fine using HtmlNode.CreateNode() but in the end, you're only referencing the first node of the sequence. You would need to replace using the parent node.

var htmlStr = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);

var query = doc.DocumentNode.Descendants("b");
foreach (var item in query.ToList())
{
    var newNodesStr = "some text <b>node</b> <strong>another node</strong>";
    var newHeadNode = HtmlNode.CreateNode(newNodesStr);
    item.ParentNode.ReplaceChild(newHeadNode.ParentNode, item);
}

Popular Answer

I am using HtmlDocument.DocumentNode for the newly generated node.

string html = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
HtmlDocument document = new HtmlDocument();
document.LoadHtml(html);
var bolds = document.DocumentNode.Descendants().Where(item => item.Name == "b");
foreach (var item in bolds)
{
    string newNodeHtml = GenerateNewNodeHtml();
    var nodeDocument = new HtmlDocument();
    nodeDocument.LoadHtml(newNodeHtml);
    item.ParentNode.ReplaceChild(nodeDocument.DocumentNode);
}



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why