Add newline in HTML source code using HTML Agility Pack

html html-agility-pack vb.net

Question

Using the HTML Agility Pack, I'm making changes to an HTML file.

Here is an example of a table-filled HTML document:

Dim document As New HtmlDocument
Dim tables As Array

document.Load(path_html)

Dim div1 As HtmlNode = HtmlNode.CreateNode("<div></div>")
Dim div2 As HtmlNode = HtmlNode.CreateNode("<div></div>")

tables = document.DocumentNode.Descendants("table").ToArray()

For Each tr As HtmlNode In tables.Descendants("tr").ToArray
   tr.AppendChild(div1)
   tr.AppendChild(div2)
Next

document.save(path_html)

The outcome is shown below in an HTML file:

<div></div><div></div>

What I'm looking for is:

<div></div>
<div></div>

This makes my HTML file unclear, thus I believe it should be the default.

I came across this question (which is exactly my problem) here, however the solution does not work for me (perhaps because the question is asking for VB.NET and the solution is for C#).

Can anyone assist?

1
1
5/23/2017 12:26:42 PM

Accepted Answer

I haven't published anyvb.net a long period, so I tried this out first inC# :

var document = new HtmlDocument();
var div = HtmlNode.CreateNode("<div></div>");
var newline = HtmlNode.CreateNode("\r\n");
div.AppendChild(newline);
for (int i = 0; i < 2; ++i)
{
    div.AppendChild(HtmlNode.CreateNode("<div></div>"));
    div.AppendChild(newline);
}
document.DocumentNode.AppendChild(div);
Console.WriteLine(document.DocumentNode.WriteTo());

Works excellent and produces:

<div>
<div></div>
<div></div>
</div>

Then thought, "No way, it's not possible." - note the commented lines:

Dim document = New HtmlDocument()
Dim div = HtmlNode.CreateNode("<div></div>")
' this writes the literal string...
Dim newline = HtmlNode.CreateNode("\r\n")
' this works!
' Dim newline = HtmlNode.CreateNode(Environment.NewLine)
div.AppendChild(newline)
For i = 1 To 2
    div.AppendChild(HtmlNode.CreateNode("<div></div>"))
    div.AppendChild(newline)
Next
document.DocumentNode.AppendChild(div)
Console.WriteLine(document.DocumentNode.WriteTo())

Unfortunately, that is the case, which is presumably why the output is the query you provided wasn't marked as answered:

<div>\r\n<div></div>\r\n<div></div>\r\n</div>

Lastly, rather of use the newline string as\r\n tried Environment.NewLine , which produces the following results:

<div>
<div></div>
<div></div>
</div>

In C#, it works both ways.

2
5/23/2017 10:30:04 AM

Popular Answer

You would need to create a node that represents a Carriage Return (based on this answer).\r A Line Feed () and\n ):

Dim newLineNode As HtmlNode = HtmlNode.CreateNode("\r\n")

Considering what you said:

I tried this but it adds '\r\n' in my HTML, it's not going back to line.

You have previously attempted this, and it produces the string literal "rn." I was able to reproduce this problem as well.

Instead, consider using.<br> a line break in a tag:

Dim newLineNode As HtmlNode = HtmlNode.CreateNode("<br>")

Your code would like the following if it followed the sample code:

Dim newLineNode As HtmlNode = HtmlNode.CreateNode("<br>")

For Each tr As HtmlNode In tables.Descendants("tr").ToArray
   tr.AppendChild(div1)
   tr.AppendChild(newLineNode)
   tr.AppendChild(div2)
Next

However tables.Descendants("tr").ToArray did provide me a build error. Since it is beyond the purview of this inquiry and you haven't brought it up as a concern, I'll assume that it does in fact work for you.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow