Html Agility Pack extracting inner content from HTML BODY node

c# html html-agility-pack vb.net

Question

Need a bit of help with HTML Agility Pack!

Basically I want to grab plain-text withing the body node of the HTML. So far I have tried this in vb.net and it fails to return the innertext meaning no change is seen, well atleast from what I can see.

Dim htmldoc As HtmlDocument = New HtmlDocument
htmldoc.LoadHtml(html)

Dim paragraph As HtmlNodeCollection = htmldoc.DocumentNode.SelectNodes("//body")

If Not htmldoc Is Nothing Then
   For Each node In paragraph
       node.ParentNode.RemoveChild(node, True)
   Next
End If

Return htmldoc.DocumentNode.WriteContentTo

I have tried this:

Return htmldoc.DocumentNode.InnerText

But still no luck!

Any advice???

1
10
7/27/2011 10:49:23 PM

Popular Answer

How about:

Return htmldoc.DocumentNode.SelectSingleNode("//body").InnerText
18
7/27/2011 11:28:47 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow