Pass string of HTML page and scrape with HtmlAgilityPack

c# html-agility-pack vb.net

Question

Why do I get this error?

"Illegal characters in path" at htmlDoc.Load(pageSource)

pageSource is a string variable of the HTML page. I need to pass the page source as a string, not as a file and not as a URL. How do I do this?

Dim ids As New List(Of String)()
Dim pageSource = getHtml(url)

Dim htmlDoc As HtmlDocument = New HtmlDocument()

htmlDoc.OptionFixNestedTags = True


htmlDoc.Load(pageSource)


Dim s As HtmlNodeCollection = htmlDoc.DocumentNode.SelectNodes("//div/@id")

For Each div As HtmlNode In s
    ids.Add(div.Id)
Next

Accepted Answer

Use LoadHtml instead of Load:

htmlDoc.LoadHtml(pageSource)

See also the source.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why