WPF C#, parse webbrowser content in htmlAgilityPack

c# html-agility-pack webbrowser-control wpf

Question

I need to scrap some data from a website, I create a webbrowser to allow the user to make login and use the search tool and once he searched and got the list with the results I want to have the ability to get this data and perform further analysis and have offline access.

As I said the easiest approach for me is using a webbrowser, it works out of the box, login works, surfing works, and then when I reach the appropriated page I have the webBrowser.Document witch is a mshtml.HTMLDocumentClass (if I´m correct). But htmlAgilityPack request a HtmlDocument

What's the easiest way to parse from one to the other? Please notice the webbroser is WPF webbrowser.

Popular Answer

No temporal extra files needed, just parsing from the right class.

string html = (webBrowser.Document as HTMLDocument).documentElement.innerHTML;
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);

from here on .. happy scrapping :)



Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why