I need to scrap some data from a website, I create a webbrowser to allow the user to make login and use the search tool and once he searched and got the list with the results I want to have the ability to get this data and perform further analysis and have offline access.
As I said the easiest approach for me is using a webbrowser, it works out of the box, login works, surfing works, and then when I reach the appropriated page I have the
webBrowser.Document witch is a
mshtml.HTMLDocumentClass (if IÂ´m correct).
But htmlAgilityPack request a
What's the easiest way to parse from one to the other? Please notice the webbroser is WPF webbrowser.
No temporal extra files needed, just parsing from the right class.
string html = (webBrowser.Document as HTMLDocument).documentElement.innerHTML; HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(html);
from here on .. happy scrapping :)