How to load dynamically generated webpage?

c# data-scrubbing html html-agility-pack

Question

I'm attempting to open the website http://www.artstation.com/artist/nicotine so I can scrub the page, but the tags I need aren't there since the page seems to have been produced by code.

It won't function if you load it using the following since it just loads the original javascript and not the output it produces:

HtmlWeb htmlWeb = new HtmlWeb();
imagepage = htmlWeb.Load(http://www.artstation.com/artist/nicotine);

In order to check it for tags, how can I load the page that is shown in the browser?

1
1
8/17/2014 5:14:02 AM

Popular Answer

HtmlAgilityPack cannot be used for this. The JavaScript on this file hasn't yet been processed or performed by a web browser when HAP asks the server to deliver you the page file since its content hasn't happened yet.

For this, there is a workaround. selenium and phantomJs may be used to get the content of dynamically produced tags. These programs have a browser stack and will run JavaScript for you. Similar tools and several examples are readily available.

0
8/17/2014 6:37:31 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow