Parsing javascript HTML using HTMLAgilityPack

c# html html-agility-pack parsing

Question

I have the following HTML that I'm trying to parse using the HTML Agility Pack.

This is a snippet of HTML code:

<body id="station_page" class="">
...
<div>....</div>
<script type="text/javascript"> 
if (Blablabla == undefined) { var Blablabla = {}; }
Blablabla .Data1= "I want this data";
Blablabla .BlablablaData = 
{  "Data2":"I want this data",
"Blablabla":"",
"Blablabla":0   }
{   "Blablabla":123,
"Data3":"I want this data",
"Blablabla":123}
    Blablabla .Data4= I want this data;
</script>...

I'm tring to get those 4 data variable (Data1,Data2,Data3,Data4). first, I tried to found the javascript:

doc.DocumentNode.SelectSingleNode("//script[@type='text/javascript']").InnerHtml

How can I check if it's really the right javascript? After finding the relevant javascript how can I get those 4 data variable (Data1,Data2,Data3,Data4)?

1
1
3/8/2013 2:45:22 PM

Popular Answer

You can't parse javascript with HTML Agility Pack, it only supports HTML parsing. You can get to the script you need with an XPATH like this:

doc.DocumentNode.SelectSingleNode("//script[contains(text(), 'Blablabla')]").InnerHtml

But you'll need to parse the javascript with another method (regex, js grammar, etc.)

4
3/8/2013 2:59:40 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow