Parsing javascript HTML using HTMLAgilityPack

c# html html-agility-pack parsing

Question

I have the following HTML that I'm trying to parse using the HTML Agility Pack.

This is a snippet of HTML code:

<body id="station_page" class="">
...
<div>....</div>
<script type="text/javascript"> 
if (Blablabla == undefined) { var Blablabla = {}; }
Blablabla .Data1= "I want this data";
Blablabla .BlablablaData = 
{  "Data2":"I want this data",
"Blablabla":"",
"Blablabla":0   }
{   "Blablabla":123,
"Data3":"I want this data",
"Blablabla":123}
    Blablabla .Data4= I want this data;
</script>...

I'm tring to get those 4 data variable (Data1,Data2,Data3,Data4). first, I tried to found the javascript:

doc.DocumentNode.SelectSingleNode("//script[@type='text/javascript']").InnerHtml

How can I check if it's really the right javascript? After finding the relevant javascript how can I get those 4 data variable (Data1,Data2,Data3,Data4)?

Popular Answer

You can't parse javascript with HTML Agility Pack, it only supports HTML parsing. You can get to the script you need with an XPATH like this:

doc.DocumentNode.SelectSingleNode("//script[contains(text(), 'Blablabla')]").InnerHtml

But you'll need to parse the javascript with another method (regex, js grammar, etc.)




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why