I have the following snippet of code, retrieved from a web page:
<li class="player" data-id="168568" data-teamid="156" data-x="142.33" data-y="297.16040000000004" data-name="Corentin Tolisso" data-position="3">Corentin Tolisso<span class="shirt">24</span></li>
My goal is to extract "Corentin Tolisso", the shirt number "24" as well as the values of data-x and data-y.
So far I am able to get it to work with values that are within >...<
, using HTML Agility Pack.
However I can't find a way to extract the numbers of data-x and data-y.
I have copied the HTML string into a new jsfiddle, which puts out exactly what my C# code is getting, the things between >...<
.
How do I extract the values of data-x and data-y?
Note: Using String.IndexOf
works fine, it takes away flexibility though. This is my fallback strategy.
Note 2: I looked here and here, both of which give me some idea, but I stil have a hard time applying it to C#.
1 way would be using (["'])(?:(?=(\\?))\2.)*?\1
It supports nested quotes as well
Give it a try to this link: https://regex101.com/r/cB0kB8/1
With JQuery it makes it very simple.
Also check an example found here: Getting value of HTML text input
<form name="input" action="handle_email.php" method="post">
Email: <input type="text" name="email" />
<input type="submit" value="Newsletter" />
</form>
<a id="regLink" href="http://mywebsite.com/register?user_email=">Register</a>
$('input[name="email"]').change(function(){
alert($('#regLink').attr('href')+$('input[name="email"]').val());
});
Hope it helps you!