I am trying to parse HTML code using Html Agility Pack. Is there any tutorial available, or can someone tell me how can I get a text from a <td>
that has no Id and no class?
<table id="results-table">
<tr class="row1">
<td>Diode Zener Single 12V 5% 1W 2-Pin DO-41 Bulk</td>
...
Each row contains 10 different <td>
. Thanks!
You can try using this XPATH
to query all the td
s within your table
having id="results-table"
//table[@id='results-table']/tr/td
Firepath for Firefox can help you in formulating XPATH and you can manipulate it from there.
Sample code below
HtmlDocument doc = new HtmlDocument();
var fileName = @"..\..\..\docs\10960189.htm";
doc.Load(fileName);
var nodes = doc.DocumentNode.SelectNodes("//table[@id='results-table']/tr/td");
foreach (var node in nodes)
{
Debug.WriteLine(node.InnerText);
}
HTH