I for parsing html use Html Agility Pack and so Grate stuff but i encountered some bad things :| this is my Background Code
public static HtmlDocument GetXHtmlFromUri2(string uri)
{
HttpClient client = HttpClientFactory.Create(new CustomeHeaderHandler());
var htmlDoc = new HtmlDocument()
{
OptionCheckSyntax = true,
OptionFixNestedTags = true,
OptionAutoCloseOnEnd = true,
OptionReadEncoding = true,
OptionDefaultStreamEncoding = Encoding.UTF8,
};
htmlDoc.LoadHtml(client.GetStringAsync(uri).Result);
return htmlDoc;
}
i use html agility for WebApi (Mvc4) and this is Get Method Logic
//GET api/values
public string GetHtmlFlights()
{
var result = ClientFlightTabale.GetXHtmlFromUri2("http://ikiafids.ir/departureFA.html");
HtmlNode node = result.DocumentNode.SelectSingleNode("//table[1]/tbody/tr[1]");
string temp = node.FirstChild.InnerHtml.Trim();
return temp;
}
but when i Call this method (from Browser and Fiddler) encountered Exceptions , With this theme :
Object reference not set to an instance of an object, and this exception Is concerned this line
string temp = node.FirstChild.InnerHtml.Trim();
can anyone help me please ?
I think you are looking for something like this:
var result = ClientFlightTabale.GetXHtmlFromUri2("http://ikiafids.ir/departureFA.html");
var tableNode = result.DocumentNode.SelectSingleNode("//table[1]");
var titles = tableNode.Descendants("th")
.Select(th => th.InnerText)
.ToList();
var table = tableNode.Descendants("tr").Skip(1)
.Select(tr => tr.Descendants("td")
.Select(td => td.InnerText)
.ToList())
.ToList();
I think your selector is wrong. Try this instead?
result.DocumentNode.SelectSingleNode("//table/tr[1]")