將數據從HTML表中獲取到數據表中

c# html html-agility-pack linq xpath

好的,我需要查詢實時網站以從表中獲取數據,將此HTML表放入DataTable然後使用此數據。到目前為止,我已經設法使用Html Agility Pack和XPath來獲取我需要的表中的每一行,但我知道必須有一種方法將其解析為DataTable。 (C#)我目前使用的代碼是:

string htmlCode = "";
using (WebClient client = new WebClient())
{
htmlCode = client.DownloadString("http://www.website.com");
}
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.LoadHtml(htmlCode);

//My attempt at LINQ to solve the issue (not sure where to go from here)
var myTable = doc.DocumentNode
.Descendants("table")
.Where(t =>t.Attributes["summary"].Value == "Table One")
.FirstOrDefault();

//Finds all the odd rows (which are the ones I actually need but would prefer a
//DataTable containing all the rows!
foreach (HtmlNode cell in doc.DocumentNode.SelectNodes("//tr[@class='odd']/td"))
{
string test = cell.InnerText;
//Have not gone further than this yet!
}

我正在查詢的網站上的HTML表格如下所示:

string htmlCode = "";
using (WebClient client = new WebClient())
{
htmlCode = client.DownloadString("http://www.website.com");
}
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.LoadHtml(htmlCode);

//My attempt at LINQ to solve the issue (not sure where to go from here)
var myTable = doc.DocumentNode
.Descendants("table")
.Where(t =>t.Attributes["summary"].Value == "Table One")
.FirstOrDefault();

//Finds all the odd rows (which are the ones I actually need but would prefer a
//DataTable containing all the rows!
foreach (HtmlNode cell in doc.DocumentNode.SelectNodes("//tr[@class='odd']/td"))
{
string test = cell.InnerText;
//Have not gone further than this yet!
}

我不確定是否更好/更容易使用LINQ + HAP或XPath + HAP來獲得所需的結果,我嘗試了兩種方法,但您可能會看到它們。這是我第一次製作程序來查詢網站,甚至以任何方式與網站互動,所以我現在很不確定!在此先感謝任何幫助:)

一般承認的答案

HTML Agility Pack沒有開箱即用的方法,但創建一個方法應該不會太難。那裡從Linq到XML的XML到Datatable的樣本 。這些可以重新製作成您需要的東西。

如果需要,我可以幫助創建整個方法,但不是今天:)。

也可以看看:


熱門答案

使用上面的Jack Eker的一些代碼和Mark Gravell的一些代碼( 見這裡的帖子 ),我設法找到了解決方案。此代碼段用於在撰寫本文時獲取南非2012年的公共假日

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Web;
using System.Net;
using HtmlAgilityPack;



namespace WindowsFormsApplication
{
    public partial class Form1 : Form
    {
        private DataTable dt;
        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {

            string htmlCode = "";
            using (WebClient client = new WebClient())
            {
                client.Headers.Add(HttpRequestHeader.UserAgent, "AvoidError");
                htmlCode = client.DownloadString("http://www.info.gov.za/aboutsa/holidays.htm");
            }
            HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

            doc.LoadHtml(htmlCode);

            dt = new DataTable();
            dt.Columns.Add("Name", typeof(string));
            dt.Columns.Add("Value", typeof(string));

            int count = 0;


            foreach (HtmlNode table in doc.DocumentNode.SelectNodes("//table"))
            {

                foreach (HtmlNode row in table.SelectNodes("tr"))
                {

                    if (table.Id == "table2")
                    {
                        DataRow dr = dt.NewRow();

                        foreach (var cell in row.SelectNodes("td"))
                        {
                            if ((count % 2 == 0))
                            {
                                dr["Name"] = cell.InnerText.Replace(" ", " ");
                            }
                            else
                            {

                                dr["Value"] = cell.InnerText.Replace(" ", " ");

                                dt.Rows.Add(dr);
                            }
                            count++;

                        }


                    }

                }


                dataGridView1.DataSource = dt;

            }
        }

    }
}



許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因
許可下: CC-BY-SA with attribution
不隸屬於 Stack Overflow
這個KB合法嗎? 是的,了解原因