System.Collections.Generic.List`1 [System.String]同时Webscraping

c# html-agility-pack linq

目前有一个问题,我不能让C#将我的列表输出到可读的东西,这意味着我实际上无法看到webscraping是否实际工作或提取不正确的信息。

任何人都知道如何将System.Collections.Generic.List1` [System.String]更改为可读的内容?

using HtmlAgilityPack;
using NScrape.Forms;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text;
using System.Threading.Tasks;

namespace FulcrumBotManager
{
    class Program
    {
        static void Main(string[] args)
        {

            WebClient webClient = new WebClient();
            string download = webClient.DownloadString("http://localhost:1013");

            HtmlAgilityPack.HtmlDocument html = new HtmlAgilityPack.HtmlDocument();
            html.LoadHtml(download);
            List<List<string>> table = html.DocumentNode.SelectSingleNode("//table")
                        .Descendants("tr")
                        .Skip(1)
                        .Where(tr => tr.Elements("td").Count() > 1)
                        .Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
                        .ToList();


           table.ForEach(Console.WriteLine);
        }
    }
}

被抓取的HTML

<!DOCTYPE html>
<!-- saved from url=(0022)http://localhost:1013/ -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<meta http-equiv="refresh" content="5">
<style>#accounts {font-family: "Trebuchet MS", Arial, Helvetica, sans-serif;border-collapse: collapse;margin: 0px auto;}#accounts td, #customers th {font-size: 1em;border: 1px solid #98bf21;padding: 3px 7px 2px 7px;}#accounts th {font-size: 1.1em;text-align: left;padding-top: 5px;padding-bottom: 4px;background-color: #A7C942;color: #ffffff;}#accounts tr.alt td {color: #000000;background-color: #EAF2D3;}</style>
</head>
<body>
<table id="accounts">
<tbody>
<tr>
<th>Run</th>
<th>Region</th>
<th>Username</th>
<th>Max IP</th>
<th>Game</th>
<th>Spell 1</th>
<th>Spell 2</th>
<th>Summoner</th>
<th>Lvl</th>
<th>Total IP</th>
<th>Total RP</th>
<th>Status</th>
</tr>
<tr>
<td>False</td>
<td>EUW</td>
<td>sage</td>
<td>0</td>
<td>ARAM</td>
<td>Barrier</td>
<td>Heal</td>
<td>Summoner</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>10:26:45 Disconnected</td>
</tr><tr class="alt">
<td>False</td>
<td>EUW</td>
<td>wily</td>
<td>0</td>
<td>ARAM</td>
<td>Barrier</td>
<td>Heal</td>
<td>Summoner</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>10:26:45 Disconnected</td>
</tr><tr>
<td>False</td>
<td>EUW</td>
<td>miles</td>
<td>0</td>
<td>ARAM</td>
<td>Barrier</td>
<td>Heal</td>
<td>Summoner</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>10:26:46 Disconnected</td>
</tr><tr class="alt">
<td>False</td>
<td>EUW</td>
<td>cookie</td>
<td>0</td>
<td>ARAM</td>
<td>Barrier</td>
<td>Heal</td>
<td>Summoner</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>10:26:47 Disconnected</td>
</tr><tr>
<td>False</td>
<td>EUW</td>
<td>lazors</td>
<td>0</td>
<td>ARAM</td>
<td>Barrier</td>
<td>Heal</td>
<td>Summoner</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>10:26:48 Disconnected</td>
</tr></tbody>
</table>
<center>Updated core files | 00:50:56:C0:00:01 00:50:56:C0:00:08 00:21:CC:73:5B:BF 08:11:96:F7:A7:0C 00:FF:8B:11:85:F4 <br>Refreshes every 5 seconds</center>
</body>
</html>

一般承认的答案

基本上你所拥有的是一个string列表列表:-)。这意味着它是“两级层次结构”。

在当前状态中,您只是枚举和编写每个内部列表本身。因为Console.WriteLine不熟悉Lists ,所以它只调用实例上的ToString() ,它输出类型名称。

你真正想要的是枚举内部列表:

//enumerate all lists in the outer list
foreach ( var list in table )
{
   //enumerate the inner list
   foreach ( var item in list )
   {
        //output the actual item
        Console.WriteLine( item );
   }
}

热门答案

table.ForEach(x => x.ForEach(Console.WriteLine));

Martin的解释是正确的,只是除了它:你可以用LINQ这样做。



Related

许可下: CC-BY-SA with attribution
不隶属于 Stack Overflow
这个KB合法吗? 是的,了解原因
许可下: CC-BY-SA with attribution
不隶属于 Stack Overflow
这个KB合法吗? 是的,了解原因