Scrape JSON from webpage with C#

c# html-agility-pack

Question

I'm new to C# and asynchronous task execution.

I'm trying to scrape some music album info from a website. The webpage's search returns a plaintext JSON object, but I'm unable to access any DOM information. I tried the following (using HTMLAgilityPack):

using HtmlAgilityPack;
using System;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace WindowsFormsApp1 {
public partial class Form1 : Form {
    public Form1() {
        InitializeComponent();
    }

    public async Task<String> AlbumScraper(string albumname) {

        HtmlWeb web = new HtmlWeb();

        string albumurl = Uri.EscapeUriString("https://www.metal-archives.com/search/ajax-album-search/?field=title&query=" + albumname);
        Console.Write(albumurl);
        var albumdoc = await Task.Factory.StartNew(() => web.Load(albumurl));
        string albumjson = "";

        if (albumdoc.DocumentNode != null) {
            albumjson = albumdoc.DocumentNode.InnerText;
        }

        return albumjson;
    }

    private async void Form1_Load(object sender, EventArgs e) {
        string rawtext = await AlbumScraper("rust+in+peace");
        Console.Write(rawtext);
    }
}
}

How can I get the resulting JSON text? When I load the "albumurl" URL, I can see it perfectly.

1
1
7/4/2017 5:15:27 AM

Accepted Answer

You don't require HtmlAgilityPack, to start.

Secondly,try:

using Newtonsoft.Json.Linq;

string albumurl = Uri.EscapeUriString("https://www.metal-archives.com/search/ajax-album-search/?field=title&query=rust+in+peace");
string doc = "";
using (System.Net.WebClient client = new System.Net.WebClient()) // WebClient class inherits IDisposable
{
    doc = client.DownloadString(albumurl);
}

Following that, you may deserialize it (@itikhomi).

AlbumSearchResponse data = JsonConvert.DeserializeObject<AlbumSearchResponse>(doc);

You may also manually parse it.

JObject json = JObject.Parse(doc);
string error= Convert.ToString(json["error"]);
. . .
string aaData= Convert.ToString(json["aaData"]);
JArray arr = JArray.Parse(aaData);
foreach(JToken token in arr)
{
    string[] strarr = token.ToObject<string[]>();
}
1
7/4/2017 6:28:07 AM

Popular Answer

It may be produced using the internet tool http://json2csharp.com/. Add produced class to your code after that.

public class AlbumSearchResponse
    {
        public string error { get; set; }
        public int iTotalRecords { get; set; }
        public int iTotalDisplayRecords { get; set; }
        public int sEcho { get; set; }
        public List<List<string>> aaData { get; set; }
    }

rewrite your answer in class

var data = JsonConvert.DeserializeObject<AlbumSearchResponse>(response);
        foreach (var item in data.aaData)
        {
            //do whatever your want with data
        }

In order for JsonConvert to function, you must additionally install the Newtonsoft JSON package from Nuget.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow