Html Agility Pack (HAP)
// From File var doc = new HtmlDocument(); doc.Load(filePath); // From String var doc = new HtmlDocument(); doc.LoadHtml(html); // From Web var url = "http://html-agility-pack.net/"; var web = new HtmlWeb(); var doc = web.Load(url);
What's Html Agility Pack?
HAP is an HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT.
What's web scraping in C#?
Web scraping is a technique used in any language such as C# to extract data from a website.
Is web scraping legal?
That's a gray zone! There is no official answer about it, and almost every company has some web scraping program. In short, do polite crawling and don't spam a website and everything will be fine.
When is the v2.x coming?
There is no official date, but the work is in progress. A lot of improvement is already planned to make web scraping even easier!
Which 3rd party libraries?
You can enhance HAP with some third party libraries:
We need your help to support this Html Agility Pack!
Html Agility Pack is FREE and always will be.
However, last year alone, we spent over 3000 hours maintaining our free projects! We need resources to keep developing our open-source projects.
We highly appreciate any contribution!
> 3,000+ Requests answered per year
> $100,000 USD investment per year
> 500 Commits per year
> 100 Releases per year
