Html Agility Pack (HAP)

This library is sponsorised by Entity Framework Extensions

Downloaded more than
0
times !
// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

// From Web
var url = "http://html-agility-pack.net/";
var web = new HtmlWeb();
var doc = web.Load(url);

What's Html Agility Pack?

What's Html Agility Pack?

HAP is an HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT.

What's web scraping in C#?

What's web scraping in C#?

Web scraping is a technique used in any language such as C# to extract data from a website.

Is web scraping legal?

Is web scraping legal?

That's a gray zone! There is no official answer about it, and almost every company has some web scraping program. In short, do polite crawling and don't spam a website and everything will be fine.

When is the v2.x coming?

When is the v2.x coming?

There is no official date, but the work is in progress. A lot of improvement is already planned to make web scraping even easier!

Which 3rd party libraries?

Which 3rd party libraries?

You can enhance HAP with some third party libraries:

Where can I find Html Agility Pack examples?

Where can I find Html Agility Pack examples?

Online examples are now available!

Online Examples

We need your help to support this Html Agility Pack!

Html Agility Pack is FREE and always will be.

However, last year alone, we spent over 3000 hours maintaining our free projects! We need resources to keep developing our open-source projects.

We highly appreciate any contribution!


> 3,000+ Requests answered per year
> $100,000 USD investment per year
> 500 Commits per year
> 100 Releases per year


HTML Parser

Load and parse HTML

HAP - Parser Example
// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

// From Web
var url = "http://html-agility-pack.net/";
var web = new HtmlWeb();
var doc = web.Load(url);

HTML Selectors

Select HtmlNode, Element, and Attributes:

HAP - Selectors Examples
// With XPath 
var value = doc.DocumentNode
 .SelectNodes("//td/input")
 .First()
 .Attributes["value"].Value;
 
// With LINQ 
var nodes = doc.DocumentNode.Descendants("input")
 .Select(y => y.Descendants()
 .Where(x => x.Attributes["class"].Value == "box"))
 .ToList();

HTML Manipulation

Manipulate HtmlNode, Element, and Attributes:

HAP - Manipulation Example
var doc = new HtmlDocument();
doc.LoadHtml(html);

// InnerHtml 
var innerHtml = doc.DocumentNode.InnerHtml;

// InnerText 
var innerText = doc.DocumentNode.InnerText;

HTML Traversing

Traverse HtmlNode, Element, and Attributes:

HAP - Traversing Example
var doc = new HtmlDocument();
htmlDoc.LoadHtml(html);

// Descendants 
var nodes = doc.DocumentNode.Descendants("input");