I'm on a development process of a crawling engine. My program crawls websites through Xpath with HtmlAgilityPack. I need to get some image src tag's directly. You can see my simple code below which is not working correctly, thanks in advice!
PS: Please ignore " char problem, XPath patterns are provided by database.
And this is the line i need to crawl (the
*...* part shows block to extract
<img id="product_photo" src="*/images/thumb/4400/10280/st.jpg*">
Some pages provide image in meta tags so
.Attributes["src"] wont work.
UPDATE: You can see my query and result here
You cann't get the value of "src" or any other attributes in using:
Just by using:
It's because XPath cann't return value of an attribute by SelectSingleNode() func in HtmlAgilityPack class. So you must use
SelectSingleNode(yourXpath).value or use Regex after the pharsing to get just the "src" without the outerText.