I am running a web crawler and parsing the page content with HtmlAgilityPack and randomly get a StackOverflowException in that C# library, but when I try to see the Call Stack list up to my code, I get:
"The maximum number of stack frames supported by Visual Studio has been exceeded."
Side notes: I'm already using sjdirect's HAP fix.
Here is a snapshot (it repeats like this all the way).
Is there a way to enlarge the number of stack frames Visual Studio can track, to at least as much as the application can allocate before filling its stack? Or can the reverse be done, namely reduce the stack size of the debugged application?
The problem with StackOverflowExceptions is that they are so deep, that the stack effectively gets trashed. This page has a recursive example that causes this condition that ends up with 80,000 levels on the stack.
Considering VS last I read is still a 32-bit app that emulates 64-bit for debugging, you may be blowing well past the available memory for VS to manage the amount of stack levels for you.
There is no apparent feature as to restrict the stack size of the CLR application or increase Visual Studio's tracked stack frame count.
As a solution I'll just give up on HtmlAgilityPack to extract the text (things like this aren't really solutions) and write myself an old fashioned HTML to text parser or try one of the answers other similar questions posted on StackOverflow (very similar to Matt Crouch's question, though none of the answers is fit for extracting renderable text from thousands of pages)
Edit: though regex is not usually recommended, this actually solved my problem (without having to deal with the StackOverflowException): Convert HTML to Plain Text
Thank you for your efforts and I hope this helps somebody else.