Need to replace an img src attrib with new value

c#-4.0 html-agility-pack winforms

Question

I'm retrieving HTML of many webpages (saved earlier) from SQL Server. My purpose is to modify an img's src attribute. There is only one img tag in the HTML and it's source is like so:

... <td colspan="3" align="center"> <img src="/crossword/13cnum1.gif" height="360" width="360" border="1"><br></td> ...

I need to change the /crossword/13cnum1.gif to http://www.nostrotech.com/crossword/13cnum1.gif

Code:

    private void ReplaceTest() {
        String currentCode = string.Empty;

        Cursor saveCursor = Cursor.Current;

        try {
            Cursor.Current = Cursors.WaitCursor;
            foreach (WebData oneWebData in DataContext.DbContext.WebDatas.OrderBy(order => order.PuzzleDate)) {
                if (oneWebData.Status == "Done" ) {

                    currentCode = oneWebData.Code;

                    #region Setup Agility
                    HtmlAgilityPack.HtmlDocument AgilityHtmlDocument = new HtmlAgilityPack.HtmlDocument {
                        OptionFixNestedTags = true
                    };

                    AgilityHtmlDocument.LoadHtml(oneWebData.PageData);
                    #endregion

                    #region Image and URL
                    var imageOnPage = from imgTags in AgilityHtmlDocument.DocumentNode.Descendants()
                                                        where imgTags.Name == "img" &&
                                                                 imgTags.Attributes["height"] != null &&
                                                                 imgTags.Attributes["width"] != null
                                                        select new {
                                                            Url = imgTags.Attributes["src"].Value,
                                                            tag = imgTags.Attributes["src"],
                                                            Text = imgTags.InnerText
                                                        };

                    if (imageOnPage == null) {
                        continue;
                    }

                    imageOnPage.FirstOrDefault().tag.Value = "http://www.nostrotech.com" + imageOnPage.FirstOrDefault().Url;                                                            
                    #endregion                  
                }
            }
        }
        catch (Exception ex) {
            XtraMessageBox.Show(String.Format("Exception: " + currentCode + "!{0}Message: {1}{0}{0}Details:{0}{2}", Environment.NewLine, ex.Message, ex.StackTrace), Text, MessageBoxButtons.OK, MessageBoxIcon.Error);
        }
        finally {
            Cursor.Current = saveCursor;
        }           
    }

I need help as the markup is NOT updated this way and I need to store the modified markup back to the DB. Thanks.

Accepted Answer

XPATH is much more consise than all this XLinq jargon, IMHO... Here is how to do it:

    HtmlDocument doc = new HtmlDocument();
    doc.Load(myHtml);

    foreach (HtmlNode img in doc.DocumentNode.SelectNodes("//img[@src and @height and @width]"))
    {
        img.SetAttributeValue("src", "http://www.nostrotech.com" + img.GetAttributeValue("src", null));
    }

This code searches for img tags that have src, height and width attributes. Then, it replaces the src attribute value.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why