Is it possible to use a wildcard or a string that "contains" in a switch case? caution: lengthy

c# html-agility-pack html-parsing screen-scraping

Question

I read a lot of sample code since I'm new to programming and try to piece things together to see what works. I'm attempting to scrape a news website using the HTML Agility Pack.

Problem: One of the nodes I am testing utilizes time of viewing rather than a static number. How may I use this with a switch-case method? If I am completely off the mark in my strategy, I am also open to any recommendations.

Also, if there is a method to bypass this node that works for me, I don't need to record it.

I made the decision to utilize a switch-based example.

var rows = doc.DocumentNode.SelectNodes(".//*[@id='weekdays']/tr");
            foreach (var row in rows)
            {
                var cells = row.SelectNodes("./td");
                string title = cells[0].InnerText;
                var valueRow = cells[2];
                switch (title)
                {
                    case "Date":
                        HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
                        Console.WriteLine("UPC=A:\t" + date.InnerText);
                        break;

                    case "":
                        string Time = valueRow.InnerText;
                        Console.WriteLine("Time:\t" + Time);
                        break;


                    case "News":
                        string Time = valueRow.InnerText;
                        Console.WriteLine("News:\t" + News);
                        break;
                }

fragment of html

<table id="weekdays" cellpadding="6" cellspacing="0" border="0" width="100%">
                    <tr>
                        <td class="thead" style="border-bottom: 1px solid #d1d1e1;font-weight:normal; text-align: center; width:8%; padding-left: 6px;">Date</td>
                        <td class="thead" style="border-bottom: 1px solid #d1d1e1;font-weight:normal; width:8%; text-align: center; white-space:nowrap"><a href="guestcp.php?do=customoptions" title="Time &amp; Date Options"><img style="position:relative; vertical-align: bottom;" src="images/misc/clock_small.gif" title="Time &amp; Date Options" alt="Time &amp; Date Options" border="0" /></a><a href="guestcp.php?do=customoptions" title="Time &amp; Date Options"><span id="ff_nowtime_clock">3:20pm</span></a></td>
                        <td class="thead" style="border-bottom: 1px solid #d1d1e1;font-weight:normal; text-align: center; width:8%;">News</td>

.........

                    <tr id="detail_row_seek_37876">

        <td id="toprow_9" class="alt1 espace" rowspan="3" style="vertical-align: top; text-align: center;" nowrap="nowrap">
            <span class="smallfont">
                <div>Sat</div>
                Apr 9
            </span>
        </td>

    <td class="alt1 espace" style="text-align: center;" nowrap="nowrap">

            <span class="smallfont">Day 3</span>


    </td>
    <td class="alt1 espace" style="text-align: center;"><span class="smallfont">EUR</span></td>
    <td class="alt1 espace" style="padding-top: 2px" align="center">


<a name="chart=37876" style="position:absolute; margin-top: -10px;"></a><a name="details=37876" style="position:absolute; margin-top: -10px;"></a>


<div class="cal_imp_medium" title="Medium Impact Expected"></div></td>
    <td class="alt1 espace">

        <div class="smallfont" id="title_37876" style="padding-left: 11px;">ECOFIN Meetings</div>

    </td>

The so-called time column is not static; rather, it employs a time value, which is the issue. Is there a method to get around this very wordy difficulty, such as using a wild card in the case or doing a "contains"?

1
2
4/9/2011 11:04:34 PM

Accepted Answer

Each instance of the switch statement requires the usage of constant values.

I can only think of one method for you to do what you're trying to do: utilize thedefault: You may test the value you're searching by utilizing a case in this default case.contains , Parse or Regex test withif .

Sorry, I couldn't fully understand your HTML example code, but updated C# would resemble:

            switch (title)
            {
                case "Date":
                    HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
                    Console.WriteLine("UPC=A:\t" + date.InnerText);
                    break;


                case "News":
                    string News = valueRow.InnerText;
                    Console.WriteLine("News:\t" + News);
                    break;

                default:
                    if (regexTime.Match(title))
                    {
                        string Time = valueRow.InnerText;
                        Console.WriteLine("Time:\t" + Time);
                    }
                    break;
            }
5
4/9/2011 11:17:44 PM

Popular Answer

Use the "case default:" and insert a condition to do a check.

 switch (title) {
                    case "Date":
                        HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
                        Console.WriteLine("UPC=A:\t" + date.InnerText);
                        break;

                    case "News":
                        string Time = valueRow.InnerText;
                        Console.WriteLine("News:\t" + News);
                        break;

                    case default:
                        if (whatever you need) {
                            ...
                        }
                        break;
                }


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow