Building A Feed Reader #3: Downloading Feeds
Welcome back. After taking a look at understanding the problem, today O will discuss some of the issues to do with connectivity. But first:
Progress As Of Today
If I do say so myself, I have been VERY BUSY working on this. It is beginning to take shape and then some.
New Icons
Preliminary UI For blogs & smart feeds
Smart feeds will be searches or watches for particular key words in all subscribed feeds
Marking items as read and unread
Less Laziness
Don’t you hate it when you see the following: 1 Message(s) unread? That’s laziness right there. LAZINESS!
Tags
You can create your own tags and then apply them to articles. Since there can be dozens of tags, a way to force you to organize yourself is to use what I’ll call quick tags. What are quick tags? Simply tags that you decide you will use more often then most. They will appear in the main UI as follows:
Search
The search on the main UI is now functional
Hitting enter pops up this screen
As you can see, the article text, title, categories, authors and tags are used in the search
Click Through Searches
A particularly handy feature is the ability to click on an article’s categories and tags and immediately get a list of other articles with the same category or tag
Clicking on a category link will popup the link below
Clicking a tag link will popup the link below
Task Area Menu
I’ve added support to display the reader only in the task area (configurable from the settings). See the last issue for details
Lesson For Today: Connectivity
The very first problem is getting the raw XML feeds, identifying them and then as appropriate interpreting them for storage and display.
Identifying & Downloading Feeds
Identifying the feed type (ATOM, RSS, RSD, etc) Ideally should be something that you do once. So I identify the feed type when you subscribe for the first time. How?
As it so happens the Argotic Framework has a SyndicationDiscovery class that, surprise surprise, discovers a whole lot of information from a feed. It is especially functional because it has LocateEndpoints method that given a URL it will determine all the RSS feeds that can be subscribed to on that particular URL.
The relevant code is as follows:
1: //————————————————————
2: // Determine the feed format from a given URL
3: //————————————————————
4: Uri uri = new Uri(“http://thinkersroom.com/bytes/”);
5: SyndicationFormat format = SyndicationDiscovery.DetermineSyndicationFormat(uri);
The first time I ran this code it failed spectacularly. Why? Because the Argotic Framework assumes a direct connection to the Internet! Now that right there is a problem for those of us that are behind proxy servers!
Digging around the source led to the discovery that it is peppered with code like this:
1: //————————————————————
2: // Create web client to download feed data
3: //————————————————————
4: using(WebClient webClient = new WebClient())
5: {
6: //————————————————————
7: // Apply security credentials if available
8: //————————————————————
9: if(this.Settings != null && this.Settings.Credentials != null)
10: {
11: webClient.Credentials = this.Settings.Credentials;
12: }
13:
14: //————————————————————
15: // Download feed data
16: //————————————————————
17: byte[] data = webClient.DownloadData(sourceAsUri);
18:
19: //————————————————————
20: // Load syndication feed using created stream
21: //————————————————————
22: this.Load(new MemoryStream(data));
23: }
This will fail every time if you are behind a proxy server. Fixing it is trivial enough. Since the Argotic code is peppered with such code, a utility function to correctly build a WebCllient is needed
1: public static WebClient GetClient(WebProxy proxy)
2: {
3: //
4: // Create the web client
5: //
6: WebClient webClient = new WebClient();
7: //
8: //Add a header to identify ourselves
9: //
10: webClient.Headers.Add(“user-agent”, “FeedCruncher”);
11: //
12: //Assing the proxy where applicable
13: //
14: if (proxy != null)
15: webClient.Proxy = proxy;
16: return webClient;
17: }
How about the proxy that’s being passed then?
This one is constructed by the client based on the configured user settings. Here is the code (I’m writing the UI itself in VB.NET, and the logic in C#. Why? Because I wondered if I could!
1: If My.Settings.UseProxy Then
2: ‘
3: ‘ Create a web proxy
4: ‘
5: Dim proxy As WebProxy = Nothing
6: If My.Settings.ProxyAuthentication Then
7: ‘
8: ‘ Create network credential for authentication
9: ‘
10: Dim nc As New NetworkCredential(My.Settings.Username, My.Settings.Password, My.Settings.Domain)
11: ‘
12: ‘ Create web proxy with overload for network credentials
13: ‘
14: proxy = New WebProxy(String.Format(“{0}:{1}”, My.Settings.Host, My.Settings.Port), True, _
15: Regex.Split(My.Settings.ByPassAddresses, “;|,”), nc)
16: Else
17: ‘
18: ‘ Create web proxy without overload for network credentials
19: ‘
20: proxy = New WebProxy(String.Format(“{0}:{1}”, My.Settings.Host, My.Settings.Port), True, _
21: Regex.Split(My.Settings.ByPassAddresses, “;|,”))
22: End If
23: ‘
24: ‘ Return our proxy
25: ‘
26: Return proxy
27: Else
28: ‘
29: ‘ We don’t need a proxy. Return nothing
30: ‘
31: Return Nothing
32: End If
And so, our connectivity issue is sorted.
The DetermineSyndicationFormat returns an enumeration. Once you know this you can correctly download the appropriate feed type using code like this:
1: SyndicationFormat sf = SyndicationDiscovery.DetermineSyndicationFormat(new Uri(URL), proxy);
2: switch (sf)
3: {
4: case SyndicationFormat.Rss:
5: RssFeed rfeed = RssFeed.Create(new Uri(URL), proxy);
6: myFeed.Type = FeedType.RSS;
7: myFeed.SiteURL = rfeed.Channel.Link.OriginalString;
8: myFeed.FeedURL = URL;
9: myFeed.Name = rfeed.Channel.Title;
10: myFeed.Header = rfeed.Header;
11: myFeed.Description = rfeed.Channel.Description;
12: myFeed.LastUpdated = rfeed.Channel.LastBuildDate;
13: myFeed.Generator = rfeed.Channel.Generator;
14: foreach (RssItem item in rfeed.Channel.Items)
15: {
16: FeedItem myFeedItem = new FeedItem(item.Link.OriginalString, item.Title, item.Description, item.PublicationDate, item.Guid.Value, false);
17: myFeedItem.FeedID = myFeed.FeedID;
18: foreach(RssCategory cat in item.Categories)
19: myFeedItem.Categories.Add(new Category(cat.Value));
20: myFeed.Items.Add(myFeedItem);
21: myFeed.Guids.Add(item.Guid.Value);
22: }
23: break;
24: case SyndicationFormat.Atom:
25: //
26: // Same logic but for the ATOM format
27: //
28: }
29: return myFeed;
To store the results, I use plain old C# objects to mirror a blog (Feed) and its items (FeedItems)
So, as you can see the bulk of the work is done by the Argotic Framework.
Conditional Get
When it comes to feed readers, an essential consideration is saving bandwidth. The most common way is to only fetch a feed if it has been updated since the last check. This is implemented on the http conditional-get method, and in this manner we can save quite a bit of bandwidth.
However the build that I believe to be the latest, 2.0, always returns true for the HasChanges property. I’ve been looking for the code for a while but it’s not clear to me how the headers are set in the conditional get request. Hmm.
Now what do we do once we have this data in our custom objects? Why save it of course.
Stay tuned for Episode 4: Persisting to Database














Modified

NinjaCross says:
Added on December 7th, 2007 at 4:47 pmNice post, and promising project.
!
Some months ago I was so frustrated about the lack of completeness in currently avaiable feed readers, so I started thinking about building my own.
Unfortunately my work priorities stoled all my spare time, so everything stopped there.
I hope you’ll fill the gap, and publish a beta as soon as possibile