Building A Feed Reader #3: Downloading Feeds

Welcome back. After taking a look at understanding the problem, today O will discuss some of the issues to do with connectivity. But first:

Progress As Of Today

If I do say so myself, I have been VERY BUSY working on this. It is beginning to take shape and then some.

New Icons

icons

Preliminary UI For blogs & smart feeds

foldersandfeeds

Smart feeds will be searches or watches for particular key words in all subscribed feeds

Marking items as read and unread

readandunread

Less Laziness

lazy

Don’t you hate it when you see the following: 1 Message(s) unread? That’s laziness right there. LAZINESS!

Tags

tagcreation

You can create your own tags and then apply them to articles. Since there can be dozens of tags, a way to force you to organize yourself is to use what I’ll call quick tags. What are quick tags? Simply tags that you decide you will use more often then most. They will appear in the main UI as follows:

quicktags

Search

The search on the main UI is now functional

searchui

Hitting enter pops up this screen

results

As you can see, the article text, title, categories, authors and tags are used in the search

Click Through Searches

A particularly handy feature is the ability to click on an article’s categories and tags and immediately get a list of other articles with the same category or tag

clickthrough

Clicking on a category link will popup the link below

categoryclick

Clicking  a tag link will popup the link below

tagclick

Task Area Menu

I’ve added support to display the reader only in the task area (configurable from the settings). See the last issue for details

taskarea

Lesson For Today: Connectivity

The very first problem is getting the raw XML feeds, identifying them and then as appropriate interpreting them for storage and display.

Identifying & Downloading Feeds

Identifying the feed type (ATOM, RSS, RSD, etc) Ideally should be something that you do once. So I identify the feed type when you subscribe for the first time. How?

As it so happens the Argotic Framework has a SyndicationDiscovery class that, surprise surprise, discovers a whole lot of information from a feed. It is especially functional because it has LocateEndpoints method that given a URL it will determine all the RSS feeds that can be subscribed to on that particular URL.

The relevant code is as follows:

 

   1: //————————————————————
   2: //    Determine the feed format from a given URL
   3: //————————————————————
   4: Uri uri                     = new Uri(“http://thinkersroom.com/bytes/”);
   5: SyndicationFormat format    = SyndicationDiscovery.DetermineSyndicationFormat(uri);

 

The first time I ran this code it failed spectacularly. Why? Because the Argotic Framework assumes a direct connection to the Internet! Now that right there is a problem for those of us that are behind proxy servers!

Digging around the source led to the discovery that it is peppered with code like this:

 

   1: //————————————————————
   2: //    Create web client to download feed data
   3: //————————————————————
   4: using(WebClient webClient = new WebClient())
   5: {
   6:    //————————————————————
   7:    //    Apply security credentials if available
   8:    //————————————————————
   9:    if(this.Settings != null && this.Settings.Credentials != null)
  10:    {
  11:        webClient.Credentials   = this.Settings.Credentials;
  12:    }
  13:  
  14:    //————————————————————
  15:    //    Download feed data
  16:    //————————————————————
  17:    byte[] data = webClient.DownloadData(sourceAsUri);
  18:  
  19:    //————————————————————
  20:    //    Load syndication feed using created stream
  21:    //————————————————————
  22:    this.Load(new MemoryStream(data));
  23: }

 

This will fail every time if you are behind a proxy server. Fixing it is trivial enough. Since the Argotic code is peppered with such code, a utility function to correctly build a WebCllient is needed

   1: public static WebClient GetClient(WebProxy proxy)
   2: {
   3:     //
   4:     // Create the web client
   5:     //
   6:     WebClient webClient = new WebClient();
   7:     //
   8:     //Add a header to identify ourselves
   9:     //
  10:     webClient.Headers.Add(“user-agent”, “FeedCruncher”);
  11:     //
  12:     //Assing the proxy where applicable
  13:     //
  14:     if (proxy != null)
  15:         webClient.Proxy = proxy;
  16:     return webClient;
  17: }

 

How about the proxy that’s being passed then?

This one is constructed by the client based on the configured user settings. Here is the code (I’m writing the UI itself in VB.NET, and the logic in C#. Why? Because I wondered if I could!

 

   1: If My.Settings.UseProxy Then
   2:     
   3:     ‘ Create a web proxy
   4:     
   5:     Dim proxy As WebProxy = Nothing
   6:     If My.Settings.ProxyAuthentication Then
   7:         
   8:         ‘ Create network credential for authentication
   9:         
  10:         Dim nc As New NetworkCredential(My.Settings.Username, My.Settings.Password, My.Settings.Domain)
  11:         
  12:         ‘ Create web proxy with overload for network credentials
  13:         
  14:         proxy = New WebProxy(String.Format(“{0}:{1}”, My.Settings.Host, My.Settings.Port), True, _
  15:         Regex.Split(My.Settings.ByPassAddresses, “;|,”), nc)
  16:     Else
  17:         
  18:         ‘ Create web proxy without overload for network credentials
  19:         
  20:         proxy = New WebProxy(String.Format(“{0}:{1}”, My.Settings.Host, My.Settings.Port), True, _
  21:         Regex.Split(My.Settings.ByPassAddresses, “;|,”))
  22:     End If
  23:     
  24:     ‘ Return our proxy
  25:     
  26:     Return proxy
  27: Else
  28:     
  29:     ‘ We don’t need a proxy. Return nothing
  30:     
  31:     Return Nothing
  32: End If

 

And so, our connectivity issue is sorted.

The DetermineSyndicationFormat returns an enumeration. Once you know this you can correctly download the appropriate feed type using code like this:

   1: SyndicationFormat sf = SyndicationDiscovery.DetermineSyndicationFormat(new Uri(URL), proxy);
   2: switch (sf)
   3: {
   4:     case SyndicationFormat.Rss:
   5:         RssFeed rfeed = RssFeed.Create(new Uri(URL), proxy);
   6:         myFeed.Type = FeedType.RSS;
   7:         myFeed.SiteURL = rfeed.Channel.Link.OriginalString;
   8:         myFeed.FeedURL = URL;
   9:         myFeed.Name = rfeed.Channel.Title;
  10:         myFeed.Header = rfeed.Header;
  11:         myFeed.Description = rfeed.Channel.Description;
  12:         myFeed.LastUpdated = rfeed.Channel.LastBuildDate;
  13:         myFeed.Generator = rfeed.Channel.Generator;
  14:         foreach (RssItem item in rfeed.Channel.Items)
  15:         {
  16:             FeedItem myFeedItem = new FeedItem(item.Link.OriginalString, item.Title, item.Description, item.PublicationDate, item.Guid.Value, false);
  17:             myFeedItem.FeedID = myFeed.FeedID;
  18:             foreach(RssCategory cat in item.Categories)
  19:                 myFeedItem.Categories.Add(new Category(cat.Value));
  20:             myFeed.Items.Add(myFeedItem);
  21:             myFeed.Guids.Add(item.Guid.Value);
  22:         }
  23:         break;
  24:     case SyndicationFormat.Atom:
  25:         //
  26:         // Same logic but for the ATOM format
  27:         //
  28: }
  29: return myFeed;

 

To store the results, I use plain old C# objects to mirror a blog (Feed) and its items (FeedItems)

So, as you can see the bulk of the work is done by the Argotic Framework.

Conditional Get

When it comes to feed readers, an essential consideration is saving bandwidth. The most common way is to only fetch a feed if it has been updated since the last check. This is implemented on the http conditional-get method, and in this manner we can save quite a bit of bandwidth.

However the build that I believe to be the latest, 2.0, always returns true for the HasChanges property. I’ve been looking for the code for a while but it’s not clear to me how the headers are set in the conditional get request. Hmm.

Now what do we do once we have this data in our custom objects? Why save it of course.

Stay tuned for Episode 4: Persisting to Database

 

kick it on DotNetKicks.com

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blogmarks
  • co.mments
  • del.icio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Reddit
  • TailRank
  • YahooMyWeb
 

Other posts

One response


  1. Nice post, and promising project.
    Some months ago I was so frustrated about the lack of completeness in currently avaiable feed readers, so I started thinking about building my own.
    Unfortunately my work priorities stoled all my spare time, so everything stopped there.
    I hope you’ll fill the gap, and publish a beta as soon as possibile :) !

Leave a Reply