Although ASP.NET MVC’s inherent routing capabilities provide search engine optimization (SEO) benefits right out of the box, there's still a large number of problems that IIS and ASP.NET sites can run afoul of if not checked. In this article, we’ll look at one problem and I'll provide you with an insanely simple solution for addressing it. For more SEO development articles, see "ASP.NET MVC, SEO, and NotFoundResults: A Better Way to Handle Missing Content" and "ASP.NET MVC and Search Engine Optimization."


Duplicate Content and Canonicalization

A key consideration that many search engines strive to accomplish is to reduce the amount of duplicate content that's shown in the browser when someone executes a search. Ironically, IIS and ASP.NET sites easily create duplicate content by default—simply because IIS and ASP.NET go to such great lengths to help ensure that visitors are able to find the content that's being requested. For example, if you consider the following URLs for a simple categories page on a given website, any of the URLs would work (under default conditions) as a means of accessing what developers would consider the same content:

The problem is that all of these URLs are distinct or different—meaning that a site that provides access to the same content via all of those different URLs is actually guilty of creating duplicate content. And because search engines seek to remove duplicated content, the existence of this problem can dilute and decrease page ranking to the point in which this problem has received lots of coverage and attention in SEO circles. The solution for fixing this problem is to simply standardize on the links that are used to promote content on your site.

The problem, of course, is that if you have a large site, it’s harder to standardize link formatting to make URLs as consistent as they need to be to avoid duplication. Moreover, a bigger problem is that you don’t have much control over how external sites will be linking to your content, and search engines place heavy emphasis on external links. Consequently, things such as the presence or absence of trailing slashes, capitalization, and the use or non-use of "www" end up being problems that you’ll need to address when handling requests. Only, because IIS and ASP.NET have embraced the fact that URLs can vary greatly from one user to the next, these sites almost blur these differences out of consideration for developers—making it easy to see how MVC and other IIS and ASP.NET sites can easily fall prey to duplicate content—especially because IIS is case-insensitive.

An elementary way to address all of these concerns with MVC sites is to use a simple Action Filter, which will enforce canonicalization schemes by throwing HTTP 301s (which are effectively unnoticed by end users, but tell bots exactly which URL is preferred) when non-standardized URLs are hit.


Creating a Canonicalization Action Filter

Creating a canonicalization Action Filter for MVC applications is actually trivial. All that's needed is a standardized set of rules that you want to use for accessing content on your site, and then you need to write some simple ASP.NET logic to enforce those rules. Then apply that filter to all of your controllers (or to a base controller, which all of your other controllers inherit) to ensure that URLs are consistently accessed sitewide.

For example, Figure 1 shows the entire body of a filter that I’ve created for my site where I’ve determined that I always want URLs to reference "www" as part of the host name and where I’ve also specified that URLs should always be lowercase and terminate with a trailing slash.

If you look at the code for this filter, you’ll see that the logic involved is trivial. All I’ve done is override the filter’s OnExecuting() method to do some very simple checks to ensure that requests match the rules that I’ve preferentially defined. For example, in my first two checks I ensure that everything is lowercase and that all requests terminate in a trailing slash. However, you could easily invert my trailing-slash logic to make sure that your canonicalization rules ensure that requests don’t end in a trailing slash.


Some Additional Considerations

However, there are a few things to consider in Figure 1. One consideration is that, I’m processing canonicalization rules for only GET requests, and that is for two reasons. First, search engines (or their bots) shouldn’t be executing POST requests against your site, so there’s no need to standardize URL schemes used for POST operations. Second, when it comes to POST operations the last thing you want is for a POST’s collection of form data to mysteriously vanish because there was a transparent redirect going on underneath the covers. In other words, make sure that you’re canonicalizing only verbs that make sense (e.g., GETs).

Note that before I start making checks against host names to determine whether host names are prefixed with "www," I make a quick check to my configuration class to make sure I’m not on a development server. Otherwise, localhost requests would end up being shunted out to the live site during development and testing operations. Of course, I could handle that in several different ways, but the key thing to note is that if you set up your own canonicalization filter, you’ll want to account for this as well. Similarly, you might also want to invert the logic or preference I’ve defined and ensure that requests are never canonicalized against a "www" prefix and instead go directly against your host name.

Likewise, although it borders on being a bit redundant (in terms of duplicating some checks and processing), my use of a Redirect() helper method is designed to correct as many canonicalization problems in a single redirect as possible to avoid incurring too many 301s for a single request because too many 301s can also hurt ranking. Accordingly, the code in my Redirect() method actually comes close to bordering on violating DRY (Don’t Repeat Yourself), but I personally would prefer that over losing potential ranking.



In the end, ASP.NET MVC sites are easily susceptible to some ugly problems associated with duplicate content, but Action Filters make addressing these problems all but trivial. And although I’ve outlined a simple example showing how one of my sites meets my preferences— preferences (which are subject to both change and controversy) aren’t the key. Instead, the key point to take away from here is that Action Filters can be used to define a set of rules that can consistently be enforced as a means of improving overall SEO.

Figure 1: Canonicalized Filter

public class CanonicalizedAttribute : ActionFilterAttribute                              {                                  private IAppConfiguration _config;                                                                 public CanonicalizedAttribute()                                  {                                      this._config = DependencyResolver.Current.GetService<IAppConfiguration>();                                  }                                                                 public override void OnActionExecuting(ActionExecutingContext filterContext)                                  {                                      HttpContextBase Context = filterContext.HttpContext;                                                                     string path = Context.Request.Url.AbsolutePath ?? "/";                                      string query = Context.Request.Url.Query;                                                                     // don't 'rewrite' POST requests or you'll lost values:                                      if (Context.Request.RequestType == "GET")                                      {                                          // check for any upper-case letters:                                          if (path != path.ToLower(CultureInfo.InvariantCulture))                                          {                                              this.Redirect(Context, path, query);                                              return;                                          }                                                                         // make sure request ends with a "/"                                          if(!path.EndsWith("/"))                                          {                                              this.Redirect(Context, path + "/", query);                                              return;                                          }                                                                         // perform hostname checks (unless working in dev):                                          if (this._config.IsProductionServer)                                          {                                              string hostName =                                                  Context.Request.Url.Host.ToLower(CultureInfo.InvariantCulture);                                                                                           if (!hostName.Contains(""))                                              {                                                  this.Redirect(Context, path, query);                                                  return;                                              }                                                                             // don't allow host-name only connections (i.e., force 'www'):                                              if (!hostName.StartsWith("www."))                                              {                                                  this.Redirect(Context, path, query);                                                  return;                                              }                                          }                                      }                                                                     base.OnActionExecuting(filterContext);                                             }                                                                 // correct as many 'rules' as possible per redirect to avoid                                  // issuing too many redirects per request.                                  private void Redirect(HttpContextBase context, string path, string query)                                  {                                      string newLocation = "" + path;                                      if (!newLocation.EndsWith("/"))                                          newLocation += "/";                                                                     newLocation = newLocation.ToLower(CultureInfo.InvariantCulture);                                                                     context.Response.RedirectPermanent(newLocation + query, true);                                  }                              }