Sitecore: 404 without 302.

Sitecore: 404 without 302.

This post shows how to handle a request for a page which is not found, how to avoid having the request being redirected with a 302 status code before a 404.

How Sitecore handle NotFoundItem

Sitecore lets you specify a page to use for ‘ItemNotFound’ errors in web.config, and it’s always good practice to have a pretty ‘page not found’ page:

<setting name="ItemNotFoundUrl" value="/404" />

When Sitecore can’t find the requested item, it will redirect the user to the page that is configured in the setting ItemNotFoundUrl.

However, during the redirecting the response returns an HTTP status code 302 (Moved Temporarily). This is sad, as search engines don’t understand that the page actually hasn’t been found.

For search engine optimization, you should return the HTTP 404 status code for any invalid URL by setting the Status property of the current System.Web.HttpResponse

What is wrong with a 302 status code?

Basically it all comes down to SEO. If a search bot follows a link and receive a 302 then all is good on that url since it has just been temporarily moved to another address.

This means that the requested url’s which responds with a 302 is kept in the search indexes which is all expected behaviour even though the real intent was to get the url removed from the index since it no longer exists.

302_404

To get the url removed from the search indexes then it has to respond right away with a 404 when requested and not with a 302 followed by a 404.

Unfortunately this 302 status code followed by a 404 is seen very often on quite a lot of Sitecore installations due to improper configuration or bad coding. It is actually the out-of-the-box behaviour in Sitecore if you do not change it to use server transfers instead of redirects.

How to solve it?

To solve this problem you can implement this custom pipeline processor.
It overrides the method that is called when Sitecore redirects the client to the ItemNotFound page.
Instead of redirecting, it does an HTTP request to retrieve the content of the ItemNotFound page and then it writes it to the client together with a 404 Not Found status code.

Keep in mind that:

  • The hostname of your website should be available on the server (some servers don’t have a DNS, in that case you can add it to the hosts file)
  • Information about cookies is not submitted during the internal request of the ItemNotFound page, so if you have personalised information on that page, it will not work

To implement my solution,
add this to your include config file or add the processor to your web.config as a replacement of the default ExecuteRequest processor:

<pipelines>
  <httpRequestBegin>
    <processor type="AndreyVinda.Library.Pipelines.ExecuteRequest, AndreyVinda.Library" patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ExecuteRequest, Sitecore.Kernel']"/>
    <processor type="Sitecore.Pipelines.HttpRequest.ExecuteRequest, Sitecore.Kernel">
      <patch:delete />
    </processor>
  </httpRequestBegin>
</pipelines>

Add this class to your solution:

namespace AndreyVinda.Library.Pipelines
{
    public class ExecuteRequest : Sitecore.Pipelines.HttpRequest.ExecuteRequest
    {
        protected override void RedirectOnItemNotFound(string url)
        {
            var context = System.Web.HttpContext.Current;

            try
            {
                // Request the NotFound page
                string domain = context.Request.Url.GetComponents(UriComponents.Scheme | UriComponents.Host, UriFormat.Unescaped);
                string content = Sitecore.Web.WebUtil.ExecuteWebPage(string.Concat(domain, url));

                // Send the NotFound page content to the client with a 404 status code
                context.Response.TrySkipIisCustomErrors = true;
                context.Response.StatusCode = 404;
                context.Response.Write(content);
            }
            catch (Exception)
            {
                // If our plan fails for any reason, fall back to the base method
                base.RedirectOnItemNotFound(url);
            }

            // Must be outside the try/catch, cause Response.End() throws an exception
            context.Response.End();
        }
    }
}

The result in Firebug after requesting a non-existing item:

404

Enjoy!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s