Automatic Sitemap for Heroku with Ruby on Rails 3.2

A while back I wrote a blogpost on generating a sitemap for a ruby on rails app. This generator was run via a rake task. Now that loads of our apps are deployed to heroku and some have content that changes all of the time, it doesn’t make sense to have a rake task to run periodically for us, we need a sitemap on the fly. So here is how we did it:

Step 1:
In your config/routes.rb file add:

match 'sitemap', :to => "sitemap#index", :as => :sitemap

Step 2:
Create a controller called “app/views/sitemap_controller.rb”
Paste the following into it:

class SitemapController < ApplicationController
  def index
    static_urls = [ {:url => '/home/about',      :updated_at => ""},
                    {:url => '/home/help',       :updated_at => ""},
                    {:url => '/home/contact_us', :updated_at => ""},
                    {:url => '/home/terms',      :updated_at => ""} ] 
    @pages_to_visit  = static_urls
    @pages_to_visit += Article.all.collect{  |a| {:url => article_path(a) ,  :updated_at => I18n.l(a.updated_at, :format => :w3c)} }
    @pages_to_visit += Category.all.collect{ |c| {:url => category_path(c) , :updated_at => I18n.l(c.updated_at, :format => :w3c)} }
    respond_to do |format|
      format.xml
    end
  end
end

Step 3:
Create a view for the xml. I have used Haml for my view, you could use builder as an alternative. Call this “app/views/sitemap/index.xml.haml”

- base_url = "http://#{request.host_with_port}"
!!! XML
%urlset{:xmlns => "http://www.sitemaps.org/schemas/sitemap/0.9"}
  - @pages_to_visit.each do |page|
    %url
      %loc #{base_url}#{page[:url]}
      - if page[:updated_at].present?
        %lastmod= page[:updated_at]
      %changefreq= page[:changefreq].present? ? page[:changefreq] : "monthly"
      %priority= page[:priority].present? ? page[:priority] : "0.5"

Step 4:
Sitemaps require the date to be in the w3c DateTime format. This involves creating a specific Time format. In “config/locales/en.yml” add the following format:

en:
  time:
    formats:
      w3c: "%Y-%m-%dT%H:%M:%S+00:00"

Note: The timezone is has been hard-coded here. In ruby 1.9.3 you can specify “%:z” to get the timezone at the end of this, however heroku is running 1.9.1 so this cannot be done. However Heroku is at UTC time anyway so this does not matter!

If you don’t have anything custom inside this file that should be the entire contents of the file.
This means you can now use “I18n.l” to specify your custom timestamp format, for example:

I18n.l(a.updated_at, :format => :w3c)

which we used above.

And thats your sitemap on the fly! You could also generate a html sitemap if you wanted alongside this, or any other format you wish.
For the html one just add a format.html in the respond_to block of the controller and then add a new file in “app/views/sitemap/index.html.haml” and put your haml view for the html sitemap there. If you are not using haml substitute “.haml” for “.erb”

2 Comments

RSS feed for comments on this post.

  1. Benjamin — August 11, 2012

    you can clear up your controller and use the w3c helper rails already provides.

    http://apidock.com/rails/DateTime/xmlschema or iso8601

    i18n is more supposed to be different in languages but there wont be any difference in languages.

    hope you like the already included version.

  2. admin — August 28, 2012

    nice thanks!

Leave a comment

Preview: