This is a not that common of a problem, though there are other posts on the topic. But my solution’s a little different from what I found, so I figured it was blog-worthy. I used this in a rails 3.0 app. I’ve also only tried it with sqlite, so it may totally blow up when I push to heroku. YMMV.

Take, for example, some initial tables:

users – id:primary_key
replies – id:primary_key
liked_replies – user_id:integer, reply_id:integer

and some initial models:

class User < ActiveRecord::Base
  has_and_belongs_to_many :liked_replies, :class_name => 'Reply', :join_table => 'liked_replies'

class Reply < ActiveRecord::Base
  has_and_belongs_to_many :likers, :class_name => 'User', :join_table => 'liked_replies'

And, of course, there’s existing data that you don’t want to destroy.

So, I generated a migration:

bundle exec rails g migration add_id_to_liked_replies id:primary_key

If you’re curious, the migration looks like this:

class AddIdToLikedReplies < ActiveRecord::Migration
  def self.up
    add_column :liked_replies, :id, :primary_key
  def self.down
    remove_column :liked_replies, :id

Migrate as usual, and then the app stops running, because routes can’t be built, because User is a devise model, and devise_for tries to load User which has an association that generates one of these:

Primary key is not allowed in a has_and_belongs_to_many join table (liked_replies). (ActiveRecord::HasAndBelongsToManyAssociationWithPrimaryKeyError)

Hm. This will probably happen when we deploy, too. And this makes a db:rollback fail, too. !!!! So, how can we make the code work with or without a primary key? I’ve seen lots of people talk about making your code work with the before & after version of your db, and that seems like the right goal here, too.

Here are my new model classes that work with either the new or old liked_replies table:

class User < ActiveRecord::Base
    has_and_belongs_to_many :liked_replies, :class_name => 'Reply', :join_table => 'liked_replies'
    has_many :liked_reply_records, :class_name => 'LikedReply', :dependent => :destroy
    has_many :liked_replies, :through => :liked_reply_records, :source => :reply

class Reply < ActiveRecord::Base
    has_and_belongs_to_many :likers, :class_name => 'User', :join_table => 'liked_replies'
    has_many :liked_replies
    has_many :likers, :through => :liked_replies, :source => :user

class LikedReply < ActiveRecord::Base
  belongs_to :user
  belongs_to :reply

So, say you’re in a fantasy world, and the rif-raf is beating down your doors…

Or, maybe, you’re trying to write a sane json rendering of your model, and the model has a polymorphic belongs_to.

As a bit of review, let’s start with the easy case. You have some models with a simple association.

class Child < ActiveRecord::Base
  belongs_to :parent

class Parent < ActiveRecord::Base
  has_many :children

The rabl for a child is pretty easy. In app/views/children/show.rabl you might have:

object @child
attributes :id, :name
child(:parent) { partial "children/parent" }

As it turns out, you’re actually writing an app that’s used to track orphans left behind after UFO visits, so you need to support children of aliens, too. So your models look more like this:

class Child < ActiveRecord::Base
  belongs_to :parent, :polymorphic => true

class HumanParent < ActiveRecord::Base
  has_many :children, :as => :parent

class AlienParent < ActiveRecord::Base
  has_many :children, :as => :parent

And of course you need to output different information for alien parents (app/views/children/alien_parent.rabl) than for human parents (app/views/children/human_parent.rabl). So your rabl in app/views/children/show.rabl will look more like this:

object @child
attributes :id, :name
code(:parent) { |child|
  partial "children/#{}",
   :object => child.parent

For a while, I’ve been keeping bundled gems local to each project. (It’s pretty easy to do: bundle --path .bundle/gems the first time you bundle a project. I like putting it in .bundle because it’s already gitignored.) This results in a lot of duplicate gems across my system, but it has the advantage of keeping each project very self-contained, and I avoid undesirable side-effects (like rake getting upgraded) just because I bundle some code from github.

This leads to a pretty sparse set of system gems. But it also makes it a little challenging to bootstrap a new rails project, since there isn’t a current rails executable! How to avoid installing the latest rails to start a new project?

It’s pretty easy, when you think about it, but it is a little less straightforward.

mkdir -p ~/my/new/awesome_project
cd ~/my/new/awesome_project
echo source :rubygems            > Gemfile
echo gem 'rails', '~>3.1.0.rc4' >> Gemfile
bundle --path .bundle
bundle exec rails .

It’ll ask if you want to overwrite Gemfile, which is, of course, fine, because it’s just going to add more gems.

In Chad Fowler’s list of 20 Rails Development No-No’s, he included “seed data in migrations” as a bad thing.

I can definitely say that I do this. Almost every project has some. On my last big rails project, there was seed data in the migrations.

We put seed data into migrations for several reasons. It was convenient. Seed data often went with other DB changes, so we just put it all in one place. Also, we didn’t have a lot of time to figure out where to put the seed data, and the “best practice” that I was aware of (“just use schema.rb”) was quite inadequate: it didn’t push in the seed data that the application needed in order to function; and it would have destroyed all of the application’s data on upgrades.

A commenter added:

we pull this out into a separate rake task (load seed data). This allows the seed data to adapt as the project adapts. If you have a lookup table with several sentinel values, then when you want to update that list, you need to be aware that you added them in a previous migration.

Another commenter says:

About using seed data in migrations, I think it’s Ok as long as you useSQL to load data into the database and don’t rely on models cause’ they will eventually change or be removed from the project.

On my last big rails project, we used the models to load seed data sometimes, and we used fixtures other times. The problem with fixtures is that they’re not designed for seed data, so it’s a bit of a pain to configure them correctly, and if you reload them in a later migration, you have to tell the fixtures that it’s OK to load data into the same table again.  When we used the model classes, we usually had to reset the column information before we used them, or create temp classes that were just used in the migrations. One of the approaches pointed out by these commenters would have been helpful.

On a recent project, we built an OLAP piece to satisfy some of the reporting requirements. The app was a Rails app, but since we used MS SQL Server, we just built the OLAP piece using standard MS tooling (SSIS, SSAS, etc.).

During implementation of the OLAP thing, I recognized that 32 bits was way overkill for some of the primary keys in the app database. For instance, there’s one table that had 6 rows when we deployed, and there are probably no more than 50, let alone 5000000, possible entries in that table, ever. So I took the liberty of “optimizing” the post-ETL database to use smaller, 16-bit primary keys.

Things ran just fine for almost six months.

Right before Christmas, I got an email from the DBA, saying that the processing job just started reporting an error:

… Description: Data conversion failed while converting column “id” (199) to column “small_id” (142).  The conversion returned status value 6 and status text “Conversion failed because the data value overflowed the specified type.”. …

It was obvious where that was coming from. At this point, I started asking myself, “Why in the world did I decide to truncate those integers?” I was a bit embarassed — such a n00b error.

But I was also curious why this problem happened so quickly. Like I said, the production database that we started with had max(id) == 6, and this table hardly ever got new rows. How did it get over 32767 so quickly? Did we have a weird migration? Or did the customer churn through that many tries at setting up a new value?

I found the answer while I was trying to set up a database to reproduce the error. I thought it was going to be a pain to get a value high enough into the PK, since PKs are usually a little tricky to insert arbitrarily. I decided the easy way to do it would be to bump the sequence up to 100,000 or so, and then use the app to generate some bad data. So the first step was to check what the next sequence value was.

dbcc checkident('things', noreseed)

Checking identity information: current identity value '1460103721', ...


select max(id) from things


So how did a “6” get translated to “1460103721”?

Well, remember that this was a Rails app. To initially populate this table, we used a fixture. We loaded the fixture in a migration. There were several iterations of the initial data, so several migrations loaded several fixtures on top of each other. Early in the project, we used Rails’s magic autonumbering fixtures for the initial data. Rails basically creates a hash of the fixture name to generate the id. Near the 1.0 release, I got paranoid about Rails generating an ID close enough to MAX_INT that the customer wouldn’t be able to add a reasonable number of new entries to the table, so we switched to hand-numbered fixtures. Evidently, the last fixture load (the one with max(id) == 6) should have also reset the sequence so the next value would be 7, not 1460103722.

Looking through the tables now, it looks like most of them are on the order of 1,000,000,000 away from max. The closest one is the admin users table, which is at 2,126,185,674. So the app will only support 21,297,973 more admin users. I think I’m comfortable with that.

Anyway, back to my main point: Don’t truncate data! Even if you think you’re being all slick and saving all kinds of space (I was saving 500 KB!) don’t do it!

On the server that I set up a couple days ago, there’s a rails app running with passenger on ubuntu 8.10. It’s using SQL Server as its database.

Today, I got a report from a user that it was serving up 500 errors. Not good.

Looking at log/production.log I found that the DB connection had gone stale. (The actual error was “08S01 (0) [unixODBC][FreeTDS][SQL Server]Communication link failure”.) Evidently, patch Tuesday happened this week (as evidenced by the OpenNMS notifications in my inbox and the windows update dialog on my PC about needing to RESTART NOW), so the SQL cluster had been rebooted.

I restarted the app, which resolved the problem, but wasn’t looking forward to future restarts due to SQL cluster activity. I wanted to make it so the app would not fail ever due to a stale DB connection. Could I tell passenger to restart if it got a 500 error? Could I tell rails to retry if the connection is stale?

Thinking back to my recent JRuby on Rails work, I remembered a suggestion from Nick Sieger for releasing ActiveRecord connections. It’s more necessary when using a connection pool (i.e. when running on a J2EE server, and using a JDBC connection pool), but it should also keep my connections from going stale because, well, the connection won’t be around to go stale.

For a recent project, we built a rails application that accepted logins for two different classes of user. It was logically two applications, but since they both worked on the same domain model, we made them the same physical application, for simplicity.

One class of user was the customer of our client. These users were associated with their organization, so we used an “are you logged in?” type of authorization scheme, where any user in the organization could view and modify that organization’s information, but could view and modify no other organization’s information.

The second class of user was broadly termed “admin” users, and was basically people within our client’s company. These users had varying permissions, based on their roles in the company. Because they had varying authorities, we needed a better authorization system than “are you logged in?”

The system we came up with was fairly novel, though not as good as the easy role-based authorization that I read about yesterday. I thought I’d write it up anyways. The system was basically the same as any role-based permissioning system, but the things we decided to apply permissions to were actions.

So there were users, roles, and permissions. Users stored the usual stuff. User has_and_belongs_to_many roles. Roles have a name. Role has_and_belongs_to_many permissions. Permission has a controller and an action.

Thanks to the conventions of rails, we were able to define permissions such as “controller = *, action = show” and add that to the “read-only” role. We deviated from rails’s conventions in a couple of places, but it turned out to be few enough that we were able to set up a few simple permissions, and group them into the roles that the customer needed.

The big advantage of this system was that we needed to write the authorization code once: verify that the current user is allowed to execute the current action.

A disadvantage is that pages didn’t necessarily know if they were linking to a permission denied error. We probably could have resolved this, but we felt that it would work out OK if all the functionality was advertised to everyone that had read access, and if someone needed more functionality than they had, they’d click a link, get a friendly “permission denied, but ask your local admin if you need more access than you have” page.

I skipped all the code in this post, but hopefully you get the gist.