On a recent project, we built an OLAP piece to satisfy some of the reporting requirements. The app was a Rails app, but since we used MS SQL Server, we just built the OLAP piece using standard MS tooling (SSIS, SSAS, etc.).
During implementation of the OLAP thing, I recognized that 32 bits was way overkill for some of the primary keys in the app database. For instance, there’s one table that had 6 rows when we deployed, and there are probably no more than 50, let alone 5000000, possible entries in that table, ever. So I took the liberty of “optimizing” the post-ETL database to use smaller, 16-bit primary keys.
Things ran just fine for almost six months.
Right before Christmas, I got an email from the DBA, saying that the processing job just started reporting an error:
… Description: Data conversion failed while converting column “id” (199) to column “small_id” (142). The conversion returned status value 6 and status text “Conversion failed because the data value overflowed the specified type.”. …
It was obvious where that was coming from. At this point, I started asking myself, “Why in the world did I decide to truncate those integers?” I was a bit embarassed — such a n00b error.
But I was also curious why this problem happened so quickly. Like I said, the production database that we started with had max(id) == 6, and this table hardly ever got new rows. How did it get over 32767 so quickly? Did we have a weird migration? Or did the customer churn through that many tries at setting up a new value?
I found the answer while I was trying to set up a database to reproduce the error. I thought it was going to be a pain to get a value high enough into the PK, since PKs are usually a little tricky to insert arbitrarily. I decided the easy way to do it would be to bump the sequence up to 100,000 or so, and then use the app to generate some bad data. So the first step was to check what the next sequence value was.
dbcc checkident('things', noreseed)
Checking identity information: current identity value '1460103721', ...
select max(id) from things
So how did a “6” get translated to “1460103721”?
Well, remember that this was a Rails app. To initially populate this table, we used a fixture. We loaded the fixture in a migration. There were several iterations of the initial data, so several migrations loaded several fixtures on top of each other. Early in the project, we used Rails’s magic autonumbering fixtures for the initial data. Rails basically creates a hash of the fixture name to generate the id. Near the 1.0 release, I got paranoid about Rails generating an ID close enough to MAX_INT that the customer wouldn’t be able to add a reasonable number of new entries to the table, so we switched to hand-numbered fixtures. Evidently, the last fixture load (the one with max(id) == 6) should have also reset the sequence so the next value would be 7, not 1460103722.
Looking through the tables now, it looks like most of them are on the order of 1,000,000,000 away from max. The closest one is the admin users table, which is at 2,126,185,674. So the app will only support 21,297,973 more admin users. I think I’m comfortable with that.
Anyway, back to my main point: Don’t truncate data! Even if you think you’re being all slick and saving all kinds of space (I was saving 500 KB!) don’t do it!