October 2009


A few years ago, I read about natural distribution of leading digits in a set of natural numbers. The normal use of this rule is to differentiate between data sets with fabricated numbers and those with real numbers.

Today, I ended up with two sets of sixteen numbers, and was curious how the leading digits were spread out.

The first data set had values between 561 and 8224. The second had values between 39 and 576. The second set was a function of the first. The leading digit frequencies were as follows:

First Digit Frequency Benford’s Law
Set 1 Set 2 Total
1 3 8 11 34.4% 30.1%
2 5 4 9 28.1% 17.6%
3 4 1 5 15.6% 12.5%
4 0 0 0 0% 9.7%
5 1 2 3 9.4% 7.9%
6 0 1 1 3.1% 6.7%
7 1 0 1 3.1% 5.8%
8 1 0 1 3.1% 5.1%
9 1 0 1 3.1% 4.6%

I was impressed with how front-loaded that table is, and how closely it tracked with Benford’s law. There doesn’t seem to be any reason for “1” or “2” to be more common than, say, “4” as a leading digit in either set, but in both cases “1” and “2” (22% of the leading digits) accounted for more than half of the leading digits (34% and 28% respectively).

I spoke last night at IndyALT.NET about source control systems. I had a pretty good time, and I’d like to thank the group for inviting me. For any who are interested, I put my slides on slideshare. The git repo graphs were generated by git-graph-objects.

I’m trying to get started with Ragel, and found a hello world example for ruby that got me past Go. Since the C# example was a bit different, I thought I’d share what I came up with.


 1 %%{
 2   machine hello;
 3   expr = ‘h’;
 4   main := expr @ { Console.Out.WriteLine("greetings!"); } ;
 5 }%%
 6
 7 using System;
 8
 9 namespace Mab.Test
10 {
11   public class Hello
12   {
13 %% write data;
14     public static void Main(string [] args)
15     {
16       foreach(var arg in args)
17       {
18         Console.WriteLine("***** " + arg + " ******");
19         Run(arg);
20       }
21     }
22
23     private static void Run(String data)
24     {
25       int cs;
26       int p = 0;
27       int pe = data.Length;
28       // init:
29       %% write init;
30       // exec:
31       %% write exec;
32     }
33   }
34 }

To see how it works, save the following code to Hello.rl and run these commands:

ragel -A Hello.rl
csc /t:exe /out:test.exe Hello.cs
test.exe a h z