Using the exact reference distribution for small sample Benford tests

I recently came across another potential application related to my testing small samples for randomness in day-of-week patterns. Testing digit frequencies for Benford’s law basically works on the same principles as my day-of-week bins. So here I will show an example in R.

First, download my functions here and save them to your local machine. The only library dependency for this code to work is the partitions library, so install that. Now for the code, you can source my functions and load the partitions library. The second part of the code shows how to generate the null distribution for Benford’s digits — the idea is that lower digits will have a higher probability of occurring.

#Example using small sample tests for Benfords law
library(partitions)
#switch to wherever you downloaded the functions to your local machine
source("C:\\Users\\axw161530\\Dropbox\\PublicCode_Git\\ExactDist\\Exact_Dist.R")

f <- 1:9
p_fd <- log10(1 + (1/f)) #first digit probabilities

And so if you do cbind(f,p_fd) at the console it prints out:

      f       p_fd
 [1,] 1 0.30103000
 [2,] 2 0.17609126
 [3,] 3 0.12493874
 [4,] 4 0.09691001
 [5,] 5 0.07918125
 [6,] 6 0.06694679
 [7,] 7 0.05799195
 [8,] 8 0.05115252
 [9,] 9 0.04575749

So we expect over 30% of the first digits to be 1’s, just under 18% to be 2’s, etc. To show how we can use this for small samples, I take an example of fraudulent checks from Mark Nigrini’s book, Digital Analysis using Benford’s law (I can’t find a google books link to this older one — it is the 2000 one published by Global Audit Press I took the numbers from).

#check data from Nigrini page 84
checks <- c(1927.48,
           27902.31,
           86241.90,
           72117.46,
           81321.75,
           97473.96,
           93249.11,
           89658.17,
           87776.89,
           92105.83,
           79949.16,
           87602.93,
           96879.27,
           91806.47,
           84991.67,
           90831.83,
           93766.67,
           88338.72,
           94639.49,
           83709.28,
           96412.21,
           88432.86,
           71552.16)

#extracting the first digits
fd <- substr(format(checks,trim=TRUE),1,1)
tot <- table(factor(fd, levels=paste(f)))

Now Nigrini says this is too small of a sample to perform actual statistical tests, so he just looks at it at face value. If you print out the tot object you will see that we have mostly upper values in the series, three 7’s, nine 8’s, and nine 9’s.

> tot

1 2 3 4 5 6 7 8 9 
1 1 0 0 0 0 3 9 9

Now, my work on small samples for day-of-week crime sprees showed that given reasonably expected offender behavior, you only needed as few as 8 crimes to have pretty good power to test for randomness in the series. So given that I would expect the check series of 23 is not totally impossible to detect significant deviations from the null Benford’s distribution. But first we need to figure out if I can actually generate the exact distribution for 23 digits in 9 bins in memory.

m <- length(tot)
n <- sum(tot)
choose(m+n-1,m-1)

Which prints out just under 8 million, [1] 7888725. So we should be able to hold that in memory.

So now comes the actual test, and as my comment says this takes a few minutes for R to figure out — so feel free to go get a coffee.

#Takes a few minutes
resG <- SmallSampTest(d=tot,p=p_fd,type="G")
resG

Here I use the likelihood ratio G test instead of the more usual Chi-Square test, as I found that always had more power in the day-of-the-week paper. From our print out of resG we subsequently get:

> resG
Small Sample Test Object 
Test Type is G 
Statistic is 73.4505062784174 
p-value is:  2.319579e-14  
Data are:  1 1 0 0 0 0 3 9 9 
Null probabilities are:  0.3 0.18 0.12 0.097 0.079 0.067 0.058 0.051 0.046 
Total permutations are:  7888725

So since the p-value is incredibly small, we would reject the null that the first digit distribution of these checks follows Benford’s law. So on its face we can reject the null with this dataset, but it would take more investigation in general to give recommendations of how many observations in practice you would need before you can reasonably even use the small sample distribution. I have code that allows you to test the power given an alternative distribution in those functions, so for a quick and quite conservative test I see the power if the alternative distribution were uniform instead of Benford’s with our 23 observations. The idea is if people make up numbers uniformly instead of according to Benford’s law, what is the probability we would catch them with 23 observations. I label this as conservative, because I doubt people even do a good job of making them uniform — most number fudging cases will be much more egregious I imagine.

#power under null of equal probability
p_alt <- rep(1/9,9)
PowUni <- PowAlt(SST=resG,p_alt=p_alt) #again takes a few minutes
PowUni

So based on that we get a power estimate of:

> PowUni
Power for Small Sample Test 
Test statistic is: G  
Power is: 0.5276224  
Null is: 0.3 0.18 0.12 0.097 0.079 0.067 0.058 0.051 0.046  
Alt is: 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11  
Alpha is: 0.05  
Number of Bins: 9  
Number of Observations: 23

So the power is only 0.5 in this example. Since you want to aim for power of ~0.80 or higher, this shows that you are not likely to uncover more subtle patterns of manipulation with this few of observations. The power would go up for more realistic deviations though — so I don’t think this idea is totally dead in the water.

If you are like me and find it annoying to wait for a few minutes to get the results, a quick and dirty way to make the test go faster is to collapse bins that have zero observations. So our first digits have zero observations in the 3,4,5 and 6 bins. So I collapse those and do the test with only 6 bins, which makes the results return in around ~10 seconds instead of a few minutes. Note that collapsing bins is better suited for the G or the Chi-Square test, because the KS test or Kuiper’s test the order of the observations matter.

#Smaller subset
UpProb <- c(p_fd[c(1,2,7,8,9)],sum(p_fd[c(3,4,5,6)]))
ZeroAdd <- c(table(fd),0)

resSmall <- SmallSampTest(d=ZeroAdd,p=UpProb,type="G")
resSmall

When you do this for the G and the Chi-square test you will get the same test statistics, but in this example the p-value is larger (but is still quite small).

> resSmall
Small Sample Test Object 
Test Type is G 
Statistic is 73.4505062784174 
p-value is:  1.527991e-15  
Data are:  1 1 3 9 9 0 
Null probabilities are:  0.3 0.18 0.058 0.051 0.046 0.37 
Total permutations are:  98280

I can’t say for sure the behavior of the tests when collapsing categories, but I think it is reasonable offhand, especially if you have some substantive reason to collapse them before looking at the data.

In practice, they way I expect this would work is not just for testing one individual, but as a way to prioritize audits of many individuals. Say you had a large company, and you wanted to check the invoices for 1,000’s of managers, but each manager may only have 20 some invoices. You would compute this test for each manager then, and then subsequently rank them by the p-values (or do some correction like the false-discovery-rate) for further scrutiny. That takes a bit more work to code that up efficiently than what I have here though. Like I may pre-compute the exact CDF’s for each test statistic, aggregate them alittle so they fit in memory, and then check the test against the relevant CDF.

But feel free to bug me if you want to use this idea in your own work and want some help implementing it.

For some additional examples, here is some code to get the second digit expected probabilities:

#second digit probabilities
s <- 0:9
x <- expand.grid(f*10,s)
x$end <- log10(1 + (1/(x[,1]+x[,2])))
p_sd <- aggregate(end ~ Var2, data=x, sum)
p_sd

which are expected to be much more uniform than the first digits:

> p_sd
   Var2        end
1     0 0.11967927
2     1 0.11389010
3     2 0.10882150
4     3 0.10432956
5     4 0.10030820
6     5 0.09667724
7     6 0.09337474
8     7 0.09035199
9     8 0.08757005
10    9 0.08499735

And we can subsequently also test the check sample for deviation from Benford’s law in the second digits. Here I show an example of using the exact distribution for the Kolmogorov-Smirnov test. (There may be reasonable arguments for using Kuiper’s test with digits as well, but for both the KS and the Kuiper’s V you need to make sure the bins are in the correct order to conduct those tests.) To speed up the computation I only test the first 18 checks.

#second digits test for sample checks, but with smaller subset
sd <- substr(format(checks[1:18],trim=TRUE),2,2)
tot_sd <- table(factor(sd, levels=paste(s)))
resK_2 <- SmallSampTest(d=tot_sd,p=p_sd,type="KS")
resK_2

And the test results are:

> resK_2
Small Sample Test Object 
Test Type is KS 
Statistic is 0.222222222222222 
p-value is:  0.7603276  
Data are:  1 2 2 2 1 0 2 4 1 3 
Null probabilities are:  0.12 0.11 0.11 0.1 0.1 0.097 0.093 0.09 0.088 0.085 
Total permutations are:  4686825

So for the second digits of our checks we would fail to reject the null that the data follow Benford’s distribution. To test the full 23 checks would generate over 28 million permutations – will update based on how long that takes.

The final example I will give is with another small example dataset — the last 12 purchases on my credit card.

#My last 12 purchases on credit card
purch <- c( 72.00,
           328.36,
            11.57,
            90.80,
            21.47,
             7.31,
             9.99,
             2.78,
            10.17,
             2.96,
            27.92,
            14.49)
#artificial numbers, 72.00 is parking at DFW, 9.99 is Netflix

In reality, digits can deviate from Benford’s law for reasons that do not have to do with fraud — such as artificial constraints to the system. But, when I test this set of values, I fail to reject the null that they follow Benford’s distribution,

fdP <- substr(format(purch,trim=TRUE),1,1)
totP <- table(factor(fdP, levels=paste(f)))

resG_P <- SmallSampTest(d=totP,p=p_fd,type="G")
resG_P

So for a quick test the first digits of my credit card purchases do approximately follow Benford’s law.

> resG_P
Small Sample Test Object 
Test Type is G 
Statistic is 12.5740089945434 
p-value is:  0.1469451  
Data are:  3 4 1 0 0 0 2 0 2 
Null probabilities are:  0.3 0.18 0.12 0.097 0.079 0.067 0.058 0.051 0.046 
Total permutations are:  125970

Using the exact reference distribution for small sample Benford tests | Andrew W...

Using the exact reference distribution for small sample Benford tests

Recommend

如何看待比特币成为萨尔瓦多的法定货币？

人民网：韩国虚拟货币交易持续升温年轻人沉迷“币圈”

Layer2 扩容黑客松火热报名中

厨房里的小心机，除菌烘干一应俱全，开启健康饮食新生活

“我的开源项目被科技巨头拿去做产品了，注释中还删掉了我的名字”

Kubernetes Operators 101, Part 1: Overview and key features

与小姐姐的夏日浪漫约会：即刻GET暖心女友礼物

华为新款 MatePad Pro 上手！平替万元级数位板的生产力工具！

宝妈的厨房杀菌好物，从源头守护宝宝的饮食健康

OS/2 6.304: Who Has More?

About Joyk