Payday lending – how low can they go?

A post from TMM referenced the British payday lender wonga. As this is a way of credit which is either uncommon or illegal in Germany, a word about their business model: They give out very small credits (upper limit 400 pound) for a short time, up to 30 days. For this they will charge 5.5 pounds in fees and roughly 1% interest per day, based on a yearly rate of 365% .

Nice business, if you can get it. What peeked my interest was the question: How much losses can they take on these loans and still make a decent return on investment? After fiddling around with the numbers I pulled from thin air, the result is: They can troll the bottom of the sea, and still make a living. Here is my reasoning:

I simulate a credit portfolio of 100,000 credits within a business year of 365 days, ignoring weekends and holidays. Loan size and duration are uniformly distributed within the limits set by wonga. I assign a probability of default of between 80% and 100%. I will define the time reference, in which this probability applies, in the next step. So this value can be interpreted as “the creditor will go bust within the next (x) days with (y) likelihood”. I further assume an equity of 1 mil pounds to start the business. This ensures that the lender himself does not get in the red.

numYears =1
numCost = 100000*numYears
numDays = 365*numYears
equity = 1e6

credits = data.frame(id = 1:numCost,
                     loan = round(runif(numCost, min=1, max =400),0),
                     duration = as.integer(runif(numCost, min=1, max = 30)),
                     loanday = as.integer(runif(numCost,min=1,max=numDays)),
                     pd = runif(numCost, min = 0.8, max = 1))


credits$repayday = credits$loanday + credits$duration

I calculate the value of the outstanding amount on the repayment day as the size of the loan + 1% interest per day of duration. I assume that the fixed amount of 5.50 pounds coveres the fixed costs of the credit. To calculate the actual repayment made to the lender I make a random draw to simulate the possibility of default, where the base probability of default is wheighted with the duration of the credit in reference the time reference chosen for the probability of default. Usually this is one year, but I will vary it in the analysis.

In the next step I calculate the cashflow and the balance sheet during the business year, and calculate the return on investment as the value of the balance sheet at the end of the year, divided by the equity.

calcRoi = function(credits, defaultDenum=365)
{
  credits$repayment = (credits$loan* (1+credits$duration/100))* 
    rbinom(numCost, 1, (1-credits$pd*credits$duration/defaultDenum)) 

  cashflow = data.frame(day = 1:numDays, outflow = 0, inflow = 0, balance = 0)
  cashflow$balance[1]= equity
  for(day in 1:numDays)
  {
    cashflow$outflow[day] = sum(credits$loan[which(credits$loanday == day)])
    cashflow$inflow[day] = sum(credits$repayment[which(credits$repayday == day)])
    if(day>1)
    {
      cashflow$balance[day]=cashflow$balance[day-1]+cashflow$inflow[day]-cashflow$outflow[day]
    }
  }
  (ROI = cashflow$balance[nrow(cashflow)]/equity)
}

I now calculate the ROI for different time references in the range from 1 month to 1 year and plot the result
plot of chunk unnamed-chunk-3
The red line is the break-even and the blue line marks 10% ROI.

As you can see, the payday lender will make a decent ROI of 10% even with an average probability of default of 90% within the next 3-4 months, even if he writes off the defaulted loans completely. The crucial part of the business thus will be walking the fine line of selecting creditors who are nearly busted, so they are desperate enough to apply to this loan, but not yet busted, so that the lender will lose on a too large part of the credit portfolio.

Re: Data Paranoia – maybe justified this time?

Re: Data Paranoia – maybe justified this time?

Menzie Chinn responded to a post on Zero Hedge in which someone claimed that the september spread in the gain of full-time jobs and the loss in part-time jobs would be unusually high. Menzie responded that this would be well within the probability given the history of these time series. Thankfully he gave links to the data on the FRED database. As I was wasting the time anyway, I thought I would take a look at the data.

## [1] "LNS12600000" "LNS12500000"

plot of chunk data

So, we have a set of 549 montly data from January 1968 to September 2013. As in Menzie’s piece, i take the first difference of the logs to get monthly rates of change.

dfDL = as.data.frame(apply(df, 2, function(x)diff(log(x))))
names(dfDL) = c("FullTime", "PartTime")
summary(dfDL)
##     FullTime             PartTime        
##  Min.   :-0.0211697   Min.   :-0.032792  
##  1st Qu.:-0.0006873   1st Qu.:-0.005594  
##  Median : 0.0013157   Median : 0.001420  
##  Mean   : 0.0011179   Mean   : 0.001760  
##  3rd Qu.: 0.0032261   3rd Qu.: 0.007903  
##  Max.   : 0.0148232   Max.   : 0.108558

We define the september 2013 values as cutfff-points, and get the data points with an even more extreme spread between full-time and part-time.

# In case someone would try to use the code at a later time. 
sep13 = which(index(LNS12500000)=="2013-09-01")-1
cutOffs = dfDL[sep13,]
lowerRight = dfDL[which(dfDL$FullTime > cutOffs$FullTime & 
                          dfDL$PartTime< cutOffs$PartTime),]

Looking at the scatter-plot, we could expect a negative correlation between the two variables

plot of chunk unnamed-chunk-3

And this is confirmed, the correlation is -0.3918705. So, on first glance it would not seem to be unlikely that there should be a large increase of full-time jobs and a large decrease in part-time jobs. In fact, it would be the expected result that, given a high value for one series, we have high value for the other series with the opposite sign.

Identifying the empirical likelyhood of the event, we see that the red dot is September 2013, blue ones are larger spreads – 5 points, which gives us a frequency for this event of 1.0928962%. Next step is to see, if the results is statistically more unlikely than what we have observed.

We assume a bivariate normal distribution with observed mean values and covariance matrix:

mu = colMeans(dfDL)
sigma = cov(dfDL)
dfSum = data.frame(Mean = mu, StDev = apply(dfDL,2,sd))
dfSum
##                 Mean       StDev
## FullTime 0.001117893 0.003495736
## PartTime 0.001760472 0.011738748

As we can see, the mean rate of the part-time series is about 80% higher than the full-time series. It also has a much larger standard deviation.

To calculate the probability of the joint event that the full-time-rate in a given month is geq 0.3702089% and that the part-time-rate in a given month is leq -1.1011538% we put the numbers into R:

probEvent = pmvnorm(upper =unlist(c(Inf,cutOffs[2])),lower =unlist(c(cutOffs[1],-Inf)),mean = mu, sigma = sigma)

And the result is 6.2379699% – or about once very 1.3359047 years, which is in the same ballpark-range as the observed frequency of 1.0928962% or once every 7.625 years.

We contrast it with the outlier on the other side of the distribution, which looks like a rebasement in Januar 1994, as we have a decrease of -2078.83973288225 in the number of full-time jobs and an increase of 2576.9469075262 in the number of part-time-jobs.

we get the following results:

whichPT = which.max(dfDL$PartTime)
cutOffs2 = dfDL[whichPT,]
cutOffs2
##               FullTime  PartTime
## 1994-01-01 -0.02116966 0.1085579
probEvent2 = pmvnorm(lower =unlist(c(-Inf,cutOffs2[2])),upper =unlist(c(cutOffs2[1],Inf)),mean = mu, sigma = sigma)

And the result is 6.1032592 × 10-21% – which is a “not in the lifetime of the universe”-kind of likelihood.

So, the answer to the leading question – is the paranoia justified this time – is no. This month’s development is not statistically unlikely, especially in contrast to a genuine man-made event.