Posted by Dariusz Dwornikowski on Sun 21 September 2014

statistics of RFS bugs and sponsoring process

For some days I have been working on statistics of the sponsoring process in Debian. I find this to be one of the most important things that Debian has to attract and enable new contributions. It is important to know how this process works, whether we need more sponsors, how effective is the sponsoring and what are the timings connected to it.

How I did this ?

I have used Debbugs SOAP interface to get all bugs that are filed against sponsorship-requests pseudo package. SOAP gives a little bit of overhead because it needs to download a complete list of bugs for the sponsorship-requests package, and then process them according to given date ranges. The same information can be easily extracted from the UDD database in the future, it will be faster because SQL is better when working with date ranges than python obviously.

The most problematic part was getting the "real done date" of a particular bug, and frankly most of my time I have spent on writing a rather dirty and complicated script. The script gets a log for a particular bug number and returns a "real done date". I have published a proof of concept in a previous post..

What I measured ?

RFSs is a queue, and in every queue one is interested in a mean time to get processed. In this case I called the metric global MTTGS (mean time to get sponsored). This is a metric that gives the overall performance insight in RFS queue. Time to get sponsored (TTGS) for a bug is a number of days that passed between filing an RFS bug and closing it (bug was sponsored). Mean time to get sponsored is calculated as a sum of TTGSs of all bugs divided by number of bugs (in a given period of time). Global MTTGS is MTTGS calculated for a period of time 2012-1-1 until today().

Besides MTTGS I have also measured typical bug related metrics:

  • number of bugs closed in a given day,
  • number of bugs opened in a given day,
  • number of bugs with status open in a given day,
  • number of bugs with status closed in a given day.

Plots and graphs

Below is a plot of global MTTGS vs. time (click for a larger image).

mttgs plot

As you can see, the trend is roughly exponential and MTTGS tends to settle around 60 days at the end of the year 2013. This does not mean that your package will wait 60 days on average nowadays to get sponsored. I remind that this is a global MTTGS, so even if the MTTGS of last month was very low, the global MTTGS would decrease just slightly. It gives, however, a good glance in performance of the process. Even that more packages are filed for sponsoring (see next graphs) now, than in the beginning of the epoch, the sponsoring rate is high enough to flatten the global MTTGS, and with time maybe decrease it.

The image below (click for a larger one) shows how many bugs reside in a queue with status open or closed (calculated for each day). For closed we have an almost linear function, so each day more or less the same amount of bugs are closed and they increase the pool of bugs with status closed. For bugs with status open the interesting part begins around May 2012 after the system is saturated or gets popular. It can be interpreted as a plot of how many bugs reside in the queue, the important part is that it is stable and does not show clear increasing trend.

open done plot

The last plot shows arrival and departure rate of bugs from RFS queue, i.e. how many bugs are opened and closed each day. The interesting part here are the maxima. Let's look at them.

opened closed plot

Maximal number of opened bugs (21) was on 2012-05-06. As it appears it was a bunch upload of RFSs for tryton-modules-*..

  706953  RFS: tryton-modules-account-stock-anglo-saxon/2.8.0-1 
  706954  RFS: tryton-modules-purchase-shipment-cost/2.8.0-1 
  706948  RFS: tryton-modules-production/2.8.0-1 
  706969  RFS: tryton-modules-account-fr/2.8.0-1 
  706946  RFS: tryton-modules-project-invoice/2.8.0-1 
  706950  RFS: tryton-modules-stock-supply-production/2.8.0-1 
  706942  RFS: tryton-modules-product-attribute/2.8.0-1 
  706957  RFS: tryton-modules-stock-lot/2.8.0-1 
  706958  RFS: tryton-modules-carrier-weight/2.8.0-1 
  706941  RFS: tryton-modules-stock-supply-forecast/2.8.0-1 
  706955  RFS: tryton-modules-product-measurements/2.8.0-1 
  706952  RFS: tryton-modules-carrier-percentage/2.8.0-1 
  706949  RFS: tryton-modules-account-asset/2.8.0-1 
  706904  RFS: chinese-checkers/0.4-1 
  706944  RFS: tryton-modules-stock-split/2.8.0-1 
  706981  RFS: distcc/3.1-6 
  706945  RFS: tryton-modules-sale-supply/2.8.0-1 
  706959  RFS: tryton-modules-carrier/2.8.0-1 
  706951  RFS: tryton-modules-sale-shipment-cost/2.8.0-1 
  706943  RFS: tryton-modules-account-stock-continental/2.8.0-1 
  706956  RFS: tryton-modules-sale-supply-drop-shipment/2.8.0-1

Maximum number of closed bugs (18) was on 2013-09-24, and as you probably guessed right also tryton modules had impact on that.

  706953  RFS: tryton-modules-account-stock-anglo-saxon/2.8.0-1 
  706954  RFS: tryton-modules-purchase-shipment-cost/2.8.0-1 
  706948  RFS: tryton-modules-production/2.8.0-1 
  706969  RFS: tryton-modules-account-fr/2.8.0-1 
  706946  RFS: tryton-modules-project-invoice/2.8.0-1 
  706950  RFS: tryton-modules-stock-supply-production/2.8.0-1 
  706942  RFS: tryton-modules-product-attribute/2.8.0-1 
  706958  RFS: tryton-modules-carrier-weight/2.8.0-1 
  706941  RFS: tryton-modules-stock-supply-forecast/2.8.0-1 
  706955  RFS: tryton-modules-product-measurements/2.8.0-1 
  706952  RFS: tryton-modules-carrier-percentage/2.8.0-1 
  706949  RFS: tryton-modules-account-asset/2.8.0-1 
  706944  RFS: tryton-modules-stock-split/2.8.0-1 
  706959  RFS: tryton-modules-carrier/2.8.0-1 
  723991  RFS: mapserver/6.4.0-2 
  706951  RFS: tryton-modules-sale-shipment-cost/2.8.0-1 
  706943  RFS: tryton-modules-account-stock-continental/2.8.0-1 
  706956  RFS: tryton-modules-sale-supply-drop-shipment/2.8.0-1

The software

Most of the software was written in Python. Graphs were generated in R. After a code cleanup I will publish a complete solution on my github account, free to use by everybody. If you would like to see another statistics, please let me know, I can create them if the data provides sufficient information.

© 2012-2014 Dariusz Dwornikowski. Built using Pelican. Based on theme by Carey Metcalfe, available on GitHub, which is based on svbhack by Giulio Fidente, modified by Vincent Cheng.