Week #5: First Phase Ends…

What did you do this week?

The PR resolving the memory leaks is still up for reviews. While it was being reviewed, I thought it would be nice to think ahead of time and look at the potential performance of the methods I propose to add. So, I created a new branch gsoc-unuran-bench on my fork and started wrapping the remaining methods I had proposed in the excel sheet that I wrote a couple of weeks ago. I then wrote a small Python script to benchmark all the wrapped methods against NumPy random number generators. For now, I have only used two distributions: Standard Normal and Beta(2, 3). I plan to add more in the following weeks. Sampling was run 3 times per measurement. The results of the benchmark:

  • UNU.RAN’s methods (namely NumericalInversePolynomial and AutomaticRatioOfUniforms) were 3x faster than the NumPy RNG for the Beta(2, 3) distribution.
  • NumPy RNG was slightly faster than UNU.RAN’s methods (with NumericalInversePolynomial and AutomaticRatioOfUniforms being the closest to the performance of the NumPy RNG) to sample from the Standard Normal distribution.

It is good to see that there is a possibility of improving the performance of sampling from some distributions once the methods from UNU.RAN are integrated in SciPy.

What is coming up next?

There are already some reviews on the PR resolving memory leaks and I hope by the end of the next week, there would be even more and we could decide whether we want to use that approach in SciPy. It’s a tricky and non-conventional approach so I am not sure how many reviews would be considered “enough” or how much time will it take for the maintainers to properly review it. But while that is going on, I hope to start wrapping methods to sample from discrete distributions and benchmark them against the NumPy RNG.

Did you get stuck anywhere?

No. This was more or less a smooth week…

Marking the end of Phase 1

GSoC page says that the first phase reviews would start from tomorrow. So, this seems like a good time to summarize all the progress of Phase 1 here:

  • PR filed: Tests pass.
  • SciPy builds with UNU.RAN.
  • Separated UNU.RAN in its own submodule.
  • Create wrappers for one continuous and one discrete generator.
  • Basic benchmarks written.
  • Basic tests written.
  • A strong documentation suite and tutorials written.
  • Extra benchmarks written on my fork for all continuous methods in UNU.RAN.

According to my proposal, I have successfully achieved my first and second milestone and encroached the third milestone ahead of time! Most of the points above are still under reviews and might be changed in the future if need be. It would still be a challenge, both for the maintainers and me, to resolve the memory leaks issue but I hope that is done before the end of the second phase so that we can merge and test out some of the new functionality and iterate on the design. Let’s hope the best for what’s coming!