Tuesday, April 26, 2016

Global unroll

Spock is a great tool, but my coleague Marcin ZajÄ…czkowski invented a way to make it perfect. He published his lib  to unroll all tests by default. This is very nice feature, because this is probably the desired way of working with parametrized tests.
Just add this lib to testCompile:
now you can remove all @Unroll from your tests.

Simple pull request made by author:

Sunday, April 24, 2016

Shadows of BigDecimal

We are using BigDecimal in many places. We are using BigDecimal for modeling currencies, financial measures (interest rates, coefficients), maybe some other domain values that needs precision better than Double provides.

BigDecimal is common and straightforward answer for precision problem, but why it sucks?

First problem with BigDecimal is lack of power function - there is no BigDecimal^BigDecimal. Of course this is not a real problem - you can use some external library for doing this or write this one on your own. To do this we have to implement e^x ln(x) and n! - piece of cake. Frustration is bigger if you try to find a lib for that.

Second problem is that even you solve the first problem there is new - performance problem. This problem accompanies us as a humanity for ages. If we want to perform operations with precision better than float or double offers we have to struggle with numerical recipes and lack of low level support. This one is quite interesting. 

What is my understanding of low level support? Lets go back in time to 286 (Intel 286 not A.D. 286). There were no low level support for floating point operations at all. For 386SX there were new option to buy 387 floating point co-processor to perform floating point operations faster. There were also software emulators for 387 (to run apps written for 386DX with embedded 387 on plain 386SX). From that time later on Intel integrate those two chips in one. Modern IA-64 processors delivers floating point programming model based on separate instruction set to perform operations like add and multiply. As we know from math perspective it is enough to implement more sophisticated functions like e^x, ln x and others. We have also support to approximate reciprocal for floating point number and therefore we can divide by float because division is just a multiplication by reciprocal. Those operations are faster than those you can implement in plain assembly. But can we benefit from this mechanism in BigDecimal computations?

To answer this question we have to go deep to the assembly language. For java this is surprisingly quite easy. If we add "-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly" to java command line and make it work. We will get assembly for parts that was compiled with JIT. You should get something like this:

Decoding compiled method 0x000000010fdffd90:
[Entry Point]
  # {method} {0x0000000123bb5310} 'multiply' '(Ljava/math/BigDecimal;)Ljava/math/BigDecimal;' in 'java/math/BigDecimal'
  # this:     rsi:rsi   = 'java/math/BigDecimal'
  # parm0:    rdx:rdx   = 'java/math/BigDecimal'
  #           [sp+0xf0]  (sp of caller)
  0x000000010fdfffe0: mov    0x8(%rsi),%r10d
  0x000000010fdfffe4: shl    $0x3,%r10
  0x000000010fdfffe8: cmp    %rax,%r10
  0x000000010fdfffeb: jne    0x000000010fa45e20  ;   {runtime_call}
  0x000000010fdffff1: data32 data32 nopw 0x0(%rax,%rax,1)
  0x000000010fdffffc: data32 data32 xchg %ax,%ax
Code for "simple" multiplication fills few screens and takes a lot of compare and jump loops.
Long trace of this output is not interesting to follow, but there is no signs for floating point operations. It looks like our JIT don't find floating point operations useful for BigDecimal operations.

My personal experience with this effect was when I was implementing "better" way of computing APR. I have to implement a^x and therefore I was forced to implement e^x and ln x on my own. Of course it was based on multiplication/division and add operations. At the end - results was worse than I expected. For very common arguments for APR computations took more than second on my i5 CPU. During those computations one core was working on 100%. I won't merge this branch because it renders my library useless - I can't waste so much CPU time for computations of APR.

Third problem with BigDecimal is inconsistency of basic math operations like comparison value to another one. Lets focus on ZERO. This number is very important because we can't find reciprocal of ZERO. As a senior math user I was aware that division(x) will fail for x = ZERO and therefore I was checking if(x.equals(ZERO)) to prevent division. Unfortunately there is a case when x is not equal ZERO (x="OE-21") in terms of equals method, but it ends with "BigInteger divide by zero". Fortunately there is compareTo method which returns: -1, 0, or 1 as this BigDecimal is numerically * less than, equal to, or greater than val. To solve this issue you should compareTo val rather than check if it is equal to val.

Even though some problems of BigDecimal really annoys me I found it very useful for modeling math and financial domains. It has its own problems, but I will not implement this kind of functionality better on my own.

Saturday, April 9, 2016

APR Adventure

APR - part I

When you want to compare two things in Java you should implement Comparable<> interface (in math it is called total order). You don't have to compute single value for one object and then second for the other to compare them. Of course in some cases this approach is even simpler, but it brings some other problem - there is no easy way to check if you are comparing apples to apples because there is no such thing as a BigDecimal<Apple>.

In real world when we want to compare sticks we will probably use the length as a measure to ease entire process. For stones we can use length of stone as a measure of "good stick and stone". As you can imagine using one coefficient for comparing sticks and stones may break my bones.

EU enforces some regulations on every company that lend money to provide single coefficient to ease consumers compare their offers.

Main problem of this approach is that it is very hard to compare two loans, because we are comparing two cashflows (in Java Map<LocalDate, BigDecimal>) using one coefficient (Java type: BigDecimal). So we are mapping map to scalar and compare those scalars. As you can imagine this is universal way of comparing apples and oranges.

For unknown reasons EU choose not easy to solve numerically measure called Annual Percentage Rate - APR.
Ak is lets say money we pay to our Customer and A'k' is money paid by Customer to us.
i in this formula is APR itself.
How hard it can be to find value of i from this formula?

Here we have first problem - this is no simple and generic method/formula to find "i". It is that hard that a lot of companies on their websites deliver wrong results to their clients (this is prohibited by the law by the way - you have to provide valid APR according to UE regulations. Those regulations are rewritten in local law and translated to many european languages, but math stays the same). 

 Let's redefine this problem a little bit. We have a situation f(x)=g(x) which we can rewrite as a f(x)-g(x)=0. This is more general problem - we have to find root of equation expressed as a sum of all payments discounted to the first day of loan.

From the API perspective we have a cashFlow defined as a Map<LocalDate, BigDecimal> and thats it! There is an opensource library to perform all those computations and provide results for mentioned API: 

APR calculator

Source code you can find here.