Piotr Pietrzak: 2016

Tuesday, April 26, 2016

Global unroll

Spock is a great tool, but my coleague Marcin Zajączkowski invented a way to make it perfect. He published his lib to unroll all tests by default. This is very nice feature, because this is probably the desired way of working with parametrized tests.
Just add this lib to testCompile:
info.solidsoft.spock:spock-global-unroll:0.5.0
now you can remove all @Unroll from your tests.

Simple pull request made by author:
https://github.com/piotrpietrzak/aprcalc/pull/2/files

Sunday, April 24, 2016

Shadows of BigDecimal

We are using BigDecimal in many places. We are using BigDecimal for modeling currencies, financial measures (interest rates, coefficients), maybe some other domain values that needs precision better than Double provides.

BigDecimal is common and straightforward answer for precision problem, but why it sucks?

First problem with BigDecimal is lack of power function - there is no BigDecimal^BigDecimal. Of course this is not a real problem - you can use some external library for doing this or write this one on your own. To do this we have to implement e^x ln(x) and n! - piece of cake. Frustration is bigger if you try to find a lib for that.

Second problem is that even you solve the first problem there is new - performance problem. This problem accompanies us as a humanity for ages. If we want to perform operations with precision better than float or double offers we have to struggle with numerical recipes and lack of low level support. This one is quite interesting.

What is my understanding of low level support? Lets go back in time to 286 (Intel 286 not A.D. 286). There were no low level support for floating point operations at all. For 386SX there were new option to buy 387 floating point co-processor to perform floating point operations faster. There were also software emulators for 387 (to run apps written for 386DX with embedded 387 on plain 386SX). From that time later on Intel integrate those two chips in one. Modern IA-64 processors delivers floating point programming model based on separate instruction set to perform operations like add and multiply. As we know from math perspective it is enough to implement more sophisticated functions like e^x, ln x and others. We have also support to approximate reciprocal for floating point number and therefore we can divide by float because division is just a multiplication by reciprocal. Those operations are faster than those you can implement in plain assembly. But can we benefit from this mechanism in BigDecimal computations?

To answer this question we have to go deep to the assembly language. For java this is surprisingly quite easy. If we add "-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly" to java command line and make it work. We will get assembly for parts that was compiled with JIT. You should get something like this:

Decoding compiled method 0x000000010fdffd90:
Code:
[Entry Point]
[Constants]
  # {method} {0x0000000123bb5310} 'multiply' '(Ljava/math/BigDecimal;)Ljava/math/BigDecimal;' in 'java/math/BigDecimal'
  # this:     rsi:rsi   = 'java/math/BigDecimal'
  # parm0:    rdx:rdx   = 'java/math/BigDecimal'
  #           [sp+0xf0]  (sp of caller)
  0x000000010fdfffe0: mov    0x8(%rsi),%r10d
  0x000000010fdfffe4: shl    $0x3,%r10
  0x000000010fdfffe8: cmp    %rax,%r10
  0x000000010fdfffeb: jne    0x000000010fa45e20  ;   {runtime_call}
  0x000000010fdffff1: data32 data32 nopw 0x0(%rax,%rax,1)
  0x000000010fdffffc: data32 data32 xchg %ax,%ax

(...)
Code for "simple" multiplication fills few screens and takes a lot of compare and jump loops.

Long trace of this output is not interesting to follow, but there is no signs for floating point operations. It looks like our JIT don't find floating point operations useful for BigDecimal operations.

My personal experience with this effect was when I was implementing "better" way of computing APR. I have to implement a^x and therefore I was forced to implement e^x and ln x on my own. Of course it was based on multiplication/division and add operations. At the end - results was worse than I expected. For very common arguments for APR computations took more than second on my i5 CPU. During those computations one core was working on 100%. I won't merge this branch because it renders my library useless - I can't waste so much CPU time for computations of APR.

Third problem with BigDecimal is inconsistency of basic math operations like comparison value to another one. Lets focus on ZERO. This number is very important because we can't find reciprocal of ZERO. As a senior math user I was aware that division(x) will fail for x = ZERO and therefore I was checking if(x.equals(ZERO)) to prevent division. Unfortunately there is a case when x is not equal ZERO (x="OE-21") in terms of equals method, but it ends with "BigInteger divide by zero". Fortunately there is compareTo method which returns: -1, 0, or 1 as this BigDecimal is numerically * less than, equal to, or greater than val. To solve this issue you should compareTo val rather than check if it is equal to val.

Even though some problems of BigDecimal really annoys me I found it very useful for modeling math and financial domains. It has its own problems, but I will not implement this kind of functionality better on my own.

Saturday, April 9, 2016

APR Adventure

APR - part I

When you want to compare two things in Java you should implement Comparable<> interface (in math it is called total order). You don't have to compute single value for one object and then second for the other to compare them. Of course in some cases this approach is even simpler, but it brings some other problem - there is no easy way to check if you are comparing apples to apples because there is no such thing as a BigDecimal<Apple>.

In real world when we want to compare sticks we will probably use the length as a measure to ease entire process. For stones we can use length of stone as a measure of "good stick and stone". As you can imagine using one coefficient for comparing sticks and stones may break my bones.

EU enforces some regulations on every company that lend money to provide single coefficient to ease consumers compare their offers.

Main problem of this approach is that it is very hard to compare two loans, because we are comparing two cashflows (in Java Map<LocalDate, BigDecimal>) using one coefficient (Java type: BigDecimal). So we are mapping map to scalar and compare those scalars. As you can imagine this is universal way of comparing apples and oranges.

For unknown reasons EU choose not easy to solve numerically measure called Annual Percentage Rate - APR.

Ak is lets say money we pay to our Customer and A'k' is money paid by Customer to us.

i in this formula is APR itself.

How hard it can be to find value of i from this formula?

Here we have first problem - this is no simple and generic method/formula to find "i". It is that hard that a lot of companies on their websites deliver wrong results to their clients (this is prohibited by the law by the way - you have to provide valid APR according to UE regulations. Those regulations are rewritten in local law and translated to many european languages, but math stays the same).

Let's redefine this problem a little bit. We have a situation f(x)=g(x) which we can rewrite as a f(x)-g(x)=0. This is more general problem - we have to find root of equation expressed as a sum of all payments discounted to the first day of loan.

From the API perspective we have a cashFlow defined as a Map<LocalDate, BigDecimal> and thats it! There is an opensource library to perform all those computations and provide results for mentioned API:

APR calculator

Source code you can find here.

Wednesday, March 2, 2016

Schedulers in couchbase

Couchbase java diver by default provides threading pools that are quite interesting. Why "are" instead of "is"? Because there are two of them - one for I/O operations and another one for computations.

Lets focus first on I/O operations.

There is a special I/O Thread Pool provided by couchbase driver to deal with io intensive operations. You can define only pool size - there is small hint in documentation - try to set this value to number of processors in your systems. This default value is connected with a way that IO operations are performed (non blocking netty). So far so good - very easy configuration: only one int parameter to set with reasonable default.

There is no option in API to provide IO Thread Pool/Scheduler/whatever to manually tune it up.

We can do a lot more with Computational Thread Pool. First of all we can use default CTP with manually set computationPoolSize. This thread pool is produced by slightly modified default RX Scheduler (which provides custom names like cb-computational-%d).

There is more interesting thing we can do with CTP - we can provide our own RX Scheduler. We can do this like this:

https://gist.github.com/piotrpietrzak/9c6f11f8842184eae718

Now we have full control over creating each thread in our application. Why should we struggle with that?

There is multiple good reasons to control that, but this is only one way to make correlationId (aka trace and span in spring sleuth). We have to provide our trace and span context (for example via TraceContextHolder) from thread managed by spring mvc to thread provided by our scheduler.

Monday, February 15, 2016

How to store passwords

Hash function

Let's do some modeling - we will start with storing password problem in our Identity Management System.

First of all I want to start with a statement: we don't want to store plaintext user passwords. We want to know that provided password is valid. Basic idea behind this concept is to create a function to transform user password to less vulnerable and irreversible hash.

At this point I think we are all aware that we should do this that way and you probably know about existence of hash functions. Choosing a proper algorithm to provide such hash is just not so simple.
We can start with some basic model and we will define some rules to our hash algorithm later on.
PasswordHashProvider - basic interface to provide one or more hash algorithms should looks like this:

public interface PasswordHashProvider {
    String encode(String password, UUID salt);

    default boolean verify(String password, UUID salt, String hash) {
        return encode(password, salt).equals(hash);
    }
}

With this simple interface we can model our requirements and implement few tests.
First of all we want our function to provide some properties like:

it is quick to compute the hash value for any given message
it is infeasible to generate a message from its hash
it is infeasible to modify a message without changing the hash
it is infeasible to find two different messages with the same hash.

To validate first property: "it is quick to compute the hash value for any given message" we can generate multiple passwords and fail if our computations last a little bit too long. Lets write test for this:

@Timeout(value = 100, unit = TimeUnit.MILLISECONDS)
def "should generate hash during 100 milliseconds or faster"() {
 given:
  String simplePassword = "PASSWORD"
 when:
  String result = passwordHashProvider.encode(simplePassword, UUID.randomUUID())
 then:
  !result.isEmpty()
}

Second property of hash function - "infeasible to generate a message from its hash" is quite interesting from mathematical perspective. First of all this kind of function should be irreversible (I mean there should be no function f^-1 that f(f^-1(x)) = x ). This property is very easy to break- I mean every reversible function should be injective. To break injective easily we can assign same result for multiple values. This is the case for our password hash function - you can login with different password you defined but it should be infeasible to find the second one.

To express third property: "infeasible to modify a message without changing the hash" I wrote simple test:

def "should generate different hash for same passwords and different salt"() {
 given:
  List hashes = []
  Integer countSum = 0
  String password = "PASSWORD"
 when:
  (1..ITERATIONS).each {
   String hash = passwordHashProvider.encode(password, UUID.randomUUID())
   hashes.add(hash)
   countSum += hashes.count(hash)
  }
 then:
  countSum == ITERATIONS
}

Ok - I will explain one trick used here to validate this property. We want to check if change of salt will change the password hash. Salt is just some known value that is accessible during encryption of passwords to improve their quality and complicate hacker life because password rainbow tables (tables with common hash/password combinations) does not work for passwords with salt properly added during encoding.

Second trick of this test is question: how to verify that our hashes are unique? Simple idea behind this is to sum count of our hash among other hashes and after that check if our sum is equal to number of iterations. In case we provide only one hash for all values we will get sum of arithmetic progression. In our case for lets say 100 password hashes we will get sum of 5050. Therefore this sum should be from range 100 - 5050.

I can't find a way to test property "it is infeasible to find two different messages with the same hash". We have to assume that this property should be reported by security researchers if we use one of well known algorithms.

Well known algorithms:

MD5

This algorithm fails the last property: "it is infeasible to find two different messages with the same hash". You can find details searching for phrase: "MD5 is not collision resistant".

Bcrypt

Default password hash algorithm for BSD and other systems including some Linux distributions.

Scrypt

What a silly name! What should I put in google to find java implementation? Java scrypt? :)

This algorithm characterizes with some good properties like being quite heavy memory intensive (which is good, because it saves us from farms of nvidia cores ready to break our computation intensive algorithms - as a humanity we can't scale memory without huge costs (in early 2016) ).

PBKDF2

After wikipedia: "PBKDF2 (Password-Based Key Derivation Function 2) is a key derivation function that is part of RSA Laboratories' Public-Key Cryptography Standards (PKCS) series, specifically PKCS #5 v2.0, also published as Internet Engineering Task Force's RFC 2898."

Argon2

https://github.com/p-h-c/phc-winner-argon2

To sum up

Lots of algorithms to hash passwords exists but which one is best or just good enough? Of course Argon2 is quite interesting. We can define how hard it can be to cpu, memory and we can also define degree of parallelism. For now I can't find JVM implementation of this algorithm - only bindings to C lib.

PBKDF2 seems reasonable - it has java native support (javax.crypto package). It has quite interesting properties and is widely implemented. It has some troubles like being cpu rather than memory intensive (which in cloud environments is also crucial for us). Lets start with PBKDF2 and move to Argon2 someday (the day I will find the JVM implementation with Apache2 compatible licence).

Friday, February 12, 2016

Spring sleuth

Correlation Id

When you start with microservices the natural question comes to mind: how to debug all this stuff. Unfortunately some of problems appear only in production environments. How to trace them effectively? Solution is not always so simple, but correlation id should help you with debugging such things. I will show you logs and everything should become clear right now:

2016-02-12 19:48:41.116 [trace=6d9ed076-bb14-42e9-b537-3ae1e85441b0,span=6d9ed076-bb14-42e9-b537-3ae1e85441b0] [o-auto-1-exec-3] o.s.cloud.sleuth.util.ExceptionUtils    : Trace client warn: thread http-nio-auto-1-exec-3 tried to start a new Span with parent MilliSpan(begin=1455302921115, end=0, name=null, traceId=6d9ed076-bb14-42e9-b537-3ae1e85441b0, parents=[], spanId=6d9ed076-bb14-42e9-b537-3ae1e85441b0, remote=true, exportable=true, annotations={}, processId=null, timelineAnnotations=[]), but there is already a currentSpan MilliSpan(begin=1455302921054, end=0, name=http/api/user/januszkowalski, traceId=6d9ed076-bb14-42e9-b537-3ae1e85441b0, parents=[], spanId=6d9ed076-bb14-42e9-b537-3ae1e85441b0, remote=false, exportable=false, annotations={}, processId=null, timelineAnnotations=[])

2016-02-12 19:48:41.119 [trace=6d9ed076-bb14-42e9-b537-3ae1e85441b0,span=17a6a6c2-dba3-4ac1-a23b-2deb6371c08b] [o-auto-1-exec-3] o.s.cloud.sleuth.log.Slf4jSpanListener  : Starting span: MilliSpan(begin=1455302921119, end=0, name=http/api/user/januszkowalski, traceId=6d9ed076-bb14-42e9-b537-3ae1e85441b0, parents=[6d9ed076-bb14-42e9-b537-3ae1e85441b0], spanId=17a6a6c2-dba3-4ac1-a23b-2deb6371c08b, remote=false, exportable=true, annotations={}, processId=null, timelineAnnotations=[])

First idea is to add unique id to every request and trace this id between another microservices interactions. This trick gives us possibility to trace our logs with this uid. Quite simple, but how to make this without rewriting this feaature from scratch?

In Approdorix project I started with Spring sleuth. It was really simple, just one line in application.properties:

logging.pattern.console=%d{yyyy-MM-dd HH:mm:ss.SSS} [trace=%X{X-Trace-Id:-},span=%X{X-Span-Id:-}] [%15.15t] %-40.40logger{39}: %m%n

Observable

Why you should consider Couchbase as a KV or document database?

Tomek Nurkiewicz gave me access to his fresh book: "Reactive Programming with RxJava". I think that this is my chance to learn "something new" by designing fully reactive application.

Evolution - from callbacks to RX

As you probably know node.js and javascript in general is single threaded. You can ask question - how it is possible that most nowadays Single Page Applications do a lot of things under the hood? They can send few request to backend and redraw user interface in responsive timing. Of course there is a possibility to run multiple IO operations at the same time in background but our app should not actively wait and check IO operations statuses.
Key to the "success" or rather a way how they do this in javascript is callbacks. After our operation ends (with success or error) our app gets informed instantly.

Have you heard about callback hell? It all starts in javascript. For many years a lot of fresh js adepts create a lot of async code that tends to move to the right - I mean literally moves to the right because every callback moves their code 4 spaces right.

Now you can't enter modern JS without hearing about solutions to avoid callback hell (I mean promises and $q).

I was quite familiar with $q and promises in javascript when I found out CompletableFuture in java8. I like those new mechanism in java8 especially mixed with lambdas.

If you don't know much about RxJava you can imagine that RxJava is just CompletableFuture on steroids.

I found in Tomeks book a chapter named: "Couchbase client API" but there is no content behind this marker in current version. I can imagine that this chapter will be extremely interesting and especially if Tomek will compare RxJava driver with other db drivers - those blocking ones.

What is so special about relation of Couchbase with RxJava? Couchbase is first database I know to provide native RxJava driver. What does it mean? You can query your database asynchronously and be informed just after our data reach client side document by document until we get complete result or we end with any error. Lets look how we can write this in code:

    @RequestMapping(value = "/{login}", method = RequestMethod.GET)
    public DeferredResult<User> findByLogin(@PathVariable String login) {
        DeferredResult<User> deferred = new DeferredResult<>();
        userService
                .get(login)
                .subscribe(deferred::setResult,
                        e -> {
                            logger.error("Can't find user by login", e);
                            throw new UserDoesNotExist();
                        });
        return deferred;
    }

We have few mechanisms in this small piece of code:

DeferredResult - this is Spring way to defer returning of result (from Spring MVC 3.2). This mechanism benefit from servlet 3 and its ability to async request processing. This is quite important because it is very easy to transform RxJava into blocking and we should do such trick to avoid blocking any thread.

Observable<User> - this is RxJava stuff. UserService.get(login) method returns Observable<User>. We can subscribe on events from this observable and react to emitted events. If we get healthy user we can return our user as a result with deferred::setResult. If we get some nasty error we can put message in log and throw runtime error. With all those tricks during the time our database process our query no threads on our application server waits for result. Just after they will come our app will process them with new threads.

I wonder if I can develop entire API in this manner - I mean reactive way. I will get back to this post if I encounter any problems with this approach. I think that Tomeks book will help me to stay on course and avoid me from making mistakes that transforms RxJava into blocking one.

Approdorix

The idea behind the Approdorix.com

My blog looks dead for some time, but a lot of other stuff was done under the hood (this is called the life). Finally I found the brilliant idea to write entire cycle of blogposts about creating open source project. My idea is to build some general microservices to fulfill most general needs of future applications.

Is this possible?

I must admit that I am afraid about the limited usability of a general models, but let's consider some bounded contexts where we can apply general domain models without bringing much risk to the project.

For example in Identity Management domain there is a lot of stuff you can do without any changes in almost every application. Users can register, login, logout, edit some basic data, use other authorization providers like: facebook, gmail or twitter. Things gets even more complicated if you consider microservices architecture. It sounds like a lot of stuff that has to be done even before you actually provide any real business value. I think that Identity Management done right can be shared between the future projects and therefore IDM is good bounded context to start with.
You can find github repository here:
https://github.com/piotrpietrzak/approdorix-users

A friend of mine told me about some general model of accounting pattern described in Martin's Fowler paper. After reading this I think that it can bring some value to implement general accounting microservice to provide financial consequences of business events. I wonder if it is possible to implement "general accounting" module that can with very small adjustments work in any domain.

Start small and grow only if we have to

Of course I do it for fun and therefore I do not want to just do it, but I want to learn something new, discover the way of doing things right and write about it.

My plan is to run those microservices with all dependencies on AWS with Kubernetes. I think this assumption is quite useful for our open source project end users because they can start small and grow even if they don't want to spend a lot of $ on infrastructure. I think I will start with couchbase as a document database and kafka as message broker. Why? I will explain in following blog posts.