gt_dev: 2016

Wednesday, 7 December 2016

[Java / Assertj] How to create your own fluent assertions ?

Everyone should write unit tests. And I mean EVERYONE. Integration tests are very popular and useful but unit tests are the fastest. Having good unit tests allows you to check whether your class works within second or two. UTs also tell you that something's wrong but how they do it ? When test fails you know that some parts of your code don't work as you expect. You see the name of failing test and a class which the test corresponds to so you have general idea what might have gone wrong but if you want to fix that bug you have to read both test and the code. This is why your tests should be extremely readable. I use the following live template for all the tests I write (Intellij Idea):

@Test
public void $METHOD_NAME$() throws Exception {
    // given
    $END$
    
    // when
    
    
    // then
    
}

When section invokes a method which is being tested by particular test. Given section prepares the input and records mocks' behaviour. Then section checks the output. I really like given-when-then template. It clearly separates those three sections so you always know which object is the input and so on. When section typically contains single line of code. Actually I like when each section is one-liner but it's usually impossible. Perfect test for me looks basically like that:

@Test
public void should_calculate_ceiling() throws Exception {
    // given
    double price = 7.8;
    
    // when
    double ceiling = Math.ceil(price);

    // then
    assertThat(ceiling).isEqualTo(8.0);
}

That case is actually too simple because typically you have to check the state of some object. Let's say that you're testing some class that returns an instance of Person:

@AllArgsConstructor
@Getter
public static class Person {
    private final String firstName;
    private final String lastName;
    private final int age;
    private final Sex sex;
    private final Optional<Job> job;

    public static enum Sex {
        MALE, FEMALE
    }

    public static class Job {}
}

I would never implement Person class like this but it's just an example so forgive me that :) If you use standard junit assertions you would probably write something like that:

@Test
public void should_return_joey() throws Exception {
    // when
    Person person = p();

    // then
    assertEquals(Sex.MALE, person.getSex());
    assertEquals("Joey", person.getFirstName());
    assertEquals("Tribiani", person.getLastName());
    assertTrue(person.getAge() > 18);
    assertEquals(Optional.empty(), person.getJob());
}

private Person p() {
    return new Person("Joey", "Tribiani", 25, Sex.MALE, Optional.empty());
}

I really don't like assertEquals(). First of all you have to remember order of parameters. The first one is expected and the second actual. If you make mistake you will see a misleading message. Another thing assertTrue() doesn't tell you why Joey has to be older than 18. This is why you should use some library that contains fluent assertions and allows you to create your own assertions. The most popular assertions library for Java is probably AssertJ. Having for instance a string you can check many things at once:

assertThat("Joey Tribiani").hasSize(13)
                           .startsWith("Jo")
                           .contains("y T")
                           .endsWith("ani")
                           .doesNotContain("Rachel");

Same applies to collections:

List<String> friends = ImmutableList.of("Rachel", "Ross", "Joey", "Chandler", "Pheebs", "Monica");
assertThat(friends).hasSize(6)
                   .doesNotContain("Gunther")
                   .containsSequence("Rachel", "Ross")
                   .endsWith("Monica")
                   .allMatch(friend -> friend.matches("[A-Z][a-z]*"));

And so on... Ok let's write custom assertions for Person class. First of all you have to create a class that extends AbstractAssert

public class PersonAssertions extends AbstractAssert<PersonAssertions, Person> {
}

Then we must add constructor that matches super() and static method assertThat:

public class PersonAssertions extends AbstractAssert<PersonAssertions, Person> {
    public static PersonAssertions assertThat(final Person actual) {
        return new PersonAssertions(actual);
    }

    private PersonAssertions(final Person actual) {
        super(actual, PersonAssertions.class);
    }
}

Now we can start adding assertions. First of all let's create some methods that allow you to check first and last name.

public PersonAssertions hasFirstName(final String name) {
    Assertions.assertThat(name).isEqualTo(actual.getFirstName());
    return this;
}

public PersonAssertions hasLastName(final String lastName) {
    Assertions.assertThat(lastName).isEqualTo(actual.getLastName());
    return this;
}

Note that we always return this to make the assertions fluent. Now:

assertEquals("Joey", person.getFirstName());
assertEquals("Tribiani", person.getLastName());

Can be replaced by:

assertThat(person).hasFirstName("Joey")
                  .hasLastName("Tribiani");

Now let's check gender:

public PersonAssertions isFemale() {
    Assertions.assertThat(actual.getSex()).isSameAs(Sex.FEMALE);
    return this;
}

public PersonAssertions isMale() {
    Assertions.assertThat(actual.getSex()).isSameAs(Sex.MALE);
    return this;
}

Now we have:

assertThat(person).hasFirstName("Joey")
                  .hasLastName("Tribiani")
                  .isMale();

Those assertions are just fancy methods that check internal state of given object. All the created methods are way more readable but we can still add more specific assertions that suits to our domain. Let's assume that we're selling alcohol.... In Poland you can buy alcohol if you're 18 years old.

public PersonAssertions canBuyBeerInPoland() {
    Assertions.assertThat(actual.getAge()).isGreaterThanOrEqualTo(18);
    return this;
}

assertThat(person).hasFirstName("Joey")
                  .hasLastName("Tribiani")
                  .isMale()
                  .canBuyBeerInPoland();

canBuyBeerInPoland() looks much better than assertTrue(person.getAge() > 18) because it uses our domain language. And the last one. Let's check whether given person is unemployed. Person class contains Optional<Job> so we have to check if it's empty or not:

public PersonAssertions isUnemployed() {
    Assertions.assertThat(actual.getJob()).isEmpty();
    return this;
}

Now you can compare junit assertions to our custom assertj assertions:
JUnit:

assertEquals(Sex.MALE, person.getSex());
assertEquals("Joey", person.getFirstName());
assertEquals("Tribiani", person.getLastName());
assertTrue(person.getAge() > 18);
assertEquals(Optional.empty(), person.getJob());

AssertJ:

assertThat(person).hasFirstName("Joey")
                  .hasLastName("Tribiani")
                  .isMale()
                  .canBuyBeerInPoland()
                  .isUnemployed();

It looks much better and you can reuse all those methods. You should also think about overriding error messages. We're still using the same instance of Person:

private Person p() {
    return new Person("Joey", "Tribiani", 25, Sex.MALE, Optional.empty());
}

The follwoing assertion:

assertThat(person).hasFirstName("Joeya");

generates error message like this:

org.junit.ComparisonFailure: 
Expected :"Joey"
Actual   :"Joeya"

You can always override error message like that:

public PersonAssertions hasFirstName(final String name) {
    Assertions.assertThat(name).overridingErrorMessage("Given Person [" + actual + "] has name " + actual.getFirstName() + ". Expected: " + name)
            .isEqualTo(actual.getFirstName());
    return this;
}

Now the same test generates:

java.lang.AssertionError: Given Person [DistributorSalesScheduler.Person(firstName=Joey, lastName=Tribiani, age=25, sex=MALE, job=Optional.empty)] has name Joey. Expected: Joeya

The whole class looks as follows:

public static class PersonAssertions extends AbstractAssert<PersonAssertions, Person> {
    public static PersonAssertions assertThat(final Person actual) {
        return new PersonAssertions(actual);
    }

    private PersonAssertions(final Person actual) {
        super(actual, PersonAssertions.class);
    }

    public PersonAssertions hasFirstName(final String name) {
        Assertions.assertThat(name).overridingErrorMessage("Given Person [" + actual + "] has name " + actual.getFirstName() + ". Expected: " + name)
                .isEqualTo(actual.getFirstName());
        return this;
    }

    public PersonAssertions hasLastName(final String lastName) {
        Assertions.assertThat(lastName).isEqualTo(actual.getLastName());
        return this;
    }

    public PersonAssertions canBuyBeerInPoland() {
        Assertions.assertThat(actual.getAge()).isGreaterThanOrEqualTo(18);
        return this;
    }

    public PersonAssertions isFemale() {
        Assertions.assertThat(actual.getSex()).isSameAs(Sex.FEMALE);
        return this;
    }

    public PersonAssertions isMale() {
        Assertions.assertThat(actual.getSex()).isSameAs(Sex.MALE);
        return this;
    }

    public PersonAssertions hasJob() {
        Assertions.assertThat(actual.getJob()).isPresent();
        return this;
    }

    public PersonAssertions isUnemployed() {
        Assertions.assertThat(actual.getJob()).isEmpty();
        return this;
    }
}

As you see writing custom assertions is very easy and makes tests more readable. All the assertions are reusable and can be unit tested. The project I've been working on contains module with all the utilities for tests so other modules can import the assertions. I'm always very happy when I see that someone's created new set of assertions for our domain objects :)

Monday, 14 November 2016

[Java 8 / Parallel stream / Stream] Should I always use parallel stream instead of stream ?

Streams are probably one of the most commonly used feature of Java 8. At first people discover forEach() method, then map() and filter() and so on. Some of them starts reading about functional programming but from my experience I'd say that in general people still think that stream is just an improved looping structure.

Then comes this exciting moment when they realize that it all can be much, much faster because there's also parallelStream. And then problems come...

Very often when I'm waiting for something I look into the code and try to fix some crappy parts. This one I've found yesterday:

resource.setRegions(product.getRegions().parallelStream().map(Region::getName).collect(toList()));

Our database has something like ten regions. Let's see how long it takes to collect such items using stream and parallelStream.

public static void main(String[] args) {
    final List<Region> regions = IntStream.range(0, 10)
                        .mapToObj(i -> new Regioan("region:" + i))
                        .collect(toList());

    useLabel("stream()").andLogPerformanceOf(() -> regions.stream()
                                                         .map(Region::getName)
                                                         .collect(toList()));
    useLabel("parallelStream()").andLogPerformanceOf(() -> regions.parallelStream()
                                                              .map(Region::getName)
                                                              .collect(toList()));
}

useLabel(...).andLogPerformanceOf(...) is just a simple wrapper that runs a piece of code and logs time taken (I'll paste it at the end of the article). First run shows:

stream() started
stream() completed. Time elapsed = 1 millis
parallelStream() started
parallelStream() completed. Time elapsed = 9 millis

And some more results:

10 elements
Stream	Parallel stream
4	10
2	3
2	14
2	18
1	6
3	17
2	8
5	7
2	14
1	8

As you can see in all cases stream() is faster than parallelStream(). Parallel stream has much higher overhead compared to stream which uses single thread. When you want to split collection's computation you need to divide the input so that the threads compute similar amount of data, run the threads, collect results and so on.

Let's make the input list bigger.

100 elements
Stream	Parallel stream
2	12
2	11
1	5
2	7
2	8
1	6
3	6
2	6
4	9
6	18

Parallel stream is still slower.

1000 elements
Stream	Parallel stream
9	14
2	20
2	7
9	23
3	9
2	20
3	6
2	5
3	9
3	5

Still slower.

10 000 elements
Stream	Parallel stream
8	7
6	23
12	9
5	10
6	9
16	19
7	9
14	14
11	22
20	18

For 10k elements the results are similar.

1 000 000 elements
Stream	Parallel stream
1423	65
1715	91
1244	63
1345	68
1458	91
1479	65
1415	48
1584	87
1425	61
1506	73

Having list that contains 1M elements parallel stream is way faster but how often do you work with such big collections ?

Let's get back to the main question: Should I always use parallel stream instead of stream ?

Definitely not.

You should consider parallel version:

when you work with huge collections
when computation of single element takes much time

I suppose that each case should be considered separately. Performance stronlgy depends on operations you perform so in my opinion trying to define some kind of conditions when parallel stream should be used simply doesn't make sense.

You've seen example that transforms huge collection. You can find another one which shows processing collection for which computing single element takes much time in my post here: How to control pool size while using parallel stream.

You should also remember that if you want to make your code parallel IT HAS TO BE immutable. I stronly recommend reading about functional programming principles.

That's all. I've promised to paste the tool that logs performance so here you are:

/**
 * @author Grzegorz Taramina
 *         Created on: 13/06/16
 */
public class PerformanceLoggingBlock implements Logging {
    private final String label;

    public static PerformanceLoggingBlock useLabel(final String label) {
        return new PerformanceLoggingBlock(label);
    }

    private PerformanceLoggingBlock(final String label) {
        this.label = label;
    }

    public void andLogPerformanceOf(final Runnable runnable) {
        perfLog().info(label + " started");
        Stopwatch stopwatch = Stopwatch.createStarted();
        runnable.run();
        perfLog().info(label + " completed. Time elapsed = " + stopwatch.elapsed(MILLISECONDS) + " millis");
    }

    public <T> T andLogPerformanceOf(final Supplier<T> supplier) {
        System.out.println(label + " started");
        Stopwatch stopwatch = Stopwatch.createStarted();
        T result = supplier.get();
        System.out.println(label + " completed. Time elapsed = " + stopwatch.elapsed(MILLISECONDS) + " millis");
        return result;
    }
}

Wednesday, 9 November 2016

[Scala / Java / Gradle] How to add Scala to Java project and use both ?

I've been developing Java projects for couple of years and I remember the day when I finally could use Java 8. I was pretty excited that I can abandon Guava's FluentIterable, command pattern, anonymous classes and so on but I shortly realized that it's not enough when you want to write concise functional code.

Although Java8 makes significant step forward it's nothing compared to Scala. I haven't heard about a company that decided to rewrite some huge Java project to Scala so far but fortunately Scala runs on JVM so you can use both.

Ok I have Java8 + Gradle project. Let's add Scala :)

In build gradle you need to add scala plugin:

apply plugin: 'scala'

And scala lang/compiler dependencies:

compile group: 'org.scala-lang', name: 'scala-library', version: scalaVersion
compile group: 'org.scala-lang', name: 'scala-compiler', version: scalaVersion

You should also make sure that .java and .scala files are being built together so:

sourceSets.main.scala.srcDir "src/main/java"
sourceSets.test.scala.srcDir "src/test/java"
sourceSets.main.java.srcDirs = []
sourceSets.test.java.srcDirs = []

The project I've been working on has multiple modules so I've added sourceSets and dependencies in subprojects section. Remember about installing Scala plugin in your IDE (in intellij it's called Scala).

That's all :)

Now when I'm trying to build the project I get the following output:

➜  cw git:(develop) git status
On branch develop
Your branch is up-to-date with 'origin/develop'.
nothing to commit, working directory clean
➜  cw git:(develop) gradle build
:compileJava UP-TO-DATE
:compileScala UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:jar UP-TO-DATE
:assemble UP-TO-DATE
:compileTestJava UP-TO-DATE
:compileTestScala UP-TO-DATE
:processTestResources UP-TO-DATE
:testClasses UP-TO-DATE
:test UP-TO-DATE
:check UP-TO-DATE
:build UP-TO-DATE
...

As you can see there's compileScala among others which in fact builds both .java and .scala files. It's because of our sourceSets. It allows you to use Scala classes in .java files and Java classes in .scala files.

You should also make sure that you chose proper version of Scala. I use 2.11.5. It works without any issues with Java 8.

Friday, 23 September 2016

[Bash / Git] How to print number of commits done today ?

Quite simple stuff. You want to know how many commits you've done so far today. How to do that ?

➜  cw git:(develop) git log

It shows commits in the following format:

commit 2affec1442eb6b4ad21cd6d93c7670e49e4f3cba
Author: gt
Date:   Wed Sep 21 13:33:15 2016 +0200

    got rid of some warnings

commit 2bd8a52b239301adedeee8a36cff47c2599f992f
Author: gt
Date:   Wed Sep 21 13:25:24 2016 +0200

    bumped scala version in versions.gradle

You can exactly see when and by whom the commit has been made.

Author: gt
Date:   Wed Sep 21 13:25:24 2016 +0200

We need to filter git log output by author and date. It would be easier to grep the data if both author and commit were in the same line. We also need date only (time isn't important here) so let's format git log output like that:

git log --date=short --format=format:"%ad %aE %s"

It prints:

2016-09-23 mymaila@mycompany.pl commit message 1
2016-09-23 mymaila@mycompany.pl commit message 2
2016-09-23 mymaila@mycompany.pl commit message 3
2016-09-23 mymaila@mycompany.pl commit message 4
2016-09-23 mymaila@mycompany.pl commit message 5
2016-09-23 mymaila@mycompany.pl commit message 6
2016-09-23 mymaila@mycompany.pl commit message 7
2016-09-22 mymaila@mycompany.pl commit message 8
2016-09-22 mymaila@mycompany.pl commit message 9
2016-09-22 mymaila@mycompany.pl commit message 10

Now we can grep such output easily but we still need to know current date in the same format.

➜  cw git:(develop) date +'%Y-%m-%d'
2016-09-23

Let's put it together:

➜  cw git:(develop) ✗ git log --date=short --format=format:"%ad %aE %s" | grep "$(date +'%Y-%m-%d') mymaila@mycompany.pl"
2016-09-23 mymaila@mycompany.pl commit message 1
2016-09-23 mymaila@mycompany.pl commit message 2
2016-09-23 mymaila@mycompany.pl commit message 3
2016-09-23 mymaila@mycompany.pl commit message 4
2016-09-23 mymaila@mycompany.pl commit message 5
2016-09-23 mymaila@mycompany.pl commit message 6
2016-09-23 mymaila@mycompany.pl commit message 7

And the final step - count the lines:

➜  cw git:(develop) ✗ git log --date=short --format=format:"%ad %aE %s" | grep "$(date +'%Y-%m-%d') mymaila@mycompany.pl" | wc -l
8

I guess it's faster to count manually commits in git log than type the whole command above so let's put it into a script:

#!/bin/bash

TODAY=$(date +'%Y-%m-%d')
AUTHOR="mymaila@mycompany.pl"
COMMITS=$(git log --date=short --format=format:"%ad %aE %s" | grep "$TODAY $AUTHOR" | wc -l)

echo "Commits today: $COMMITS"

I've also added alias in .zshrc:

alias hmc="/home/gt/tools/howManyCommitsToday.sh"

➜  cw git:(develop) hmc
Commits today: 8

Wednesday, 10 August 2016

[CleanCode / Patterns] How classes like Period should be constructed ?

I've been thinking recently a lot about constructing objects that contain only two fields. The system I develop contains many classes like this which is rather problematic. I guess Period is a great example.

    public class Period {
        private final DateTime start;
        private final DateTime end;

	...
    }

How can we create instance of Period ?

1. No-arg constructor + setters

This is probably the worst way. It would look like that:

final Period period = new Period();
period.setStart(startDate);
period.setEnd(endDate);

Advantages:

None

Disadvantages:

you create empty object
it takes three lines of code
it's muttable so you never know if the object is consistent
you can make stupid mistakes like invoke same setter twice with different parameters and so on

Basically for me no-arg constructor and set of getters and setters is a data structure. It isn't encapsulation because you won't put any logic to setters. You can make all fields public - it's the same. Let's talk about object consistency for a while. Consider you have Address class:

    public static class Address {
        private final String street;
        private final String houseNumber;
        private final String postalCode;
    }

Some user (let's say Ben) registers in your system with the following address:

new Address("Long street", "16a", "50-500");

After few days Ben finds better flat on the same street say: Long Steet, 46d, 50-500 What now ? Should we update existing address or create new one ? Some people would say update but what if this house belongs to other post therefore should have other postal code ? I think that we should in such case always create new object. I have a validator which checks Address instance so I am sure that the address is correct. When I create and validate object I get consistent and correct object so immutable objects should be created whenever it's possible.

2. All-args constructor + getters

Advantages:

object is immutable
it takes one line to create instance

Disadvantages:

it's not very readable (espiecially for people who aren't developers)
when constructor's parameters are of the same type it's easy to swap them by mistake

3. Builder

I think it's best solution in this case. Most people use builder pattern for classes which have a lot of fields. In my opinion builder is a perfect pattern to create instance of class which has only two fields. Let's get back to Period class:

    @Getter
    public static class Period {
        private final DateTime startDate;
        private final DateTime endDate;

        public static Builder newPeriodThatStarts(final DateTime startDate) {
            return new Builder(startDate);
        }

        private Period(final DateTime startDate, final DateTime endDate) {
            this.startDate = startDate;
            this.endDate = endDate;
        }

        public static class Builder {
            private final DateTime startDate;
            
            private Builder(final DateTime startDate) {
                this.startDate = startDate;
            }
            
            public Period andEnds(final DateTime endDate) {
                return new Period(startDate, endDate);
            }
        }
    }

And this is how you instantiate Period instance:

    public static void main(String [] args) {
        Period period = newPeriodThatStarts(now()).andEnds(now().plusDays(1));
        Period anotherPeriod = newPeriodThatStarts(new DateTime(2016, 2, 15, 10, 0, 0)).andEnds(now());
    }

Advantages:

object is immutable
it takes one line to create instance
it's very readable even for someone who isn't software developer

Disadvantages:

you have to create a builder (or use lombok)

Builder seems to be the best way for me. For Period you could also use all args constructor because the parameters have natural order - the first one is startDate and the second endDate (I can't really imagine the opposite) so you shouldn't swap the parameters by mistake. A lot of domain objects take two strings as parameters and in this case I would definitely use builder.

Tuesday, 19 July 2016

[Java 8 / Threads / Parallel stream] How to control pool size while using parallel stream ?

Recently I had to implement huge functionality which among other things is responsible for automatic buying of products available in some shop. The api allows to put many products into single request but due to some legal issues the items have to be bought one by one. So if someone wants to buy:

Product: Witcher 3, quantity: 15
Product: GTA 5, quantity: 5

I have to make 20 requests. It's a soap endpoint so it takes lots of time. Consider the following method:

    public OrderResult makeOrder(final List<ExternalSupplierOrderEntry> orderEntries, final String orderId, final String language) {
        final List<List<ExternalSupplierOrderEntry>> orderEntriesChunks = orderSplitter.split(orderEntries);
        final List<ExternalSupplierCode> boughtCodes = orderEntriesChunks.stream()
                .map(chunk -> Try.ofFailable(() -> buy(orderEntries, orderId, language))
                                 .whenFailure(t -> log().error("Something went wrong while making order", t))
                                 .orElse(markCodesAsFailed(chunk)))
                .flatMap(Collection::stream)
                .collect(toList());

        return new OrderResult(boughtCodes, orderId);
   }

It splits the order into chunks (one item per chunk) and buys it. buy() method calls soap endpoint which returns the code. When I try to buy 50 codes it takes one minute to complete the order. It's way too long so my first thought was: replace stream() with parallelStream(). And it actually works :)

    public OrderResult makeOrder(final List<ExternalSupplierOrderEntry> orderEntries, final String orderId, final String language) {
        final List<List<ExternalSupplierOrderEntry>> orderEntriesChunks = orderSplitter.split(orderEntries);
        final List<ExternalSupplierCode> boughtCodes = orderEntriesChunks.parallelStream()
                .map(chunk -> Try.ofFailable(() -> buy(orderEntries, orderId, language))
                                 .whenFailure(t -> log().error("Something went wrong while making order", t))
                                 .orElse(markCodesAsFailed(chunk)))
                .flatMap(Collection::stream)
                .collect(toList());

        return new OrderResult(boughtCodes, orderId);
   }

I'm trying to buy 50 codes so buy() method is being invoked 50 times. For stream() I get: 61.3s For parallelStream() I get: 10.21s 10 seconds is a huge improvement but it's still very long so my second thought was to increase the number of threads in pool. parallelStream() is ok but there's no overloaded parallelStream(int threads) method. Since Java 7 fork / join framework is available directly in the JDK. Parallel stream utilizes the framework in order to perform operations on stream's elements using multiple threads. When you look into ForkJoinPool class you'll see that default construvtor sets default number of threads (parallelism parameter) like that:

    public ForkJoinPool() {
        this(Math.min(MAX_CAP, Runtime.getRuntime().availableProcessors()),
             defaultForkJoinWorkerThreadFactory, null, false);
    }

I takes minimum(availableProcessors, 0x7fff) where 0x7fff = 32767 You'll typically get here min(8, 32767) = 8. Let's make some test.

    public static void main(String [] args) {
        final Set<Object> threadNames = IntStream.range(0, 10).parallel()
                .boxed()
                .peek(i -> Try.ofFailable(() -> { Thread.sleep(1000); return i; }))
                .map(i -> Thread.currentThread().getName())
                .collect(toSet());
        System.out.println(threadNames.size());
        System.out.println(threadNames);
    }

It prints:

4
[ForkJoinPool.commonPool-worker-1, ForkJoinPool.commonPool-worker-2, main, ForkJoinPool.commonPool-worker-3]

Note that peek operation which sleeps for a second has been added to make operations longer so all the threads are being used. Let's try to increase number of threads in the pool using ForkJoinPool.

    public static void main(String [] args) throws ExecutionException, InterruptedException {
        final ForkJoinPool forkJoinPool = new ForkJoinPool(20);
        final Set<String> threadNames = forkJoinPool.submit(() -> IntStream.range(0, 20).parallel()
                .boxed()
                .peek(i -> Try.ofFailable(() -> { Thread.sleep(1000); return true; }).toOptional())
                .map(i -> Thread.currentThread().getName())
                .collect(toSet())).get();

        System.out.println(threadNames.size());
        System.out.println(threadNames);
    }

This one prints:

20
[ForkJoinPool-1-worker-8, ForkJoinPool-1-worker-30, ForkJoinPool-1-worker-9, ForkJoinPool-1-worker-23, ForkJoinPool-1-worker-12, ForkJoinPool-1-worker-22, ForkJoinPool-1-worker-11, ForkJoinPool-1-worker-20, ForkJoinPool-1-worker-1, ForkJoinPool-1-worker-4, ForkJoinPool-1-worker-5, ForkJoinPool-1-worker-2, ForkJoinPool-1-worker-16, ForkJoinPool-1-worker-27, ForkJoinPool-1-worker-15, ForkJoinPool-1-worker-26, ForkJoinPool-1-worker-25, ForkJoinPool-1-worker-19, ForkJoinPool-1-worker-18, ForkJoinPool-1-worker-29]

As you can see number of threads involved in resolving stream's output has been increased to 20. It takes only one additional line because you have to create ForkJoinPool object. I'd really like to get rid of that line so I don't have to remember about ForkJoinPool so I've created this class:

/**
 * @author Grzegorz Taramina
 *         Created on: 18/07/16
 */
public class ForkJoinPoolInvoker {
    private final ForkJoinPool forkJoinPool;

    public static ForkJoinPoolInvoker usePoolWithSize(final int poolSize) {
        return new ForkJoinPoolInvoker(poolSize);
    }

    private ForkJoinPoolInvoker(final int poolSize) {
        this.forkJoinPool = new ForkJoinPool(poolSize);
    }

    public <T> T andInvoke(final Callable<T> task) {
        final ForkJoinTask<T> submit = forkJoinPool.submit(task);
        return Try.ofFailable(submit::get).orElseThrow(RuntimeException::new);
    }
}

Now the previous example would lool like that:

    public static void main(String [] args) throws ExecutionException, InterruptedException {
        final Set<Object> threadNames = usePoolWithSize(20).andInvoke(() -> IntStream.range(0, 20).parallel()
                .boxed()
                .peek(i -> Try.ofFailable(() -> { Thread.sleep(1000); return true; }).toOptional())
                .map(i -> Thread.currentThread().getName())
                .collect(toSet()));

        System.out.println(threadNames.size());
        System.out.println(threadNames);
    }

Let's get back to the main example that buys codes:

    public OrderResult makeOrder(final List<ExternalSupplierOrderEntry> orderEntries, final String language) {
        final List<List<ExternalSupplierOrderEntry>> orderEntriesChunks = orderSplitter.split(orderEntries);
        final List<ExternalSupplierCode> boughtCodes =  usePoolWithSize(nexwaySettings.getNumberOfBuyingThreads())
                                                        .andInvoke(() ->
             orderEntriesChunks.stream()
                .parallel()
                .map(chunk -> {
                    final String orderId = uuidProvider.randomUUID();
                    return Try.ofFailable(() -> buy(chunk, orderId, language))
                            .whenFailure(t -> log().error("Something went wrong while making order", t))
                            .orElse(markCodesAsFailed(chunk, orderId));
                })
                .flatMap(Collection::stream)
                .collect(toList())
        );

        return new OrderResult(boughtCodes);
    }

I've also checked how many threads would be sufficient to make order in a reasonable time:

Note that all the results presented in the chart have been averaged (for each number of threads the test has been performed 10 times). The chart shows that using 15 threads is sufficient because it takes slightly more than 4 seconds to make 50 requests. As you can see changing pool size is quite easy. I do realize that I could do that differently but in the end this solution looks good. All the calls to the API have timeout so I shouldn't experience all the typical problems connected to parallel stream that people talk about.

Thursday, 14 July 2016

[Spring / Async] How to invoke method in separate thread ?

[Spring / Async] How to invoke method in separate thread ? Some tasks need to be invoked in a separate thread - asynchronously. For instance when the operation needs a lot of time to finish. You just want to run it and return its id or some kind of status. In pure Java you would use Threads / Runnables and other stuff available in low-level cuncurrency API which JDK provides. Actually it isn't very convenient so Spring developers have taken care of that. I've prepared very simple Spring application which contains only two beans: RequestHandler and OperationRunner. The first one invokes the other. Here's the application runner:

    public class App {
        public static void main(final String [] args) throws InterruptedException {
            final AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext(AppConfig.class);
            final RequestHandler handler = context.getBean(RequestHandler.class);
            handler.handle();
        }
    }

And a configuration:

@Configuration
@ComponentScan("gt.dev.spring.async")
public class AppConfig {
}

Implementation of components:

    @Component
    public class RequestHandler {
        private static final Logger LOG = Logger.getLogger(RequestHandler.class);

        private final OperationRunner operationRunner;

        @Autowired
        public RequestHandler(final OperationRunner operationRunner) {
            this.operationRunner = operationRunner;
        }

        public void handle() throws InterruptedException {
            LOG.info("Request processing started. Invoking operation...");
            operationRunner.run();
            LOG.info("Request processed");
        }
    }

And OperationRunner:

    @Component
    public class OperationRunner {
        private static final Logger LOG = Logger.getLogger(OperationRunner.class);

        public void run() throws InterruptedException {
            LOG.info("Operation started. Sleeping...");
            Thread.sleep(5000);
            LOG.info("Operation finished");
        }
    }

After running main method in App class I get the following logs:

maj 07, 2015 3:39:28 PM gt.dev.spring.async.RequestHandler handle
INFO: Request processing started. Invoking operation...
maj 07, 2015 3:39:28 PM gt.dev.spring.async.OperationRunner run
INFO: Operation started. Sleeping...
maj 07, 2015 3:39:33 PM gt.dev.spring.async.OperationRunner run
INFO: Operation finished
maj 07, 2015 3:39:33 PM gt.dev.spring.async.RequestHandler handle
INFO: Request processed

It works as expected:

RequestHandler logs information which indicates that process has been started
RequestHandler invokes OperationRunner
OperationRunner logs that operation started
OperationRunner performs long operation (sleeps for 5 seconds)
OperationRunner logs that operation finished
control gets back to RequestHandler
RequestHandler logs that request has been processed

Every client has to wait until the operation is completed. In some cases we may want to return some kind of id of operation to the client and process in the background asynchronously. Let's make the run() method async. To do that we need @Async annotation that indicates that particular method should be run in separate thread.

    @Component
    public class OperationRunner {
        private static final Logger LOG = Logger.getLogger(OperationRunner.class);

        @Async
        public void run() throws InterruptedException {
            LOG.info("Operation started. Sleeping...");
            Thread.sleep(5000);
            LOG.info("Operation finished");
        }
    }

Spring also has to know that async calls are enabled so we need one additional annotation in spring configuration - @EnableAsync. Lots of people forget about that. It's actually quite clever. You can put @Async on all the methods that need to be asynchronous and when needed disable all async calls by removing @EnableAsync.

    @Configuration
    @ComponentScan("gt.dev.spring.async")
    @EnableAsync
    public class AppConfig {
    }

After running the app I get the following log which proves that request has been processed before the async operation finished.

maj 07, 2015 3:54:13 PM gt.dev.spring.async.RequestHandler handle
INFO: Request processing started. Invoking operation...
maj 07, 2015 3:54:13 PM gt.dev.spring.async.RequestHandler handle
INFO: Request processed
maj 07, 2015 3:54:13 PM gt.dev.spring.async.OperationRunner run
INFO: Operation started. Sleeping...
maj 07, 2015 3:54:18 PM gt.dev.spring.async.OperationRunner run
INFO: Operation finished

Tuesday, 12 July 2016

[Java 8 / Functional programming] Functional util that invokes a command n times.

Recently I've been doing major refactoring of integration tests. I've found many tests which do stuff like that:

for (int i = 0; i < 256; i++) {
     addProduct(UUID.randomUUID().toString);
 }

It's pretty ugly, isn't it ? It would be nice to have a small tool that invokes given piece of code n times. In Java 8 we can use IntStream:

IntStream.range(0, 256).forEach(i -> addProduct(UUID.randomUUID().toString()));

Looks better but it's still not very readable. Again I've started with a test that specifies how the tool should work:

    @Test
    public void shouldInvokeCommandFiveTimes() throws Exception {
        // given
        final List<String> list = newArrayList();

        // when
        times(5).invoke(() -> list.add("item"));

        // then
        assertThat(list).containsExactly("item", "item", "item", "item", "item");
    }

I've come up with the following class:

/**
 * @author Grzegorz Taramina
 *         Created on: 12/07/16
 */
public class Times {
    private final int times;

    private Times(final int times) {
        this.times = times;
    }

    public static Times times(final int times) {
        Assert.isTrue(times >= 0, "times must be at least equal to zero");
        return new Times(times);
    }

    public void invoke(final Runnable runnable) {
        IntStream.range(0, times).forEach(i -> runnable.run());
    }
}

It's very simple but makes code concise and readable:

times(5).invoke(() -> addProduct(randomUUID().toString));

I might have exaggerated saying that this is functional tool. It's simply higher order function but very useful.

Monday, 11 July 2016

[Java8 / Functional programming] How to create object that will be created lazily ?

Sometimes you may want to create some objects lazily. Especially when it comes to really heavy objects that may or may not be used in runtime. In one of the companies I used to work we had to use Java 6. Some developers must have read some articles about laziness and started to create literally all the objects lazily like that:

    public static class OldFashionedHeavyObjectHolder {
        private HeavyObject heavyObject;

        public synchronized HeavyObject getHeavyObject() {
            if (heavyObject == null) {
                heavyObject = new HeavyObject();
            }

            return heavyObject;
        }
    }

After couple of months we had a lot of classes with tons of getters that check if an object is null and so on. I can notice at least four disadvantages of that approach:

the method has to be synchronized because more than one thread can invoke the method when heavyObject == null
even if heavyObject has already been created you have to check that
it's extremely ugly
it's hard to test it

Luckily I've changed the company and now I can use all those fancy streams, lambdas and everything that Java 8 comes with. Basically I wanted to create a tool which works like that:

/**
 * @author Grzegorz Taramina
 *         Created on: 23/06/16
 */
public class LazyInstanceTest {
    @Test
    public void shouldCreateLazyInstance() throws Exception {
        // given
        LazyInstance<String> instance = LazyInstance.of(() -> "i'm lazy");

        // when
        String result = instance.get();

        // then
        assertThat(result).isEqualTo("i'm lazy");
    }
}

String is obviously just a simplification. So some kind of factory that creates a holder of a heavy instance and takes care of creating it lazily. I've figured out the following class:

 *
 * @author Grzegorz Taramina
 *         Created on: 23/06/16
 */
public class LazyInstance<T> {
    private final Supplier<T> instanceSupplier;
    private Supplier<T> instance = this::create;

    public static <T> LazyInstance<T> of(final Supplier<T> instanceSupplier) {
        return new LazyInstance<>(instanceSupplier);
    }

    /**
     * Creates LazyInstance
     * @param instanceSupplier supplier that will be lazily used while creating instance
     */
    private LazyInstance(final Supplier<T> instanceSupplier) {
        this.instanceSupplier = instanceSupplier;
    }

    public T get() {
        return instance.get();
    }

    private synchronized T create() {
        class InstanceFactory implements Supplier<T> {
            private final T instance = instanceSupplier.get();

            public T get() {
                return instance;
            }
        }

        if (!InstanceFactory.class.isInstance(instance)) {
            instance = new InstanceFactory();
        }

        return instance.get();
    }
}

It works like that:

final LazyInstance<HeavyObject> heavy = new LazyInstance<>(HeavyObject::new);
HeavyObject heavyObject = heavy.get();

The main idea of this class is that the supplier is being invoked lazily. Synchronized create method returns value returned by InstanceFactory (in fact it's a Supplier). Instance factory in turn returns value that returns Supplier provided to the LazyInstance. So basically we're invoking supplier that invokes supplier that creates real instance. Another thing: instance = new InstanceFactory(); - this line's really important because it swaps suppliers which means that synchronized block and if-else statement are being invoked only once. After the object is created the instance field (in LazyInstance not the InstanceFactory) contains InstanceFactory instance which returns real instance. It may look a bit complicated but I think it does all the stuff quite elegantly. Just to prove that the instance is being created lazily:

    public static class HeavyObject {
        public HeavyObject() {
            System.out.println("heavy's being created...");
        }
    }

    public static void main(String [] args) {
        System.out.println("Started executing main method");
        final LazyInstance<HeavyObject> heavy = LazyInstance.of(HeavyObject::new);
        System.out.println("Created lazy instance");
        System.out.println("Calling heavy.get()");
        HeavyObject heavyObject = heavy.get();
        System.out.println("End of main");
    }

The output:

Started executing main method
Created lazy instance
Calling heavy.get()
heavy's being created...
End of main

I should also mention that it looks good in Java 8 because of lambdas but it can be also implemented in older versions using anonymous classes.

Wednesday, 10 February 2016

[Java8 / Spring / Test] How to test TransactionTemplate's execute() method ?

If your app uses Spring framework you may be familiar with either @Transactional or TransactionTemplate. Altough TransactionTemplate couples your app with Spring a lot of people use it. In most cases it's being injected into DAOs or some kind of AbstractDAO. DAO is an object which typically is tested by integration test which enables in-memory database. In most cases you won't need unit testing here but what if TransactionTemplate has been injected into some kind of service / transaction etc - in general a class which has to be unit tested ? There is one problem with TransactionTemplate - execute() method takes TransactionCallback as a parameter. This is how you would invoke it:

transactionTemplate.execute((s) -> propertyDao.persist(copyOf(toSave).withNewResourceId(accountId)));

If you mock TransactionTemplate then propertyDao.persist() will never be invoked. In my unit test PropertyDao is a mock so now I cannot use Mockito.verify() to check whether persist method has been invoked (it returns void).

private final PropertyDao propertyDao = mock(PropertyDao.class);

Let's see how execute() method has been implemented:

    @Override
    public <T> T execute(TransactionCallback<T> action) throws TransactionException {
        if (this.transactionManager instanceof CallbackPreferringPlatformTransactionManager) {
            return ((CallbackPreferringPlatformTransactionManager) this.transactionManager).execute(this, action);
        }
        else {
            TransactionStatus status = this.transactionManager.getTransaction(this);
            T result;
            try {
                result = action.doInTransaction(status);
            }
            catch (RuntimeException ex) {
                // Transactional code threw application exception -> rollback
                rollbackOnException(status, ex);
                throw ex;
            }
            catch (Error err) {
                // Transactional code threw error -> rollback
                rollbackOnException(status, err);
                throw err;
            }
            catch (Exception ex) {
                // Transactional code threw unexpected exception -> rollback
                rollbackOnException(status, ex);
                throw new UndeclaredThrowableException(ex, "TransactionCallback threw undeclared checked exception");
            }
            this.transactionManager.commit(status);
            return result;
        }
    }

The most important line:

result = action.doInTransaction(status);

It simply means that our function:

transactionTemplate.execute((s) -> propertyDao.persist(copyOf(toSave).withNewResourceId(accountId)));

is being invoked in the method so when you mocked the template it won't happen at all. How to deal with that ? My first thought was to use ArgumentCaptor to catch the parameter passed to execute method and invoke it but I think I found a better way.

class FunctionCallingTransactionTemplate extends TransactionTemplate {
        @Override public <T> T execute(TransactionCallback<T> action) throws TransactionException {
            final TransactionStatus irrelevantStatus = null;
            return action.doInTransaction(irrelevantStatus);
        }
    }

In the code above I extend TransactionTemplate so that in only invokes the action passed to execute() method without other stuff. I guess I'm gonna need this in many tests so we can create a simple trait:

public interface FunctionCallingTransactionTemplateTrait {
    default TransactionTemplate functionCallingTransactionTemplate() {
        return new FunctionCallingTransactionTemplate();
    }

    class FunctionCallingTransactionTemplate extends TransactionTemplate {
        @Override public <T> T execute(TransactionCallback<T> action) throws TransactionException {
            final TransactionStatus irrelevantStatus = null;
            return action.doInTransaction(irrelevantStatus);
        }
    }
}

Now in my test I have:

public class SaveAccountAttributesTransactionTest implements FunctionCallingTransactionTemplateTrait {
    private final ArgumentCaptor propertyCaptor = ArgumentCaptor.forClass(Property.class);
    private final PropertyDao propertyDao = mock(PropertyDao.class);
    private final TransactionTemplate transactionTemplate = functionCallingTransactionTemplate();

    private final SaveAccountAttributesTransaction transaction = new SaveAccountAttributesTransaction(propertyDao, transactionTemplate);
    ...
}

And some test:

    @Test
    public void shouldUpdateOneValueAndPersistOther() throws Exception {
        // given
        when(propertyDao.fetchResourceProperties("root", ACCOUNT)).thenReturn(Lists.newArrayList(
                propertyOf("firstProp", "2.21", "root"),
                propertyOf("secondProp", null, null),
                propertyOf("thirdProp", null, null),
                propertyOf("fourthProp", null, null)
        ));
        SaveAccountAttributesEvent event = new SaveAccountAttributesEvent("root", Lists.newArrayList(
                propertyOf("firstProp", "2.22", "root"),
                propertyOf("secondProp", null, null),
                propertyOf("thirdProp", "1.11", "root"),
                propertyOf("fourthProp","default", null)
        ));

        // when
        transaction.execute(event);

        // then
        verify(propertyDao).updatePropertyValue(anyString(), eq("2.22"));
        verify(propertyDao).persist(propertyCaptor.capture());
        assertThat(propertyCaptor.getAllValues()).extracting(Property::getResourceId, Property::getValue)
                .containsOnly(tuple("root", "1.11"));
    }

And it passess :) As you can see I verify behaviour of propertyDao which is being invoked by our extended TransactionTemplate. Hope it helps.

Monday, 8 February 2016

]Bash / Gawk] How to print how long do you have to stay at work today ?

Today I'm going to show you a short script wirtten when I was running integration tests. It simply shows minutes you still have to be at work.

#!/bin/bash
START_HOUR=$1
START_MINUTE=$2

if [ -z "$START_HOUR" ] || [ -z "$START_MINUTE" ]; then
    START_HOUR="9"
    START_MINUTE="45"
fi

date | gawk '{print $4}' | gawk -F":" '{print 8 * 60 - (($1 * 60 + $2) - ('"$START_HOUR"' * 60 + '"$START_MINUTE"'))}'

It's actually very simple.

START_HOUR=$1
START_MINUTE=$2

Here I create two variables which contain hour and minute I came to the office.

[ -z "$START_HOUR" ]

This condition checks whether a variable contains some value. In case I don't pass the date when I came it sets my default which is 9:45. Then very simple oneliner:

date | gawk '{print $4}' | gawk -F":" '{print 8 * 60 - (($1 * 60 + $2) - ('"$START_HOUR"' * 60 + '"$START_MINUTE"'))}'

date

prints current date -> Mon 8 Feb 15:12:09 CET 2016

date | gawk '{print $4}'

prints the fourth field (split by whitespace) -> 15:12:55

gawk -F":"

-F allows you to specify how do you want the input string to be split - in this case it's ':' so $1 now contains current hour and $2 current minute.

'"$START_HOUR"'

this is how you can access shell variables in gawk And then some simple math (note that I assume that working day == 8h):

'{print 8 * 60 - (($1 * 60 + $2) - ('"$START_HOUR"' * 60 + '"$START_MINUTE"'))}'

8 * 60 = working day (minutes)

($1 * 60 + $2) - ('"$START_HOUR"' * 60 + '"$START_MINUTE"'))

current minute of day minus minute I came to the office
The result of the script is: 148 which means I can go home after 148 minutes :)
You can obviously pass what time you came to work:
./howLong 10 0 prints 162