gt_dev: Scala

Showing posts with label Scala. Show all posts

Tuesday, 10 October 2017

[Scala / Scalatest / JUnit] How to run Scalatests with existing JUnit tests ?

The project I've been working on has originally been created in Java. I've added Scala one year ago so it became a cross-language project. Since then all new features have been coded in Scala but I didn't have that much time to add any scala testing framework so all the tests have been written in Scala + JUnit like this one:

import com.wiso.cw.externalsupplier.genba.auth.GenbaHeader._
import org.assertj.core.api.Assertions.assertThat
import org.assertj.core.data.MapEntry.entry
import org.junit.Test
import org.mockito.Mockito._

class GenbaHttpEntityCreatorTest {
  private val defaultHeadersCreator = mock(classOf[GenbaDefaultHeadersCreator])

  private val entityCreator = new GenbaHttpEntityCreator(defaultHeadersCreator)

  @Test
  def should_create_http_entity_with_headers(): Unit = {
    // when
    val entity = entityCreator.createHttpEntity("some content", Token -> "token",
                                                                AppId -> "our app id",
                                                                Accept -> "application/json")

    // then
    assertThat(entity.getBody) isEqualTo "some content"
    assertThat(entity.getHeaders).containsOnly(entry("token", "token"),
                                               entry("appId", "our app id"),
                                               entry("Accept", "application/json"))
  }
}

Recently I decided to add Scalatest but we have thousands of JUnit tests so the only option is to run both Scalatest and JUnit tests when the project is being built by gradle or intellij. At first I added gradle dependency:

testCompile group: 'org.scalatest', name: 'scalatest_2.12', version: '3.0.1'

Note that the artifact name is scalatest_2.12 which means it's gonna work with Scala 2.12 so if you use other version make sure to find appropriate version of scalatest.

Basically if you want to run scalatest like any other junit test just add JUnitRunner:

import org.junit.runner.RunWith
import org.scalatest.junit.JUnitRunner

@RunWith(classOf[JUnitRunner])

It works but you have to add @RunWith annotation and extend proper spec (see: http://www.scalatest.org/user_guide/selecting_a_style) in every single test. I'd rather prefer creating my own spec. I've created two specs: UnitSpec - for all unit tests and IntegrationSpec - for tests that run spring context. The spec should extend any scalatest spec you'd like to use (for me it's FlatSpec).

@RunWith(classOf[JUnitRunner])
abstract class UnitSpec extends FlatSpec

Now every test that extend UnitSpec uses FlatSpec and can be run like JUnit test. You can obviously add some stuff here like Matchers, MockitoSugar and so on. My UnitSpec looks like that:

@RunWith(classOf[JUnitRunner])
abstract class UnitSpec extends FlatSpec
  with Matchers
  with MockitoSugar
  with OptionValues
  with GivenWhenThen
  with OneInstancePerTest
  with BeforeAndAfter {
  def when[T](methodcall: T): OngoingStubbing[T] = Mockito.when(methodcall)

  def verify[T](methodcall: T): T = Mockito.verify(methodcall)

  def verify[T](methodcall: T, verificationMode: VerificationMode): T = Mockito.verify(methodcall, verificationMode)

  def spy[T](o: T): T = Mockito.spy(o)

  def verifyZeroInterations(mocks: Object*): Unit = Mockito.verifyZeroInteractions(mocks: _*)

  def timeout(milis: Long): VerificationWithTimeout = Mockito.timeout(milis)
}

Note that we still use mockito and I don't want to import Mockito.when, Mockito.verify and so on in tests so it's more conveniant to have those methods in base class.

If you know Scala you may ask if I could use trait instead of abstract class. I was also thinking about that but it seems that annotations are not being inherited from traits.. Consider the following code:

@RunWith(classOf[JUnitRunner])
trait Trait

@RunWith(classOf[JUnitRunner])
abstract class AbstractClass

class ChildTrait extends Trait

class ChildClass extends AbstractClass

object A extends App {
  new ChildTrait().getClass.getAnnotations.foreach(println)

  new ChildClass().getClass.getAnnotations.foreach(println)
}

The output:

Child trait:
 s PU3g! )b#D    9"AA Ue LG C     !$  =S:LGO   7A Q  )
Child class:
@org.junit.runner.RunWith(value=class org.scalatest.junit.JUnitRunner)
^"mCN"B
   ! A  j]&$F  ! y  )

It shows that @RunWith annotation has not been inherited by ChildTrait class so if ChildClass were a test it wouldn't be run with other tests!!! Here's some simple test:

class RetryTest extends UnitSpec {
  private val someService = mock[Service]

  it should "retry twice and return value" in {
    Given("some service returns result after two errors")
    when(someService.doSth()).thenThrow(new RuntimeException)
      .thenThrow(new RuntimeException)
      .thenReturn("done")

    When("running it with double retry")
    val result = retry(3) {
      someService.doSth()
    }

    Then("result should be returned")
    result shouldBe "done"
  }
}

Friday, 28 July 2017

[Scala / Threads / Parallel collections] How to configure number of threads on parallel collection ?

I've already written an article about controlling pool size while using Java's parallel stream. Recently I needed exactly the same feature in Scala. Scala collection can be turned into parallel collection by invoking par method (introduced by Parallelizable trait).

trait Parallelizable[+A, +ParRepr <: Parallel] extends Any

so assuming you don't break basic rules of functional programming (immutablility, statelessness etc.) you can make your algorithms much faster by adding .par. Unfortunately .par method (exactly like .parallelStream() in Java) doesn't take any parameter which could controll the size of a pool.

/** Returns a parallel implementation of this collection.
   *
   *  For most collection types, this method creates a new parallel collection by copying
   *  all the elements. For these collection, `par` takes linear time. Mutable collections
   *  in this category do not produce a mutable parallel collection that has the same
   *  underlying dataset, so changes in one collection will not be reflected in the other one.
   *
   *  Specific collections (e.g. `ParArray` or `mutable.ParHashMap`) override this default
   *  behaviour by creating a parallel collection which shares the same underlying dataset.
   *  For these collections, `par` takes constant or sublinear time.
   *
   *  All parallel collections return a reference to themselves.
   *
   *  @return  a parallel implementation of this collection
   */
  def par: ParRepr = {
    val cb = parCombiner
    for (x <- seq) cb += x
    cb.result()
  }

However you can set the number of threads using ForkJoinTaskSupport like that:

object ParExample extends App {
  val friends = List("rachel", "ross", "joey", "chandler", "pheebs", "monica").par
  friends.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(6))
}

Let's prove it works:

object ParExample extends App {
  val friends = List("rachel", "ross", "joey", "chandler", "pheebs", "monica", "gunther", "mike", "richard").par
  val stopwatch = Stopwatch.createStarted()
  friends.foreach(f => Thread.sleep(5000))
  println(stopwatch.elapsed(MILLISECONDS))
}

It prints 10025 so the pool must have been smaller than the list which contains 9 strings. Now I'll set the pool size:

object ParExample extends App {
  val friends = List("rachel", "ross", "joey", "chandler", "pheebs", "monica", "gunther", "mike", "richard").par
  friends.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(9))
  val stopwatch = Stopwatch.createStarted()
  friends.foreach(f => Thread.sleep(5000))
  println(stopwatch.elapsed(MILLISECONDS))
}

This time it took only 5024 milliseconds because each element of the list has been processed by separate thread. The solution is ok for me but I'd rather want to use concise one-liners so I've created implicit conversion between ParSeq and RichParSeq that adds parallelismLevel(numberOfThreads: Int) method.

class RichParSeq[T](val p: ParSeq[T]) {
  def parallelismLevel(numberOfThreads: Int): ParSeq[T] = {
    p.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(numberOfThreads))
    p
  }
}

object ConfigurableParallelism {
  implicit def parIterableToRichParIterable[T](p: ParSeq[T]): RichParSeq[T] = new RichParSeq[T](p)
}

Now you can use it like that:

import ConfigurableParallelism._

object ParExample extends App {
  val stopwatch = Stopwatch.createStarted()
  val friends = List("rachel", "ross", "joey", "chandler", "pheebs", "monica", "gunther", "mike", "richard").par
    .parallelismLevel(9)
    .foreach(f => Thread.sleep(5000))
  println(stopwatch.elapsed(MILLISECONDS))
}

It took 5231 milliseconds.

Wednesday, 12 April 2017

[Scala] How to convert dates implicitly ?

If you read any book or tutorial about scala you probably know what implicit keyword means. Basically you can create a method that will be called when compiler thinks it's a good idea to do so. Although I'm not a fan of implicit conversions (when you overuse it the code becomes less readable and clear) it's very useful to get rid of method calls in some cases.

When you create a DAO you probably extend some abstract data access object that provides entity manager, jdbc template or other object that connects your application with a database.

I've been using Jooq for 16 months now to construct SQL queries. It's quite nice tool that prevents from typical sql typos. See example below:

def findForUpdate(searchKey: DateTime, productId: String): Option[SearchHistoryReport] = {
    val sql = dslContext.select(SEARCH_HISTORY_REPORT.SEARCH_DATE_AND_HOUR,
      SEARCH_HISTORY_REPORT.PRODUCT_ID,
      SEARCH_HISTORY_REPORT.SEARCH_SCORE)
      .from(SEARCH_HISTORY_REPORT)
      .where(SEARCH_HISTORY_REPORT.SEARCH_DATE_AND_HOUR.equal(new Timestamp(searchKey.getMillis)))
      .and(SEARCH_HISTORY_REPORT.PRODUCT_ID.equal(productId))
      .forUpdate().getSQL(INLINED)
    Try({
      npjt.queryForObject(sql, noParams(), (rs: ResultSet, rowNum: Int) => SearchHistoryReport.builder()
        .searchDateAndHour(fromTimestamp(rs.getTimestamp(SEARCH_HISTORY_REPORT.SEARCH_DATE_AND_HOUR.toString)))
        .productId(rs.getString(SEARCH_HISTORY_REPORT.PRODUCT_ID.toString))
        .searchScore(rs.getBigDecimal(SEARCH_HISTORY_REPORT.SEARCH_SCORE.toString)).build())
    }).toOption
  }

Search date and hour is a datetime column in the database:

MariaDB [_censored]> describe search_history_report;
+----------------------+---------------+------+-----+---------+-------+
| Field                | Type          | Null | Key | Default | Extra |
+----------------------+---------------+------+-----+---------+-------+
| product_id           | varchar(255)  | NO   | PRI | NULL    |       |
| search_date_and_hour | datetime      | NO   | PRI | NULL    |       |
| search_score         | decimal(19,2) | NO   |     | NULL    |       |
+----------------------+---------------+------+-----+---------+-------+

Datetime can be compared to java.sql.Timestamp but all dates in my domain objects are Joda's DateTime.

Let's see another example:

val sql = dslContext.select(sum(SEARCH_HISTORY_REPORT.SEARCH_SCORE).as(SCORE_SUM_ALIAS), SEARCH_HISTORY_REPORT.PRODUCT_ID)
      .from(SEARCH_HISTORY_REPORT)
      .where(SEARCH_HISTORY_REPORT.PRODUCT_ID.in(productIds))
      .and(SEARCH_HISTORY_REPORT.SEARCH_DATE_AND_HOUR.between(new Timestamp(startDate.getMillis)).and(new Timestamp(endDate.getMillis)))
      .groupBy(SEARCH_HISTORY_REPORT.PRODUCT_ID).getSQL(INLINED)

I simply want to get rid of DateTime => Timestamp conversion.

Let's assume that my abstract dao looks like that:

trait Dao {
  def dslContext(): DSLContext = ???

  def jdbcTemplate(): NamedParameterJdbcTemplate = ???
}

It provides methods that return dslContext (jooq) and jdbcTemplate (spring).

I'd be really happy if all my daos could automatically convert org.joda.time.DateTime to java.sql.Timestamp.

Let's create implicit conversion between those types.

trait Dao {
  implicit def asTimestamp(date: DateTime): Timestamp = new Timestamp(date.getMillis)

  def dslContext(): DSLContext = ???

  def jdbcTemplate(): NamedParameterJdbcTemplate = ???
}

That's all. Now when I pass DateTime to method that takes Timestamp the compiler knows that implicit conversion can be used. It calls asTimestamp method that returns Timestamp behind the hood. Programmer doesn't have to remember that jooq likes timestamps only.

Query that uses implicit conversion looks like that:

val sql = dslContext.select(sum(SEARCH_HISTORY_REPORT.SEARCH_SCORE).as(SCORE_SUM_ALIAS), SEARCH_HISTORY_REPORT.PRODUCT_ID)
      .from(SEARCH_HISTORY_REPORT)
      .where(SEARCH_HISTORY_REPORT.PRODUCT_ID.in(productIds))
      .and(SEARCH_HISTORY_REPORT.SEARCH_DATE_AND_HOUR.between(startDate).and(endDate))
      .groupBy(SEARCH_HISTORY_REPORT.PRODUCT_ID).getSQL(INLINED)

I guess it's perfect use case when implicit conversion can be used. The code is very consise and readable and it's natural to pass DateTime so noone has to remember about this conversion.

Wednesday, 22 March 2017

[Scala] How to transform tuple to class instance ?

In scala TupleN contains N fields and this is basically all that tuple can do (except swapping elements in Tuple2). Although it's really simple data structure it's extremely useful especially when you do some collection's processing.

Imagine that users in your system make orders and you want to find all users with their orders.

object A extends App {
  List("some@email.com", "another@email.com", "and@other.com")
    .map(email => (email, findOrders(email)))

  private def findOrders(email: String): List[Order] = List() // call some dao here...
}

case class Account(email: String, orders: List[Order])

As a result we have collection of tuples that contain email and list of orders. We could have created Account object instead of tuple like that:

.map(email => Account(email, findOrders(email))

but let's say we want only users who made at least one order so we have to make additional filtering:

List("some@email.com", "another@email.com", "and@other.com")
    .map(email => (email, findOrders(email)))
    .filterNot(_._2.isEmpty)

In the end we want to return list of accounts which means the tuple has to be transformed into Account instance. The easiest way would be something like that:

List("some@email.com", "another@email.com", "and@other.com")
    .map(email => (email, findOrders(email)))
    .filterNot(_._2.isEmpty)
    .map(t => Account(t._1, t._2))

but it just doesn't feel right. You might have noticed that we used case class instead of class so tupled method can be invoked (it comes from Function2 trait).

/** Creates a tupled version of this function: instead of 2 arguments,
   *  it accepts a single [[scala.Tuple2]] argument.
   *
   *  @return   a function `f` such that `f((x1, x2)) == f(Tuple2(x1, x2)) == apply(x1, x2)`
   */
  @annotation.unspecialized def tupled: Tuple2[T1, T2] => R = {
    case Tuple2(x1, x2) => apply(x1, x2)
  }

As you see in scaladoc it allows to create object's instance from tuple so considering our Account we can either call Account(email, orders) or Account((email, orders)). And this is exaclty what we're looking for:

List("some@email.com", "another@email.com", "and@other.com")
    .map(email => (email, findOrders(email)))
    .filterNot(_._2.isEmpty)
    .map(Account.tupled)

Wednesday, 9 November 2016

[Scala / Java / Gradle] How to add Scala to Java project and use both ?

I've been developing Java projects for couple of years and I remember the day when I finally could use Java 8. I was pretty excited that I can abandon Guava's FluentIterable, command pattern, anonymous classes and so on but I shortly realized that it's not enough when you want to write concise functional code.

Although Java8 makes significant step forward it's nothing compared to Scala. I haven't heard about a company that decided to rewrite some huge Java project to Scala so far but fortunately Scala runs on JVM so you can use both.

Ok I have Java8 + Gradle project. Let's add Scala :)

In build gradle you need to add scala plugin:

apply plugin: 'scala'

And scala lang/compiler dependencies:

compile group: 'org.scala-lang', name: 'scala-library', version: scalaVersion
compile group: 'org.scala-lang', name: 'scala-compiler', version: scalaVersion

You should also make sure that .java and .scala files are being built together so:

sourceSets.main.scala.srcDir "src/main/java"
sourceSets.test.scala.srcDir "src/test/java"
sourceSets.main.java.srcDirs = []
sourceSets.test.java.srcDirs = []

The project I've been working on has multiple modules so I've added sourceSets and dependencies in subprojects section. Remember about installing Scala plugin in your IDE (in intellij it's called Scala).

That's all :)

Now when I'm trying to build the project I get the following output:

➜  cw git:(develop) git status
On branch develop
Your branch is up-to-date with 'origin/develop'.
nothing to commit, working directory clean
➜  cw git:(develop) gradle build
:compileJava UP-TO-DATE
:compileScala UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:jar UP-TO-DATE
:assemble UP-TO-DATE
:compileTestJava UP-TO-DATE
:compileTestScala UP-TO-DATE
:processTestResources UP-TO-DATE
:testClasses UP-TO-DATE
:test UP-TO-DATE
:check UP-TO-DATE
:build UP-TO-DATE
...

As you can see there's compileScala among others which in fact builds both .java and .scala files. It's because of our sourceSets. It allows you to use Scala classes in .java files and Java classes in .scala files.

You should also make sure that you chose proper version of Scala. I use 2.11.5. It works without any issues with Java 8.