Then comes this exciting moment when they realize that it all can be much, much faster because there's also parallelStream. And then problems come...
Very often when I'm waiting for something I look into the code and try to fix some crappy parts. This one I've found yesterday:
resource.setRegions(product.getRegions().parallelStream().map(Region::getName).collect(toList()));Our database has something like ten regions. Let's see how long it takes to collect such items using stream and parallelStream.
public static void main(String[] args) { final List<Region> regions = IntStream.range(0, 10) .mapToObj(i -> new Regioan("region:" + i)) .collect(toList()); useLabel("stream()").andLogPerformanceOf(() -> regions.stream() .map(Region::getName) .collect(toList())); useLabel("parallelStream()").andLogPerformanceOf(() -> regions.parallelStream() .map(Region::getName) .collect(toList())); }useLabel(...).andLogPerformanceOf(...) is just a simple wrapper that runs a piece of code and logs time taken (I'll paste it at the end of the article). First run shows:
stream() started stream() completed. Time elapsed = 1 millis parallelStream() started parallelStream() completed. Time elapsed = 9 millisAnd some more results:
Stream |
Parallel stream |
---|---|
4 |
10 |
2 |
3 |
2 |
14 |
2 |
18 |
1 |
6 |
3 |
17 |
2 |
8 |
5 |
7 |
2 |
14 |
1 |
8 |
Let's make the input list bigger.
Stream |
Parallel stream |
---|---|
2 |
12 |
2 |
11 |
1 |
5 |
2 |
7 |
2 |
8 |
1 |
6 |
3 |
6 |
2 |
6 |
4 |
9 |
6 |
18 |
Stream |
Parallel stream |
---|---|
9 |
14 |
2 |
20 |
2 |
7 |
9 |
23 |
3 |
9 |
2 |
20 |
3 |
6 |
2 |
5 |
3 |
9 |
3 |
5 |
Stream |
Parallel stream |
---|---|
8 |
7 |
6 |
23 |
12 |
9 |
5 |
10 |
6 |
9 |
16 |
19 |
7 |
9 |
14 |
14 |
11 |
22 |
20 |
18 |
Stream |
Parallel stream |
---|---|
1423 |
65 |
1715 | 91 |
1244 | 63 |
1345 | 68 |
1458 | 91 |
1479 | 65 |
1415 | 48 |
1584 | 87 |
1425 | 61 |
1506 | 73 |
Let's get back to the main question: Should I always use parallel stream instead of stream ?
Definitely not.
You should consider parallel version:
- when you work with huge collections
- when computation of single element takes much time
You've seen example that transforms huge collection. You can find another one which shows processing collection for which computing single element takes much time in my post here: How to control pool size while using parallel stream.
You should also remember that if you want to make your code parallel IT HAS TO BE immutable. I stronly recommend reading about functional programming principles.
That's all. I've promised to paste the tool that logs performance so here you are:
/** * @author Grzegorz Taramina * Created on: 13/06/16 */ public class PerformanceLoggingBlock implements Logging { private final String label; public static PerformanceLoggingBlock useLabel(final String label) { return new PerformanceLoggingBlock(label); } private PerformanceLoggingBlock(final String label) { this.label = label; } public void andLogPerformanceOf(final Runnable runnable) { perfLog().info(label + " started"); Stopwatch stopwatch = Stopwatch.createStarted(); runnable.run(); perfLog().info(label + " completed. Time elapsed = " + stopwatch.elapsed(MILLISECONDS) + " millis"); } public <T> T andLogPerformanceOf(final Supplier<T> supplier) { System.out.println(label + " started"); Stopwatch stopwatch = Stopwatch.createStarted(); T result = supplier.get(); System.out.println(label + " completed. Time elapsed = " + stopwatch.elapsed(MILLISECONDS) + " millis"); return result; } }
No comments:
Post a Comment