Joan and Andrei both provided alternative solutions to this problem. I wondered which of our solutions was the best in the worst-case, so I created a Java Microbenchmark Harness (JMH) example which runs each of these pieces of code to time them. So I present...
A mini-tutorial on using the Java Microbenchmark Harness to benchmark Java code
My package is daily, so I use that in place of <org.sample>, and I set the artifactId to 001 because this is the first Daily Coding Problem I'm solving in this fashion. This makes a directory called 001 with the following structure:
$ tree 001
001
├── pom.xml
└── src
└── main
└── java
└── daily
└── MyBenchmark.java
4 directories, 2 files
The contents of MyBenchmark.java are as follows:
/*
* ...Oracle copyright...
*/packagedaily;importorg.openjdk.jmh.annotations.Benchmark;publicclassMyBenchmark{@BenchmarkpublicvoidtestMethod(){// This is a demo/sample template for building your JMH benchmarks. Edit as needed.// Put your benchmark code here.}}
...but we're going to delete all of this and write our own code.
Benchmarking
Christophe Schreiber has a good example of using JMH on Dev.To. We need to send parameters to our methods, though. In the worst-case scenario, we'll have a very long array in which no two numbers add to k. These numbers should also all be unique so that the compiler can't do any optimisations and so that we need to continually add them to Andrei's Set.
packagedaily;importjava.util.Arrays;importjava.util.Collections;importjava.util.List;importjava.util.Set;importjava.util.TreeSet;importjava.util.concurrent.TimeUnit;importjava.util.stream.Collectors;importjava.util.stream.IntStream;importorg.openjdk.jmh.annotations.Benchmark;importorg.openjdk.jmh.annotations.BenchmarkMode;importorg.openjdk.jmh.annotations.Fork;importorg.openjdk.jmh.annotations.Measurement;importorg.openjdk.jmh.annotations.Mode;importorg.openjdk.jmh.annotations.OutputTimeUnit;importorg.openjdk.jmh.annotations.Param;importorg.openjdk.jmh.annotations.Scope;importorg.openjdk.jmh.annotations.State;importorg.openjdk.jmh.annotations.Warmup;@BenchmarkMode(Mode.AverageTime)@Warmup(iterations=10,time=500,timeUnit=TimeUnit.MILLISECONDS)@Measurement(iterations=10,time=500,timeUnit=TimeUnit.MILLISECONDS)@Fork(1)@OutputTimeUnit(TimeUnit.MICROSECONDS)publicclassMyBenchmark{@State(Scope.Benchmark)publicstaticclassSetup{@Param({"10","15","20","25","30","40","60","80","100","200","400","800","1000","10000","100000","1000000","10000000"})intN;// create a List of numbers 1...lengthList<Integer>list=IntStream.rangeClosed(1,N).boxed().collect(Collectors.toList());// always larger than the largest possible sum of two elementsintk=Integer.MAX_VALUE;// shuffle the array before returning it to the userint[]get_given(){// shuffle the ListCollections.shuffle(list);// convert the List to an arrayint[]given=list.stream().mapToInt(i->i).toArray();returngiven;}// return kintget_k(){returnk;}}//----------------------------------------------------------------------------//// TESTS////----------------------------------------------------------------------------@BenchmarkpublicbooleantestAndrew(Setupd){// setupint[]given=d.get_given();intk=d.get_k();// algorithmintlen=given.length;for(intouter=0;outer<(len-1);++outer)for(intinner=outer+1;inner<len;++inner)if(given[outer]+given[inner]==k)returntrue;returnfalse;}@BenchmarkpublicbooleantestAndrei(Setupd){// setupint[]given=d.get_given();intk=d.get_k();// algorithmSet<Integer>set=newTreeSet<Integer>();for(intnumber:given){if(set.contains(number))returntrue;set.add(k-number);}returnfalse;}@BenchmarkpublicbooleantestJoan(Setupd){// setupint[]given=d.get_given();intk=d.get_k();// algorithmArrays.sort(given);inti=0;intj=given.length-1;while(i<given.length&&j>=0){intsum=given[i]+given[j];if(sum==k)returntrue;elseif(sum>k)--j;else++i;}returnfalse;}}
All code is available on my GitHub. The results of the benchmarking are below. Please note that I don't use JMH very often so this may not be the optimal way to lay out this benchmark. If you have any suggestions, be sure to let me (politely) know :)
Interesting stats though. But could it be that the N squared approach never really got to it's worst case scenario by not iterating through all the elements?
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Joan and Andrei both provided alternative solutions to this problem. I wondered which of our solutions was the best in the worst-case, so I created a Java Microbenchmark Harness (JMH) example which runs each of these pieces of code to time them. So I present...
A mini-tutorial on using the Java Microbenchmark Harness to benchmark Java code
Setup
Use Maven to set up a simple JMH project:
My package is
daily
, so I use that in place of<org.sample>
, and I set theartifactId
to001
because this is the first Daily Coding Problem I'm solving in this fashion. This makes a directory called001
with the following structure:The contents of
MyBenchmark.java
are as follows:...but we're going to delete all of this and write our own code.
Benchmarking
Christophe Schreiber has a good example of using JMH on Dev.To. We need to send parameters to our methods, though. In the worst-case scenario, we'll have a very long array in which no two numbers add to
k
. These numbers should also all be unique so that the compiler can't do any optimisations and so that we need to continually add them to Andrei'sSet
.I will be using this file on GitHub by Bruno Doolaeghe as a template:
All code is available on my GitHub. The results of the benchmarking are below. Please note that I don't use JMH very often so this may not be the optimal way to lay out this benchmark. If you have any suggestions, be sure to let me (politely) know :)
Even in the worst-case scenario with a 10-million element array, I see no difference between the three methods in terms of time:
Because of the uncertainties on the three benchmarks, they could all be the same (they could all be
0.053 us/op
). That's anticlimactic!Follow-up with arrays with more elements:
Does the double-
for
win??Interesting stats though. But could it be that the N squared approach never really got to it's worst case scenario by not iterating through all the elements?