How do you write a code in C# to split the list [a, b, c, d, e, f, g, h]
into sublists [a, b, c], [d, e, f], [g, h]
each with three elements and the last with less than three? The most popular answer is the following LINQ.
List<List<T>> Split<T>(this IList<T> source, int length)
{
return source
.Select((x, i) => new { Index = i, Value = x })
.GroupBy(x => x.Index / length)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
}
In StackOverflow, this LINQ got more than 900 upvotes as of the answer to two questions (Split a List into smaller lists of N size [duplicate], Split List into Sublists with LINQ).
However, this method has the worst performance in any assumed answers. It creates the same number of objects with Index and Value as the length of the source. Object creation is a costly operation in terms of both memory and speed. Although the cost is much smaller, the exact count of divisions and comparisons as the length of the source is also not good for performance.
If you need performance, you should not use LINQ in the first place, but if you insist on a concise LINQ, how about the following solution.
List<List<T>> Split<T>(this IList<T> source, int length)
{
return Enumerable
.Range(0, (source.Count + length - 1) / length)
.Select(n => source.Skip(n * length).Take(length).ToList())
.ToList();
}
The following are the benchmark results taken from BenchmarkDotNet, which splits a 1000-length array of ints into three pieces each. The first answer is Splitter1, and the above is Splitter2. Mean is the average, and the rests are measurement errors. The results show Splitter2 was more than 3.5 times faster than Splitter1.
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1766 (21H2)
AMD Ryzen 7 3800X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.106
[Host] : .NET 6.0.6 (6.0.622.26707), X64 RyuJIT
LongRun : .NET 6.0.6 (6.0.622.26707), X64 RyuJIT
Job=LongRun IterationCount=100 LaunchCount=3
WarmupCount=15
Method | Mean | Error | StdDev |
---|---|---|---|
Splitter1 | 89.94 μs | 0.351 μs | 1.779 μs |
Splitter2 | 24.04 μs | 0.100 μs | 0.517 μs |
If the target framework of your projects is .NET 6 or later, you should use the Chunk method for this operation, which is introduced in .NET 6. The benchmark results, including the Chunk method, are shown below. It was more than 4,000 times faster than Splitter2.
Method | Mean | Error | StdDev |
---|---|---|---|
Splitter1 | 88,650.102 ns | 245.2557 ns | 1,251.8625 ns |
Splitter2 | 23,481.503 ns | 117.8934 ns | 600.6975 ns |
Chunk | 5.609 ns | 0.0198 ns | 0.0984 ns |
Top comments (0)