Harmful collection transformations. Part 3: collections

Published on Saturday, January 8, 2022

Starting with string in the first post we continue to study examples with collection transformations and how they affect our applications.

Collections

.NET contains hundreds of different collections for any purpose, but it looks like developers used to use List<T> or Array so much that they convert anything into an array just in case. In most cases, this is unnecessary. Sometimes it's harmful as we found out earlier. Let's look at some examples:

Count elements

Sometimes we don't need all elements, but only count them. And we can see the code like this:

return certificates.Where(x => certificateStatesByFormId.SafeGet(x.Id) == Form.CertificateState.Revoked || x.ValidToTicks <= nowTicks - 1)
   .ToArray().Length;

Or another example:

public int GetTotalAmount(Guid managerId)
{
   return filterSettingsHandler
       .Select<ClientFilterDbo>(managerId)
       .Where(IsVisible)
       .ToArray().Length;
}

Creating an array to count elements is unnecessary. LINQ provides a Count() method that does what we need without harmful overhead. And we can easily rewrite examples:

return certificates.Count(x => certificateStatesByFormId.SafeGet(x.Id) == Form.CertificateState.Revoked || x.ValidToTicks <= nowTicks - 1);
public int GetTotalAmount(Guid managerId)
{
   return filterSettingsHandler
       .Select<ClientFilterDbo>(managerId)
       .Count(IsVisible);
}

We get the cleaner and more readable code. Besides that we got some performance improvements.

  • Benchmark code. Click to expand.
  • namespace ToCharArrayBenchmark
    {
        using System.Linq;
        using BenchmarkDotNet.Attributes;
        using BenchmarkDotNet.Jobs;
    
        [HtmlExporter]
        [RPlotExporter]
        [SimpleJob(RuntimeMoniker.Net48, baseline: true)]
        [SimpleJob(RuntimeMoniker.NetCoreApp31)]
        [SimpleJob(RuntimeMoniker.Net50)]
        [MemoryDiagnoser]
        public class CollectionsCountBenchmark
        {
            [Params(10, 20, 50, 100, 500)]
            public int N { get; set; }
            
            private string[] array;
            
            [GlobalSetup]
            public void SetUp()
            {
                array = new string[N];
                for (int i = 0; i < N; i++)
                {
                    array[i] = i.ToString();
                }
            }
    
            [Benchmark(Baseline = true)]
            public int Count()
            {
                return array.Where(o => o != "1").Count();
            }
            
            [Benchmark]
            public int ArrayLength()
            {
                return array.Where(o => o != "1").ToArray().Length;
            }
        }
    }
    
  • Full results in table format. Click to expand.
  • 
    BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19043.1348 (21H1/May2021Update)
    AMD Ryzen 7 4700U with Radeon Graphics, 1 CPU, 8 logical and 8 physical cores
      [Host]             : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT
      .NET 5.0           : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
      .NET Core 3.1      : .NET Core 3.1.20 (CoreCLR 4.700.21.47003, CoreFX 4.700.21.47101), X64 RyuJIT
      .NET Framework 4.8 : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT
    
    
    
    Method Runtime N Mean Error StdDev Ratio RatioSD Gen 0 Allocated
    Count .NET 5.0 10 106.4 ns 1.78 ns 1.67 ns 0.70 0.02 0.0229 48 B
    ArrayLength .NET 5.0 10 263.5 ns 3.93 ns 3.28 ns 1.72 0.04 0.1798 376 B
    Count .NET Core 3.1 10 108.2 ns 1.66 ns 1.39 ns 0.71 0.02 0.0229 48 B
    ArrayLength .NET Core 3.1 10 286.0 ns 3.52 ns 3.12 ns 1.88 0.03 0.1798 376 B
    Count .NET Framework 4.8 10 151.7 ns 2.52 ns 2.90 ns 1.00 0.00 0.0267 56 B
    ArrayLength .NET Framework 4.8 10 361.4 ns 6.58 ns 9.65 ns 2.39 0.10 0.2141 449 B
    Count .NET 5.0 20 139.5 ns 0.79 ns 0.74 ns 0.61 0.01 0.0229 48 B
    ArrayLength .NET 5.0 20 378.4 ns 7.24 ns 6.77 ns 1.67 0.04 0.2713 568 B
    Count .NET Core 3.1 20 136.5 ns 0.63 ns 0.59 ns 0.60 0.01 0.0229 48 B
    ArrayLength .NET Core 3.1 20 419.6 ns 4.89 ns 4.57 ns 1.85 0.03 0.2713 568 B
    Count .NET Framework 4.8 20 227.0 ns 3.01 ns 2.82 ns 1.00 0.00 0.0267 56 B
    ArrayLength .NET Framework 4.8 20 595.2 ns 6.18 ns 5.48 ns 2.63 0.03 0.3862 810 B
    Count .NET 5.0 50 253.4 ns 0.99 ns 0.83 ns 0.59 0.00 0.0229 48 B
    ArrayLength .NET 5.0 50 708.7 ns 8.07 ns 7.55 ns 1.66 0.02 0.5121 1,072 B
    Count .NET Core 3.1 50 248.0 ns 2.19 ns 2.05 ns 0.58 0.00 0.0229 48 B
    ArrayLength .NET Core 3.1 50 763.3 ns 12.10 ns 10.72 ns 1.79 0.03 0.5121 1,072 B
    Count .NET Framework 4.8 50 426.5 ns 2.19 ns 1.94 ns 1.00 0.00 0.0267 56 B
    ArrayLength .NET Framework 4.8 50 1,334.4 ns 26.38 ns 25.91 ns 3.12 0.06 0.7572 1,589 B
    Count .NET 5.0 100 458.0 ns 2.64 ns 2.34 ns 0.60 0.00 0.0229 48 B
    ArrayLength .NET 5.0 100 1,185.5 ns 11.97 ns 11.20 ns 1.55 0.02 0.9060 1,896 B
    Count .NET Core 3.1 100 414.5 ns 1.15 ns 0.96 ns 0.54 0.00 0.0229 48 B
    ArrayLength .NET Core 3.1 100 1,236.0 ns 23.07 ns 22.65 ns 1.62 0.04 0.9060 1,896 B
    Count .NET Framework 4.8 100 765.6 ns 4.88 ns 4.32 ns 1.00 0.00 0.0267 56 B
    ArrayLength .NET Framework 4.8 100 2,521.6 ns 48.92 ns 61.87 ns 3.30 0.09 1.4496 3,041 B
    Count .NET 5.0 500 2,010.3 ns 7.45 ns 6.97 ns 0.61 0.02 0.0229 48 B
    ArrayLength .NET 5.0 500 4,572.5 ns 86.28 ns 80.71 ns 1.38 0.05 4.0283 8,432 B
    Count .NET Core 3.1 500 1,893.9 ns 9.48 ns 8.40 ns 0.57 0.02 0.0229 48 B
    ArrayLength .NET Core 3.1 500 4,395.2 ns 86.59 ns 88.92 ns 1.33 0.05 4.0283 8,432 B
    Count .NET Framework 4.8 500 3,298.3 ns 64.73 ns 79.49 ns 1.00 0.00 0.0267 56 B
    ArrayLength .NET Framework 4.8 500 11,853.8 ns 235.81 ns 280.71 ns 3.59 0.12 5.9204 12,470 B
    // * Legends *
      N         : Value of the 'N' parameter
      Mean      : Arithmetic mean of all measurements
      Error     : Half of 99.9% confidence interval
      StdDev    : Standard deviation of all measurements
      Ratio     : Mean of the ratio distribution ([Current]/[Baseline])
      RatioSD   : Standard deviation of the ratio distribution ([Current]/[Baseline])
      Gen 0     : GC Generation 0 collects per 1000 operations
      Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
      1 us      : 1 Microsecond (0.000001 sec)
    

    The following diagram shows execution time of both methods (in microseconds) for .NET 5 platform:

    Time diagram

    As we see converting to array is about 2.5 times slower even for the most optimized .NET 5. Other frameworks are up to 4 times slower.

    And if this is not enough let's look at memory allocations (in bytes):

    Allocated memory diagram

    The LINQ method does not create many objects and does not lead to additional GC regardless of the size of the original collection.

    We can confidently say that creating an array to count elements is a harmful transformation.

    ToArray() with LINQ

    The very common pattern - to call ToArray() (just in case) before LINQ-methods.

    var correctPaymentsByBillId = correctPayments
       .GroupBy(x => x.BillId)
       .ToDictionary(x => x.Key, x => x.ToArray().Last());
    

    Here GroupBy returns IGrouping<out TKey, out TElement> interface and it, in turns, implements IEnumerable<TElement>. Based on this we can simply remove this ToArray():

    var correctPaymentsByBillId = correctPayments
       .GroupBy(x => x.BillId)
       .ToDictionary(x => x.Key, x => x.Last());
    

    Array is often used as a return type of methods like this:

    private OrderPackage[] ReadRootPackages(params Guid[] packageIds)
    {
        var rootPackages = new Dictionary<Guid, OrderPackage>();
        //...
        return rootPackages.Values.ToArray();
    }
    

    And the result doesn't have to be an array or any specific collection:

    public OrderPackageDto[] SelectTariffForestByPackageIds(Guid[] packageIds)
    {
    	var rootOrderPackages = ReadRootPackages(packageIds);
    	var masterPackageIds = rootOrderPackages.Select(package => package.Id).ToArray();
    	...
    }
    

    As we see, ReadRootPackages can return ICollection<OrderPackage> or IEnumerable<OrderPackage> and ToArray() is redundant:

    private IEnumerable<OrderPackage> ReadRootPackages(params Guid[] packageIds)
    {
        var rootPackages = new Dictionary<Guid, OrderPackage>();
        //...
        return rootPackages.Values;
    }
    

    The most obvious example is declaring method return type as IEnumerable but returning an array:

    public IEnumerable<Division> GetDivisions(HashSet<SupportUserAccessRight> accessRights)
    {
       return divisions.Where(d => d.AccessRights.Intersect(accessRights).Any()).OrderBy(d => d.Priority).ToArray();
    }
    

    Like the previous example GetDivisions is used in LINQ-methods chain and we can remove ToArray() call:

    public IEnumerable<Division> GetDivisions(HashSet<SupportUserAccessRight> accessRights)
    {
       return divisions.Where(d => d.AccessRights.Intersect(accessRights).Any()).OrderBy(d => d.Priority);
    }
    

    Usually, you don't need a collection between LINQ-methods, because of the lazy evaluation nature of LINQ. How does ToArray() affect performance depending on different collections? Let's compare two versions - with ToArray() and without it.

  • Benchmark code. Click to expand.
  • namespace ToCharArrayBenchmark
    {
        using System.Collections.Generic;
        using System.Linq;
        using BenchmarkDotNet.Attributes;
        using BenchmarkDotNet.Engines;
        using BenchmarkDotNet.Jobs;
    
        [HtmlExporter]
        [RPlotExporter]
        [SimpleJob(RuntimeMoniker.Net48, baseline: true)]
        [SimpleJob(RuntimeMoniker.NetCoreApp31)]
        [SimpleJob(RuntimeMoniker.Net50)]
        [MemoryDiagnoser]
        public class CollectionToArrayBenchmark
        {
            [Params(10, 20, 50, 100, 500)]
            public int N { get; set; }
    
            private List<string> list;
            private string[] array;
            private HashSet<string> hashSet;
            private Dictionary<long, string> dictionary;
            private readonly Consumer consumer = new Consumer();
    
            [GlobalSetup]
            public void SetUp()
            {
                array = new string[N];
                list = new List<string>(N);
                dictionary = new Dictionary<long, string>(N);
                hashSet = new HashSet<string>(N);
    
                for (int i = 0; i < N; i++)
                {
                    list.Add(i.ToString());
                    array[i] = i.ToString();
                    hashSet.Add(i.ToString());
                    dictionary[i] = i.ToString();
                }
            }
    
            [Benchmark(Baseline = true)]
            public void FilterArray()
            {
                GetFilteredEnumerable(array).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterList()
            {
                GetFilteredEnumerable(list).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterHashSet()
            {
                GetFilteredEnumerable(hashSet).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterDictionary()
            {
                GetFilteredEnumerable(dictionary.Values).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterArray_ToArray()
            {
                GetFilteredArray(array).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterList_ToArray()
            {
                GetFilteredArray(list).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterHashSet_ToArray()
            {
                GetFilteredArray(hashSet).Consume(consumer);
            }
    
            [Benchmark]
            public void FilterDictionary_ToArray()
            {
                GetFilteredArray(dictionary.Values).Consume(consumer);
            }
    
            private static IEnumerable<string> GetFilteredEnumerable(IEnumerable<string> strings)
            {
                return strings.Where(o => o.Length > 1).Select(o => o);
            }
    
            private static string[] GetFilteredArray(IEnumerable<string> strings)
            {
                return strings.Where(o => o.Length > 1).ToArray();
            }
        }
    }
    
  • Full results in table format. Click to expand.
  • 
    BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19043.1288 (21H1/May2021Update)
    AMD Ryzen 7 4700U with Radeon Graphics, 1 CPU, 8 logical and 8 physical cores
      [Host]             : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT
      .NET 5.0           : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
      .NET Core 3.1      : .NET Core 3.1.20 (CoreCLR 4.700.21.47003, CoreFX 4.700.21.47101), X64 RyuJIT
      .NET Framework 4.8 : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT
    
    
    Method Runtime N Mean Error StdDev Median Ratio RatioSD Gen 0 Allocated
    FilterArray .NET 5.0 10 88.86 ns 0.937 ns 0.876 ns 88.73 ns 0.80 0.01 0.0497 104 B
    FilterList .NET 5.0 10 156.24 ns 1.318 ns 1.169 ns 155.86 ns 1.40 0.02 0.0725 152 B
    FilterHashSet .NET 5.0 10 212.50 ns 3.097 ns 2.746 ns 211.43 ns 1.91 0.03 0.0763 160 B
    FilterDictionary .NET 5.0 10 209.59 ns 2.441 ns 2.283 ns 209.03 ns 1.89 0.03 0.0763 160 B
    FilterArray_ToArray .NET 5.0 10 88.34 ns 1.332 ns 1.180 ns 88.73 ns 0.79 0.01 0.0229 48 B
    FilterList_ToArray .NET 5.0 10 100.15 ns 1.151 ns 0.961 ns 100.02 ns 0.90 0.01 0.0343 72 B
    FilterHashSet_ToArray .NET 5.0 10 197.00 ns 1.438 ns 1.345 ns 196.58 ns 1.77 0.03 0.0458 96 B
    FilterDictionary_ToArray .NET 5.0 10 206.34 ns 3.901 ns 3.649 ns 204.96 ns 1.86 0.04 0.0458 96 B
    FilterArray .NET Core 3.1 10 100.98 ns 1.096 ns 0.972 ns 100.91 ns 0.91 0.01 0.0497 104 B
    FilterList .NET Core 3.1 10 166.74 ns 1.152 ns 1.021 ns 166.57 ns 1.50 0.02 0.0725 152 B
    FilterHashSet .NET Core 3.1 10 240.48 ns 4.734 ns 7.644 ns 236.99 ns 2.16 0.09 0.0763 160 B
    FilterDictionary .NET Core 3.1 10 210.68 ns 1.404 ns 1.173 ns 210.80 ns 1.89 0.03 0.0763 160 B
    FilterArray_ToArray .NET Core 3.1 10 94.07 ns 1.784 ns 2.615 ns 93.07 ns 0.84 0.03 0.0229 48 B
    FilterList_ToArray .NET Core 3.1 10 97.77 ns 1.811 ns 1.606 ns 97.25 ns 0.88 0.02 0.0343 72 B
    FilterHashSet_ToArray .NET Core 3.1 10 210.94 ns 3.940 ns 5.650 ns 210.06 ns 1.92 0.06 0.0458 96 B
    FilterDictionary_ToArray .NET Core 3.1 10 204.44 ns 3.835 ns 3.587 ns 203.66 ns 1.83 0.04 0.0458 96 B
    FilterArray .NET Framework 4.8 10 111.34 ns 1.837 ns 1.534 ns 111.21 ns 1.00 0.00 0.0573 120 B
    FilterList .NET Framework 4.8 10 161.72 ns 2.305 ns 2.156 ns 161.68 ns 1.45 0.03 0.0725 152 B
    FilterHashSet .NET Framework 4.8 10 213.74 ns 3.929 ns 3.675 ns 213.37 ns 1.92 0.05 0.0763 160 B
    FilterDictionary .NET Framework 4.8 10 212.36 ns 3.331 ns 3.116 ns 211.33 ns 1.91 0.04 0.0763 160 B
    FilterArray_ToArray .NET Framework 4.8 10 125.26 ns 2.469 ns 2.310 ns 124.86 ns 1.12 0.03 0.0381 80 B
    FilterList_ToArray .NET Framework 4.8 10 182.54 ns 2.822 ns 2.640 ns 182.31 ns 1.64 0.04 0.0458 96 B
    FilterHashSet_ToArray .NET Framework 4.8 10 223.26 ns 4.104 ns 3.638 ns 223.15 ns 2.01 0.05 0.0572 120 B
    FilterDictionary_ToArray .NET Framework 4.8 10 214.85 ns 1.389 ns 1.084 ns 214.84 ns 1.93 0.03 0.0572 120 B
    FilterArray .NET 5.0 20 204.77 ns 2.235 ns 1.981 ns 204.25 ns 0.81 0.02 0.0496 104 B
    FilterList .NET 5.0 20 335.74 ns 3.112 ns 2.911 ns 335.81 ns 1.34 0.03 0.0725 152 B
    FilterHashSet .NET 5.0 20 416.95 ns 2.229 ns 2.085 ns 415.96 ns 1.66 0.04 0.0763 160 B
    FilterDictionary .NET 5.0 20 437.06 ns 6.477 ns 6.058 ns 438.21 ns 1.74 0.04 0.0763 160 B
    FilterArray_ToArray .NET 5.0 20 338.20 ns 5.587 ns 4.953 ns 339.08 ns 1.34 0.04 0.1988 416 B
    FilterList_ToArray .NET 5.0 20 358.90 ns 4.098 ns 3.422 ns 358.15 ns 1.42 0.04 0.2103 440 B
    FilterHashSet_ToArray .NET 5.0 20 557.83 ns 4.063 ns 3.800 ns 558.38 ns 2.22 0.06 0.2213 464 B
    FilterDictionary_ToArray .NET 5.0 20 574.07 ns 7.046 ns 6.246 ns 574.29 ns 2.28 0.05 0.2213 464 B
    FilterArray .NET Core 3.1 20 215.45 ns 1.700 ns 1.419 ns 215.22 ns 0.86 0.02 0.0496 104 B
    FilterList .NET Core 3.1 20 320.02 ns 2.990 ns 2.651 ns 319.63 ns 1.27 0.03 0.0725 152 B
    FilterHashSet .NET Core 3.1 20 437.02 ns 4.356 ns 3.861 ns 436.05 ns 1.74 0.05 0.0763 160 B
    FilterDictionary .NET Core 3.1 20 452.16 ns 8.495 ns 7.947 ns 453.44 ns 1.80 0.06 0.0763 160 B
    FilterArray_ToArray .NET Core 3.1 20 340.41 ns 4.727 ns 4.190 ns 339.43 ns 1.35 0.04 0.1988 416 B
    FilterList_ToArray .NET Core 3.1 20 353.83 ns 6.133 ns 7.532 ns 354.83 ns 1.42 0.04 0.2103 440 B
    FilterHashSet_ToArray .NET Core 3.1 20 576.33 ns 8.148 ns 7.223 ns 575.38 ns 2.29 0.06 0.2213 464 B
    FilterDictionary_ToArray .NET Core 3.1 20 584.74 ns 11.442 ns 10.143 ns 580.87 ns 2.32 0.08 0.2213 464 B
    FilterArray .NET Framework 4.8 20 248.80 ns 4.782 ns 6.546 ns 247.30 ns 1.00 0.00 0.0572 120 B
    FilterList .NET Framework 4.8 20 329.58 ns 3.299 ns 2.925 ns 328.61 ns 1.31 0.03 0.0725 152 B
    FilterHashSet .NET Framework 4.8 20 443.56 ns 8.577 ns 9.177 ns 441.12 ns 1.77 0.07 0.0763 160 B
    FilterDictionary .NET Framework 4.8 20 461.62 ns 7.660 ns 6.791 ns 459.31 ns 1.83 0.06 0.0763 160 B
    FilterArray_ToArray .NET Framework 4.8 20 469.41 ns 7.360 ns 6.525 ns 470.37 ns 1.86 0.03 0.2332 489 B
    FilterList_ToArray .NET Framework 4.8 20 575.07 ns 6.210 ns 5.809 ns 575.53 ns 2.29 0.06 0.2403 505 B
    FilterHashSet_ToArray .NET Framework 4.8 20 686.90 ns 13.321 ns 14.253 ns 687.12 ns 2.75 0.10 0.2518 530 B
    FilterDictionary_ToArray .NET Framework 4.8 20 677.82 ns 10.143 ns 9.488 ns 675.06 ns 2.70 0.08 0.2518 530 B
    FilterArray .NET 5.0 50 519.68 ns 8.640 ns 7.215 ns 517.65 ns 0.89 0.01 0.0496 104 B
    FilterList .NET 5.0 50 819.43 ns 11.886 ns 11.118 ns 814.16 ns 1.41 0.02 0.0725 152 B
    FilterHashSet .NET 5.0 50 972.46 ns 5.558 ns 5.199 ns 973.23 ns 1.67 0.01 0.0763 160 B
    FilterDictionary .NET 5.0 50 972.48 ns 2.824 ns 2.359 ns 972.40 ns 1.67 0.01 0.0763 160 B
    FilterArray_ToArray .NET 5.0 50 810.60 ns 12.451 ns 11.647 ns 808.18 ns 1.39 0.02 0.4930 1,032 B
    FilterList_ToArray .NET 5.0 50 851.79 ns 11.612 ns 10.294 ns 848.90 ns 1.46 0.02 0.5045 1,056 B
    FilterHashSet_ToArray .NET 5.0 50 1,479.79 ns 28.211 ns 26.389 ns 1,471.44 ns 2.54 0.06 0.5684 1,192 B
    FilterDictionary_ToArray .NET 5.0 50 1,401.39 ns 26.216 ns 24.522 ns 1,396.34 ns 2.41 0.05 0.5684 1,192 B
    FilterArray .NET Core 3.1 50 534.26 ns 5.393 ns 4.781 ns 533.28 ns 0.92 0.01 0.0496 104 B
    FilterList .NET Core 3.1 50 775.65 ns 6.106 ns 5.413 ns 775.77 ns 1.33 0.01 0.0725 152 B
    FilterHashSet .NET Core 3.1 50 1,016.81 ns 15.748 ns 14.731 ns 1,020.79 ns 1.74 0.02 0.0763 160 B
    FilterDictionary .NET Core 3.1 50 1,056.04 ns 20.453 ns 26.595 ns 1,056.13 ns 1.83 0.05 0.0763 160 B
    FilterArray_ToArray .NET Core 3.1 50 940.75 ns 14.501 ns 13.564 ns 942.47 ns 1.61 0.03 0.4921 1,032 B
    FilterList_ToArray .NET Core 3.1 50 977.71 ns 19.087 ns 34.417 ns 983.53 ns 1.67 0.08 0.5035 1,056 B
    FilterHashSet_ToArray .NET Core 3.1 50 1,453.36 ns 27.352 ns 25.585 ns 1,443.28 ns 2.50 0.05 0.5684 1,192 B
    FilterDictionary_ToArray .NET Core 3.1 50 1,368.01 ns 11.451 ns 10.151 ns 1,365.62 ns 2.35 0.02 0.5684 1,192 B
    FilterArray .NET Framework 4.8 50 582.74 ns 3.555 ns 2.969 ns 582.90 ns 1.00 0.00 0.0572 120 B
    FilterList .NET Framework 4.8 50 783.91 ns 1.892 ns 1.580 ns 784.02 ns 1.35 0.01 0.0725 152 B
    FilterHashSet .NET Framework 4.8 50 1,071.60 ns 21.374 ns 28.533 ns 1,072.29 ns 1.83 0.05 0.0763 160 B
    FilterDictionary .NET Framework 4.8 50 1,139.82 ns 21.384 ns 20.003 ns 1,141.93 ns 1.96 0.04 0.0763 160 B
    FilterArray_ToArray .NET Framework 4.8 50 1,367.50 ns 27.057 ns 65.861 ns 1,331.99 ns 2.52 0.08 0.7381 1,549 B
    FilterList_ToArray .NET Framework 4.8 50 1,597.52 ns 23.829 ns 19.899 ns 1,603.19 ns 2.74 0.04 0.7458 1,565 B
    FilterHashSet_ToArray .NET Framework 4.8 50 1,808.94 ns 23.400 ns 21.888 ns 1,811.74 ns 3.11 0.04 0.7572 1,589 B
    FilterDictionary_ToArray .NET Framework 4.8 50 1,832.78 ns 34.906 ns 34.282 ns 1,840.36 ns 3.14 0.07 0.7572 1,589 B
    FilterArray .NET 5.0 100 1,049.10 ns 10.403 ns 8.687 ns 1,047.02 ns 0.87 0.02 0.0496 104 B
    FilterList .NET 5.0 100 1,516.00 ns 29.028 ns 27.153 ns 1,510.02 ns 1.26 0.02 0.0725 152 B
    FilterHashSet .NET 5.0 100 1,938.09 ns 29.526 ns 27.619 ns 1,924.95 ns 1.61 0.03 0.0763 160 B
    FilterDictionary .NET 5.0 100 2,045.10 ns 15.618 ns 14.609 ns 2,035.70 ns 1.70 0.03 0.0763 160 B
    FilterArray_ToArray .NET 5.0 100 1,499.65 ns 21.245 ns 19.872 ns 1,498.60 ns 1.25 0.03 0.8869 1,856 B
    FilterList_ToArray .NET 5.0 100 1,555.10 ns 31.036 ns 31.871 ns 1,560.42 ns 1.30 0.03 0.8984 1,880 B
    FilterHashSet_ToArray .NET 5.0 100 2,457.19 ns 34.939 ns 30.973 ns 2,451.15 ns 2.05 0.04 1.0147 2,128 B
    FilterDictionary_ToArray .NET 5.0 100 2,599.85 ns 39.720 ns 37.155 ns 2,585.06 ns 2.17 0.04 1.0147 2,128 B
    FilterArray .NET Core 3.1 100 1,102.10 ns 4.401 ns 4.117 ns 1,101.34 ns 0.92 0.01 0.0496 104 B
    FilterList .NET Core 3.1 100 1,423.55 ns 3.328 ns 2.599 ns 1,423.44 ns 1.19 0.02 0.0725 152 B
    FilterHashSet .NET Core 3.1 100 2,160.32 ns 42.036 ns 51.624 ns 2,159.38 ns 1.81 0.06 0.0763 160 B
    FilterDictionary .NET Core 3.1 100 2,190.92 ns 42.564 ns 53.830 ns 2,194.61 ns 1.81 0.05 0.0763 160 B
    FilterArray_ToArray .NET Core 3.1 100 1,530.44 ns 20.335 ns 19.021 ns 1,526.51 ns 1.28 0.03 0.8869 1,856 B
    FilterList_ToArray .NET Core 3.1 100 1,583.65 ns 31.314 ns 36.061 ns 1,568.48 ns 1.32 0.05 0.8984 1,880 B
    FilterHashSet_ToArray .NET Core 3.1 100 2,711.94 ns 54.067 ns 127.443 ns 2,673.45 ns 2.33 0.16 1.0147 2,128 B
    FilterDictionary_ToArray .NET Core 3.1 100 2,509.05 ns 43.450 ns 40.643 ns 2,508.83 ns 2.09 0.04 1.0147 2,128 B
    FilterArray .NET Framework 4.8 100 1,199.84 ns 21.589 ns 19.139 ns 1,195.87 ns 1.00 0.00 0.0572 120 B
    FilterList .NET Framework 4.8 100 1,614.03 ns 31.386 ns 27.823 ns 1,602.08 ns 1.35 0.04 0.0725 152 B
    FilterHashSet .NET Framework 4.8 100 2,035.95 ns 39.087 ns 32.640 ns 2,023.10 ns 1.70 0.04 0.0763 160 B
    FilterDictionary .NET Framework 4.8 100 2,295.34 ns 9.565 ns 8.947 ns 2,296.43 ns 1.91 0.03 0.0763 160 B
    FilterArray_ToArray .NET Framework 4.8 100 2,767.37 ns 17.560 ns 16.426 ns 2,768.58 ns 2.31 0.04 1.4305 3,001 B
    FilterList_ToArray .NET Framework 4.8 100 3,164.04 ns 11.241 ns 9.386 ns 3,164.16 ns 2.64 0.04 1.4381 3,017 B
    FilterHashSet_ToArray .NET Framework 4.8 100 3,660.94 ns 70.542 ns 65.985 ns 3,654.40 ns 3.06 0.08 1.4458 3,041 B
    FilterDictionary_ToArray .NET Framework 4.8 100 3,699.41 ns 21.608 ns 18.044 ns 3,693.30 ns 3.08 0.04 1.4458 3,041 B
    FilterArray .NET 5.0 500 5,357.99 ns 103.933 ns 149.057 ns 5,286.35 ns 0.93 0.03 0.0458 104 B
    FilterList .NET 5.0 500 7,633.34 ns 81.790 ns 72.504 ns 7,610.14 ns 1.33 0.02 0.0610 152 B
    FilterHashSet .NET 5.0 500 9,670.59 ns 67.964 ns 53.062 ns 9,665.57 ns 1.68 0.01 0.0763 160 B
    FilterDictionary .NET 5.0 500 9,571.07 ns 112.495 ns 93.938 ns 9,531.54 ns 1.67 0.03 0.0763 160 B
    FilterArray_ToArray .NET 5.0 500 6,904.08 ns 66.606 ns 62.303 ns 6,874.80 ns 1.20 0.01 4.0054 8,392 B
    FilterList_ToArray .NET 5.0 500 6,968.44 ns 83.854 ns 70.022 ns 6,975.29 ns 1.21 0.02 4.0131 8,416 B
    FilterHashSet_ToArray .NET 5.0 500 10,971.07 ns 109.378 ns 96.961 ns 10,951.74 ns 1.91 0.03 4.0741 8,536 B
    FilterDictionary_ToArray .NET 5.0 500 11,998.38 ns 211.175 ns 197.533 ns 11,998.62 ns 2.09 0.03 4.0741 8,536 B
    FilterArray .NET Core 3.1 500 5,347.93 ns 33.686 ns 28.129 ns 5,338.97 ns 0.93 0.01 0.0458 104 B
    FilterList .NET Core 3.1 500 7,483.89 ns 45.716 ns 35.692 ns 7,498.14 ns 1.30 0.01 0.0610 152 B
    FilterHashSet .NET Core 3.1 500 9,377.21 ns 116.563 ns 97.335 ns 9,337.82 ns 1.63 0.02 0.0763 160 B
    FilterDictionary .NET Core 3.1 500 9,933.11 ns 61.978 ns 51.755 ns 9,907.55 ns 1.73 0.01 0.0763 160 B
    FilterArray_ToArray .NET Core 3.1 500 7,688.61 ns 149.282 ns 223.439 ns 7,655.68 ns 1.35 0.04 4.0054 8,392 B
    FilterList_ToArray .NET Core 3.1 500 8,520.42 ns 170.171 ns 315.423 ns 8,541.70 ns 1.48 0.05 4.0131 8,416 B
    FilterHashSet_ToArray .NET Core 3.1 500 11,357.74 ns 167.598 ns 139.952 ns 11,288.67 ns 1.98 0.03 4.0741 8,536 B
    FilterDictionary_ToArray .NET Core 3.1 500 11,174.15 ns 183.748 ns 180.465 ns 11,096.25 ns 1.95 0.04 4.0741 8,536 B
    FilterArray .NET Framework 4.8 500 5,735.48 ns 35.151 ns 39.070 ns 5,726.53 ns 1.00 0.00 0.0534 120 B
    FilterList .NET Framework 4.8 500 8,031.11 ns 136.072 ns 186.257 ns 7,934.82 ns 1.40 0.04 0.0610 153 B
    FilterHashSet .NET Framework 4.8 500 10,124.26 ns 199.856 ns 266.803 ns 10,041.80 ns 1.78 0.04 0.0763 161 B
    FilterDictionary .NET Framework 4.8 500 11,615.37 ns 147.565 ns 123.223 ns 11,591.23 ns 2.02 0.02 0.0763 161 B
    FilterArray_ToArray .NET Framework 4.8 500 13,573.07 ns 128.080 ns 113.540 ns 13,595.41 ns 2.36 0.03 5.9052 12,434 B
    FilterList_ToArray .NET Framework 4.8 500 17,131.55 ns 340.744 ns 510.009 ns 17,036.32 ns 2.98 0.09 5.9204 12,445 B
    FilterHashSet_ToArray .NET Framework 4.8 500 18,801.99 ns 353.812 ns 295.449 ns 18,881.88 ns 3.28 0.06 5.9204 12,470 B
    FilterDictionary_ToArray .NET Framework 4.8 500 18,045.81 ns 133.857 ns 118.661 ns 18,032.49 ns 3.14 0.04 5.9204 12,470 B
    // * Legends *
      N         : Value of the 'N' parameter
      Mean      : Arithmetic mean of all measurements
      Error     : Half of 99.9% confidence interval
      StdDev    : Standard deviation of all measurements
      Ratio     : Mean of the ratio distribution ([Current]/[Baseline])
      RatioSD   : Standard deviation of the ratio distribution ([Current]/[Baseline])
      Gen 0     : GC Generation 0 collects per 1000 operations
      Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
      1 us      : 1 Microsecond (0.000001 sec)
    

    The following diagram shows execution time of both methods (in microseconds) for .NET 5 platform: Time diagram

    The results are quite comparable, but .NET Framework is not so good: Time diagram

    Additional ToArray() call is about 2.5 times slower.

    .NET made huge improvements in Core 3.1 and 5 versions. The following diagram shows execution time (in microseconds) of different .NET versions depending on number of elements: Time diagram

    Array arguments

    We used to use arrays as arguments in methods, but this can be a place where we can make another unnecessary transformation. Let's look at the example:

    private async Task<OrderPackage[]> ReadRootPackagesAsync(params Guid[] packageIds)
    {
    	var queryPackageIds = new HashSet<Guid>(packageIds);
    	...
    	var packageIds = queryPackageIds.ToArray();
    	var orderPackages = await orderPackageService.SelectByIdsAsync(packageIds).ConfigureAwait(false);
    	...
    }
    
    public Task<OrderPackage[]> SelectByIdsAsync(params Guid[] packageIds)
    {
    	...
    	return GetTable().Where(o => packageIds.Contains(o.Id));
    }
    

    GetTable() is a method that returns IQueryable<OrderPackage>, so the Where expression will be converted to a SQL code and it's not necessary to use any specific collection. Here we can use IEnumerable<Guid> again, but I prefer an interface IReadOnlyCollection<T> that was introduced in .NET 4.5 Framework. The IReadOnlyCollection<T> interface indicates that collection is materialized unlike IEnumerable<T> but we are not interested in a collection's structure (HashSet, List, Array, ... - does not matter). And on the other hand using the IReadOnlyCollection<T> prevents the possibility of multiple enumeration of IEnumerable<T>. So, our modified example:

    private async Task<OrderPackage[]> ReadRootPackagesAsync(params Guid[] packageIds)
    {
    	var queryPackageIds = new HashSet<Guid>(packageIds);
    	...
    	var orderPackages = await orderPackageService.SelectByIdsAsync(queryPackageIds).ConfigureAwait(false);
    	...
    }
    
    public Task<OrderPackage[]> SelectByIdsAsync(IReadOnlyCollection<Guid> packageIds)
    {
    	...
    	return GetTable().Where(o => packageIds.Contains(o.Id));
    }
    

    We pass the HashSet<Guid> directly to the SelectByIdsAsync method without using intermediate variables.

    But we should be careful using common interfaces like IEnumerable<T> or IReadOnlyCollection<T> to avoid antipatterns like this:

    public UserSearchModel BuildUserSearchModel(IEnumerable<IUserProps> userPropses)
    {
    	var users = new List<UserModel>();
    	var allUserPropses = userPropses.ToArray();
    	foreach (var userProps in allUserPropses)
    	{
    		...
    	}
    	
    	return new UserSearchModel
    	{
    		Users = users.ToArray(),
    		QueryModel = queryModel,
    		TotalCount = total,
    		AvailableCount = users.Count,
    		HasMore = allUserPropses.Length >= queryModel.PageSize,
    		QueryForm = queryFormBuilder.Build(queryModel)
    	};
    }
    

    The argument userPropses converted to a local array allUserPropses to avoid multiple enumeration as we learned before. userPropses has type IEnumerable<IUserProps>, so it deserves our attention. But what we will see, if we look at the BuildUserSearchModel call?

    public async Task<SearchModel> SearchAsync(SearchQuery searchQuery)
    {
    	var userPropses = new List<IUserProps>();
    	...
    	return searchModelBuilder.BuildUserSearchModel(userPropses);
    }
    

    It becomes obvious that IEnumerable<IUserProps> is not the best choice. It confuses developers because it says that userPropses is not a collection, but it is in fact. The better variant - replace it with IReadOnlyCollection<IUserProps>:

    public UserSearchModel BuildUserSearchModel(IReadOnlyCollection<IUserProps> userPropses)
    {
    	var users = new List<UserModel>();
    	foreach (var userProps in userPropses)
    	{
    		...
    	}
    	
    	return new UserSearchModel
    	{
    		Users = users.ToArray(),
    		QueryModel = queryModel,
    		TotalCount = total,
    		AvailableCount = users.Count,
    		HasMore = userPropses.Count >= queryModel.PageSize,
    		QueryForm = queryFormBuilder.Build(queryModel)
    	};
    }
    

    Code became cleaner, no conversions, no overhead.

    Fluent assertions in unit tests

    I would like to make a small note about the Fluent Assertions library. We write many tests where we need to compare two collections:

    [Test]
    public void TestSingleWriteInSameTimeline()
    {
    	var testData = new List<TestTimelineData>();
    	...
    	
    	var actual = timeline.Select<TestTimelineData>(timelineId, ticks, TestWriteCount);
    
    	actual.Select(x => x.Data).ToArray().Should().BeEquivalentTo(testData.ToArray());
    }
    

    But the method BeEquivalentTo works fine with enumerations. We can verify this by looking at its definition:

    public AndConstraint<TAssertions> BeEquivalentTo<TExpectation>(IEnumerable<TExpectation> expectation,
    	string because = "", params object[] becauseArgs)
    {
    	return BeEquivalentTo(expectation, config => config, because, becauseArgs);
    }
    

    Fluent Assertions will create arrays anyway, so you won't get much performance benefits, but still can keep your code cleaner:

    [Test]
    public void TestSingleWriteInSameTimeline()
    {
    	var testData = new List<TestTimelineData>();
    	...
    	
    	var actual = timeline.Select<TestTimelineData>(timelineId, ticks, TestWriteCount);
    
    	actual.Select(x => x.Data).Should().BeEquivalentTo(testData));
    }
    

    Analyzer

    If you are worried about performance and memory management or just a purity of code, you should be very attentive with collection manipulations. Sometimes, it's difficult to notice a problem, especially after refactorings: you change a method signature from int[] to IEnumerable<int> but don't check all callings - it can lead to some issues like the ones we reviewed earlier.

    Unfortunately, there is not a built-in tool to detect problems like this, because many of them need complex analysis. I started Collections.Analyzer project that can warn about potential problems:

    LINQ example

    And more obvious cases:

    LINQ example

    Links