List initializer

Published on Saturday, January 6, 2024

We all used to type new List<int> { 1, 2, 3, 4 } or new int[] { 1, 2, 3, 4} to initialize collections with some values. It looks similar in syntax but differs in behavior and you should be careful if you are concerned about performance.

Array

As we know, arrays contains sequence of elements with a fixed size. Once it was created it can't be changed during the lifetime of the instance.

var array = new int[] { 1, 2, 3, 4};

List<T>

When we don't know final size of a collection or we need add/delete elements during the lifetime, it's suitable to use the List<T> type.

var list = new List<int>();
list.Add(1);
list.Add(2);

At first glance it may seem that we can always use the list instead of the array - it has all array's features, but also can be dynamically changed. But to decide whether to use the list we need to know more about it internal structure.

Part of the source code:

public class List<T> : IList<T>, IList, IReadOnlyList<T>
{
    private const int DefaultCapacity = 4;

    internal T[] _items;
    internal int _size;

    private static readonly T[] s_emptyArray = new T[0];

    // Constructs a List. The list is initially empty and has a capacity
    // of zero. Upon adding the first element to the list the capacity is
    // increased to DefaultCapacity, and then increased in multiples of two
    // as required.
    public List()
    {
        _items = s_emptyArray;
    }
    ...

    public int Capacity
    {
        get => _items.Length;
        set {...}
    }

    public int Count => _size;
}
  • _items - array with stored elements;
  • _size - count of elements stored in the list and size of the whole list;
  • Capacity - size of _items and maximum amount of elements can be stored without resize.

List resize

To put it simply, we can say that the list is just the array that can resize when need. Every time we try to add one more item the list ensures _items has enough free space, otherwise it sets new capacity:

internal void Grow(int capacity)
{
    ...
    int newCapacity = _items.Length == 0 ? DefaultCapacity : 2 * _items.Length;
    ...
    Capacity = newCapacity;
}

Let's look closer to the Capacity property:

// Gets and sets the capacity of this list.  The capacity is the size of
// the internal array used to hold items.  When set, the internal
// array of the list is reallocated to the given capacity.
public int Capacity
{
    get => _items.Length;
    set
    {
        if (value < _size)
        {
            ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value, ExceptionResource.ArgumentOutOfRange_SmallCapacity);
        }

        if (value != _items.Length)
        {
            if (value > 0)
            {
                T[] newItems = new T[value];
                if (_size > 0)
                {
                    Array.Copy(_items, newItems, _size);
                }
                _items = newItems;
            }
            else
            {
                _items = s_emptyArray;
            }
        }
    }
}

So, what do we see? Initially every list is created with an empty internal array. After adding first element the list creates the new array for 4 elements (DefaultCapacity is 4). And when the current array is exhausted, the new one is created with doubled size, and all elements are copied.

Performance

What happens, when we create new list and initialize it with some values?

var list = new List<int> { 1, 2, 3, 4, 5 };

It looks like array initialization, but works very different way. According to the documentation any type that implements IEnumerable and has Add method can be used with collection initializer.

So the previous example is just short form of serial calls of Add method:

var list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);

The compiler just transforms short collection initializer and adds necessary calls automatically. And this may cause performance degradation.

Let's take a closer look at the process of adding 5 elements:

  • Before the first Add call the list is empty, the internal array is empty.
  • First added element creates the new internal array for 4 elements.
  • Elements 2, 3, 4 change nothing.
  • When we add the fifth element, the internal array is full and needs to be resized. The new array of size 8 is created, all elements are copied from the previous array and the fifth element is added.

At the and we have the list with 5 elements and internal array for 8 elements. At the same time, we created 2 arrays, and the final one wastes 37.5% of it's space for nothing.

As you might guess, creating new arrays and copying elements causes memory allocations and takes additional time.

This could be an unpleasant surprise for critical places. Do we have a solution? Yes!

Capacity

If we know or suggest the final size of the list, we can create it with initial capacity.

var list = new List<int>(5);
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);

Or

var list = new List<int>(5) { 1, 2, 3, 4, 5 };

Now we immediately create one array for 5 elements and no more unnecessary allocations.

Benchmarks

Let's compare creating lists with initial capacity and without it.

  • Benchmark code. Click to expand.
  • using System.Collections.Generic;
    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Jobs;
    
    namespace Benchmarks
    {
        [ShortRunJob(RuntimeMoniker.Net48)]
        [ShortRunJob(RuntimeMoniker.Net60)]
        [ShortRunJob(RuntimeMoniker.Net80)]
        [MemoryDiagnoser]
        public class InitListBenchmark
        {
            [BenchmarkCategory("One")]
            [Benchmark]
            public List<int> InitList1()
            {
                return new List<int> {1};
            }
    
            [BenchmarkCategory("One")]
            [Benchmark]
            public List<int> InitListWithSize1()
            {
                return new List<int>(1) {1};
            }
    
            [BenchmarkCategory("Five")]
            [Benchmark]
            public List<int> InitList5()
            {
                return new List<int> {1, 2, 3, 4, 5};
            }
            
            [BenchmarkCategory("Five")]
            [Benchmark]
            public List<int> InitListWithSize5()
            {
                return new List<int>(5) {1, 2, 3, 4, 5};
            }
            
            [BenchmarkCategory("Ten")]
            [Benchmark]
            public List<int> InitList10()
            {
                return new List<int> {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
            }
            
            [BenchmarkCategory("Ten")]
            [Benchmark]
            public List<int> InitListWithSize10()
            {
                return new List<int>(10) {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
            }
        }
    }
    
    
    BenchmarkDotNet v0.13.8, Windows 10 (10.0.19045.3930/22H2/2022Update)
    AMD Ryzen 7 7840H with Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores
    .NET SDK 8.0.101
      [Host]                      : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2
      ShortRun-.NET 6.0           : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2
      ShortRun-.NET 8.0           : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
      ShortRun-.NET Framework 4.8 : .NET Framework 4.8.1 (4.8.9195.0), X64 RyuJIT VectorSize=256
    
    IterationCount=3  LaunchCount=1  WarmupCount=3  
    
    
    Method Job Runtime Mean Error StdDev Gen0 Allocated
    InitList1 ShortRun-.NET 6.0 .NET 6.0 10.599 ns 4.114 ns 0.2255 ns 0.0086 72 B
    InitListWithSize1 ShortRun-.NET 6.0 .NET 6.0 5.727 ns 1.351 ns 0.0741 ns 0.0076 64 B
    InitList1 ShortRun-.NET 8.0 .NET 8.0 9.515 ns 8.786 ns 0.4816 ns 0.0086 72 B
    InitListWithSize1 ShortRun-.NET 8.0 .NET 8.0 6.638 ns 4.749 ns 0.2603 ns 0.0076 64 B
    InitList1 ShortRun-.NET Framework 4.8 .NET Framework 4.8 14.523 ns 4.407 ns 0.2416 ns 0.0127 80 B
    InitListWithSize1 ShortRun-.NET Framework 4.8 .NET Framework 4.8 6.678 ns 6.997 ns 0.3835 ns 0.0115 72 B
    InitList5 ShortRun-.NET 6.0 .NET 6.0 22.426 ns 4.313 ns 0.2364 ns 0.0153 128 B
    InitListWithSize5 ShortRun-.NET 6.0 .NET 6.0 8.611 ns 5.117 ns 0.2805 ns 0.0096 80 B
    InitList5 ShortRun-.NET 8.0 .NET 8.0 23.910 ns 15.573 ns 0.8536 ns 0.0153 128 B
    InitListWithSize5 ShortRun-.NET 8.0 .NET 8.0 9.153 ns 9.765 ns 0.5352 ns 0.0096 80 B
    InitList5 ShortRun-.NET Framework 4.8 .NET Framework 4.8 33.884 ns 14.387 ns 0.7886 ns 0.0216 136 B
    InitListWithSize5 ShortRun-.NET Framework 4.8 .NET Framework 4.8 12.232 ns 2.418 ns 0.1325 ns 0.0140 88 B
    InitList10 ShortRun-.NET 6.0 .NET 6.0 37.512 ns 30.673 ns 1.6813 ns 0.0258 216 B
    InitListWithSize10 ShortRun-.NET 6.0 .NET 6.0 12.740 ns 14.638 ns 0.8024 ns 0.0115 96 B
    InitList10 ShortRun-.NET 8.0 .NET 8.0 38.765 ns 25.316 ns 1.3877 ns 0.0258 216 B
    InitListWithSize10 ShortRun-.NET 8.0 .NET 8.0 12.288 ns 11.804 ns 0.6470 ns 0.0115 96 B
    InitList10 ShortRun-.NET Framework 4.8 .NET Framework 4.8 54.156 ns 16.285 ns 0.8926 ns 0.0357 225 B
    InitListWithSize10 ShortRun-.NET Framework 4.8 .NET Framework 4.8 18.609 ns 2.785 ns 0.1527 ns 0.0166 104 B

    When we set initial capacity, it doesn't lead to unnecessary allocations and we see better performance. And there is no excess memory traffic. The more elements we add to the list, the more difference we see in benchmarks.

    Analyzer

    If performance is critical to your project, you should pay attention on these situations. Automatic diagnostics can help you. I support a set of roslyn-based diagnostics - Collections.Analyzer and now it can detects lists with collection initializer and without initial capacity.

    Opinion

    • If you need a static collection you won't change (add or remove elements) - use arrays.
    • If you create a list and you know exactly its future size - set initial capacity with this size.
    • If you create a list and you don't know its future size - set expected size.

    Links