We all used to type new List<int> { 1, 2, 3, 4 }
or new int[] { 1, 2, 3, 4}
to initialize collections with some values. It looks similar in syntax but differs in behavior and you should be careful if you are concerned about performance.
Array
As we know, arrays contains sequence of elements with a fixed size. Once it was created it can't be changed during the lifetime of the instance.
var array = new int[] { 1, 2, 3, 4};
List<T>
When we don't know final size of a collection or we need add/delete elements during the lifetime, it's suitable to use the List<T>
type.
var list = new List<int>();
list.Add(1);
list.Add(2);
At first glance it may seem that we can always use the list instead of the array - it has all array's features, but also can be dynamically changed. But to decide whether to use the list we need to know more about it internal structure.
Part of the source code:
public class List<T> : IList<T>, IList, IReadOnlyList<T>
{
private const int DefaultCapacity = 4;
internal T[] _items;
internal int _size;
private static readonly T[] s_emptyArray = new T[0];
// Constructs a List. The list is initially empty and has a capacity
// of zero. Upon adding the first element to the list the capacity is
// increased to DefaultCapacity, and then increased in multiples of two
// as required.
public List()
{
_items = s_emptyArray;
}
...
public int Capacity
{
get => _items.Length;
set {...}
}
public int Count => _size;
}
_items
- array with stored elements;_size
- count of elements stored in the list and size of the whole list;Capacity
- size of_items
and maximum amount of elements can be stored without resize.
List resize
To put it simply, we can say that the list is just the array that can resize when need.
Every time we try to add one more item the list ensures _items
has enough free space, otherwise it sets new capacity:
internal void Grow(int capacity)
{
...
int newCapacity = _items.Length == 0 ? DefaultCapacity : 2 * _items.Length;
...
Capacity = newCapacity;
}
Let's look closer to the Capacity
property:
// Gets and sets the capacity of this list. The capacity is the size of
// the internal array used to hold items. When set, the internal
// array of the list is reallocated to the given capacity.
public int Capacity
{
get => _items.Length;
set
{
if (value < _size)
{
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value, ExceptionResource.ArgumentOutOfRange_SmallCapacity);
}
if (value != _items.Length)
{
if (value > 0)
{
T[] newItems = new T[value];
if (_size > 0)
{
Array.Copy(_items, newItems, _size);
}
_items = newItems;
}
else
{
_items = s_emptyArray;
}
}
}
}
So, what do we see? Initially every list is created with an empty internal array. After adding first element the list creates the new array for 4 elements (DefaultCapacity
is 4). And when the current array is exhausted, the new one is created with doubled size, and all elements are copied.
Performance
What happens, when we create new list and initialize it with some values?
var list = new List<int> { 1, 2, 3, 4, 5 };
It looks like array initialization, but works very different way. According to the documentation any type that implements IEnumerable
and has Add
method can be used with collection initializer.
So the previous example is just short form of serial calls of Add
method:
var list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);
The compiler just transforms short collection initializer and adds necessary calls automatically. And this may cause performance degradation.
Let's take a closer look at the process of adding 5 elements:
- Before the first
Add
call the list is empty, the internal array is empty. - First added element creates the new internal array for 4 elements.
- Elements 2, 3, 4 change nothing.
- When we add the fifth element, the internal array is full and needs to be resized. The new array of size 8 is created, all elements are copied from the previous array and the fifth element is added.
At the and we have the list with 5 elements and internal array for 8 elements. At the same time, we created 2 arrays, and the final one wastes 37.5% of it's space for nothing.
As you might guess, creating new arrays and copying elements causes memory allocations and takes additional time.
This could be an unpleasant surprise for critical places. Do we have a solution? Yes!
Capacity
If we know or suggest the final size of the list, we can create it with initial capacity.
var list = new List<int>(5);
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);
Or
var list = new List<int>(5) { 1, 2, 3, 4, 5 };
Now we immediately create one array for 5 elements and no more unnecessary allocations.
Benchmarks
Let's compare creating lists with initial capacity and without it.
Benchmark code. Click to expand.
using System.Collections.Generic;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
namespace Benchmarks
{
[ShortRunJob(RuntimeMoniker.Net48)]
[ShortRunJob(RuntimeMoniker.Net60)]
[ShortRunJob(RuntimeMoniker.Net80)]
[MemoryDiagnoser]
public class InitListBenchmark
{
[BenchmarkCategory("One")]
[Benchmark]
public List<int> InitList1()
{
return new List<int> {1};
}
[BenchmarkCategory("One")]
[Benchmark]
public List<int> InitListWithSize1()
{
return new List<int>(1) {1};
}
[BenchmarkCategory("Five")]
[Benchmark]
public List<int> InitList5()
{
return new List<int> {1, 2, 3, 4, 5};
}
[BenchmarkCategory("Five")]
[Benchmark]
public List<int> InitListWithSize5()
{
return new List<int>(5) {1, 2, 3, 4, 5};
}
[BenchmarkCategory("Ten")]
[Benchmark]
public List<int> InitList10()
{
return new List<int> {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
}
[BenchmarkCategory("Ten")]
[Benchmark]
public List<int> InitListWithSize10()
{
return new List<int>(10) {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
}
}
}
BenchmarkDotNet v0.13.8, Windows 10 (10.0.19045.3930/22H2/2022Update)
AMD Ryzen 7 7840H with Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.101
[Host] : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2
ShortRun-.NET 6.0 : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2
ShortRun-.NET 8.0 : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
ShortRun-.NET Framework 4.8 : .NET Framework 4.8.1 (4.8.9195.0), X64 RyuJIT VectorSize=256
IterationCount=3 LaunchCount=1 WarmupCount=3
Method | Job | Runtime | Mean | Error | StdDev | Gen0 | Allocated |
---|---|---|---|---|---|---|---|
InitList1 | ShortRun-.NET 6.0 | .NET 6.0 | 10.599 ns | 4.114 ns | 0.2255 ns | 0.0086 | 72 B |
InitListWithSize1 | ShortRun-.NET 6.0 | .NET 6.0 | 5.727 ns | 1.351 ns | 0.0741 ns | 0.0076 | 64 B |
InitList1 | ShortRun-.NET 8.0 | .NET 8.0 | 9.515 ns | 8.786 ns | 0.4816 ns | 0.0086 | 72 B |
InitListWithSize1 | ShortRun-.NET 8.0 | .NET 8.0 | 6.638 ns | 4.749 ns | 0.2603 ns | 0.0076 | 64 B |
InitList1 | ShortRun-.NET Framework 4.8 | .NET Framework 4.8 | 14.523 ns | 4.407 ns | 0.2416 ns | 0.0127 | 80 B |
InitListWithSize1 | ShortRun-.NET Framework 4.8 | .NET Framework 4.8 | 6.678 ns | 6.997 ns | 0.3835 ns | 0.0115 | 72 B |
InitList5 | ShortRun-.NET 6.0 | .NET 6.0 | 22.426 ns | 4.313 ns | 0.2364 ns | 0.0153 | 128 B |
InitListWithSize5 | ShortRun-.NET 6.0 | .NET 6.0 | 8.611 ns | 5.117 ns | 0.2805 ns | 0.0096 | 80 B |
InitList5 | ShortRun-.NET 8.0 | .NET 8.0 | 23.910 ns | 15.573 ns | 0.8536 ns | 0.0153 | 128 B |
InitListWithSize5 | ShortRun-.NET 8.0 | .NET 8.0 | 9.153 ns | 9.765 ns | 0.5352 ns | 0.0096 | 80 B |
InitList5 | ShortRun-.NET Framework 4.8 | .NET Framework 4.8 | 33.884 ns | 14.387 ns | 0.7886 ns | 0.0216 | 136 B |
InitListWithSize5 | ShortRun-.NET Framework 4.8 | .NET Framework 4.8 | 12.232 ns | 2.418 ns | 0.1325 ns | 0.0140 | 88 B |
InitList10 | ShortRun-.NET 6.0 | .NET 6.0 | 37.512 ns | 30.673 ns | 1.6813 ns | 0.0258 | 216 B |
InitListWithSize10 | ShortRun-.NET 6.0 | .NET 6.0 | 12.740 ns | 14.638 ns | 0.8024 ns | 0.0115 | 96 B |
InitList10 | ShortRun-.NET 8.0 | .NET 8.0 | 38.765 ns | 25.316 ns | 1.3877 ns | 0.0258 | 216 B |
InitListWithSize10 | ShortRun-.NET 8.0 | .NET 8.0 | 12.288 ns | 11.804 ns | 0.6470 ns | 0.0115 | 96 B |
InitList10 | ShortRun-.NET Framework 4.8 | .NET Framework 4.8 | 54.156 ns | 16.285 ns | 0.8926 ns | 0.0357 | 225 B |
InitListWithSize10 | ShortRun-.NET Framework 4.8 | .NET Framework 4.8 | 18.609 ns | 2.785 ns | 0.1527 ns | 0.0166 | 104 B |
When we set initial capacity, it doesn't lead to unnecessary allocations and we see better performance. And there is no excess memory traffic. The more elements we add to the list, the more difference we see in benchmarks.
Analyzer
If performance is critical to your project, you should pay attention on these situations. Automatic diagnostics can help you. I support a set of roslyn-based diagnostics - Collections.Analyzer and now it can detects lists with collection initializer and without initial capacity.
Opinion
- If you need a static collection you won't change (add or remove elements) - use arrays.
- If you create a list and you know exactly its future size - set initial capacity with this size.
- If you create a list and you don't know its future size - set expected size.