LINQ: Any vs Count

Published on Saturday 16 June 2018

This post started from discussion on SO: Why LINQ method Any does not check Count? This article shows comparison of performance methods Any and Count = 0.

UPD: The second part of the article has been published: Any() vs Count: part 2, which compares implementations in new versions of .NET. This article is about .NET 4.7.1 only.

First of all, we create new extension method:

public static class EnumerableExtensions
{
    public static bool CustomAny<TSource>(this IEnumerable<TSource> source)
    {
        if (source == null)
        {
            throw new ArgumentNullException(nameof(source));
        }

        if (source is ICollection<TSource> collection)
        {
            return collection.Count != 0;
        }

        using (var enumerator = source.GetEnumerator())
        {
            if (enumerator.MoveNext())
                return true;
        }

        return false;
    }
}

This method checks Count property for ICollection<TSource> type instead of creating enumerator. Let's check if it will be faster than Any.

For benchmarking we use BenchmarkDotNet.

Collections to test:

  • List
  • Array
  • HashSet
  • Dictionary
  • Stack
  • Queue
  • SortedList
  • SortedSet

For testing we create collections with 1000 elements and compare methods Any and CustomAny:

[CategoriesColumn]
[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByCategory)]
public class SimpleCollections
{
    [Params(1000)]
    public int _size;

    private List<int> _list;
    private int[] _array;
    private HashSet<int> _hashSet;
    private Dictionary<int, int> _dictionary;
    private Stack<int> _stack;
    private Queue<int> _queue;
    private SortedList<int, int> _sortedList;
    private SortedSet<int> _sortedSet;

    [GlobalSetup]
    public void SetUp()
    {
        _list = new List<int>();
        _array = new int[_size];
        _hashSet = new HashSet<int>();
        _dictionary = new Dictionary<int, int>();
        _stack = new Stack<int>();
        _queue = new Queue<int>();
        _sortedList = new SortedList<int, int>();
        _sortedSet = new SortedSet<int>();

        for (int i = 0; i < _size; i++)
        {
            _list.Add(i);
            _array[i] = i;
            _hashSet.Add(i);
            _dictionary[i] = i;
            _stack.Push(i);
            _queue.Enqueue(i);
            _sortedList.Add(i, i);
            _sortedSet.Add(i);
        }
    }
    
    [Benchmark(Baseline = true, Description = "Any"), BenchmarkCategory("List")]
    public bool ListAny()
    {
        return _list.Any();
    }

    [Benchmark(Description = "Custom"), BenchmarkCategory("List")]
    public bool ListCount()
    {
        return _list.CustomAny();
    }
    
    ///other collections
}

Benchmarks are grouped by the category - type of collection. Method Any is default Enumerable.Any, Custom - our custom implementation EnumerableExtensions.CustomAny.

Any is a baseline benchmark in the category.

The first run result

BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134
Intel Core i7-2670QM CPU 2.20GHz (Sandy Bridge), 1 CPU, 8 logical and 4 physical cores
Frequency=2143565 Hz, Resolution=466.5126 ns, Timer=TSC
  [Host]     : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.3110.0
  DefaultJob : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.3110.0
Method Categories _size Mean Median Scaled ScaledSD
Any List 1000 38.88 ns 38.32 ns 1.00 0.00
Custom List 1000 16.49 ns 16.50 ns 0.43 0.02
Any Array 1000 27.46 ns 27.39 ns 1.00 0.00
Custom Array 1000 51.21 ns 50.60 ns 1.87 0.05
Any HashSet 1000 39.87 ns 39.89 ns 1.00 0.00
Custom HashSet 1000 15.06 ns 15.01 ns 0.38 0.00
Any Dictionary 1000 51.14 ns 50.83 ns 1.00 0.00
Custom Dictionary 1000 17.41 ns 17.36 ns 0.34 0.01
Any Stack 1000 42.10 ns 42.29 ns 1.00 0.00
Custom Stack 1000 49.41 ns 48.60 ns 1.17 0.06
Any Queue 1000 44.68 ns 44.41 ns 1.00 0.00
Custom Queue 1000 50.68 ns 50.66 ns 1.14 0.04
Any SortedList 1000 42.89 ns 42.55 ns 1.00 0.00
Custom SortedList 1000 17.48 ns 17.17 ns 0.41 0.02
Any SortedSet 1000 220.19 ns 219.66 ns 1.00 0.00
Custom SortedSet 1000 18.46 ns 18.34 ns 0.08 0.00

Here and later this legend is used:

  Categories : Collection type
  _size      : Number of elements in collection
  Mean       : Arithmetic mean of all measurements
  Error      : Half of 99.9% confidence interval
  StdDev     : Standard deviation of all measurements
  Median     : Value separating the higher half of all measurements (50th percentile)
  Scaled     : Mean(CurrentBenchmark) / Mean(BaselineBenchmark)
  ScaledSD   : Standard deviation of ratio of distribution of [CurrentBenchmark] and [BaselineBenchmark]
  1 ns       : 1 Nanosecond (0.000000001 sec)

As we see, in common the new method with Count is faster than the old one or the same, except some collections. Let's examine them in details.

Array

Our new method gives much worse results. Why? Answer is not obvious. Array can be converted to ICollection<T>, but it takes much time due to internal implementation of array in .NET.

So, we need to add array check:

public static class EnumerableExtensions
{
    public static bool CustomAny<TSource>(this IEnumerable<TSource> source)
    {
        if (source == null)
        {
            throw new ArgumentNullException(nameof(source));
        }

        if (source is TSource[] array)
        {
            return array.Length != 0;
        }

        if (source is ICollection<TSource> collection)
        {
            return collection.Count != 0;
        }

        using (var enumerator = source.GetEnumerator())
        {
            if (enumerator.MoveNext())
                return true;
        }

        return false;
    }
}

Stack and Queue

These collections do not implement ICollection<T> interface:

public class Stack<T> : IEnumerable<T>, IEnumerable, ICollection, IReadOnlyCollection<T>
public class Queue<T> : IEnumerable<T>, IEnumerable, ICollection, IReadOnlyCollection<T>

But they have ICollection, we can use it as the most common:

public static class EnumerableExtensions
{
    public static bool CustomAny<TSource>(this IEnumerable<TSource> source)
    {
        if (source == null)
        {
            throw new ArgumentNullException(nameof(source));
        }

        if (source is TSource[] array)
        {
            return array.Length != 0;
        }

        if (source is ICollection<TSource> collection)
        {
            return collection.Count != 0;
        }

        if (source is ICollection baseCollection)
        {
            return baseCollection.Count != 0;
        }

        using (var enumerator = source.GetEnumerator())
        {
            if (enumerator.MoveNext())
                return true;
        }

        return false;
    }
}

The second run result

Method Categories _size Mean Scaled ScaledSD ScaledSD
Any List 1000 37.52 ns 1.00 0.00 0.00
Custom List 1000 19.69 ns 0.53 0.01 0.02
Any Array 1000 27.11 ns 1.00 0.00 0.00
Custom Array 1000 18.87 ns 0.70 0.02 0.05
Any HashSet 1000 39.42 ns 1.00 0.00 0.00
Custom HashSet 1000 18.76 ns 0.48 0.00 0.00
Any Dictionary 1000 50.13 ns 1.00 0.00 0.00
Custom Dictionary 1000 19.95 ns 0.40 0.01 0.01
Any Stack 1000 39.56 ns 1.00 0.00 0.00
Custom Stack 1000 31.66 ns 0.80 0.01 0.06
Any Queue 1000 42.90 ns 1.00 0.00 0.00
Custom Queue 1000 31.69 ns 0.74 0.00 0.04
Any SortedList 1000 40.53 ns 1.00 0.00 0.00
Custom SortedList 1000 19.71 ns 0.49 0.00 0.02
Any SortedSet 1000 219.88 ns 1.00 0.00 0.00
Custom SortedSet 1000 20.68 ns 0.09 0.00 0.00

Ok, in all tests our method is better than default one. But there is one collection, where result is ~10 times better - SortedSet.

SortedSet

Why method Any dramatically worse than Count? As we know, Any just creates enumerator and tries to take the first element. Let's look at source code of SortedSet:

IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
  return (IEnumerator<T>) new SortedSet<T>.Enumerator(this);
}

GetEnumerator creates new struct of type Enumerator, constructor gets SortedSet<T> as a parameter:

internal Enumerator(SortedSet<T> set)
{
  this.tree = set;
  this.tree.VersionCheck();
  this.version = this.tree.version;
  this.stack = new Stack<SortedSet<T>.Node>(2 * SortedSet<T>.log2(set.Count + 1));
  this.current = (SortedSet<T>.Node) null;
  this.reverse = false;
  this.siInfo = (SerializationInfo) null;
  this.Intialize();
}

Open method Intialize (misprint in source code):

private void Intialize()
{
  this.current = (SortedSet<T>.Node) null;
  SortedSet<T>.Node node1 = this.tree.root;
  while (node1 != null)
  {
    SortedSet<T>.Node node2 = this.reverse ? node1.Right : node1.Left;
    SortedSet<T>.Node node3 = this.reverse ? node1.Left : node1.Right;
    if (this.tree.IsWithinRange(node1.Item))
    {
      this.stack.Push(node1);
      node1 = node2;
    }
    else
      node1 = node2 == null || !this.tree.IsWithinRange(node2.Item) ? node3 : node2;
  }
}

It is worth noting that SortedSet is built on the binary tree, and method Intialize bypasses all nodes in tree and puts elements into the stack. So, when we create enumerator of SortedSet, internal implementation creates new collection with all elements. The more elements we have, the more time it takes to copy:

Method _size Mean Scaled
Any 1000 230.64 ns 1.00
Custom 1000 22.23 ns 0.10
Any 10000 288.27 ns 1.00
Custom 10000 21.72 ns 0.08
Any 100000 343.54 ns 1.00
Custom 100000 21.70 ns 0.06

Ok, we checked the most popular collections, and we can use Count property instead of Enumerable.Any. But is it safe to use Count every time? Is there a collection, where Count takes more time than creating enumerator? You may be wondering.

Concurrent collections

All collections we checked before are not thread-safe, and can not be used in multi-threading applications. There is a special class of collections - concurrent collections:

  • ConcurrentBag
  • ConcurrentDictionary
  • ConcurrentQueue
  • ConcurrentStack

Let's fill collections from different threads to simulate multi-threading environment:

[GlobalSetup]
public void SetUp()
{
    _bag = new ConcurrentBag<int>();
    _dictionary = new ConcurrentDictionary<int, int>();
    _queue = new ConcurrentQueue<int>();
    _stack = new ConcurrentStack<int>();

    var tasksCount = 10;
    var batch = _size / tasksCount;

    var tasks = new Task[tasksCount];

    for (int i = 0; i < tasksCount; i++)
    {
        var task = i;

        tasks[task] = Task.Run(() =>
        {
            var from = task * batch;
            var to = (task + 1) * batch;

            for (int j = from; j < to; j++)
            {
                _bag.Add(j);
                _dictionary[j] = j;
                _queue.Enqueue(j);
                _stack.Push(j);
            }
        });
    }

    Task.WaitAll(tasks);
}

Concurrent collections results

Method Categories _size Mean Scaled ScaledSD
Any ConcurrentBag 1000 7,065.75 ns 1.00 0.00
Custom ConcurrentBag 1000 267.89 ns 0.04 0.00
Any ConcurrentDictionary 1000 39.29 ns 1.00 0.00
Custom ConcurrentDictionary 1000 6,319.42 ns 160.88 2.45
Any ConcurrentStack 1000 28.08 ns 1.00 0.00
Custom ConcurrentStack 1000 3,179.18 ns 113.23 0.93
Any ConcurrentQueue 1000 72.12 ns 1.00 0.00
Custom ConcurrentQueue 1000 48.28 ns 0.67 0.02

We have got unpredictable results. Somewhere Any is very slow, and vice versa.

ConcurrentBag

Method GetEnumerator of ConcurrentBag works like in SortedSet. It locks collection and copies all elements into new list:

private List<T> ToList()
{
  List<T> objList = new List<T>();
  for (ConcurrentBag<T>.ThreadLocalList threadLocalList = this.m_headList; threadLocalList != null; threadLocalList = threadLocalList.m_nextList)
  {
    for (ConcurrentBag<T>.Node node = threadLocalList.m_head; node != null; node = node.m_next)
      objList.Add(node.m_value);
  }
  return objList;
}

Count property locks collection and just sum up all internal lists counts:

private int GetCountInternal()
{
  int num = 0;
  for (ConcurrentBag<T>.ThreadLocalList threadLocalList = this.m_headList; threadLocalList != null; threadLocalList = threadLocalList.m_nextList)
    checked { num += threadLocalList.Count; }
  return num;
}

So, Count is much faster than Any.

ConcurrentDictionary

GetEnumerator method does not create new collection and uses yield to iterate through collection, so, Any just do one iteration:

public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
  foreach (ConcurrentDictionary<TKey, TValue>.Node bucket in this.m_tables.m_buckets)
  {
    ConcurrentDictionary<TKey, TValue>.Node current;
    for (current = Volatile.Read<ConcurrentDictionary<TKey, TValue>.Node>(ref bucket); current != null; current = current.m_next)
      yield return new KeyValuePair<TKey, TValue>(current.m_key, current.m_value);
    current = (ConcurrentDictionary<TKey, TValue>.Node) null;
  }
}

Count locks collection and sums up counts of internal tables:

public int Count
{
  [__DynamicallyInvokable] get
  {
    int locksAcquired = 0;
    try
    {
      this.AcquireAllLocks(ref locksAcquired);
      return this.GetCountInternal();
    }
    finally
    {
      this.ReleaseLocks(0, locksAcquired);
    }
  }
}

private int GetCountInternal()
{
  int num = 0;
  for (int index = 0; index < this.m_tables.m_countPerLock.Length; ++index)
    num += this.m_tables.m_countPerLock[index];
  return num;
}

And the more elements we have the more time it takes:

Method _size Mean Scaled ScaledSD
Any 10 37.79 ns 1.00 0.00
Custom 10 239.63 ns 6.34 0.18
Any 100 34.46 ns 1.00 0.00
Custom 100 823.62 ns 23.90 0.58
Any 1000 38.22 ns 1.00 0.00
Custom 1000 6,264.17 ns 163.92 2.09
Any 10000 35.57 ns 1.00 0.00
Custom 10000 24,819.31 ns 697.82 11.97

Profiler shows that locking takes a lot of time:

Dictionary

ConcurrentStack

GetEnumerator uses lazy enumeration and it's cheap to use:

private IEnumerator<T> GetEnumerator(ConcurrentStack<T>.Node head)
{
  for (ConcurrentStack<T>.Node current = head; current != null; current = current.m_next)
    yield return current.m_value;
}

But Count just counts all elements even without locking:

public int Count
{
  [__DynamicallyInvokable] get
  {
    int num = 0;
    for (ConcurrentStack<T>.Node node = this.m_head; node != null; node = node.m_next)
      ++num;
    return num;
  }
}

ConcurrentQueue

GetEnumerator implementation is a bit complicated, but it uses lazy yield internal. Count uses simple calculations without enumerating all elements. We can use both methods.

As we see, concurrent collections are complicated and internal implementation differs very much.

Universal method for everything?

No, it's impossible. We should use Count or Any methods depending on collection.

Links

Source code of performance tests