9

Add PriorityQueue<T> to Collections · Issue #14032 · dotnet/runtime · GitH...

 3 years ago
source link: https://github.com/dotnet/runtime/issues/14032
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

See LATEST Proposal in corefxlab repo.

Second Proposal options

Proposal from https://github.com/dotnet/corefx/issues/574#issuecomment-307971397

Assumptions

Elements in priority queue are unique. If they are not, we would have to introduce 'handles' of items to enable their update/remove. Or the update/remove semantics would have to apply to first/all, which is weird.

Modeled after Queue<T> (MSDN link)

public class PriorityQueue<TElement, TPriority>
    : IEnumerable,
    IEnumerable<(TElement element, TPriority priority)>,
    IReadOnlyCollection<(TElement element, TPriority priority)>
    // ICollection not included on purpose
{
    public PriorityQueue();
    public PriorityQueue(IComparer<TPriority> comparer);

    public IComparer<TPriority> Comparer { get; }
    public int Count { get; }
    public bool IsEmpty { get; }

    public bool Contains(TElement element);

    // Peek & Dequeue
    public (TElement element, TPriority priority) Peek(); // Throws if empty
    public (TElement element, TPriority priority) Dequeue(); // Throws if empty
    public bool TryPeek(out TElement element, out TPriority priority); // Returns false if empty
    public bool TryDequeue(out TElement element, out TPriority priority); // Returns false if empty

    // Enqueue & Update
    public void Enqueue(TElement element, TPriority priority); // Throws if it is duplicate
    public void Update(TElement element, TPriority priority); // Throws if element does not exist
    public void EnqueueOrUpdate(TElement element, TPriority priority);
    public bool TryEnqueue(TElement element, TPriority priority); // Returns false if it is duplicate (does NOT update it)
    public bool TryUpdate(TElement element, TPriority priority); // Returns false if element does not exist (does NOT add it)
    
    public void Remove(TElement element); // Throws if element does not exist
    public bool TryRemove(TElement element); // Returns false if element does not exist

    public void Clear();

    public IEnumerator<(TElement element, TPriority priority)> GetEnumerator();
    IEnumerator IEnumerable.GetEnumerator();

//
// Selector part
//
    public PriorityQueue(Func<TElement, TPriority> prioritySelector);
    public PriorityQueue(Func<TElement, TPriority> prioritySelector, IComparer<TPriority> comparer);

    public Func<TElement, TPriority> PrioritySelector { get; }

    public void Enqueue(TElement element);
    public void Update(TElement element);
}

Open questions:

  1. Class name PriorityQueue vs. Heap
  2. Introduce IHeap and constructor overload? (Should we wait for later?)
  3. Introduce IPriorityQueue? (Should we wait for later - IDictionary example)
  4. Use selector (of priority stored inside the value) or not (5 APIs difference)
  5. Use tuples (TElement element, TPriority priority) vs. KeyValuePair<TPriority, TElement>
    • Should Peek and Dequeue rather have out argument instead of tuple?
  6. Is Peek and Dequeue throwing useful at all?

Original Proposal

Issue https://github.com/dotnet/corefx/issues/163 requested the addition of a priority queue to the core .NET collection data structures.

This post, while a duplicate, is intended to act the formal submission to the corefx API Review Process. The issue contents are the speclet for a new System.Collections.Generic.PriorityQueue type.

I will be contributing the PR, if approved.

Rationale and Usage

The .NET Base Class Libraries (BCL) currently lacks support for ordered producer-consumer collections. A common requirement of many software applications is the ability generate a list of items over time and process them in an order different from the order they were received in.

There are three generic data structures within the System.Collections hierarchy of namespaces that supported a sorted collection of items; System.Collections.Generic.SortedList, System.Collections.Generic.SortedSet, and System.Collections.Generic.SortedDictionary.

Of these, SortedSet and SortedDictionary are not appropriate for producer-consumer patterns that generate duplicate values. The complexity of SortedList is Θ(n) worst case for both Add and Remove.

A much more memory and time efficient data structure for ordered collections with producer-consumer usage patterns is a priority queue. Other than when capacity resizing is necessary, worse case insertion (enqueue) and remove top (dequeue) performance is Θ(log n) - far better than the existing options that exist in the BCL.

Priority queues have a wide degree of applicability across different classes of applications. The Wikipedia page on Priority Queues offers a list of many different well understand use cases. While highly specialized implementations may still require custom priority queue implementations, a standard implementation would cover a broad range of usage scenarios. Priority queues are particularly useful in scheduling the output of multiple producers, which is an important pattern in highly parallelized software.

It's worth noting that both the C++ standard library and Java offer priority queue functionality as part of their basic APIs.

Proposed API

namespace System.Collections.Generic
{
    /// <summary>
    /// Represents a collection of objects that are removed in a sorted order.
    /// </summary>
    /// <typeparam name="T">Specifies the type of elements in the queue.</typeparam>    
    [DebuggerDisplay("Count = {count}")]
    [DebuggerTypeProxy(typeof(System_PriorityQueueDebugView<>))]
    public class PriorityQueue<T> : IEnumerable<T>, ICollection, IEnumerable, IReadOnlyCollection<T>
    {
        /// <summary>
        /// Initializes a new instance of the <see cref="PriorityQueue{T}"/> class 
        /// that uses a default comparer.
        /// </summary>
        public PriorityQueue();

        /// <summary>
        /// Initializes a new instance of the <see cref="PriorityQueue{T}"/> class 
        /// that has the specified initial capacity.
        /// </summary>
        /// <param name="capacity">The initial number of elements that the <see cref="PriorityQueue{T}"/> can contain.</param>
        /// <exception cref="T:System.ArgumentOutOfRangeException"><paramref name="capacity"/> is less than zero.</exception>
        public PriorityQueue(int capacity);

        /// <summary>
        /// Initializes a new instance of the <see cref="PriorityQueue{T}"/> class 
        /// that uses a specified comparer.
        /// </summary>
        /// <param name="comparer">The <see cref="T:System.Collections.Generic.IComparer{T}"/> to use when comparing elements.</param>
        /// <exception cref="T:System.ArgumentNullException"><paramref name="comparer"/> is null.</exception>
        public PriorityQueue(IComparer<T> comparer);

        /// <summary>
        /// Initializes a new instance of the <see cref="PriorityQueue{T}"/> class 
        /// that contains elements copied from the specified collection and uses a default comparer.
        /// </summary>
        /// <param name="collection">The collection whose elements are copied to the new <see cref="PriorityQueue{T}"/>.</param>
        /// <exception cref="T:System.ArgumentNullException"><paramref name="collection"/> is null.</exception>
        public PriorityQueue(IEnumerable<T> collection);

        /// <summary>
        /// Initializes a new instance of the <see cref="PriorityQueue{T}"/> class 
        /// that contains elements copied from the specified collection and uses a specified comparer.
        /// </summary>
        /// <param name="collection">The collection whose elements are copied to the new <see cref="PriorityQueue{T}"/>.</param>
        /// <param name="comparer">The <see cref="T:System.Collections.Generic.IComparer{T}"/> to use when comparing elements.</param>
        /// <exception cref="T:System.ArgumentNullException">
        /// <paramref name="collection"/> is null. -or-
        /// <paramref name="comparer"/> is null.
        /// </exception>
        public PriorityQueue(IEnumerable<T> collection, IComparer<T> comparer);

        /// <summary>
        /// Initializes a new instance of the <see cref="PriorityQueue{T}"/> class that is empty,
        /// has the specified initial capacity, and uses a specified comparer.
        /// </summary>
        /// <param name="capacity">The initial number of elements that the <see cref="PriorityQueue{T}"/> can contain.</param>
        /// <param name="comparer">The <see cref="T:System.Collections.Generic.IComparer{T}"/> to use when comparing elements.</param>
        /// <exception cref="T:System.ArgumentOutOfRangeException"><paramref name="capacity"/> is less than zero.</exception>
        /// <exception cref="T:System.ArgumentNullException"><paramref name="comparer"/> is null.</exception>
        public PriorityQueue(int capacity, IComparer<T> comparer);

        /// <summary>
        /// Gets the <see cref="IComparer{T}"/> for the <see cref="PriorityQueue{T}"/>. 
        /// </summary>
        /// <value>
        /// The <see cref="T:System.Collections.Generic.IComparer{T}"/> that is used when
        /// comparing elements in the <see cref="PriorityQueue{T}"/>. 
        /// </value>
        public IComparer<T> Comparer 
        { 
            get;
        }

        /// <summary>
        /// Gets the number of elements contained in the <see cref="PriorityQueue{T}"/>.
        /// </summary>
        /// <value>The number of elements contained in the <see cref="PriorityQueue{T}"/>.</value>
        public int Count 
        { 
            get;
        }

        /// <summary>
        /// Adds an object to the into the <see cref="PriorityQueue{T}"/> by its priority.
        /// </summary>
        /// <param name="item">
        /// The object to add to the <see cref="PriorityQueue{T}"/>. 
        /// The value can be null for reference types.
        /// </param>
        public void Enqueue(T item);

        /// <summary>
        /// Removes and returns the object with the lowest priority in the <see cref="PriorityQueue{T}"/>.
        /// </summary>
        /// <returns>The object with the lowest priority that is removed from the <see cref="PriorityQueue{T}"/>.</returns>
        /// <exception cref="InvalidOperationException">The <see cref="PriorityQueue{T}"/> is empty.</exception>
        public T Dequeue();

        /// <summary>
        /// Returns the object with the lowest priority in the <see cref="PriorityQueue{T}"/>.
        /// </summary>
        /// <exception cref="InvalidOperationException">The <see cref="PriorityQueue{T}"/> is empty.</exception>
        public T Peek();

        /// <summary>
        /// Removes all elements from the <see cref="PriorityQueue{T}"/>.
        /// </summary>
        public void Clear();

        /// <summary>
        /// Determines whether an element is in the <see cref="PriorityQueue{T}"/>.
        /// </summary>
        /// <param name="item">
        /// The object to add to the end of the <see cref="PriorityQueue{T}"/>. 
        /// The value can be null for reference types.
        /// </param>
        /// <returns>
        /// true if item is found in the <see cref="PriorityQueue{T}"/>;  otherwise, false.
        /// </returns>
        public bool Contains(T item);

        /// <summary>
        /// Copies the elements of the <see cref="PriorityQueue{T}"/> to an  <see cref="T:System.Array"/>, 
        /// starting at a particular <see cref="T:System.Array"/> index.
        /// </summary>
        /// <param name="array">
        /// The one-dimensional <see cref="T:System.Array">Array</see> that is the
        /// destination of the elements copied from the <see cref="PriorityQueue{T}"/>. 
        /// The <see cref="T:System.Array">Array</see> must have zero-based indexing.
        /// </param>
        /// <param name="arrayIndex">The zero-based index in <paramref name="array"/> at which copying begins.</param>
        /// <exception cref="T:System.ArgumentNullException"><paramref name="array"/> is null.</exception>
        /// <exception cref="T:System.ArgumentOutOfRangeException">
        /// <paramref name="arrayIndex"/> is less than zero. -or- 
        /// <paramref name="arrayIndex"/> is equal to or greater than the length of the <paramref name="array"/>
        /// </exception>
        /// <exception cref="ArgumentException">
        /// The number of elements in the source <see cref="T:System.Collections.ICollection"/> is
        /// greater than the available space from <paramref name="index"/> to the end of the destination
        /// <paramref name="array"/>.
        /// </exception>
        public void CopyTo(T[] array, int arrayIndex);

        /// <summary>
        /// Copies the elements of the <see cref="T:System.Collections.ICollection"/> to an 
        /// <see cref="T:System.Array"/>, starting at a particular <see cref="T:System.Array"/> index.
        /// </summary>
        /// <param name="array">
        /// The one-dimensional <see cref="T:System.Array">Array</see> that is the
        /// destination of the elements copied from the <see cref="PriorityQueue{T}"/>. 
        /// The <see cref="T:System.Array">Array</see> must have zero-based indexing.
        /// </param>
        /// <param name="index">The zero-based index in <paramref name="array"/> at which copying begins.</param>
        /// <exception cref="T:System.ArgumentNullException"><paramref name="array"/> is null.</exception>
        /// <exception cref="T:System.ArgumentOutOfRangeException"><paramref name="index"/> is less than zero.</exception>
        /// <exception cref="ArgumentException">
        /// <paramref name="array"/> is multidimensional. -or-
        /// <paramref name="array"/> does not have zero-based indexing. -or-
        /// <paramref name="index"/> is equal to or greater than the length of the <paramref name="array"/> -or- 
        /// The number of elements in the source <see cref="T:System.Collections.ICollection"/> is
        /// greater than the available space from <paramref name="index"/> to the end of the destination
        /// <paramref name="array"/>. -or- 
        /// The type of the source <see cref="T:System.Collections.ICollection"/> cannot be cast automatically 
        /// to the type of the destination <paramref name="array"/>.
        /// </exception>
        void ICollection.CopyTo(Array array, int index);

        /// <summary>
        /// Copies the elements stored in the <see cref="PriorityQueue{T}"/> to a new array.
        /// </summary>
        /// <returns>
        /// A new array containing a snapshot of elements copied from the <see cref="PriorityQueue{T}"/>.
        /// </returns>
        public T[] ToArray();

        /// <summary>
        /// Returns an enumerator that iterates through the <see cref="PriorityQueue{T}"/>
        /// </summary>
        /// <returns>An enumerator for the contents of the <see cref="PriorityQueue{T}"/>.</returns>
        public Enumerator GetEnumerator();

        /// <summary>
        /// Returns an enumerator that iterates through the <see cref="PriorityQueue{T}"/>
        /// </summary>
        /// <returns>An enumerator for the contents of the <see cref="PriorityQueue{T}"/>.</returns>
        IEnumerator<T> IEnumerable<T>.GetEnumerator();

        /// <summary>
        /// Returns an enumerator that iterates through the <see cref="PriorityQueue{T}"/>.
        /// </summary>
        /// <returns>An <see cref="T:System.Collections.IEnumerator"/> that can be used to iterate through the collection.</returns>
        IEnumerator IEnumerable.GetEnumerator();

        /// <summary>
        /// Sets the capacity to the actual number of elements in the <see cref="PriorityQueue{T}"/>, 
        /// if that number is less than than a threshold value.
        /// </summary>
        public void TrimExcess();

        /// <summary>
        /// Gets a value that indicates whether access to the <see cref="ICollection"/> is 
        /// synchronized with the SyncRoot.
        /// </summary>
        /// <value>true if access to the <see cref="T:System.Collections.ICollection"/> is synchronized
        /// with the SyncRoot; otherwise, false. For <see cref="PriorityQueue{T}"/>, this property always
        /// returns false.</value>
        bool ICollection.IsSynchronized
        {
            get;
        }

        /// <summary>
        /// Gets an object that can be used to synchronize access to the 
        /// <see cref="T:System.Collections.ICollection"/>.
        /// </summary>
        /// <value>
        /// An object that can be used to synchronize access to the 
        /// <see cref="T:System.Collections.ICollection"/>.
        /// </value>
        object ICollection.SyncRoot
        {
            get;
        }

        public struct Enumerator : IEnumerator<T>
        {
            public T Current { get; }
            object IEnumerator.Current { get; }
            public bool MoveNext();
            public void Reset();
            public void Dispose();
        }
    }
}

Details

  • Implementation data structure will be a binary heap. Items with a greater comparison value will be returned first. (descending order)
  • Time complexities:

Operation Complexity Notes Construct Θ(1)

Construct Using IEnumerable Θ(n)

Enqueue Θ(log n)

Dequeue Θ(log n)

Peek Θ(1)

Count Θ(1)

Clear Θ(N)

Contains Θ(N)

CopyTo Θ(N) Uses Array.Copy, actual complexity may be lower ToArray Θ(N) Uses Array.Copy, actual complexity may be lower GetEnumerator Θ(1)

Enumerator.MoveNext Θ(1)

  • Additional constructor overloads that take the System.Comparison delegate were intentionally omitted in favor of a simplified API surface area. Callers can use Comparer.Create to convert a function or Lambda expression to an IComparer interface if necessary. This does require the caller to incur a one-time heap allocation.
  • Although System.Collections.Generic is not yet part of corefx, I propose that this class be added to corefxlab in the meantime. It can be moved to the primary corefx repository once System.Collections.Generic are added and there is consensus that its status should be elevated from experimental to an official API.
  • An IsEmpty property was not included, since there is no additional performance penalty calling Count. The majority of the collection data structures do not include IsEmpty.
  • The IsSynchronized and SyncRoot properties of ICollection were implemented explicitly as they are effectively obsolete. This also follows the pattern used for the other System.Collection.Generic data structures.
  • Dequeue and Peek throw an InvalidOperationException when the queue is empty to match the established behavior of System.Collections.Queue.
  • IProducerConsumerCollection was not implemented as its documentation states that it is only intended for thread-safe collections.

Open Questions

  • Is avoiding an additional heap allocation during calls to GetEnumerator when using foreach a strong enough rationale for including the nested public enumerator structure?
  • Should CopyTo, ToArray, and GetEnumerator return results in prioritized (sorted) order, or the internal order used by the data structure? My assumption is that the internal order should be returned, as it doesn't incur any additional performance penalties. However, this is a potential usability issue if a developer thinks of the class as a "sorted queue" rather a priority queue.
  • Does adding a type named PriorityQueue to System.Collections.Generic cause a potentially breaking change? The namespace is heavily used, and could cause a source compatibility problem for projects that include their own priority queue type.
  • Should items be dequeued in ascending or descending order, based on the output of IComparer? (my assumption is ascending order, to match the normal sorting convention of IComparer).
  • Should the collection be 'stable'? In other words, should two items with equal IComparison results be dequeued in the exact same order they are enqueued in? (my assumption is this isn't needed)

Updates

  • Fixed complexity of 'Construct Using IEnumerable' to Θ(n). Thanks @svick.
  • Added another option question regarding whether the priority queue should be ordered in ascending or descending order compared to the IComparer.
  • Removed NotSupportedException from explicit SyncRoot property to match behavior of other System.Collection.Generic types instead of using the newer pattern.
  • Made the public GetEnumerator method return a nested Enumerator struct instead of IEnumerable, similar to the existing System.Collections.Generic types. This is an optimization to avoid a heap (GC) allocation when using a foreach loop.
  • Removed ComVisible attribute.
  • Changed complexity of Clear to Θ(n). Thanks @mbeidler.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK