Skip to content

Commit

Permalink
Add "sortedPrefix(_:by)" to Collection (#9)
Browse files Browse the repository at this point in the history
  • Loading branch information
rakaramos authored Dec 4, 2020
1 parent e1c421c commit 3864606
Show file tree
Hide file tree
Showing 6 changed files with 452 additions and 0 deletions.
Binary file added Guides/Resources/SortedPrefix/FewElements.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Guides/Resources/SortedPrefix/ManyElements.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions Guides/SortedPrefix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Sorted Prefix

[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/PartialSort.swift) |
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/PartialSortTests.swift)]

Returns the first k elements of this collection when it's sorted.

If you need to sort a collection but only need access to a prefix of its elements, using this method can give you a performance boost over sorting the entire collection. The order of equal elements is guaranteed to be preserved.

```swift
let numbers = [7,1,6,2,8,3,9]
let smallestThree = numbers.sortedPrefix(3, by: <)
// [1, 2, 3]
```

## Detailed Design

This adds the `Collection` method shown below:

```swift
extension Collection {
public func sortedPrefix(_ count: Int, by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> [Element]
}
```

Additionally, a version of this method for `Comparable` types is also provided:

```swift
extension Collection where Element: Comparable {
public func sortedPrefix(_ count: Int) -> [Element]
}
```

### Complexity

The algorithm used is based on [Soroush Khanlou's research on this matter](https://khanlou.com/2018/12/analyzing-complexity/). The total complexity is `O(k log k + nk)`, which will result in a runtime close to `O(n)` if k is a small amount. If k is a large amount (more than 10% of the collection), we fall back to sorting the entire array. Realistically, this means the worst case is actually `O(n log n)`.

Here are some benchmarks we made that demonstrates how this implementation (SmallestM) behaves when k increases (before implementing the fallback):

![Benchmark](Resources/SortedPrefix/FewElements.png)
![Benchmark 2](Resources/SortedPrefix/ManyElements.png)

### Comparison with other languages

**C++:** The `<algorithm>` library defines a `partial_sort` function where the entire array is returned using a partial heap sort.

**Python:** Defines a `heapq` priority queue that can be used to manually achieve the same result.

4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ Read more about the package, and the intent behind it, in the [announcement on s
- [`randomStableSample(count:)`, `randomStableSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection, preserving their original relative order.
- [`uniqued()`, `uniqued(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Unique.md): The unique elements of a collection, preserving their order.

#### Partial sorting

- [`sortedPrefix(_:by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/SortedPrefix.md): Returns the first k elements of a sorted collection.

#### Other useful operations

- [`chunked(by:)`, `chunked(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes.
Expand Down
99 changes: 99 additions & 0 deletions Sources/Algorithms/SortedPrefix.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift Algorithms open source project
//
// Copyright (c) 2020 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//

extension Collection {
/// Returns the first k elements of this collection when it's sorted using
/// the given predicate as the comparison between elements.
///
/// This example partially sorts an array of integers to retrieve its three
/// smallest values:
///
/// let numbers = [7,1,6,2,8,3,9]
/// let smallestThree = numbers.sortedPrefix(3, by: <)
/// // [1, 2, 3]
///
/// If you need to sort a collection but only need access to a prefix of its
/// elements, using this method can give you a performance boost over sorting
/// the entire collection. The order of equal elements is guaranteed to be
/// preserved.
///
/// - Parameter count: The k number of elements to prefix.
/// - Parameter areInIncreasingOrder: A predicate that returns true if its
/// first argument should be ordered before its second argument;
/// otherwise, false.
///
/// - Complexity: O(k log k + nk)
public func sortedPrefix(
_ count: Int,
by areInIncreasingOrder: (Element, Element) throws -> Bool
) rethrows -> [Self.Element] {
assert(count >= 0, """
Cannot prefix with a negative amount of elements!
"""
)

// Do nothing if we're prefixing nothing.
guard count > 0 else {
return []
}

// Make sure we are within bounds.
let prefixCount = Swift.min(count, self.count)

// If we're attempting to prefix more than 10% of the collection, it's
// faster to sort everything.
guard prefixCount < (self.count / 10) else {
return Array(try sorted(by: areInIncreasingOrder).prefix(prefixCount))
}

var result = try self.prefix(prefixCount).sorted(by: areInIncreasingOrder)
for e in self.dropFirst(prefixCount) {
if let last = result.last, try areInIncreasingOrder(last, e) {
continue
}
let insertionIndex =
try result.partitioningIndex { try areInIncreasingOrder(e, $0) }
let isLastElement = insertionIndex == result.endIndex
result.removeLast()
if isLastElement {
result.append(e)
} else {
result.insert(e, at: insertionIndex)
}
}

return result
}
}

extension Collection where Element: Comparable {
/// Returns the first k elements of this collection when it's sorted in
/// ascending order.
///
/// This example partially sorts an array of integers to retrieve its three
/// smallest values:
///
/// let numbers = [7,1,6,2,8,3,9]
/// let smallestThree = numbers.sortedPrefix(3)
/// // [1, 2, 3]
///
/// If you need to sort a collection but only need access to a prefix of its
/// elements, using this method can give you a performance boost over sorting
/// the entire collection. The order of equal elements is guaranteed to be
/// preserved.
///
/// - Parameter count: The k number of elements to prefix.
///
/// - Complexity: O(k log k + nk)
public func sortedPrefix(_ count: Int) -> [Element] {
return sortedPrefix(count, by: <)
}
}
Loading

0 comments on commit 3864606

Please sign in to comment.