5

GA: 4. Quicksort

 2 years ago
source link: https://mingzhi198.github.io/p/ga-4.-quicksort/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Statement

  • This article is my study notes from the book: grokking algorithm: an illustrated guide for programmers and other curious people.
  • Please refer to the original work for more details and indicate the source for reprinting.

This chapter I learned divide-and-conquer and quicksort, both of which can help me solve problems elegently.

1. Divide & Conquer

It can take us some time to understand D&C. Therefore, we will see some problems to learn about it.

Example

If you are a farmer with a large land and plan to divide this farmland evenly into square plots.
Besides, the plots must be as big as possible.
Before continuing your reading, think about how can you solve this problem?

We can use D&C(Divde & Conquer) to solve this problem.
To solve a problem using D&C, we should follow two steps:

1. Figure out the base case, which should be the simplest possible case.
2. Divide or decrease you problem until it becomes the base case.

Suppose you have a land of 30 * 50, now you have the largest square with side of 30.
We have the rest of the land(20 * 30) and have the largest square of 20 * 20.
Then we are left with a land of 10 * 10, which is the answer.

  • Golang
func findMaxEvenSquareR(width, height int) int {
	if 0 == height || 0 == width {
		return height + width
	}

	if height > width {
		height = height % width
	} else {
		width = width % height
	}
	return findMaxEvenSquareR(width, height)
}

2. Exercises:

  1. Add up all numbers and return the total.
  • Python
def sumR(arr):
    if len(arr) == 1:
        return arr[0]
    return arr[0] + sumR(arr[1:])
  • Golang
func sumR(arr []int) (total int) {
	if len(arr) == 1 {
		return arr[0]
	}
	return arr[0] + sumR(arr[1:])
}
  1. Write a recursive function to count the number of elements in a list
  • Python
def countEleR(arr):
    if len(arr) == 1:
        return 1
    return 1 + countEleR(arr[1:])
  1. Find the maximum number in a list
  • Python
def findMaxR(arr):
    if len(arr) == 1:
        return arr[0]
    ret = findMaxR(arr[1:])
    if arr[0] > ret:
        return arr[0]
    else:
        return ret
  1. Write the D&C code for Binary Search
  • Python
def binarySearchR(list, item, low, high):
    if low > high:
        return None

    mid = (low + high) / 2
    guess = list[mid]
    if guess > item:
        high = mid -1
    else:
        low = mid + 1
    return binarySearchR(list, item, low, high)
  • Golang
func binarySearchR(list []int, item, low, high int) int {
	// 1. figure out the base case
	if low > high {
		return -1
	}

	//. 2. reduce the problem and get to the base case
	mid := (low + high) / 2
	guess := list[mid]
	if guess == item {
		return mid
	}
	if guess > item {
		high = mid - 1
	} else {
		low = mid + 1
	}
	return binarySearchR(list, item, low, high)
}

Tip: When we got a recursive function involving an array, the base case is often an empty array or an array with one element.

3. Quick Sort

3.1 Quicksort is a sorting algorithm, which is much faster than selection sort and commonly used in life.

The base case for quicksort are arrays that are empty and arrays with just one element.
Using D&C, we can break down a array until we get the base case.

  • Here are the steps for quick sort:
  1. Pick an element from the array as the pivot
  2. Find the elements smaller than the privot and the elements larger than it(This is called Partitioning).
  3. Call quicksort recursively on the two sub-arrays.
  4. The sub-arrays get sorted, and then we combine the whole thing to get a sorted array.

3.2 Here we can see one example from the book.

After partitioning, we get three parts: numbers less than the pivot, the pivot and numbers greater than the pivot.

quick_sort_m1.png

Next, we’ll quick sort the two sub-arrays.

Even though the two sub-arrays are not sorted, they’re partitioned.
However, sorting the whole array would be pretty easy, if they were sorted.

Then, we can combine the whole thing together, like this: left array + pivot + right array.
We do this recursively and will get a sorted array.
Here is the code, which would be easy to understand(In order to save space, We provide only the Golang code).

  • Golang
func quickSort(arr []int) (ret []int) {
	if len(arr) < 2 {
		ret = append(ret, arr...)
		return
	}

	pivot := arr[0]
	less, greater := []int{}, []int{}
	for _, v := range arr[1:] {
		if v <= pivot {
			less = append(less, v)
		} else {
			greater = append(greater, v)
		}
	}
	fmt.Println("less: ", less, ", pivot: ", pivot, ", greater: ", greater)
	lessRet := quickSort(less)
	greaterRet := quickSort(greater)

	ret = append(ret, lessRet...)
	ret = append(ret, pivot)
	ret = append(ret, greaterRet...)
	return
}

If you’d come through the explanation above, you would find this code easy to understand.
Nevertheless, it’s not efficent in memory.
Here you can see codes optimized by using no extra memory space.

func quickSortE(arr []int) {
	if len(arr) < 2 {
		return
	}

	l := len(arr)
	pivot := arr[l-1]
	j := 0 // idx of element bigger than pivot
	for i := 0; i < l-1; i++ {
		if arr[i] > pivot {
			continue
		}
		swap(arr, i, j)
		j++
	}
	swap(arr, j, l-1)
	quickSortE(arr[:j])
	quickSortE(arr[j+1:])
	return
}

func swap(arr []int, i, j int) {
	arr[i], arr[j] = arr[j], arr[i]
}

In order to have better performance, we use random element as the pivot:

func quickSortERand(arr []int) {
	if len(arr) < 2 {
		return
	}

	l := len(arr)
	rand.Seed(time.Now().UnixNano())
	pIdx := rand.Intn(l - 1)
	pivot := arr[pIdx]
	j := 0 // idx of element bigger than pivot
	for i := 0; i < l-1; i++ {
		if arr[i] > pivot {
			continue
		}
		swap(arr, i, j)
		j++
	}
	swap(arr, j, l-1)
	quickSortERand(arr[:j])
	quickSortERand(arr[j+1:])
	return
}

4. Big O notation

Quicksort is unique since its speed depends on the pivot we use.
Here we can see the most common Big O run times.

search_time_m1.png

Please pay attention to the example times in the chart because they are estimates of seconds per operation.
Compared with Merged Sort, which is O(nlogn) and faster, quicksort is different:
In the worst case, quicksorts takes O(n^2)time, which is as slow as selection sort.
In the average case, quicksorts takes O(nlogn)time.

5. Merge Sort vs. Quicksort

For algorithms with O(n), there are a constant for every operation.
Commonly, the constant is ignored, however, it’ll make a difference in some cases.

  • Quick sort has a smaller constant than merge sort.

Therefore, if they’re both O(nlogn)time, quicksort will be faster.
Besides, quicksort is faster in practice since it hits the average case way more often than the worst case.

6. Averge case vs. Worst case

The performance of quicksort is heavily affected by the pivot.
Suppose we always choose the first element as the pivot, and call quick with an sorted array.
Since quicksort doesn’t to see whether the input array is already sorted, it will still try to sort it.

As we can see in the following figure:

quick_sort_worst.png

Since we’re not splitting the array into two parts, one of the sub-arrays is always empty.
As we always get O(n) in every level of the stack, we finally get O(n)*O(n) = O(n^2) time.
Therefore, the call stack is so long that it takes O(n^2) time.

On the other hand, if we make the middle element as the pivot, the call stack would be much shorter(O(logn).
Therefore, we get operation time O(nlogn).

quick_sort_avg.png
  • However, the best case is also the average case.
  • If we choose a random element in the array as the pivot, quicksort will have O(nlogn)time on average.
Thank you for your effort to read this part of this passage.
What are included in this passage are my study notes about algorithm.
If you want to know more details about it, I strongly recommend you this book:
grokking algoriths: An illustrated guide for programmers and other curious people.
The author illustrates algorithms with pictures, which make people easily understand difficult questions.

Summary

  • D&C works by splitting a problem into smaller parts. And the base case is usually an array with no or only one element.
  • Choosing a random element as the pivot can make quicksort complete in O(nlog n) time on average.
  • Quicksort is faster than merge sort since the constant in Big O notation matters.
  • Since O(log n) is pretty faster than O(n) with a large list, the constant never matters for simple search versus binary search.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK