CPT204-Advanced OO Programming: Sorting排序
目录
1.排序算法
1.1冒泡排序Bubble Sort
1.2归并排序Merge Sort
1.3快速排序Quick Sort
1.4堆排序:Heap Sort:
1.4.1二叉树Binary tree
1.4.2完全二叉树Complete Binary Tree
1.4.3二叉堆Binary Heap
1.4.4堆排序Heap sort
1.4.4.1存储堆Storing a Heap
1.4.4.2向堆中添加元素Adding Elements to a Heap
1.4.4.3移除根节点并重建堆Removing the Root and Rebuild the Heap
1.4.5堆类
1.4.6时间复杂度
1.排序算法
1.1冒泡排序Bubble Sort
- 定义:对待排序的列表反复进行遍历,比较每一对相邻元素,如果它们的顺序错误就将它们交换。Repeatedly steps through the list to be sorted, compares each pair of adjacent items and swaps them ifthey are in the wrong order
- 问题:如果列表几乎有序或已经有序,典型的冒泡排序会在 k 小于 list.length 时一直进行遍历,即使在这些遍历中实际上没有任何元素需要交换,也是在浪费时间和资源。A typical bubble sort would keep doing passes as long as k < list.length, even though no swap is actually needed in any of these passes. Just wasting time and resources.
- 优化:Optimization
·示例: [1, 2, 3, 4] --> 不会存在 list[i] > list[i + 1] 的情况;代码块中的 if 语句不会执行,因此 needNextPass 保持为 false;在第二趟(k = 2)时,由于 `k < list.length && needNextPass` 中 needNextPass 为 false,所以停止排序。
- 时间复杂度分析
·在最优情况下,冒泡排序算法只需要第一趟就能发现数组已经有序——不需要进行下一趟。由于第一趟中比较的次数是 n - 1,所以冒泡排序的最优时间复杂度为 O(n)。In the best case, the bubble sort algorithm needsjust the first pass to find that the array is already sorted—no next pass is needed. Since the number of comparisons is n - 1 in the first pass, the best- case time for a bubble sort is O(n).
·最坏情况示例:初始列表:[5, 4, 3, 2, 1]
>因此,对于 5 个元素,我们总共做了 4 + 3 + 2 + 1 次比较。如果有 n 个元素,在最坏情况下,我们需要比较 (n - 1) + (n - 2) + ... + 2 + 1 = n/(n-1)/2次。So, 5 elements, we do 4+3+2+1 comaprisons. If n elements, in the worst case, we compare (n - 1) + (n - 2) + ... + 2 + 1 = n(n - 1) / 2 times.
·因此,冒泡排序在最坏情况下的时间复杂度为 $O(n^2)$。
- 代码:
1.2归并排序Merge Sort
- 定义:
The merge-sort algorithm can be described recursively as follows:
·算法将数组分成两个子数组,对每个子数组分别递归地执行归并排序。The algorithm divides the array into two halves and applies a merge sort on each half recursively.
·当两个子数组都排序完成后,算法将它们合并。After the two halves are sorted, the algorithm then merges them.
- 代码:
public static void mergeSort(int[ ] list) {
if (list.length > 1) {// Recursive base case: stop when condition unsatisfied
// Split the 1st half (recursive step)
int[] firstHalf = new int[list.length / 2];
System.arraycopy(list,0,firstHalf,0,list.length / 2);
mergeSort(firstHalf);
// Split the 2nd half (recursive step)
int secondHalfLength = list.length - list.length / 2;
int[ ] secondHalf = new int[secondHalfLength] ;
System.arraycopy(list,list.length / 2 ,secondHalf, 0 , secondHalfLength);
mergeSort(secondHalf);
// SortMerge (only happens AFTER both recursive calls finish)
merge( firstHalf, secondHalf, list) ;
}
}
public static void merge(int[] list1, int[] list2, int[] temp){
int current1 = 0; // Current index in list1, the first half
int current2 = 0; // Current index in list2, the 2nd half
int current3 = 0; // Current index in temp, storing data temporarily
// While the indices are in the list
while (current1 < list1.length && current2 < list2.length) {
if (list1[current1] < list2[current2])
// If current element in list1 is smaller, add it to temp
temp[current3++] = list1[current1++];
else
// Otherwise, add the current element in list2 to temp
temp[current3++] = list2[current2++];
}
// list2 finished, but there are remaining elements in list1, add them to temp
while (current1 < list1.length)
temp[current3++] = list1[current1++];
// list1 finished, but there are remaining elements in list2, add them to temp
while (current2 < list2.length)
temp[current3++] = list2[current2++];
}
- 时间复杂度:
·设 T(n)为使用归并排序对n个元素的数组进行排序所需的时间 Let T(n) denote the time required for sorting an array of n elements using merge sort.
>第一个 T(n/2)是对数组前半部分进行排序的时间,第二个 T(n/2)是对数组后半部分进行排序的时间。The first T(n/2) is the time for sorting the first half ofthe array and the second T(n/2) is the time for sorting the second half
>合并的代价为 2n - 1,其中包含 n - 1次比较(用于比较两个子数组中的元素)和 n次移动(将每个元素放入临时数组)。Megering cost 2n-1 because n - 1 comparisons (for comparing the elements of the two subarrays) and n moves (to place each element into the temporary array)
>反复将 T(n/2) 代入公式,就可以得出归并排序的时间复杂度为 O(n log n)。Repeatedly substitute T(n/2) into the formula we will find the time complexity of merge sort is O(nlogn).
1.3快速排序Quick Sort
- 定义:
·算法从数组中选择一个元素,称为枢轴。The algorithm selects an element, called the pivot, in the array.
·它将数组划分为两个部分,使得第一部分中的所有元素都小于或等于枢轴,而第二部分中的所有元素都大于枢轴。It partitions (divides) the array into two parts so all the elements in the first part are less than or equal to the pivot, and all the elements in the second part are greater than the pivot.
·然后,递归地对第一部分和第二部分分别应用快速排序,以对它们进行排序。The quick-sort algorithm is then recursively applied to the first part and then the second part to sort them out.
- 代码:
public static void quickSort(int[] list, int first, int last) {
if (last > first) {
int pivotIndex = partition(list, first, last);
quickSort(list, first, pivotIndex - 1);
quickSort(list, pivotIndex + 1, last);
}
}
public static int partition(int[] list, int first, int last) {
int pivot = list[first];
int low = first + 1; // position, not value
int high = last; // position, not value
while (high > low) {
//low pointer moves right when
//1. low <=high and 2. list[low] <= pivot
while (low <= high && list[low] <= pivot)
low++;
//high pointer moves left when
//1. low <=high and 2. list[high] > pivot
while (low <= high && list[high] > pivot)
high--;
// If low < high, swap the two elements.
if (high > low) {
int temp = list[high];
list[high] = list[low];
list[low] = temp;
}
}
// Ensure high pointer point at an element
// less than or equal to pivot
while (high > first && list[high] >= pivot){
high--;
}
// Swap pivot with list[high]
if (pivot > list[high]) {
list[first] = list[high];
list[high] = pivot;
return high;
} else // e.g., a sotred list
return first;
}
·这个 quickSort() 定义了快速排序的递归结构,因为它不断调用自身。partition() 也在这个过程中被递归调用,直到满足基准情况(last > first)。这样就保证了列表可以递归地划分为左子列表和右子列表、子子列表、子子…子列表。This quickSort() defines the recursive structure of Quick Sort, as it keeps calling itself. The partition() is also recursively called in the process, until the base case (last>first). This ensures the list can be recursively divided into left and right sublist, subsublist, subsub...sublist.
- 时间复杂度分析
·对于一个包含n个元素的数组,进行划分操作需要n次比较和n次移动。因此,划分操作所需的时间是 O(n)。To partition an array of n elements, it takes n comparisons and nmoves. Thus, the time required for partition is O(n).
·在最佳情况下,每次枢轴将数组划分为两个大小相同的部分。In the best case, each time the pivot divides the array into two parts of the same size
·在平均情况下,虽然不一定完全相同,但两个子数组的大小非常接近。In the avearge case, maybe not exactly the same, but the size of the two sub arrays are very close
·在最坏情况下,每次枢轴将数组划分为一个大的子数组和一个空的子数组。大的子数组的大小比上一次划分时少一个元素。例如(假设第一个元素是划分时的枢轴):In the worst case, the pivot divides the array each time into one big subarray with the other array empty. The size of the big subarray is one less than the one before divided
>[1, 2, 3, 4, 5, ..., n],大小为n。
>在第一次划分时,枢轴为 1,我们得到:In the first partition, pivot=1, we have ->
左侧:[空],右侧:[2, 3, 4, 5, ... , n],右侧子数组大小为n-1。
left: [empty], right [2,3,4,5...n], right sub array size = n-1
> 在下一次划分时,枢轴为 2,我们得到:In the next partition, pivot=2, we have ->
左侧:[空],右侧:[3, 4, 5, 6, ... , n],右侧子数组大小为n-2。
left: [empty], right [3,4,5,6...n], right sub array size = n-2
>以此类推,直到数组被递归地划分为大小为 1 的子数组,总共需要进行n-1次划分。continues this way, we will have to recursively divide the arrary n-1 times till the sub array size =1
>回想一下,每次划分操作的时间复杂度是 O(n),因为每次划分时都需要比较所有元素。Recall that the time complexity of each partition is O(n), as it compares all elements
during each division.
>因此,我们进行了n-1次O(n)的操作,所以最坏情况的时间复杂度是O(n^2)。So we did n-1 times O(n), so, we get the worst time complexity to be O(n^2)
1.4堆排序:Heap Sort:
1.4.1二叉树Binary tree
- 定义:二叉树是一种层级结构:它要么是空的,要么由一个元素组成,称为根节点(root),以及两个不同的二叉树,分别称为左子树和右子树。A binary tree is a hierarchical structure: it either is empty or it consists of an element, called the root, and two distinct binary trees, called the left subtree and right subtree
·路径的长度是路径中边的数量The length of a path is the number ofthe edgesin the path
·节点的深度是从根节点到该节点的路径长度。The depth of a node is the length ofthe path from the root to that node
1.4.2完全二叉树Complete Binary Tree
- 一个二叉树是完全二叉树,当且仅当:A binary tree is complete if:
·除了可能是最后一层外,所有层都被完全填满。All levels are completely filled except possibly the last level
·即使最后一层可能没有填满,但必须从左到右填充,没有空隙。Even though the last level may not be full, it must be filled from left to right, without gaps
- 例子
·第一个树是完全二叉树,因为所有层都被完全填满。1st - complete, as all levels are completely filled
·第二个树是完全二叉树,虽然最后一层没有完全填满(缺少一个节点,位于 14 旁边),但它是从左到右填充的,没有空隙。2nd - complete, although the last level is not full (missing one node next to 14), it is filled from left to right, without gaps
·第三个树是非完全二叉树,从左到右存在一个空隙(在节点 22 旁边)。3rd, incomplete, a gap from left to right (next to node 22)
·第四个树是非完全二叉树,第二层没有填满,而我们只允许最后一层(在这个例子中是第三层)没有填满。4th, incomplete, the 2nd level is not full, while we only allow the last level (level 3 in this case) not to be full
1.4.3二叉堆Binary Heap
- 二叉堆是具有以下性质的二叉树:A binary heap is a binary tree with the following properties:
·它是一个完全二叉树,It is a complete binary tree, and
·每个节点都大于或等于其任何子节点Each node is greater than or equal to any ofits children
- 示例堆:示例不是堆,因为根节点(39)小于其右子节点(42)。
1.4.4堆排序Heap sort
- 堆排序使用二叉堆,过程包括两个主要阶段:Heap sort uses a binary heap and the process consists of two main phases:
·堆构建:Heap construction
>所有元素首先被插入到一个最大堆中All the elements are first inserted into a max heap
·重复移除:Repeated removal
>反复移除根节点,它是堆中当前最大的元素。被移除的元素实际上被移动到数组的末尾,形成一个从后往前增长的已排序数组。Repeatedly remove the root node, which is the current largest element in the heap. The removed element is actually moved to the end of the array, forming a sorted array that grows from the back.
- 示例:
·例如:[10, 5, 3, 4]
·第一次移除:[..., 10]
·第二次移除:[..., 5, 10]
·第三次移除:[..., 4, 5, 10]
·最终结果:[3, 4, 5, 10]
1.4.4.1存储堆Storing a Heap
- 如果堆的大小事先已知,堆可以存储在ArrayList或数组中。A heap can be stored in an ArrayList or an array ifthe heap size is known in advance
·对于位于位置 i的节点,它的左子节点位于位置2i + 1,右子节点位于位置 2i + 2,而它的父节点位于索引 (i - 1) / 2。For a node at position i, its left child is at position 2i+1 and its right child is at position 2i+2, and its parent is at index (i-1)/2
- 例如:对于位置 4 的节点,它的两个子节点位于位置 2·4 + 1 = 9 和 2·4 + 2 = 10,它的父节点位于索引 (4 - 1) / 2 = 1(不是 1.5,使用整数除法)。 For example: the for a nood at position 4, and its two children are at positions 2*4+1=9 and 2*4+2 =10, and its parent is at index (4-1)/2=1 (not 1.5, integer division)
1.4.4.2向堆中添加元素Adding Elements to a Heap
- 向堆中添加新节点时,首先将其添加到堆的末尾,然后通过以下算法重新构建堆结构:To add a new node to a heap, first add it to the end of the heap and then rebuild the tree with this algorithm:
Let the last node be the current node;
while (the current node is greater than its parent) {
Swap the current node with its parent;
Now the current node is one level up;
}
1.4.4.3移除根节点并重建堆Removing the Root and Rebuild the Heap
- 通常我们需要移除最大元素,它是堆中的根节点。Often we need to remove the maximum element, which is the root in a heap
- 根节点被移除后,必须重建堆以保持堆的性质(父节点 ≥ 子节点),使用以下算法: After the root is removed, the tree must be rebuilt to maintain the heap property (e.g., parent>=child) using this algorithm:
- 例子:从堆中移除根节点 62Removing root 62 from the heap
·用堆中的最后一个节点 9 替换它。replaces it with the last node in the heap: 9
1.4.5堆类
- 堆类
- 移除方法:removw
- 堆排序Heap Sort
1.4.6时间复杂度
- 堆排序时间复杂度:O(nlog n)
- 空间复杂度Space Complexity
·归并排序和堆排序的时间复杂度都是 O(n log n)。Both merge and heap sorts require O(n logn) time.
·归并排序在合并两个子数组时需要一个临时数组;而堆排序不需要额外的数组空间。A merge sort requires a temporary array for merging two subarrays; a heap sort does not need additional array space.
·因此,堆排序在空间上比归并排序更高效。 a heap sort is more space efficient than a merge sort.