Sorting algorithm of data structure (learning notes and summary)

1, Introduction

Sorting is also called sort algorithm. Sorting is the process of arranging a group of data in a specified order.

2, Sort by:

  1. Internal sorting refers to loading all data to be processed into internal memory (memory) for sorting.
  2. External sorting method: the amount of data is too large to be loaded into memory, so it is necessary to sort with the help of external storage (files, etc.).
  3. Common sorting algorithm categories (see the following figure):

3, Time complexity of algorithm

Two methods of measuring the execution time of a program (algorithm)

1) Post statistical method
This method is feasible, but there are two problems: first, to evaluate the performance of the designed algorithm, it is necessary to actually run the program; Second, the statistics of the obtained time depend on the hardware, software and other environmental factors of the computer. In this way, the algorithm can only be faster if it runs in the same state of the same computer.
2) Method of ex ante estimation
The time complexity of an algorithm is analyzed to determine which algorithm is better

1. time frequency

  • Basic introduction
    Time frequency: the time spent by an algorithm is in direct proportion to the number of statements executed in the algorithm. Whichever algorithm has more statements executed, it takes more time. The number of statements executed in an algorithm is called statement frequency or time frequency. Denoted as T(n).
  • For example, to calculate the sum of all numbers from 1 to 100, we design two algorithms:
  1. Use for to calculate
 int sum=0;
   for(int i=1;i<=100;i++) {
	   sum+=i;
   }
  1. Direct calculation
 int sum = (1+100)*100/2;

For example: ignore constant term


As can be seen from the figure:

  1. 2n+20 and 2n as n becomes larger, the execution curve is infinitely close, and 20 can be ignored
  2. 3n+10 and 3N as n becomes larger, the execution curve is infinitely close, and 10 can be ignored

Ignore lower order items


As can be seen from the figure:

  1. 2n^2+3n+10 and 2n^2 as n becomes larger, the execution curve is infinitely close, and 3n+10 can be ignored
  2. As n^2+5n+20 and n^2 become larger, the execution curve is infinitely close, and 5n+20 can be ignored

Neglect factor


As shown in the figure:

  1. As the value of N becomes larger, 5n^2+7n and 3n^2 + 2n will execute curve coincidence, indicating In this case, 5 and 3 can be ignored.
  2. And n^3+5n and 6n^3+4n , Perform curve separation and explain how many times the method is critical

2. time complexity

  1. In general, the number of repetitions of the basic operation statement in the algorithm is a function of the problem scale n, which is expressed by T(n). If there is an auxiliary function f(n), so that when n approaches infinity, the limit value of T(n) / f(n) is a constant that is not equal to zero, then f(n) is a function of the same order of magnitude of T(n). It is recorded as T(n) =o (f(n)), and O (f(n)) is the asymptotic time complexity of the algorithm, which is called time complexity for short.
  2. T(n) is different, but the time complexity may be the same. For example: T(n)=n ²+ 7n+6 and T(n)=3n ²+ 2n+2 their T(n) is different, but their time complexity is the same, both of which are O(n ²).
  3. Method for calculating time complexity:
  • Replace all addition constants T(n)=n in the running time with constant 1 ²+ 7n+6 => T(n)=n ²+ 7n+1
  • In the modified run times function, only the highest order term T(n)=n is retained ²+ 7n+1 = > T(n)=n ²
  • Coefficient T(n) = n for removing the highest order term ² => T(n) = n ² => O (n ²)

Common time complexity:

  1. Constant order O(1)
  2. Logarithmic order O(log2n)
  3. Linear order O(n)
  4. Linear logarithmic order O(nlog2n)
  5. Square order O(n^2)
  6. Cubic order O(n^3)
  7. K-th power order O(n^k)
  8. Exponential order O(2^n)

Diagram corresponding to common time complexity:


Description:

  1. The time complexity of common algorithms is as follows: Ο (1) < Ο (log2n) < Ο (n) < Ο (nlog2n) < Ο (n2) < Ο (n3) < Ο (nk) < Ο (2n). As the problem scale n increases, the above time complexity increases, and the execution efficiency of the algorithm becomes lower
  2. As can be seen from the figure, we should avoid using exponential order algorithm as far as possible

4. average time complexity and worst time complexity

  1. Average time complexity refers to the running time of the algorithm when all possible input instances occur with equal probability.
  2. The worst-case time complexity is called the worst-case time complexity. The time complexity discussed in general is the time complexity in the worst case. The reason for this is that the time complexity in the worst case is the limit of the running time of the algorithm on any input instance, which ensures that the running time of the algorithm will not be longer than that in the worst case.
  3. Whether the average time complexity is consistent with the worst time complexity is related to the algorithm.

5. introduction to the spatial complexity of the algorithm

  1. Similar to the discussion of time complexity, the space complexity of an algorithm is defined as the storage space consumed by the algorithm, and it is also a function of the problem scale n.
  2. Space complexity is a measure of the amount of storage space temporarily occupied by an algorithm during operation. The number of temporary work units that some algorithms need to occupy is related to the scale n of the problem. It increases with the increase of N. when n is large, it will occupy more storage units, such as quick sort and merge sort algorithms. This is the case with cardinal sort
  3. When doing algorithm analysis, we mainly discuss the time complexity. From the perspective of user experience, the speed of program execution is more important. Some cache products
    The essence of redis (Memcache) and algorithm (cardinality sorting) is to trade space for time

Bubble sort algorithm

  • Introduction: it repeatedly visits the element column to be sorted, compares two adjacent elements in turn, and exchanges them if the order (e.g. from large to small, from Z to A) is wrong. The work of visiting elements is repeated until no adjacent elements need to be exchanged, that is, the element column has been sorted. The name of this algorithm comes from the fact that the smaller elements will slowly "float" to the top of the sequence (in ascending or descending order) through exchange, just as the bubbles of carbon dioxide in carbonated drinks will eventually float to the top, so it is called "bubble sorting".
  • Optimization: during the sorting process, each element is constantly approaching its own position. If there is no exchange after a comparison, the sequence is orderly. Therefore, a flag flag should be set during the sorting process to judge whether the elements have been exchanged. This reduces unnecessary comparisons. (the optimization mentioned here can be carried out after the bubble sort is written.)
    Demo:

    Idea: start with the element with smaller subscript), compare the values of adjacent elements in turn, and exchange if reverse order is found

    (1) A total of -1 large loop of array size
    (2) The number of times of each sorting is gradually decreasing
    (3) If we find that there is no exchange in a certain sort, we can end the bubble sort in advance. This is optimization
import java.util.Arrays;

public class BubbleSort {

	public static void main(String[] args) {
		int arr[] = {3, 9, -1, 10, 20};
		int temp=0;
		// The first sorting is to rank the largest number first from the bottom
				
				for (int j = 0; j < arr.length - 1 ; j++) {
					// If the preceding number is larger than the following number, swap
					if (arr[j] > arr[j + 1]) {
						temp = arr[j];
						arr[j] = arr[j + 1];
						arr[j + 1] = temp;
					}
				}
				
				System.out.println("Array after the second sorting");
				System.out.println(Arrays.toString(arr));
		
	
		
		// The second sorting is to rank the second largest number in the penultimate place
		
		for (int j = 0; j < arr.length - 1 - 1 ; j++) {
			// If the preceding number is larger than the following number, swap
			if (arr[j] > arr[j + 1]) {
				temp = arr[j];
				arr[j] = arr[j + 1];
				arr[j + 1] = temp;
			}
		}
		
		System.out.println("Array after the second sorting");
		System.out.println(Arrays.toString(arr));
		
		
		// The third sorting is to rank the third largest number in the penultimate position
		
		for (int j = 0; j < arr.length - 1 - 2; j++) {
			// If the preceding number is larger than the following number, swap
			if (arr[j] > arr[j + 1]) {
				temp = arr[j];
				arr[j] = arr[j + 1];
				arr[j + 1] = temp;
			}
		}

		System.out.println("Array after the third sorting");
		System.out.println(Arrays.toString(arr));
		
		// The fourth sorting is to rank the fourth largest number in the penultimate position

		for (int j = 0; j < arr.length - 1 - 3; j++) {
			// If the preceding number is larger than the following number, swap
			if (arr[j] > arr[j + 1]) {
				temp = arr[j];
				arr[j] = arr[j + 1];
				arr[j + 1] = temp;
			}
		}

		System.out.println("Array after the fourth sorting");
		System.out.println(Arrays.toString(arr)); 
		
	}

From the above code, we can see that there is a common feature in loop judgment

    j < arr.length - 1 - 0;
	j < arr.length - 1 - 1;
	j < arr.length - 1 - 2;
	j < arr.length - 1 - 3;

J < arr.length - 1 - I can be used; replace

int arr[] = {3, 9, -1, 10, 20};
		int temp=0;
		int i=0;
		// The first sorting is to rank the largest number first from the bottom
				for(i=0;i<arr.length;i++) {
				for (int j = 0; j < arr.length - 1 ; j++) {
					// If the preceding number is larger than the following number, swap
					if (arr[j] > arr[j + 1]) {
						temp = arr[j];
						arr[j] = arr[j + 1];
						arr[j + 1] = temp;
					}
				}
				System.out.println("Section"+(i+1)+"Array after sorting");
				System.out.println(Arrays.toString(arr));
				}


However, the algorithm is somewhat redundant. It can be seen from the figure that there is no need to calculate again in lesson 3. Therefore, in order to solve this problem, it is necessary to add another step of judgment

int arr[] = {3, 9, -1, 10, 20};
		int temp=0;
		int i=0;
		boolean flag = false; // Identification variable, indicating whether the exchange has taken place
		// The first sorting is to rank the largest number first from the bottom
				for(i=0;i<arr.length-1;i++) {
				for (int j = 0; j < arr.length - 1 ; j++) {
					// If the preceding number is larger than the following number, swap
					if (arr[j] > arr[j + 1]) {
						flag = true;
						temp = arr[j];
						arr[j] = arr[j + 1];
						arr[j + 1] = temp;
					}
				}
				if (!flag) { // In one sort, no exchange has occurred
					break;
				} else {
					flag = false; // Reset flag!!!, Make the next judgment
				}
				System.out.println("Section"+(i+1)+"Array after sorting");
				System.out.println(Arrays.toString(arr));
				}

Tags: Java Algorithm data structure

Posted by nutshell on Mon, 30 May 2022 20:56:02 +0530