[initial level of data structure - complexity] it only takes 3ms to run... I'm so proud

preface

This issue is our first real study of data structures. Let's have a look~

1. Data structure and algorithm

1.1 data structure

: the way in which computers store and organize data. There is a specific relationship between these data, and at the same time, they form a set.

Simply put

Data structure is to manage data in memory.

1.2 algorithm

: process unprocessed data into processed data.

In short, algorithm is "processing".

So how to measure the quality of the algorithm?

If the same thing is handled by different people, it will have different effects; The same pile of data is handled by different algorithms with different efficiency.

To measure the quality of the algorithm, predecessors introduced the concept of complexity

2. Complexity

Processing a pile of data: it takes time to execute the code; Space is needed to implement the algorithm

So there is time complexity and space complexity

How to express complexity? Running time of computer test? Accurately observe the memory condition?

It is obviously unrealistic to have different machines running at different speeds and other disturbances. Predecessors thought of using functions to express complexity

2.1 representation of complexity

Big O notation: indicates the asymptotic behavior of a function

With it, we can not only describe the general complexity of the algorithm, but also not cumbersome

Speaking of using functions to express complexity, we have these expressions

  • O( 1 )
  • O( N )
  • O( N^2)
  • ...

These functions are also commonly called large order O

This representation is also called the large O asymptotic representation

How do you get the N in the big O function?

Look down slowly~

2.2 time complexity

: used to measure the speed of algorithm execution

Can the speed of algorithm execution be measured by time? No, the machine is different, and the speed is different. My program only ran for 3ms. Maybe it's just because I'm the equipment brother.

How to measure the speed of the algorithm?

The number of statements executed. No matter what machine, the more statements are executed, the more time it takes to execute.

therefore

The time complexity of the algorithm can only be expressed by the execution times of the basic operations in the algorithm

Let's take an example:

cnt++; How many times has this statement been executed?

int cnt = 0;

void f1(int n)
{
	int i = 0;
	for (i = 0; i < n; i++)
	{
		int j = 0;
		for (j = 0; j < n; j++)
		{
			cnt++;
		}
	}
}

void f2(int n)
{
	int i = 0;
	for (i = 0; i < 2 * n; i++)
	{
		cnt++;
	}
}

int m = 20;

void f3(int m)
{
	while (m--)
	{
		cnt++;
	}
}

int main()
{
	f1(n);

	f2(n);

	f3(m);
	return 0;
}
  • f1(n): two-level for loop controlled by N -- n^2 times
  • f2(n): one layer for loop controlled by 2*n - 2*n times
  • f3(m):m is 10, while m-- control - 10 times
  • Total: n^2 + 2*n + m times

So is the complexity written as O(n^2 + 2*n + m)?

We mentioned that the big O symbol means that the function is asymptotic. What is the real meaning of function asymptotic? As I understand it, N approaches infinity

:

  • For n^2, 2*n is the younger brother
  • There is no difference between N and 2*n
  • Constant is my brother
  • ...

2.2.1 derivation of large order O

In fact, this also reflects the derivation method of large O function:

  • Replace all addition constants in brackets with constant 1
  • Only the highest order items are reserved in brackets
  • If the term of the highest order exists and is not 1, its coefficient is removed

The time complexity of the above program is expressed by the large o order, which is O(n^2)

2.2.2 common time complexity

2.2.3 worst case

In addition to big O, there is another criterion for computing time complexity:

For the algorithm, we generally pay attention to its worst-case operation; Individual algorithms will pay attention to different operation conditions according to their characteristics: average and worst

For example, Hill ranking focuses on the average, because the worst case rarely occurs

It's called expectation management when it comes to being tall, hahahaha

There is such an algorithm, the worst complexity is O(n^2), usually O(n), tell you the complexity is O(n^2)

You get the program and run, O(n), yo, pretty fast; I happen to encounter the worst case, O(n^2), "well... It's really the complexity of O(n^2)"

2.2.4 example drill

In the last few examples, calculate their time complexity

Example 1

void f1(int n)
{
 	int count = 0;
 	for (int k = 0; k < 2 * n ; ++ k)
	{
	 	++count;
 	}
 	int m = 10;
	 while (m--)
 	{
 		++count;
 	}
 	printf("%d\n", count);
}
  • The first for loop is controlled by 2*n, and the number of executions is n^2 as n increases
  • The second cycle is controlled by constant m, which is executed 10 times and constant times
  • Then the time complexity of f1(int n) is O(n^2) -- the constant order is brother

Example 2

void f2(int n, int m) 
{
 int count = 0;
 for (int k = 0; k < m; ++ k)
 {
 ++count;
 }
 for (int k = 0; k < n ; ++ k)
 {
 ++count;
 }
 printf("%d\n", count);
}
  • The first for loop is controlled by M. with the increase of M, it is executed m times
  • The second for loop is controlled by N, which is executed n times as n increases
  • Then the time complexity of f2(int n) is O(m+n)

Example 3

void BubbleSort(int* a, int n) 
{
 	assert(a);
 	for (size_t end = n; end > 0; --end)
 	{
 		int exchange = 0;
 		for (size_t i = 1; i < end; ++i)
 		{
 			if (a[i-1] > a[i])
 			{
 				Swap(&a[i-1], &a[i]);
 			exchange = 1;
 			}
 		}
 	if (exchange == 0)
		 break;
    }
}
  • The first for loop is controlled by N, and the loop body is executed n times with the increase of n
  • The second for loop is controlled by end. With the increase of N, the loop body executes 1+2+3+... + (n-2)+(n-1) times, which is an arithmetic sequence, n*(n-1)*1/2, that is, n^2 times
  • Then the time complexity of BubbleSort is O(n^2) -- compared with the n^2 of the first layer, it is a younger brother
  • *Interestingly, we added an optimization to judge whether it is orderly, so the best case of BubbleSort is O(n) -- traversing the array n times

Example 3

const char * strchr ( const char * str, int character );

The function of strc is to find characters in strings

  • Best case: it only needs constant times to find, O(1)

  • Average situation: it takes n/2 times to find, O(n/2)

  • Worst case: it takes n times to find, O(n)

  • The characteristics of this algorithm are not special, and we pay attention to the worst case: the time complexity is O(n)

After practicing here, many partners may feel the law. In fact, they just look at the variables that control the cycle! n/m corresponds to O(n)/O(m); 100 corresponds to O(1)

But is it really that simple? In fact, the number of executions is not only determined by the variables that control the loop, but also involves the logic of the algorithm

Just look at the two-point search below

Example 4

int BinarySearch(int* a, int n, int x)
{
 	assert(a);
 	int left = 0;
 	int right = n-1;
 	// [begin, end]: begin and end are left closed and right closed intervals, so there is a = sign
 	while (left <= right)
	 {
 		int mid = left + ((right-left)>>1);
        if (a[mid] < x)
 			begin = mid+1;
		else if (a[mid] > x)
 			end = mid-1;
 		else
 			return mid;
 	}
 	return -1; 
}

The while loop is controlled by left and right. With the increase of N, the search range is constantly changing: n /2 /2 /... /2 /2, and finally the range is reduced to only one number. That is, every time you search the range /2, you end up with 1

  • Suppose you find x times, n /2 /2 /.... /2 /2 = 1 -- > n = 1*2^x = 2^x -- > x=
    O ( l o g 2 n ) O(log_2n) O(log2​n)

  • In the calculation of complexity, we often omit the base 2

  • Then the time complexity of the binary search algorithm is
    O ( l o g n ) O(log_n) O(logn​)

Example 3

const char * strchr ( const char * str, int character );		

*strchr is used to find the characters in the string. Since it is a search, it can also be divided into situations

  • Best case: O(1)
  • Average: O(n/2)
  • Worst case: O(n)
  • Time complexity: O(n)

Example 4

long long Fac(size_t n) 
{
	if (1 == n)
		return 1;

	return Fac(n - 1) * n;jjjjjkjjjjjjj
  • n! = 1*2*3*...*n
  • As n increases, the number of recursions increases, and each call executes code only a constant number of times
  • Then the time complexity is O(n)

What if so?

long long Fac(size_t n) 
{
    size_t M = n;
	while (M--)
	{
		printf("%d ", n);
	}
    
	if (1 == n)
		return 1;

	return Fac(n - 1) * n;
}
  • As n increases, the number of calls increases, and the code is executed n times per call
  • Then the time complexity is O(n^2)

Example 5

long long Fib(size_t n)
{
	if (n < 3)
		return 1;

	return Fib(n - 1) + Fib(n - 2);
}

  • It can be seen that the recursion number of the function is 2^0 + 2^1 +... + 2(n-1), and the sum of the first n terms of the proportional sequence is 2n - 1

  • Time complexity is O(2^n)

This complexity is too high. It's useless to give n a big point. I can't watch it anymore. Let's optimize it

//Iterative Fib
long long Fib(size_t n)
{
	if (n < 3)
		return 1;

	long long f1 = 1, f2 = 1, f3;
	size_t i = 0;
	for (i = 3; i <= n; i++)
	{
		//Find the ith Fibonacci number
		f3 = f1 + f2;
		//iteration
		f1 = f2;
		f2 = f3;
	}

	return f3;
}
  • Only execute the loop body n-3 times
  • Then the time complexity is O(n)

2.4 spatial complexity

: used to measure the space required for algorithm execution

One thing we need to know: time cannot be reused; Space can be reused

in other words

Open + release or apply + destroy = = no additional space is used

  • The space applied in the code block is basically not additional space - except for the code block, it is destroyed, such as function calls and loops

With the above foundation, we can better introduce:

Space complexity can only be represented by additional space applied

2.4.1 example drill

Example 1

void BubbleSort(int* a, int n) 
{
 	assert(a);
 	for (size_t end = n; end > 0; --end)
 	{
 		int exchange = 0;
 		for (size_t i = 1; i < end; ++i)
 		{
 			if (a[i-1] > a[i])
 			{
 				Swap(&a[i-1], &a[i]);
 			exchange = 1;
 			}
 		}
 	if (exchange == 0)
		 break;
    }
}

Circular space use

It's OK to create end only once, but have exchange and i been created repeatedly?

They are created repeatedly, but look carefully at their positions. They are all in small code blocks - they are destroyed when the curly braces come out, create destroy - create destroy... They have been reusing the same space

  • end, exchange, i: constant variables
  • So the spatial complexity is O(1)

Summary: the space in the cycle is constantly refreshed and reused

Example 2

long long Fib(size_t n)
{
	if (n < 3)
		return 1;

	return Fib(n - 1) + Fib(n - 2);
}

Space usage of function:

The stack frames of recursive functions do not run at the same time. They are opened up from left to right. Only after the left is completed and destroyed, can space be opened up for the right to run

In this way, when the red stack frame is finished, it is purple's turn; At this time, the red stack frame has been destroyed, and purple will open the stack frame in the memory space of the original red stack frame; Similarly, after purple is destroyed, green opens stack frames in the original memory space of red and purple successively. Call back and forth in the red space (the same space)

  • The red function call on the left, from Fib(n) to Fib(1), has been called a total of n-1 times, and the subsequent calls are in the reuse space

  • Then the space complexity of the recursive version of Fib is O(n)

Summary: function stack frames are destroyed when they are used up, and the same space is often reused to create function stack frames

That's all for sharing in this issue. Please correct the deficiencies

Bacon's blog, make progress with you!

Tags: Java Algorithm data structure

Posted by minifairy on Wed, 10 Aug 2022 22:45:15 +0530