Re-learning C++ series: from C to C++

🔥 Hi, I'm Xiaoyu. This article has been included in GitHub · Androider-Planet middle. Here is the Android advanced growth knowledge system, pay attention to the official account [ Xiaoyu's study room ], don't get lost on the road to success!

foreword

As an Android developer, you may think that I am on the wrong track, and Android development does not require C++ knowledge. .

Well, if you think so, it can only mean that you are still an Android basic developer. C++ is used in advanced areas, such as performance optimization, NDK, audio and video, framework, ART virtual machine, etc., so learning C++ is very important for our Android development. In fact, it is very necessary.

This article is the first in the series of re-learning C++, I hope the article can inspire you.

Table of contents

1. Variable initialization problems of char type and char* type

### case:

void charBug() 
{
	char c1 = 'yes';//Truncate, take the last character: 's'
	char c2 = "yes";//Error: A value of type const char* cannot be used to initialize an entity of type char
	char c3 = &c1;//Error: A value of type char* cannot be used to initialize an entity of type char
	

	const char* cs1 = '/'; //Error: A value of type char cannot be used to initialize an entity of type const char*
	const char* cs2 = "/";//The correct value is: '/''\0'
	const char* cs3 = &c1;//Get the address value of c1 correctly

}

The above code shows:

  • 1. If the string is used as the value on the right side of the equal sign, the address of the string constant is assigned to the variable on the left.
  • 2. A const char type or a char type value (such as a string or character address) cannot be assigned to a char type variable.
  • 3. A value of type char cannot be directly assigned to a variable of type char or const char as an lvalue.

So how to avoid these operations in C++? use string

string s1(1, 'yes');//s
string s2(3, 'yes');//sss

string s3("yes");//yes
string s4("/");// /
string s5(1,'/');// /

Using string in C++ circumvents these compiler errors due to char* and char assignments together.

2.C language array as a parameter degeneration problem

case:

double average(int arr[]) {
	double result = 0.0;
	int len = sizeof(arr) / sizeof(arr[0]);
	for (int i = 0; i < len; i++) {
		result += arr[i];
	}
	return result/len;
}
int main()
{
   cout << "Hello World!\n";
   //charBug();
   int arr[] = { 1,2,3,4,5,6,7,8,9,10 };
   cout<<average(arr)<<endl;

}
The printed result is: 1 

It shows that arr is actually a pointer inside the function, better named p_arr, the sizeof of a pointer is 4 bytes, and arr[0] is also a pointer so len=1. C language designers consider that a large container cannot be passed into a function, but only the address of the container is passed to save space.

Let's see how C++ optimizes this problem.
Use containers in the STL to compute:

double average2(vector<int>& vec) {
	double result = 0.0;

	vector<int>::iterator it = vec.begin();
	for (; it != vec.end(); ++it) {
		result += *it;
	}
	return result / vec.size();

}
vector<int> vec{ 1,2,3,4,5,6,7,8,9,10 };
cout << average2(vec) << endl;

Print result: 5.5

The correct average result is output

3.C language shift problem

Case 1:

void bitMove() {
	char c1 = 0x63;//0110 0011
	c1 = c1 << 4;//shift right 4 bits
	printf("0x%02x\n",c1); //Calculation: 0011 0000 -> 0x30

	c1 = 0x63;
	c1 = c1 >> 4;//shift left 4 bits
	printf("0x%02x\n", c1);//Estimate: 0000 0110 -> 0x06
	
	char c2 = 0x93;//1001 0011
	c2 = c2 << 4;//shift right 4 bits
	printf("0x%02x\n", c2); //Calculation: 0011 0000 -> 0x30
	
	c2 = 0x93;
	c2 = c2 >> 4;//shift left 4 bits
	printf("0x%02x\n", c2);//Calculation: 0000 0110 -> 0x09?

}

result:
0x30 0x06 0x30 0xfffffff9

Seeing that the last value is 0xfffffff9, it is estimated that it is 0x06.
This is due to the difference between arithmetic shift and logical shift.

logical shift

For unsigned numbers, 0 is added to the low bit when shifting left, and 0 is also added to the high bit when shifting right.

arithmetic shift

For positive numbers of unsigned numbers, because the three codes are consistent, it is consistent with the complementing method of logical displacement. For negative numbers of signed numbers, 0 is added to the low bit when shifting left, and 1 is added to the high bit when shifting right. So it happens The previous 0x93 uses the high bit complement 1 in the process of left shifting, and the result is 0xfffffff9 form, because the negative number uses arithmetic shift.

How to avoid this kind of problem in C language? Consider converting signed numbers to unsigned numbers: char -> unsigned char

Here is another point why it is 0xfffffff9 instead of 0xf9. The question clearly uses the char type. Why are there 4 bytes?
The reason is that %x requires an unsigned integer variable. If the char type is passed in, there will be a process of integer promotion. Since 0xf9 is a negative char, the high bit will be filled with 1 when it is promoted, so the result is 0xfffffff9, not 0xf9.

The repair method is also to convert signed numbers to unsigned numbers: char -> unsigned char, so that the high bits will be filled with 0.

Case 2:

void bitMove1() {
	unsigned char x = 0xFF;
	const unsigned char BACK_UP = (1 << 7);
	const unsigned char ADMIN = (1 << 8);
	//printf("0x%x\n", BACK_UP);
	//printf("0x%x\n", ADMIN);

	if (x & BACK_UP) {
		cout << "BACK_UP" << endl;
	}
	if (x & ADMIN) {
		cout << "ADMIN" << endl;
	}

}  

The result is only printed except BACK_UP, why is this?
Let's open the two printf to see:
Result: one 0x80 one 0x00

The reason is that the number of shifts cannot be greater than the number of digits. If the shift exceeds the number of digits, all of them will be shifted out, resulting in a return to 0.

Both of the above cases are anomalies caused by unsigned bit shifts.

So how to avoid it in C++? use bitset

void bitMove2() {
	bitset<10> x = 0xFF;
	const bitset<10> BACK_UP = (1 << 7);
	const bitset<10> ADMIN = (1 << 8);
	printf("BACK_UP:0x%x\n", BACK_UP);
	cout<<"BACK_UP:binary:" << BACK_UP << endl;
	printf("ADMIN:0x%x\n", ADMIN);
	cout << "ADMIN:binary:" << ADMIN << endl;

	if ((x & BACK_UP) == BACK_UP) {
		cout << "BACK_UP" << endl;
	}
	if ((x & ADMIN) == ADMIN) {
		cout << "ADMIN" << endl;
	}

}
result:
BACK_UP:0x80
BACK_UP:binary:0010000000
ADMIN:0x100
ADMIN:binary:0100000000
BACK_UP

It can be seen that bitset specifies an unsigned number with a bit length of 10, so after shifting 8 bits, the result is 0x100 instead of 0.

4. The problem of forced conversion in C language

Case 1:

void cast1() {
	int arr[] = { 1,2,3,4 };
	cout <<"arr size:" << sizeof(arr) / sizeof(arr[0]) << endl;
	int thresholp = -1;
	if (sizeof(arr) / sizeof(arr[0]) > thresholp) {
		cout << "arr len is big than thresholp" << endl;
	}
	else
	{
		cout << "thresholp is big than arr len" << endl;
	}
	cout << (unsigned int)(-1) << endl;
}
result:
arr size:4
thresholp is big than arr len
4294967295

As a result, -1 is actually bigger than 4, and some students who don't understand the underlying details will be more confused.

The focus is on the judgment of sizeof(arr) / sizeof(arr[0]) > threshold, which involves a problem of forced conversion

  • 1.sizeof(arr) / sizeof(arr[0]) returns a value of type unsigned int, and threshold is a value of type int.

In computer comparison, if an unsigned type is compared with a signed type value, one side of the signed type will be converted to an unsigned type, and then the comparison will be performed. This is an implicit coercion operation.

The value of -1 converted to unsigned int type is: 4294967295, so the result obtained is threshold is big than arr len

  • 2. In this case, you only need to use an int type value to store sizeof(arr) / sizeof(arr[0])

    void cast1() {
    	int arr[] = { 1,2,3,4 };
    	cout <<"arr size:" << sizeof(arr) / sizeof(arr[0]) << endl;
    	int thresholp = -1;
    	int len = sizeof(arr) / sizeof(arr[0]);
    	if (len > thresholp) {
    		cout << "arr len is big than thresholp" << endl;
    	}
    	else
    	{
    		cout << "thresholp is big than arr len" << endl;
    	}
    }
    

    Both are signed types and will not be forced, so the correct result can be obtained.

Case 2:

void cast2() {
	double result = 0.0;
	int arr[] = { 10,20,30,40 };
	unsigned int len  = sizeof(arr) / sizeof(arr[0]);
	for (unsigned int i = 0; i < len; i++) {
		result += (1 / arr[i]);
	}
	cout << result << endl;
	return;
}
The result is 0,

**In fact, the problem lies in "1 / arr[i]", when the numerator and denominator are both int type, the result will also be an int type result, that is to say, the progress will be lost. **For the case where the numerator is 1, the result will be 0.
**How ​​to correct it? **Just change (1 / arr[i]) to (1.0 / arr[i]), since 1.0 / arr[i] will not lose precision, so the final result is a correct floating point number.

From the above two cases, it can be seen that some implicit conversions in C language will become some hidden bug s and traps, which cannot be checked.

Which solutions are also available in C++?
Displayed conversion methods can be used in C++: static_cast,const_cast,dynamic_cast,reinterpret_cast

  • static_cast; An operator equivalent to implicit conversion, used to represent explicit conversion.
  • const_cast: used to remove const and volatile attributes in composite types
  • dynamic_cast: conversion between parent and child class pointers
  • reinterpret_cast: conversion between pointer types, there are some risks

For case 2:

void cast2() {
	double result = 0.0;
	int arr[] = { 10,20,30,40 };
	unsigned int len  = sizeof(arr) / sizeof(arr[0]);
	for (unsigned int i = 0; i < len; i++) {
		result += (static_cast<double>(1))/ arr[i];
	}
	cout << result << endl;
	return;
}
Result: 0.208333

correct result

5.C language integer overflow problem

Case 1:

void intOverflow() {
	int i = 0x7ffffff0;//2147483632
	for (; i > 0; i++) {
		cout << "adding" << endl;
	}
	cout << "end ??" << endl;
}

At first glance, if i is used to be greater than 0, the for loop will not end.
actual results:

adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
adding
end ??-2147483648

After executing 16 additions, it still exits. This is because i is a variable of int type, and the integer of signed type: the range will be between -2147483648~2147483647. After the value is added to 2147483647, adding 1 will become -2147483648
It's as if the clock starts from 0 o'clock after it has gone around.

Case 2:

void intOverflow2() {
	int a = 200, b = 300, c = 400, d = 500;
	cout << a * b * c * d << endl;
}
result:-884901888

Here too, an integer overflow occurs and the result becomes negative.

How to solve this kind of large number overflow problem in C++? Use an extension library, such as the boost library https://www.boost.org.

#include <boost/multiprecision/cpp_int.hpp>
using namespace boost::multiprecision;

void intOverflow2() {
	cpp_int a = 200, b = 300, c = 400, d = 500;
	cout << a * b * c * d << endl;
}
Result: 12000000000

6.C language string defects.

Case 1:

void strTest() {
	char str1[] = "abcdef";
	cout << "str1_strlen:" << strlen(str1) << endl;
	cout <<"str1_sizeof:" << sizeof(str1) / sizeof(str1[0]) << endl;

	char str2[] = "abc\0def";
	cout << "str2_strlen:" << strlen(str2) << endl;
	cout << "str2_sizeof:" << sizeof(str2) / sizeof(str2[0]) << endl;

}
result:
str1_strlen:6
str1_sizeof:7
str2_strlen:3
str2_sizeof:8

From the above results, it can be seen that the strings in C language actually end with \0, which limits many application scenarios and has low operating efficiency.

Solution in C++
Use the string class in C++ or some open source library solutions such as the implementation of the redis library: https://redis.com https://github.com/redis

Redis is an open source (BSD licensed) in-memory data structure store used as a database, cache, message broker and streaming engine. Redis provides data structures,
Examples include strings, hashes, lists, sets, sorted collections with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.

For more information about Redis, you can refer to the Redis official website.

summary

  • 1.C language is a low-level language of a high-level language. It is small, efficient, and close to the bottom layer, but there are many details and traps
  • 2.C++ fully inherits the features of the C language, but proposes a series of more modern and engineering features, which are highly inclusive, but the language itself is more complicated.
  • 3. This chapter explains the representation of characters, strings, pointers, arrays, integers, etc., type conversion and shifting operations, explains the design of C and the corresponding solutions in C++, and helps everyone build a good C/C++ foundation. I'm Xiao Yu , see you next time.

Tags: C Android C++

Posted by varai on Fri, 13 Jan 2023 17:17:04 +0530