Character function, string function and memory function


This chapter explains in detail the use of character functions, string functions and memory functions, and simulates the implementation of library functions.

The processing of characters and strings in C language is very frequent, but C language itself has no string type. Strings are usually placed in constant strings or character arrays. String constants apply to string functions that do not modify it. Character arrays are similar to string variables and can be modified.

Let's learn about string functions in detail.

Find string length

strlen function

First, let's look at the declaration of the strlen function

size_t strlen ( const char * str );

Parameter str is a string with '\ 0' as the end flag. The type is const char *, indicating that it is a constant string. The content pointed to by the character pointer cannot be modified;

The return value is the number of characters, excluding the '\ 0' at the end. The return value type is size_t. Because the number cannot be less than 0;

Note:

  • The string ends with '\ 0'. The strlen function returns the number of characters that appear before '\ 0' in the string (the number of characters before the first '\ 0' and does not contain '\ 0').

  • The string pointed to by the parameter must end with '\ 0'.

  • Note that the return value of the function is size_t, which is unsigned (error prone)

If the source space pointed to does not end with '\ 0', the returned result is a random number because the position of '\ 0' is uncertain.

The return values of sizeof operator and strlen function are of unsigned int type

Look at the following code. What is the answer?

#include <stdio.h>
int main()
{
	const char*str1 = "abcdef";
	const char*str2 = "bbb";
	if(strlen(str2)-strlen(str1)>0)
	{
	printf("str2>str1\n");
	}
	else
	{
	printf("srt1>str2\n");
	}
	return 0;
}

Many people think that "STR1 > STR2" will be printed. In fact, it is not. Because the return value of the library function strlen is size_t (unsigned int), arithmetic conversion is required for the operation of two size_t-type data. The result of the expression is size_t, so strlen (STR2) - strlen (STR1) > 0 is always true, and the result is "STR2 > STR1".

Simulate the implementation of strlen function

Next, let's simulate the implementation of strlen function. When learning the chapter of practical debugging skills, we have implemented strlen function. Here we review it again.

3 methods

1. Create intermediate variable count

#include <stdio.h>
size_t my_strlen(const char* str)
{
	assert(str);
	int count = 0;
	while (*str != '\0')
	{
		str++;
		count++;
	}
	return count;
}
int main()
{
	char arr[] = "abcdef";

	size_t len = my_strlen(arr);
	printf("%u\n",len);
	return 0;
}

2. Pointer - pointer

size_t my_strlen(const char* str)
{
	assert(str);
	char* ret = str;
	while (*str != '\0')
	{
		str++;
	}
	return str - ret;
}

3. Recursion

size_t my_strlen(const char* str)
{
	assert(str);
	if (*str != '\0')
	{
		return 1 + my_strlen(str + 1);
	}
	return 0;
}

String copy

strcpy function

char* strcpy(char * destination, const char * source)

Function function: copy the source space string to the target space

There are two parameters, src and dest
Because we only use the source src without changing it, the type is const char *, and the target space dest is to be changed, so the type is char*

The return value is the address of the first character in the target space

Note:

  • Copies the C string pointed by source into the array pointed by destination, including the terminating null character (and stopping at that point).

  • The source string must end with '\ 0'. (if there is no '\ 0' in the source space, it is easy to cause illegal memory access.)

  • The '\ 0' in the source string will be copied to the target space. (copy until the first '\ 0' is encountered, and '\ 0' is also copied to the target space.)

  • The destination space must be large enough to hold the source string.

  • The target space must be variable.

int main()
{
	//The space pointed to by str is immutable. The constant string is stored in the constant area of the static area and cannot be changed
	char* str = "abcdef";

	char arr1[] = "xxxxxxxx";
	char arr2[] = "abc";

	//Copy the contents of arr2 to arr1
	strcpy(arr1,arr2);
	printf("%s\n",arr1);//abc

	char arr3[] = "xx";
	//strcpy(arr3,arr2); / / an error is reported. The target space must be larger than the source space


	char arr4[] = {'a','b','c','d'};
	//strcpy(arr1,arr4); / / illegal memory access

	return 0;
}

Simulated implementation of strcpy function

Version 1

#include <assert.h>
char* my_strcpy(char* dest,const char* src)
{
	assert(dest && src);
	char* ret = dest;

	while (*src != '\0')
	{
		*dest = *src;
		dest++;
		src++;
	}
	*dest = *src;
	return ret;
}


int main()
{
	char arr1[] = "xxxxxx";
	char arr2[] = "abc";

	my_strcpy(arr1,arr2);

	printf("%s\n",arr1);

	return 0;
}

Version 2

char* my_strcpy(char* dest, const char* src)
{
	assert(dest && src);
	char* ret = dest;

	while (*dest++ = *src++)
	{
		;
	}
	return ret;
}

int main()
{
	char arr1[] = "xxxxxx";
	char arr2[] = "abc";

	my_strcpy(arr1,arr2);

	printf("%s\n",arr1);

	return 0;
}

Splice string

strcat function

char *strcat( char *strDestination, const char *strSource );

Function function: copy the source string and append it to the target space. The '\ 0' at the end of the target space will be overwritten by the first character of the source string, and the '\ 0' at the end of the source string will also be copied

Appends a copy of the source string to the destination string. The terminating null character in destination is overwritten by the first character of source, and a null-character is included at the end of
the new string formed by the concatenation of both in destination.

#include <stdio.h>
#include <string.h>
int main()
{
	char arr1[20] = "abc";
	strcat(arr1,"def");
	printf("%s\n",arr1);
	return 0;
}

be careful:

  • Append starts from the '\ 0' position in the target space
  • The source string must end with '\ 0'.
  • The target space must be large enough to accommodate the contents of the source string.
  • The target space must be modifiable.

Simulated implementation of strcat function

#include <assert.h>
char* my_strcat(char* dest,const char* src)
{
	assert(dest && src);
	char* ret = dest;

	//1. Find the end of the target string \ 0
	while (*dest)
	{
		dest++;
	}
 
	//2. Append the source string until \ 0
	while (*dest++ = *src++)
	{
		;
	}
	//3. Return the starting address of the target space
	return ret;
}

int main()
{
	char arr1[20] = "abc";
	char arr2[] = "def";

	my_strcat(arr1,arr2);
	printf("%s\n",arr1);
	return 0;
}

string comparison

strcmp function

int strcmp( const char *string1, const char *string2 );

Function function: compare the size of two strings

The return value for each of these functions indicates the lexicographic relation of string1 to string2.

Value Relationship of string1 to string2
< 0 string1 less than string2
0 string1 identical to string2
> 0 string1 greater than string2

Standard provisions:

  • If the first string is greater than the second string, a number greater than 0 is returned
  • If the first string is equal to the second string, 0 is returned
  • If the first string is less than the second string, a number less than 0 is returned
#include <string.h>
int main()
{
	char arr1[] = "abcdef";
	char arr2[] = "abcq";
	char arr3[] = {'a','b','q'};

	//int ret = strcmp(arr1,arr3);
	int ret = strcmp(arr1,arr2);

	if (ret > 0)
	{
		printf("=\n");
	}
	else if (ret < 0)
	{
		printf("<\n");
	}
	else
	{
		printf(">\n");
	}
	return 0;
}

Simulated implementation of strcmp function

#include <assert.h>
int my_strcmp(const char* s1,const char* s2)
{
	assert(s1 && s2);
	while (*s1 == *s2)
	{
		//If the comparison is still equal at the end, then s1 = s2
		if (*s1 == '\0')
		{
			//s1 = s2
			return 0;
		}
		s1++;
		s2++;
	}
	//s1 != s2
	return *s1 - *s2;
}

int main()
{
	char arr1[] = "abcd";
	char arr2[] = "dfe";
	char arr3[] = {'a','b','c','d'};

	int ret = my_strcmp(arr1,arr3);
	printf("%d\n",ret);//Greater than 0
	return 0;
}

Disadvantages: the above library functions strcpy, strcat and strcmp are string functions with unlimited length! Not safe enough!

Length restricted function

Let's introduce the string function with limited length, which is relatively safe!

strncpy function

char * strncpy ( char * destination, const char * source, size_t num
);

  • Copy num characters from the source string to the destination space.

  • If the length of the source string is less than num, after copying the source string, append 0 to num after the destination.

Simulate the implementation of strncpy function

Version 1

#include <assert.h>
#include <string.h>
char* my_strncpy(char* dest,const char* src,size_t num)
{
	assert(dest && src);
	char* ret = dest;

	//1.num < strlen(src)
	if (num <= strlen(src))
	{
		while (num)
		{
			*dest++ = *src++;
			num--;
		}
	}
	else
	{
		while (*dest++ = *src++)
		{
			num--;
		}
		
		//Supplement 0
		for (int i = num; i > 0; i--)
		{
			*dest = 0;
			dest++; 
		}
	}
	return ret;
}


int main()
{
	char arr1[] = "abcdef";
	char arr2[] = "xxxx";

	char* ret = my_strncpy(arr1,arr2,6);
	printf("%s\n",ret);
	return 0;
}

Version 2

#include <assert.h>
#include <string.h>
char* my_strncpy(char* dest, const char* src, size_t num)
{
	assert(dest && src);
	char* ret = dest;

	while ((num) && (*src != '\0'))
	{
		*dest++ = *src++;
		num--;
	}
	while (num > 0)
	{
		*dest = 0;
		dest++;
		num--;
	}

	return ret;
}

int main()
{
	char arr1[] = "abcdef";
	char arr2[] = "xxxx";

	char* ret = my_strncpy(arr1,arr2,6);
	printf("%s\n",ret);
	return 0;
}

strncat function

char * strncat ( char * destination, const char * source, size_t num )

  • Appends the first num characters of source to destination, plus a terminating null-character
  • If the length of the C string in source is less than num, only the content up to the terminating null-character is copied

Copy num characters from the source string, append them to the target space, and add the end flag '\ 0',
If the length of the source string is less than num, only the string will be copied and '\ 0' will also be copied.

#include <stdio.h>
int main()
{
	char arr1[] = "abc\0xxxxx";
	char arr2[] = "defghi";

	strncat(arr1,arr2,3);
	printf("%s\n",arr1);//abcef
	return 0;
}

Simulate the implementation of strncat function

#include <assert.h>
//Simulate the implementation of strncat function
char* my_strncat(char* dest,const char* src,int num)
{
	assert(dest && src);
	char* ret = dest;

	//1. Find '\ 0' at the end of dest
	while (*dest)
	{
		dest++;
	}

	//2. Splice the source to the position '\ 0' at the end of dest
	while (*src != '\0' && num)
	{
		*dest++ = *src++;
		num--;
	}

	return ret;
}


int main()
{
	char arr1[30] = "abcdefghi";
	char arr2[] = "jkl";
	my_strncat(arr1,arr2,4);
	return 0;
}

strncmp function

int strncmp ( const char * str1, const char * str2, size_t num );

The comparison shows that another character is different, or a string ends, or num characters are all compared

int main()
{
	char arr1[] = "abcdef";
	char arr2[] = "abcdeg";

	int ret = strncmp(arr1,arr2,4);

	printf("%d\n",ret);	
	return 0;
}

Simulated implementation of strncmp function

#include <assert.h>
//Simulated implementation of strncmp function
int my_strncmp(const char* s1,const char* s2,int num)
{
	assert(s1 && s2);

	while (*s1 == *s2 && num)
	{
		if (*s1 == '\0')
		{
			return 0;
		}
		s1++;
		s2++;
		num--;
	}	
	
	return *s1 - *s2;
}


int main()
{
	char arr1[] = "abcdefgh";
	char arr2[] = "abcd";
	int ret = my_strncmp(arr1,arr2,5);
	return 0;
}

strcpy_s is provided by the VS compiler and is not available in other compilers.

String lookup function

strstr function

char * strstr ( const char * str, const char * substr)

Function function: judge whether a string is a substring of another string

Returns a pointer to the first character of string substr in string str

Returns a pointer to the first occurrence of str2 in str1, or a null pointer if str2 is not part of str1

Simulate the implementation of STR function

Determines whether a string src is a substring of another string dest

1. First, we compare the first character of the target string with the first character of the source string. If it is not equal, we compare the second character of the target space with the first character of the source space,... This cycle until we find the position of the first character equal to the first character of the source space in the target space. Therefore, we need a pointer s1 to the target space Between, the pointer moves backward to find the first matching character

2. After finding the first character, we need to remember the position of the character, so we need a pointer ps to mark it (because if this matching fails, we need to re match the source string from the next position of ps. this matching failure does not mean that src is not a substring of dest ~ ~)

3. After PS marks this position, s1 moves backward and compares the characters from the string one by one, so we also need a pointer s2 to the source string. If all the characters in src have not been compared, dest has ended, or some two characters are not equal, this matching fails

4. When matching fails, s1 should point to the next position of ps, and s2 should return to the position of the first character of the source string. Repeat steps 1 to 3. Here, use the loop until the matching succeeds or fails. If s1 traverses and still does not succeed, exit the loop

Let's simulate the implementation of the str function

#include <string.h>

char* my_strstr(const char* str1,const char* str2)
{
	assert(str1 && str2);

	char* s1;
	char* s2;
	char* cp = str1;//Start matching the position of the first character on the

	if (*str2 == '\0')
	{
		return str1;
	}

	while (*cp != '\0')
	{
		s1 = cp;
		s2 = str2;

		//Find the characters on the first match, and then match back one by one
		//while ((*s1 !='\0') && (*s2!='\0') && (* s1 == *s2))
		while (*s1 && *s2 && (* s1 == *s2))
		{
			s1++;
			s2++;
		}
		if (*s2 == '\0')
		{
			//If the matching is successful, the position of the first character on the matching is returned
			return cp;
		}
		//If there is no exact match, cp moves back one bit and continues to rematch
		cp++;
	}
	//can't find
	return NULL;
}

int main()
{
	char arr1[] = "i am good student,she is a student";
	char arr2[] = "student";

	//Find the first occurrence of arr2 in arr1
	char* ret = strstr(arr1,arr2);
	if (ret == NULL)
	{
		printf("can't find\n");
	}
	else
	{
		printf("eureka,%s\n",ret);
	}
	return 0;
}

Cut string

strtok function

char * strtok ( char * str, const char * sep );

  • The sep parameter is a string that defines the set of characters used as delimiters

  • The first parameter specifies a string that contains 0 or more tags separated by one or more separators in the sep string.

  • The strtok function finds the next tag in str, ends it with \ 0, and returns a pointer to this tag. (Note: the strtok function will change the string to be manipulated, so the string segmented by the strtok function is generally a temporary copy and can be modified.)

  • When the first parameter of the strtok function is not NULL, the function will find the first tag in str, and the strtok function will save its position in the string.

  • When the first parameter of strtok function is NULL, the function will start at the position saved in the same string to find the next tag.

  • If there are no more tags in the string, a NULL pointer is returned.

When strtok is used for the first time, the first parameter is not NULL, and then the first parameter of strtok is NULL pointer

strtok has memory function. The internal implementation uses static variables. When it is called next time, it can still remember the position of the last mark.

Code 1

#include <stdio.h>
#include <string.h>
int main()
{
	//192.168.1.207
	//. is a delimiter

	char arr1[] = "ly@qq.com";
	char* sep = "@.";

	char* ret = NULL;
	//First use, the first parameter is not NULL
	ret = strtok(arr1,sep);
	while (ret != NULL)
	{
		printf("%s\n", ret);
		//For later use, the first parameter is NULL
		ret = strtok(NULL,sep);
	}


	return 0;
}

Code 2

#include <stdio.h>
int main()
{
	char *p = "zhangpengwei@bitedu.tech";
	const char* sep = ".@";
	char arr[30];
	char *str = NULL;
	strcpy(arr, p);//Copy the data and process the contents of the arr array
	for(str=strtok(arr, sep); str != NULL; str=strtok(NULL, sep))
	{
	printf("%s\n", str);
	}
	return 0;
}

strerror function

char* strerror ( int errnum );

Function function: returns the error code and the corresponding error information
The parameter errnum represents the error code
The return value is a pointer to the first character of the error message string

#include <stdio.h>
#include <string.h>
int main()
{
	printf("%s\n",strerror(0));//No error
	printf("%s\n",strerror(1));//Operation not permitted
	printf("%s\n",strerror(2));//No such file or directory
	printf("%s\n",strerror(3));//No such process
	return 0;
}

When the C language library call fails, the error code will be stored in the errno variable

#include <stdio.h>
#include <string.h>
#Include < errno. H > / / header files that must be included

int main ()
{
	FILE * pFile;
	pFile = fopen ("unexist.ent","r");
	if (pFile == NULL)
		printf ("Error opening file unexist.ent: %s\n",strerror(errno));
		//errno: Last error number
	return 0;
}

Character function

Character conversion:
int tolower ( int c );
int toupper ( int c );

Take the isupper and tower functions as examples:
The header file is < ctype. H >

#include <stdio.h>
#include <ctype.h>
int main()
{
	int i = 0;
	char str[] = "Test String.\n";
	char c;
	while (str[i])
	{
		c = str[i];
		if (isupper(c))
			c = tolower(c);
		putchar(c);
		i++;
	}
	return 0;
}

Memory operation function

strcpy can only copy strings. What if you copy other types of data? C language has a series of memory operation library functions.

Memory copy memcpy

void *memcpy( void *dest, const void *src, size_t count );

count is the number of bytes to be copied;
void * is a universal pointer and can receive any type of pointer;

  • The function memcpy copies num bytes of data back to the memory location of destination from the location of source.

  • This function will not stop when it encounters' \ 0 '.

  • If there is any overlap between source and destination, the copied result is undefined

int main()
{
	int arr1[] = {1,2,3,4,5,6,7,8,9,10};
	int arr2[20] = {0};

	//Memory Copy 
	memcpy(arr2,arr1,sizeof(int)*5);

	return 0;
}

Simulated implementation of memecpy function

One byte one byte copy

void* my_memcpy(void* dest,const void* src,size_t count)
{
	assert(dest && src);
	void* ret = dest;
	while (count--)
	{
		*(char*)dest = *(char*)src;
		dest = (char*)dest + 1;
		src = (char*)src + 1;
	}
	return ret;
}

int main()
{
	int arr1[] = {1,2,3,4,5,6,7,8,9,10};
	int arr2[20] = {0};

	//Memory Copy 
	//memcpy(arr2,arr1,sizeof(int)*5);
	my_memcpy(arr2,arr1,sizeof(int)*5);

	return 0;
}

memcpy for memory overlap, the results are unknown

Memory copy function memmove

void * memmove ( void * destination, const void * source, size_t num );

  • The difference from memcpy is that the source memory block and target memory block processed by the memmove function can overlap.

  • If the source space and target space overlap, you have to use the memmove function

Simulate the implementation of memmove function

It is discussed in two cases:

1. The two string spaces do not overlap


In this case, the result is the same whether copying from front to back or from back to front.

2. Two strings overlap

It can be divided into two cases
(1) dest > src


At this time, if you copy from front to back, the previous copy can be completed, but the content of the later part of src has been changed, so it is not feasible

We copy from back to front and copy what will change first

(2)dest < src


It is not appropriate to copy from back to front in this case, so we copy from front to back

void* my_memmove(void* dest,const void* src,int num)
{
	void* ret = dest;
	if (dest < src)
	{
		//Copy from front to back
		while (num--)
		{
			*(char*)dest = *(char*)src;
			dest = (char*)dest + 1;
			src = (char*)src + 1;
		}
	}
	else
	{
		//Copy back to front
		while (num--)
		{
			*((char*)dest + num) = *((char*)src + num);
		}

	}
	return ret;
}
#include <memory.h>
int main()
{
	char arr1[] = "abcdef";
	
	my_memmove(arr1 + 2,arr1, 3);//ababcf
	//memmove(arr1, arr1 + 2, 3);//abcbcf

	my_memmove(arr1, arr1 + 2, 3);//

	return 0;
}

The memcpy function under VS compiler has the same functions as memmove, but it cannot be guaranteed that memcpy under other compilers can also achieve the same functions as memcpy.

Memory comparison memcmp function

int memcmp ( const void * ptr1, const void * ptr2, size_t num )
Byte by byte comparison

#include <stdio.h>
#include <string.h>
int main()
{
	char buffer1[] = "ABCDEFGHI";
	char buffer2[] = "ABCDEFGIH";
	int n;
	n = memcmp(buffer1, buffer2, sizeof(buffer1));
	if (n > 0) 
		printf("'%s' is greater than '%s'.\n", buffer1, buffer2);
	else if (n < 0) 
		printf("'%s' is less than '%s'.\n", buffer1, buffer2);
	else
		printf("'%s' is the same as '%s'.\n", buffer1, buffer2);
	return 0;
}

Memory setting memset function

void *memset( void *dest, int c, size_t count );

Function function: set memory block
Set the first count bytes of the target space to the character c

/* MEMSET.C: This program uses memset to
 * set the first four bytes of buffer to "*".
 */

#include <memory.h>
#include <stdio.h>

void main( void )
{
   char buffer[] = "This is a test of the memset function";

   printf( "Before: %s\n", buffer );
   memset( buffer, '*', 4 );
   printf( "After:  %s\n", buffer );
}


//Output
//Before: This is a test of the memset function 
//After:  **** is a test
//of the memset function
int main()
{
	char arr1[] = "abcdefgh";

	memset(arr1 + 2,0,3);
	printf("%s\n",arr1);

	return 0;
}

Tags: C string

Posted by el_quijote on Wed, 22 Sep 2021 20:19:39 +0530