C++ multithreading development

preface

Thread generation (NPTL)

  • When Linux was first developed, threads were not really supported in the kernel. But it can indeed treat the process as a schedulable entity through the clone() system call. This call creates a copy of the calling process, which shares the same address space as the calling process. The LinuxThreads project uses this call to simulate thread support in user space. Unfortunately, this method has some disadvantages, especially in signal processing, scheduling and inter process synchronization. In addition, this threading model does not meet the requirements of POSIX.
  • To improve LinuxThreads, you need to support the kernel and rewrite the thread library. Two competing projects have begun to meet these requirements. A team including IBM developers carried out the NGPT (next generation POSIX threads) project. At the same time, some Red Hat developers have launched NPTL projects. NGPT was abandoned in mid-2003, leaving this field entirely to NPTL.
  • NPTL, or Native POSIX Thread Library, is a new implementation of Linux threads. It overcomes the shortcomings of LinuxThreads and meets the requirements of POSIX. Compared with LinuxThreads, it provides significant improvements in performance and stability.
  • Check the current pthread library version: getconf GNU_LIBPTHREAD_VERSION

The concept of thread

  • Like process, thread is a mechanism that allows applications to execute multiple tasks concurrently. A process can contain multiple threads. All threads in the same program will execute the same program independently and share the same global memory area, including initialized data segments, uninitialized data segments, and heap memory segments. (UNIX process in the traditional sense is only a special case of multithreaded program, which contains only one thread).
  • Process is the smallest unit for CPU to allocate resources, and thread is the smallest unit for operating system scheduling and execution.
  • Threads are lightweight processes (LWP:Light Weight Process). In the Linux environment, the essence of threads is still processes.
  • View the LWP (thread) number of the specified process: ps – Lf pid

1, The difference between thread and process

  • Information between processes is difficult to share. Except for the read-only code segment, the parent-child processes do not share memory, so some inter process communication methods must be adopted to exchange information between processes.
  • The cost of calling fork() to create a process is relatively high. Even if the write time replication technology is used, it still needs to copy a variety of process attributes such as memory page table and file descriptor table, which means that the cost of fork() call is still expensive in time.
  • Threads can share information easily and quickly. Just copy the data into shared (global or heap) variables.
  • Creating a thread is usually 10 times or more faster than creating a process. Threads share virtual address space, so there is no need to copy memory by copy on write, and there is no need to copy page tables.

Resources shared by threads

Process ID and parent process ID
Process group ID and session ID
User ID and user group ID
Document descriptor table
Signal processing
File system related information: file permission mask (umask), current working directory
Virtual address space (except stack,.text)

Thread unshared resources

Thread ID
Signal mask
Thread specific data
error variable
Real time scheduling strategy and priority
Call link information of stack, local variables and functions

2, C++ thread function

Thread creation

Generally, the thread where the main function is located is called the main thread (main thread), and the other threads created are called sub threads.
By default, there is only one process in the program, fork() function call, and two processes.
By default, there is only one thread in the program, pthread_create() function call, 2 threads.

#include <pthread.h>
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);
    - Function: create a child thread
    - Parameters:
        - thread: Outgoing parameters. After the thread is successfully created, the thread of the child thread ID Is written to this variable.
        - attr : Set the properties of the thread, generally using the default value, NULL
        - start_routine : Function pointer, which is the logic code that the sub thread needs to process
        - arg : For the third parameter, transfer parameters
    - Return value:
        Success: 0
        Failed: error number returned. This error number is the same as before errno It's different.
        Get information about the error number:  char * strerror(int errnum);

Thread termination, thread id acquisition and comparison

When the main thread exits, it will not affect other normal running threads.

#include <pthread.h>
void pthread_exit(void *retval);
    Function: terminate a thread. The thread in which it is called indicates which thread is terminated
    Parameters:
        retval:You need to pass a pointer as a return value. You can pthread_join()Obtained from.
        No need to pass parameter write NULL Can

pthread_t pthread_self(void);
    Function: get the thread of the current thread ID

int pthread_equal(pthread_t t1, pthread_t t2);
    Function: compare two threads ID Equal or not
    Different operating systems, pthread_t The implementation of types is different. Some are unsigned long integers, and some are unsigned long integers
    It is implemented by using structures.

And terminating the connection of threads (recycling thread resources)

If a thread is not recycled after execution, a "zombie thread" will be generated. Any thread can recycle other threads, but the main thread is generally used for recycling. pthread_join can recycle thread resources. It is a blocking function, and can only recycle one thread at a time.

#include <pthread.h>
int pthread_join(pthread_t thread, void **retval);
    - Function: connect with a terminated thread;
            Reclaim the resources of sub threads;
            This function is a blocking function, and only one sub thread can be recycled at a time;
            It is generally used in the main thread.
    - Parameters:
        - thread: Of the child threads that need to be recycled ID
        - retval: Receive the return value of the child thread when it exits (secondary pointer)
    - Return value:
        0 : success
        Not 0 : Failed, returned error number

There are two considerations for the demo of recycling sub thread resources:

  1. Threads have their own stack space, if pthread_exit(void *retval); The parameter in is a local variable (stored in the stack). After the thread exits, the stack space is released, and the value of this memory is uncertain, so pthread_ The parameters of the exit function need to be global variables.
  2. pthread_ The second parameter of the join function is the secondary pointer. The purpose of this design is that the function can modify the value of the primary pointer. Here is an explanation.

int * a=xx; change(int * a); This is actually a value transfer, which will produce a copy of the argument. The function cannot modify the value of A.

#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>

int value = 10;

void * callback(void * arg) {
    printf("child thread id : %ld\n", pthread_self());
    // sleep(3);
    // return NULL; 
    // int value = 10; //  local variable
    pthread_exit((void *)&value);   // return (void *)&value;
} 

int main() {

    // Create a child thread
    pthread_t tid;
    int ret = pthread_create(&tid, NULL, callback, NULL);

    if(ret != 0) {
        char * errstr = strerror(ret);
        printf("error : %s\n", errstr);
    }

    // Main thread
    for(int i = 0; i < 5; i++) {
        printf("%d\n", i);
    }

    printf("tid : %ld, main thread id : %ld\n", tid ,pthread_self());

    // Main thread calls pthread_join() reclaims resources of child threads
    int * thread_retval;
    ret = pthread_join(tid, (void **)&thread_retval);

    if(ret != 0) {
        char * errstr = strerror(ret);
        printf("error : %s\n", errstr);
    }

    printf("exit data : %d\n", *thread_retval);

    printf("Recycling sub thread resources succeeded!\n");

    // Let the main thread exit. When the main thread exits, it will not affect other normal running threads.
    pthread_exit(NULL);

    return 0; 
}

Separation of threads (recycling thread resources)

Separate a thread. When the separated thread terminates, it will automatically release resources and return to the system.

#include <pthread.h>
int pthread_detach(pthread_t thread);
    - Function: separate a thread. When the separated thread terminates, it will automatically release resources and return to the system.
	      1.It cannot be separated many times, which will produce unpredictable behavior.
	      2.If you cannot connect to a separated thread, an error will be reported.
    - Parameter: of the thread to be separated ID
    - Return value:
	        Success: 0
	        Failed: return error number

Thread cancellation (halfway out of execution)

You can terminate the operation of a thread, but it does not terminate immediately, but only when the child thread executes to a cancellation point. Cancellation point: some system calls specified by the system can be roughly understood as switching from user area to kernel area. This position is called cancellation point.

#include <pthread.h>
int pthread_cancel(pthread_t thread);
    - Function: cancel thread (let thread terminate)
    - Parameters: thread: thread  id
    - Return value: success 0, failure -1

Thread properties

Input: man pthread at the terminal_ attr_ , Then press the tab key twice to get the functions related to thread properties. The following are some functions that operate thread properties:

int pthread_attr_init(pthread_attr_t *attr);
    - Initialize thread property variables

int pthread_attr_destroy(pthread_attr_t *attr);
    - Release resources for thread properties

int pthread_attr_getdetachstate(const pthread_attr_t *attr, int *detachstate);
    - Get the status attribute of thread separation

int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate);
    - Set the status property of thread separation

int pthread_attr_getstacksize(const pthread_attr_t *attr,size_t  *size);
	- Get the size of the thread stack

int pthread_attr_setstacksize(pthread_attr_t *attr, size_t stacksize);
	- Set the size of the thread stack

demo of setting thread separation property

#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>

void * callback(void * arg) {
    printf("chid thread id : %ld\n", pthread_self());
    return NULL;
}

int main() {

    // Create a thread Attribute Variable
    pthread_attr_t attr;
    // Initialize attribute variables
    pthread_attr_init(&attr);

    // set a property
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);

    // Create a child thread
    pthread_t tid;

    int ret = pthread_create(&tid, &attr, callback, NULL);
    if(ret != 0) {
        char * errstr = strerror(ret);
        printf("error1 : %s\n", errstr);
    }

    // Get the stack size of the thread
    size_t size;
    pthread_attr_getstacksize(&attr, &size);
    printf("thread stack size : %ld\n", size);

    // Output the id of main thread and sub thread
    printf("tid : %ld, main thread id : %ld\n", tid, pthread_self());

    // Release thread attribute resources
    pthread_attr_destroy(&attr);

    pthread_exit(NULL);

    return 0;
}

3, Thread synchronization

  • The main advantage of threads is that they can share information through global variables. However, this convenient sharing comes at a cost: you must ensure that multiple threads do not modify the same variable at the same time, or that a thread does not read variables being modified by other threads.
  • Critical area refers to the code fragment that accesses a shared resource, and the execution of this code should be atomic operation, that is, when a thread accesses the same shared resource, other threads should not execute this fragment.
  • Thread synchronization: that is, when a thread is operating on memory, other threads cannot operate on this memory address until the thread completes the operation, other threads can operate on this memory address, while other threads are in a waiting state.

Mutex (mutex lock)

  • To avoid problems when threads update shared variables, mutex (the abbreviation of mutual exclusion) can be used to ensure that only one thread can access a shared resource at the same time. Mutexes can be used to guarantee atomic access to any shared resource.
  • Mutexes have two states: locked and unlocked. At most one thread can lock the mutex at any time. Trying to lock a locked mutex again may block the thread or fail with an error, depending on the method used when locking.
  • Once a thread locks a mutex, it becomes the owner of the mutex. Only the owner can unlock the mutex. Generally, different mutexes will be used for each shared resource (which may be composed of multiple related variables). When each thread accesses the same resource, the following protocol will be adopted:
    1. Lock mutex for shared resources
    2. Access shared resources (critical area code)
    3. Unlock mutex unlock mutex
  • If multiple threads try to execute this piece of code (a critical area), in fact, only one thread can hold the mutex (other threads will be blocked), that is, only one thread can enter this code area at the same time.
Type of mutex pthread_mutex_t
int pthread_mutex_init(pthread_mutex_t *restrict mutex, const pthread_mutexattr_t *restrict attr);
    - Initialize mutex
    - Parameters:
        - mutex :  Mutex variables that need to be initialized
        - attr :  Mutex related attributes, NULL
    - restrict : C The modifier of language, the modified pointer, cannot be operated by another pointer.

int pthread_mutex_destroy(pthread_mutex_t *mutex);
    - Release mutually exclusive resources

int pthread_mutex_lock(pthread_mutex_t *mutex);
    - Lock and block. If one thread is locked, other threads can only block and wait

int pthread_mutex_trylock(pthread_mutex_t *mutex);
    - Try locking. If locking fails, it will not block and will return directly.

int pthread_mutex_unlock(pthread_mutex_t *mutex);
    - Unlock

deadlock

  • Sometimes, a thread needs to access two or more different shared resources at the same time, and each resource is managed by different mutexes. When more than one thread locks the same set of mutexes, a deadlock may occur.
  • During the execution of two or more processes, a phenomenon of waiting for each other caused by the competition for shared resources. If there is no external force, they will not be able to move forward. At this time, it is said that the system is in a deadlock state or the system has a deadlock.
  • Several scenarios of Deadlock:
    1. Forget to release the lock
    2. Repeat locking
    3. Multi thread and multi lock, preempting lock resources

Read write lock

  • When a thread already holds a mutex, the mutex blocks all threads trying to enter the critical zone. However, consider a case where the thread currently holding the mutex only wants to read and access the shared resource, while several other threads also want to read the shared resource. However, due to the exclusivity of the mutex, all other threads cannot obtain the lock, so they cannot read and access the shared resource, but in fact, multiple threads reading and accessing the shared resource at the same time will not cause problems.
  • In the data reading and writing operations, there are more reading operations and less writing operations, such as the application of reading and writing database data. In order to meet the current requirement of allowing multiple reads but only one write, threads provide read-write locks.
  • Features of read-write lock:
    1. If there are other threads reading data, other threads are allowed to perform read operations, but write operations are not allowed.
    2. If other threads write data, other threads are not allowed to read or write.
    3. Writing is exclusive and has high priority. (the process with write lock is executed first)
Type of read / write lock pthread_rwlock_t
int pthread_rwlock_init(pthread_rwlock_t *restrict rwlock, const pthread_rwlockattr_t *restrict attr);
	- Initial read / write lock
int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);
	- Destroy read / write lock
int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);
	- Add read lock
int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);
	- Try to add a read lock without blocking
int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock);
	- Write lock
int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock);
	- Try to add a write lock without blocking
int pthread_rwlock_unlock(pthread_rwlock_t *rwlock);
	- Unlock

Conditional variable

Type of condition variable pthread_cond_t
int pthread_cond_init(pthread_cond_t *restrict cond, const pthread_condattr_t *restrict attr);

int pthread_cond_destroy(pthread_cond_t *cond);

int pthread_cond_wait(pthread_cond_t *restrict cond, pthread_mutex_t *restrict mutex);
    - Wait. If this function is called, the thread will block.
    - When the function call is blocked, the mutex lock will be unlocked. When it is not blocked, continue to execute downward, and the lock will be re added.
int pthread_cond_timedwait(pthread_cond_t *restrict cond, pthread_mutex_t *restrict mutex, const struct timespec *restrict abstime);
    - How long to wait? If this function is called, the thread will block until the specified time ends.
int pthread_cond_signal(pthread_cond_t *cond);
    - Wake up one or more waiting threads
int pthread_cond_broadcast(pthread_cond_t *cond);
    - Wake up all waiting threads

Producer consumer model demo (conditional variable, imperfect, problematic).

#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>

// Create a mutex
pthread_mutex_t mutex;
// Create condition variables
pthread_cond_t cond;

struct Node{
    int num;
    struct Node *next;
};

// Head node
struct Node * head = NULL;

void * producer(void * arg) {

    // Constantly create new nodes and add them to the linked list
    while(1) {
        pthread_mutex_lock(&mutex);
        struct Node * newNode = (struct Node *)malloc(sizeof(struct Node));
        newNode->next = head;
        head = newNode;
        newNode->num = rand() % 1000;
        printf("add node, num : %d, tid : %ld\n", newNode->num, pthread_self());
        
        // As long as one is produced, consumers will be notified to consume
        pthread_cond_signal(&cond);

        pthread_mutex_unlock(&mutex);
        usleep(100);
    }

    return NULL;
}

void * customer(void * arg) {

    while(1) {
        pthread_mutex_lock(&mutex);
        // Save pointer of header node
        struct Node * tmp = head;
        // Judge whether there is data
        if(head != NULL) {
            // Data available
            head = head->next;
            printf("del node, num : %d, tid : %ld\n", tmp->num, pthread_self());
            free(tmp);
            pthread_mutex_unlock(&mutex);
            usleep(100);
        } else {
            // No data, need to wait
            // When this function call is blocked, the mutex will be unlocked. When it is not blocked, continue to execute downward, and the lock will be re added.
            pthread_cond_wait(&cond, &mutex);
            pthread_mutex_unlock(&mutex);
        }
    }
    return  NULL;
}

int main() {

    pthread_mutex_init(&mutex, NULL);
    pthread_cond_init(&cond, NULL);

    // Create 5 producer threads and 5 consumer threads
    pthread_t ptids[5], ctids[5];

    for(int i = 0; i < 5; i++) {
        pthread_create(&ptids[i], NULL, producer, NULL);
        pthread_create(&ctids[i], NULL, customer, NULL);
    }

    for(int i = 0; i < 5; i++) {
        pthread_detach(ptids[i]);
        pthread_detach(ctids[i]);
    }

    while(1) {
        sleep(10);
    }

    pthread_mutex_destroy(&mutex);
    pthread_cond_destroy(&cond);

    pthread_exit(NULL);

    return 0;
}

Semaphore

Type of semaphore sem_t
#include <semaphore.h>
int sem_init(sem_t *sem, int pshared, unsigned int value);
   - Initialize semaphore
   - Parameters:
       - sem : Address of semaphore variable
       - pshared : 0 Used between processes, non-zero used between processes
       - value : Value in semaphore

int sem_destroy(sem_t *sem);
   - Release resources

int sem_wait(sem_t *sem);
   - Lock the semaphore and call the value of the semaphore once-1,If the value is 0, it is blocked

int sem_trywait(sem_t *sem);

int sem_timedwait(sem_t *sem, const struct timespec *abs_timeout);

int sem_post(sem_t *sem);
   - Unlock the semaphore and call the value of the semaphore once+1

int sem_getvalue(sem_t *sem, int *sval);
	- Get the value of semaphore

Producer consumer model (semaphore) demo

#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <semaphore.h>

// Create a mutex
pthread_mutex_t mutex;
// Create two semaphores
sem_t psem;
sem_t csem;

struct Node{
    int num;
    struct Node *next;
};

// Head node
struct Node * head = NULL;

void * producer(void * arg) {

    // Constantly create new nodes and add them to the linked list
    while(1) {
        sem_wait(&psem);
        pthread_mutex_lock(&mutex);
        struct Node * newNode = (struct Node *)malloc(sizeof(struct Node));
        newNode->next = head;
        head = newNode;
        newNode->num = rand() % 1000;
        printf("add node, num : %d, tid : %ld\n", newNode->num, pthread_self());
        pthread_mutex_unlock(&mutex);
        sem_post(&csem);
    }

    return NULL;
}

void * customer(void * arg) {

    while(1) {
        sem_wait(&csem);
        pthread_mutex_lock(&mutex);
        // Save pointer of header node
        struct Node * tmp = head;
        head = head->next;
        printf("del node, num : %d, tid : %ld\n", tmp->num, pthread_self());
        free(tmp);
        pthread_mutex_unlock(&mutex);
        sem_post(&psem);
       
    }
    return  NULL;
}

int main() {

    pthread_mutex_init(&mutex, NULL);
    sem_init(&psem, 0, 8);
    sem_init(&csem, 0, 0);

    // Create 5 producer threads and 5 consumer threads
    pthread_t ptids[5], ctids[5];

    for(int i = 0; i < 5; i++) {
        pthread_create(&ptids[i], NULL, producer, NULL);
        pthread_create(&ctids[i], NULL, customer, NULL);
    }

    for(int i = 0; i < 5; i++) {
        pthread_detach(ptids[i]);
        pthread_detach(ctids[i]);
    }

    while(1) {
        sleep(10);
    }

    pthread_mutex_destroy(&mutex);
	sem_destroy(&psem);
	sem_destroy(&csem);

    pthread_exit(NULL);

    return 0;
}

summary

Niuke C++ learning notes

Tags: Linux C++ Unix

Posted by csj16 on Sat, 23 Jul 2022 21:30:49 +0530