Analysis and design of high concurrency seckill system architecture in mall

introduction

High concurrency and high availability are the key points of the interview. If the items in your resume are related to shopping malls, seckill, or large and medium factories, you will not be unfamiliar with such questions.

In an interview with a leading manufacturer, both sides mentioned the consistency of inventory in the mall and the prevention of oversold, oversold and inventory reduction in high concurrency. Probably, the three sides will also ask relevant questions. Take this opportunity to sort out the knowledge points about the high concurrency seckill system.

This article mainly analyzes how the system can provide normal and stable services when 1million people rob 10000 train tickets at the same time.

1. high concurrency system architecture

High concurrency systems generally adopt distributed cluster deployment. The service layer has layer by layer load balancing, and provides various disaster recovery means (dual active data center, remote multi active, node fault tolerance, server disaster recovery, etc.) to ensure high availability of the system. The traffic will also be balanced to different servers according to different load capacities and configuration strategies.

1.1 load balancing

In the figure, three kinds of load balancing are used: OSPF load balancing, Lvs load balancing and Nginx load balancing.

  • OSPF (open shortest path first) is an internal gateway protocol (IGP). OSPF establishes the link state database by announcing the status of the network interface between routers to generate the shortest path tree. OSPF will automatically calculate the Cost value on the routing interface, but it can also manually specify (prioritize) the Cost value of the interface. The Cost calculated by OSPF is also inversely proportional to the interface bandwidth. The higher the bandwidth, the smaller the Cost value. Load balancing can be performed for paths that reach the same Cost value of the target. The default is load balancing for 4 lines, and the maximum support is load balancing for 6 lines. We can modify the number of load balanced lines in the OSPF routing process at maximum paths = 6.

  • LVS(Linux Virtual Server), a cluster technology, has been integrated into the Linux kernel module to realize the scheduling and content request distribution technology based on IP data request load balancing. The scheduler has good throughput, transfers requests to different servers for execution in a balanced manner, and can shield server failures, so as to form a group of servers into a high-performance, highly available virtual server.

  • Nginx is a high-performance HTTP proxy / anti proxy server, which is often used for load balancing in server development. The main methods of nginx load balancing include: ① polling ② weighted polling ③ ip hash polling

1.2 Nginx weighted polling

Nginx implements load balancing through the upstream module. The configuration of weighted polling can add a weight value to related services. During configuration, the response load may be set according to the server performance and load capacity.

Configure the weighted polling load through an example. Listen to ports 8001-8004 locally. The weights are configured as 2, 4, 6, and 8.

#Configure load balancing
    upstream load_rule {
       server 127.0.0.1:8001 weight=2;
       server 127.0.0.1:8002 weight=4;
       server 127.0.0.1:8003 weight=6;
       server 127.0.0.1:8004 weight=8;
    }
    ...
    server {
    listen       80;
    server_name  baidu.com www.baidu.com;
    location / {
       proxy_pass http://load_rule;
    }
}

Other more detailed polling policies and other issues can be viewed Understand Nginx source code This one in: Load balancing of upstream mechanism in Nginx

1.3 supplement: pressure test

www.baidu.com is configured in the local /etc/hosts directory Com. Next, use the go language to start the four http port listening services. The following is the Go program listening on port 8001. For the others, just modify the port:

package main

import (
	"net/http"
	"os"
	"strings"
)

func main() {
	http.HandleFunc("/buy/ticket", handleReq)
	http.ListenAndServe(":8001", nil)
}

//Process the request function and write the response result information to the log according to the request
func handleReq(w http.ResponseWriter, r *http.Request) {
	failedMsg :=  "handle in port:"
	writeLog(failedMsg, "./stat.log")
}

//Write log
func writeLog(msg string, logPath string) {
	fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644)
	defer fd.Close()
	content := strings.Join([]string{msg, "\r\n"}, "8001")
	buf := []byte(content)
	fd.Write(buf)
}

I wrote the requested port log information to/ stat.log file, and then use ab pressure measurement tool to perform pressure measurement:

ab -n 1000 -c 100 http://www.baidu.com/buy/ticket

According to the statistics in the log, ports 8001-8004 have received 200, 400, 600 and 800 requests respectively, which is in good agreement with the weight ratio I configured in nginx, and the traffic after load is very uniform and random.

2. spike system mechanism

Question: if 1million people grab a train ticket at the same time, how can we provide normal and stable services?

When load balancing has been used, the user's spike traffic passes through the load balancing of each layer and is distributed evenly to different servers. Even so, the QPS borne by a single machine in the cluster is very high.

Generally speaking, the booking system needs to deal with three basic stages: order generation, inventory reduction and user payment. What our system needs to do is to ensure that the train ticket orders are not oversold and not oversold. Each train ticket sold must be paid to be effective, and to ensure stable operation in a continuous high concurrency environment.

Let's analyze the sequence of these three basic stages:

2.1 order less inventory

When the user requests to arrive at the server concurrently, first create an order, then deduct the inventory and wait for the user to pay. This is the most common idea, which can ensure that the order is not oversold, because reducing inventory after creating an order (using the Redis increment atomic operation) is an atomic operation.

But there are also some problems:

  • **The first is: * * in the case of extreme concurrency, the details of any memory operation are crucial to the performance. In particular, the logic of creating an order generally needs to be stored in the disk database, which is very stressful to the database;
  • **The second is: * * if the user places an order maliciously and only places an order without paying, the inventory will be reduced and many orders will be sold. Although the server can limit the number of IP and user purchase orders, this is not a good method.

2.2 payment minus inventory

If you wait for the user to pay the order and then reduce the inventory, you can avoid selling less. This is a big taboo of concurrency architecture, because in the case of extreme concurrency, users may continue to click a single button, resulting in multiple orders. When the inventory is reduced to 0, many users will find that the orders they grab cannot be paid. This is the so-called * * "oversold" **. At the same time, this cannot avoid concurrent operation of database and disk IO.

2.3 withholding inventory

Considering the above two schemes, we conclude that as long as an order is created, the database IO must be operated frequently.

So our optimization point is: is there a solution that does not need to directly operate database IO?

This is the withholding Inventory (refer to 3. withholding inventory optimization for specific implementation and Optimization):

  1. Ensure that it is not oversold, and then generate user orders asynchronously, so that the response speed to users will be much faster;
  2. Ensure a lot of sales, add the validity period to the order, and add new inventory without payment after the expiration;
  3. Orders are generated asynchronously. They are usually processed in real-time consumption queues such as MQ and KAFKA. When there are few orders, orders are generated very quickly and users hardly need to queue up;

2.4 comparison of three methods

Inventory reduction methodadvantageshortcoming
Order less inventoryAvoid oversoldResulting in less sales; Placing orders maliciously; Great pressure on Database
Payment minus inventoryAvoid selling lessOversold; Too many orders, unable to pay later; Huge database and disk IO pressure
Withholding inventoryAvoid oversold and oversoldThere is also room for optimization (how to ensure high and issue correct inventory deduction and quickly respond to user requests)

3. withholding inventory optimization

From the above example, withholding inventory is the best choice, which can avoid oversold and oversold, but there is still room for optimization.

Problems to be solved in the optimization scheme:

  1. Where does inventory exist?
  2. How to ensure the correct inventory deduction and rapid response to user requests under high concurrency?

Let's analyze them one by one:

In the case of high concurrency of a single machine, we usually implement the inventory deduction as follows:

In order to ensure the atomicity of inventory deduction and order generation, transaction processing is required, then inventory judgment is taken, inventory is reduced, and finally the transaction is submitted. The whole process has a lot of IO, The operation on the database is blocked (the resource occupied by the first connection is not released, and the second connection needs to obtain this resource. If the first connection is not committed or rolled back, the second connection will wait until the first connection releases the resource).

In other words, because MySQL stores data, the same data must be stored in one row in the database, so there will be a large number of threads competing for InnoDB row locks. The higher the concurrency, the more waiting threads will be, the TPS (messages processed per second) will decrease, the response time (RT) will increase, and the database throughput will be seriously affected.

This method is not suitable for implementing high concurrency seckill system.

3.1 local inventory deduction

For the previous common inventory deduction method, our optimization scheme is: local inventory deduction. We allocate a certain amount of inventory to the local machine, directly reduce inventory in memory, and then create orders asynchronously according to the previous logic. After improvement, our scheme is as follows:

This avoids frequent IO operations on the database and only performs operations in memory, which greatly improves the anti concurrency ability of a single machine. However, a single machine with millions of users can not handle it anyway. Although Nginx uses epoll model to process network requests, the C10K problem has also been solved. However, in the Linux system, all resources are files, and so are network requests. A large number of file descriptors will make the operating system lose response instantly. Above, we use the weighted balancing strategy of Nginx. According to this idea, we can balance the requests of 100W users to 100 servers, so that the concurrency of a single machine will be much less. Then we give each machine a local inventory of 100 train tickets, The total inventory of 100 servers is still 10000 train tickets, which ensures that inventory orders are not oversold. The following figure describes the cluster architecture we just described:

3.2 remote unified inventory reduction

Here comes the problem! In the case of high concurrency, we may not be able to guarantee the high availability of the system. If 2 or 3 of the 100 servers are down because they can't handle the concurrent traffic or for other reasons. Then the orders on these servers cannot be sold, resulting in less sales of orders.

To solve this problem, we need to do a unified management of the total order volume. This is the next fault-tolerant solution. The server not only needs to reduce inventory locally, but also needs to reduce inventory remotely. With the remote unified inventory reduction operation, we can allocate some redundant "buffer inventory" for each machine according to the machine load to prevent machine downtime.

We analyze the architecture of remote unified inventory reduction through a structure chart:

We use Redis to store unified inventory. Because Redis has very high performance, it is claimed that single QPS can resist 10W concurrency. After the local inventory reduction, if there is an order in the local area, we will request Redis to reduce the inventory remotely. Only after the local inventory reduction and remote inventory reduction are successful, the prompt of successful ticket grabbing will be returned to the user. This can also effectively ensure that the order will not be oversold. When one of the machines is down, because there are reserved buffer tickets on each machine, the remaining tickets on the down machine can still be made up on other machines to ensure that they will not be sold less.

How reasonable is the buffer balance set? Theoretically, the more buffer balance is set, the more downtime the system can tolerate. However, setting the buffer too large will also have a certain impact on Redis. Although Redis has a very high anti concurrency ability, the request will still go through a network IO. In fact, the number of requests for Redis in the process of ticket grabbing is the total amount of local inventory and buffer inventory. When the local inventory is insufficient, the system directly returns the message "sold out! Sold out", so the logic of unified inventory will not be deducted. To a certain extent, it also avoids the huge network requests crushing Redis, Therefore, how much the buffer value is set is the architect's careful consideration of the system load capacity.

4. summary and review

Generally speaking, secsha system is very complex. This article simply introduces and simulates some strategies on how to optimize a single machine to the highest performance, how to avoid a single point of failure in a cluster, and how to ensure that orders are not oversold and not oversold.

The complete order system also has a task to view the order progress. Each server has a task: ① synchronize the remaining tickets and inventory information from the total inventory to the user regularly; ② release the order and replenish the inventory when the user does not pay within the order validity period.

In this article, the core logic of high concurrency ticket grabbing is simply implemented. It can be said that the system design is very ingenious, skilfully avoiding the operation of DB database io. For the high concurrency request of Redis network IO, almost all the calculations are completed in memory, which effectively ensures that it is not oversold and not oversold. It can also provide some disaster recovery, and some machines can be allowed to go down.

There are two design ideas worth learning from:

  • Load balancing, divide and Conquer:

    Different traffic is divided into different machines. Each machine handles its own requests and gives full play to its own performance. In this way, the overall anti concurrency ability is also high.

    12306 will also adopt such a design idea, and divide the 10 o'clock ticket grabbing every day into 9 o'clock, 10 o'clock and 11 o'clock, so as to reduce the pressure of the system.

  • Rational use of concurrency and asynchrony:

    What can be done asynchronously can be done asynchronously. Disassembling functions can achieve unexpected results. This is true in Nginx and node JS and Redis. The epoll model they use to process network model requests tells us that single thread can still play a powerful role. Go, a language born for high concurrency, perfectly gives full play to the advantage of multi-core servers. Many tasks that can be processed concurrently can be solved by using concurrency. For example, when go processes http requests, each request will be executed in a goroutine. It is a direction we need to learn and explore to squeeze the CPU reasonably and make it play its due value.

4.1 comparison of withholding forecasting methods

Inventory reduction methodadvantageshortcoming
Order less inventoryAvoid oversoldResulting in less sales; Placing orders maliciously; Great pressure on Database
Payment minus inventoryAvoid selling lessOversold; Too many orders, unable to pay later; Huge database and disk IO pressure
General withholding inventoryAvoid oversold and oversold① Transaction processing ② judge the inventory and reduce the inventory ③ submit the transaction. There are a lot of IO and the database is blocked. It is not suitable for the seckill system at all
Local withholding inventory① Avoid oversold and oversold ② avoid frequent IO operations of the database and operate in memory ③ improve the concurrency of a single machine① In the case of high concurrency, a large number of requests will cause the operating system to lose response. ② when a server goes down, it will cause less sales
Remote unified inventory reduction① Avoid oversold and undersold ② avoid frequent IO operations of the database and operate in memory ③ the cluster has strong concurrency capability ④ can tolerate machine downtime ⑤ high concurrency will not cause oversold and undersold ⑥ avoid high concurrency requests of Redis network IOWonderful~~~

5. appendix: code demonstration

The go language is native to concurrent design. I will use the go language to show you the specific process of single machine ticket grabbing.

5.1 initialization

The init function in the go package is executed before the main function. Some preparatory work is mainly done at this stage. Our system needs to do the following preparations: initialize the local inventory, initialize the hash key value of the remote redis storage unified inventory, and initialize the redis connection pool; In addition, it is necessary to initialize an int type channel with a size of 1. The purpose is to realize the distributed lock function. You can also directly use read-write locks or redis and other methods to avoid resource competition. However, using channel is more efficient. This is the philosophy of go language: do not communicate through shared memory, but share memory through communication. The redis library uses redigo. The following is the code implementation:

...
//localSpike package structure definition
package localSpike

type LocalSpike struct {
	LocalInStock     int64
	LocalSalesVolume int64
}
...
//remoteSpike's definition of hash structure and redis connection pool
package remoteSpike
//Remote order storage key value
type RemoteSpikeKeys struct {
	SpikeOrderHashKey string	//Second kill order hash structure key in redis
	TotalInventoryKey string	//Total order inventory key in hash structure
	QuantityOfOrderKey string	//Existing order quantity key in hash structure
}

//Initialize the redis connection pool
func NewPool() *redis.Pool {
	return &redis.Pool{
		MaxIdle:   10000,
		MaxActive: 12000, // max number of connections
		Dial: func() (redis.Conn, error) {
			c, err := redis.Dial("tcp", ":6379")
			if err != nil {
				panic(err.Error())
			}
			return c, err
		},
	}
}
...
func init() {
	localSpike = localSpike2.LocalSpike{
		LocalInStock:     150,
		LocalSalesVolume: 0,
	}
	remoteSpike = remoteSpike2.RemoteSpikeKeys{
		SpikeOrderHashKey:  "ticket_hash_key",
		TotalInventoryKey:  "ticket_total_nums",
		QuantityOfOrderKey: "ticket_sold_nums",
	}
	redisPool = remoteSpike2.NewPool()
	done = make(chan int, 1)
	done <- 1
}

5.2 local inventory deduction and unified inventory deduction

The logic of local inventory deduction is very simple. The user requests to add the sales volume, then compares whether the sales volume is greater than the local inventory, and returns the bool value:

package localSpike
//Deduct inventory locally and return bool value
func (spike *LocalSpike) LocalDeductionStock() bool{
	spike.LocalSalesVolume = spike.LocalSalesVolume + 1
	return spike.LocalSalesVolume < spike.LocalInStock
}

Note that the operation of shared data LocalSalesVolume here is implemented by using locks. However, because local inventory deduction and unified inventory deduction are atomic operations, channel is used at the top layer, which will be discussed later. Redis is a unified inventory deduction operation. Because redis is a single thread, we need to implement the steps of fetching data, writing data and calculating some columns. We need to package commands with lua scripts to ensure the atomicity of the operation:

package remoteSpike
......
const LuaScript = `
        local ticket_key = KEYS[1]
        local ticket_total_key = ARGV[1]
        local ticket_sold_key = ARGV[2]
        local ticket_total_nums = tonumber(redis.call('HGET', ticket_key, ticket_total_key))
        local ticket_sold_nums = tonumber(redis.call('HGET', ticket_key, ticket_sold_key))
		-- Check whether there are any remaining tickets,Increase order quantity,Return result value
        if(ticket_sold_nums > ticket_total_nums) then
            return redis.call('HINCRBY', ticket_key, ticket_sold_key, 1)
        end
        return 0
`
//Remote unified inventory deduction
func (RemoteSpikeKeys *RemoteSpikeKeys) RemoteDeductionStock(conn redis.Conn) bool {
	lua := redis.NewScript(1, LuaScript)
	result, err := redis.Int(lua.Do(conn, RemoteSpikeKeys.SpikeOrderHashKey, RemoteSpikeKeys.TotalInventoryKey, RemoteSpikeKeys.QuantityOfOrderKey))
	if err != nil {
		return false
	}
	return result != 0
}

We use the hash structure to store the information of unified inventory and total sales volume, request it, judge whether the total sales volume is greater than the inventory, and then return the relevant bool value. Before starting the service, we need to initialize the initial inventory information of redis:

 hmset ticket_hash_key "ticket_total_nums" 10000 "ticket_sold_nums" 0

5.3 responding to user information

We start an http service and listen on one port:

package main
...
func main() {
	http.HandleFunc("/buy/ticket", handleReq)
	http.ListenAndServe(":3005", nil)
}

We have finished all the initialization work above. The logic of handleReq is very clear. It is OK to judge whether the ticket is successfully robbed and return the user information.

package main
//Process the request function and write the response result information to the log according to the request
func handleReq(w http.ResponseWriter, r *http.Request) {
	redisConn := redisPool.Get()
	LogMsg := ""
	<-done
	//Global read / write lock
	if localSpike.LocalDeductionStock() && remoteSpike.RemoteDeductionStock(redisConn) {
		util.RespJson(w, 1,  "Ticket grabbing succeeded", nil)
		LogMsg = LogMsg + "result:1,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, 10)
	} else {
		util.RespJson(w, -1, "Sold out", nil)
		LogMsg = LogMsg + "result:0,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, 10)
	}
	done <- 1

	//Write the ticket grabbing status to the log
	writeLog(LogMsg, "./stat.log")
}

func writeLog(msg string, logPath string) {
	fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644)
	defer fd.Close()
	content := strings.Join([]string{msg, "\r\n"}, "")
	buf := []byte(content)
	fd.Write(buf)
}

As mentioned earlier, we need to consider static conditions when we deduct inventory. Here we use channel to avoid concurrent reading and writing and ensure efficient sequential execution of requests. We write the return information of the interface to/ stat.log file is convenient for pressure measurement statistics.

5.4 single machine service pressure test

Start the service, we use ab pressure test tool to test:

ab -n 10000 -c 100 http://127.0.0.1:3005/buy/ticket

The following is the pressure measurement information of my local low configuration mac

This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:
Server Hostname:        127.0.0.1
Server Port:            8005

Document Path:          /buy/ticket
Document Length:        29 bytes

Concurrency Level:      100
Time taken for tests:   2.339 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1370000 bytes
HTML transferred:       290000 bytes
Requests per second:    4275.96 [#/sec] (mean)
Time per request:       23.387 [ms] (mean)
Time per request:       0.234 [ms] (mean, across all concurrent requests)
Transfer rate:          572.08 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    8  14.7      6     223
Processing:     2   15  17.6     11     232
Waiting:        1   11  13.5      8     225
Total:          7   23  22.8     18     239

Percentage of the requests served within a certain time (ms)
  50%     18
  66%     24
  75%     26
  80%     28
  90%     33
  95%     39
  98%     45
  99%     54
 100%    239 (longest request)

According to the indicators, my single machine can process 4000+ requests per second. The normal servers are all multi-core configurations, and there is no problem processing 1W+ requests at all. In addition, it is found that throughout the service process, the requests are normal, the traffic is uniform, and redis is also normal:

//stat.log
...
result:1,localSales:145
result:1,localSales:146
result:1,localSales:147
result:1,localSales:148
result:1,localSales:149
result:1,localSales:150
result:0,localSales:151
result:0,localSales:152
result:0,localSales:153
result:0,localSales:154
result:0,localSales:156
...

6. reference

  1. https://github.com/GuoZhaoran/spikeSystem
  2. Geek time - how to design a seckill system





For the follow-up updates of the blog, please follow my personal blog: Stardust blog

Tags: MySQL Redis Distribution Concurrent Programming

Posted by qaokpl on Fri, 03 Jun 2022 13:19:21 +0530