introduction
High concurrency and high availability are the key points of the interview. If the items in your resume are related to shopping malls, seckill, or large and medium factories, you will not be unfamiliar with such questions.
In an interview with a leading manufacturer, both sides mentioned the consistency of inventory in the mall and the prevention of oversold, oversold and inventory reduction in high concurrency. Probably, the three sides will also ask relevant questions. Take this opportunity to sort out the knowledge points about the high concurrency seckill system.
This article mainly analyzes how the system can provide normal and stable services when 1million people rob 10000 train tickets at the same time.
1. high concurrency system architecture
High concurrency systems generally adopt distributed cluster deployment. The service layer has layer by layer load balancing, and provides various disaster recovery means (dual active data center, remote multi active, node fault tolerance, server disaster recovery, etc.) to ensure high availability of the system. The traffic will also be balanced to different servers according to different load capacities and configuration strategies.

1.1 load balancing
In the figure, three kinds of load balancing are used: OSPF load balancing, Lvs load balancing and Nginx load balancing.
-
OSPF (open shortest path first) is an internal gateway protocol (IGP). OSPF establishes the link state database by announcing the status of the network interface between routers to generate the shortest path tree. OSPF will automatically calculate the Cost value on the routing interface, but it can also manually specify (prioritize) the Cost value of the interface. The Cost calculated by OSPF is also inversely proportional to the interface bandwidth. The higher the bandwidth, the smaller the Cost value. Load balancing can be performed for paths that reach the same Cost value of the target. The default is load balancing for 4 lines, and the maximum support is load balancing for 6 lines. We can modify the number of load balanced lines in the OSPF routing process at maximum paths = 6.
-
LVS(Linux Virtual Server), a cluster technology, has been integrated into the Linux kernel module to realize the scheduling and content request distribution technology based on IP data request load balancing. The scheduler has good throughput, transfers requests to different servers for execution in a balanced manner, and can shield server failures, so as to form a group of servers into a high-performance, highly available virtual server.
-
Nginx is a high-performance HTTP proxy / anti proxy server, which is often used for load balancing in server development. The main methods of nginx load balancing include: ① polling ② weighted polling ③ ip hash polling
1.2 Nginx weighted polling
Nginx implements load balancing through the upstream module. The configuration of weighted polling can add a weight value to related services. During configuration, the response load may be set according to the server performance and load capacity.
Configure the weighted polling load through an example. Listen to ports 8001-8004 locally. The weights are configured as 2, 4, 6, and 8.
#Configure load balancing upstream load_rule { server 127.0.0.1:8001 weight=2; server 127.0.0.1:8002 weight=4; server 127.0.0.1:8003 weight=6; server 127.0.0.1:8004 weight=8; } ... server { listen 80; server_name baidu.com www.baidu.com; location / { proxy_pass http://load_rule; } }
Other more detailed polling policies and other issues can be viewed Understand Nginx source code This one in: Load balancing of upstream mechanism in Nginx
1.3 supplement: pressure test
www.baidu.com is configured in the local /etc/hosts directory Com. Next, use the go language to start the four http port listening services. The following is the Go program listening on port 8001. For the others, just modify the port:
package main import ( "net/http" "os" "strings" ) func main() { http.HandleFunc("/buy/ticket", handleReq) http.ListenAndServe(":8001", nil) } //Process the request function and write the response result information to the log according to the request func handleReq(w http.ResponseWriter, r *http.Request) { failedMsg := "handle in port:" writeLog(failedMsg, "./stat.log") } //Write log func writeLog(msg string, logPath string) { fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644) defer fd.Close() content := strings.Join([]string{msg, "\r\n"}, "8001") buf := []byte(content) fd.Write(buf) }
I wrote the requested port log information to/ stat.log file, and then use ab pressure measurement tool to perform pressure measurement:
ab -n 1000 -c 100 http://www.baidu.com/buy/ticket
According to the statistics in the log, ports 8001-8004 have received 200, 400, 600 and 800 requests respectively, which is in good agreement with the weight ratio I configured in nginx, and the traffic after load is very uniform and random.
2. spike system mechanism
Question: if 1million people grab a train ticket at the same time, how can we provide normal and stable services?
When load balancing has been used, the user's spike traffic passes through the load balancing of each layer and is distributed evenly to different servers. Even so, the QPS borne by a single machine in the cluster is very high.
Generally speaking, the booking system needs to deal with three basic stages: order generation, inventory reduction and user payment. What our system needs to do is to ensure that the train ticket orders are not oversold and not oversold. Each train ticket sold must be paid to be effective, and to ensure stable operation in a continuous high concurrency environment.
Let's analyze the sequence of these three basic stages:
2.1 order less inventory
When the user requests to arrive at the server concurrently, first create an order, then deduct the inventory and wait for the user to pay. This is the most common idea, which can ensure that the order is not oversold, because reducing inventory after creating an order (using the Redis increment atomic operation) is an atomic operation.
But there are also some problems:
- **The first is: * * in the case of extreme concurrency, the details of any memory operation are crucial to the performance. In particular, the logic of creating an order generally needs to be stored in the disk database, which is very stressful to the database;
- **The second is: * * if the user places an order maliciously and only places an order without paying, the inventory will be reduced and many orders will be sold. Although the server can limit the number of IP and user purchase orders, this is not a good method.

2.2 payment minus inventory
If you wait for the user to pay the order and then reduce the inventory, you can avoid selling less. This is a big taboo of concurrency architecture, because in the case of extreme concurrency, users may continue to click a single button, resulting in multiple orders. When the inventory is reduced to 0, many users will find that the orders they grab cannot be paid. This is the so-called * * "oversold" **. At the same time, this cannot avoid concurrent operation of database and disk IO.

2.3 withholding inventory
Considering the above two schemes, we conclude that as long as an order is created, the database IO must be operated frequently.
So our optimization point is: is there a solution that does not need to directly operate database IO?
This is the withholding Inventory (refer to 3. withholding inventory optimization for specific implementation and Optimization):
- Ensure that it is not oversold, and then generate user orders asynchronously, so that the response speed to users will be much faster;
- Ensure a lot of sales, add the validity period to the order, and add new inventory without payment after the expiration;
- Orders are generated asynchronously. They are usually processed in real-time consumption queues such as MQ and KAFKA. When there are few orders, orders are generated very quickly and users hardly need to queue up;

2.4 comparison of three methods
Inventory reduction method | advantage | shortcoming |
---|---|---|
Order less inventory | Avoid oversold | Resulting in less sales; Placing orders maliciously; Great pressure on Database |
Payment minus inventory | Avoid selling less | Oversold; Too many orders, unable to pay later; Huge database and disk IO pressure |
Withholding inventory | Avoid oversold and oversold | There is also room for optimization (how to ensure high and issue correct inventory deduction and quickly respond to user requests) |
3. withholding inventory optimization
From the above example, withholding inventory is the best choice, which can avoid oversold and oversold, but there is still room for optimization.
Problems to be solved in the optimization scheme:
- Where does inventory exist?
- How to ensure the correct inventory deduction and rapid response to user requests under high concurrency?
Let's analyze them one by one:
In the case of high concurrency of a single machine, we usually implement the inventory deduction as follows:

In order to ensure the atomicity of inventory deduction and order generation, transaction processing is required, then inventory judgment is taken, inventory is reduced, and finally the transaction is submitted. The whole process has a lot of IO, The operation on the database is blocked (the resource occupied by the first connection is not released, and the second connection needs to obtain this resource. If the first connection is not committed or rolled back, the second connection will wait until the first connection releases the resource).
In other words, because MySQL stores data, the same data must be stored in one row in the database, so there will be a large number of threads competing for InnoDB row locks. The higher the concurrency, the more waiting threads will be, the TPS (messages processed per second) will decrease, the response time (RT) will increase, and the database throughput will be seriously affected.
This method is not suitable for implementing high concurrency seckill system.
3.1 local inventory deduction
For the previous common inventory deduction method, our optimization scheme is: local inventory deduction. We allocate a certain amount of inventory to the local machine, directly reduce inventory in memory, and then create orders asynchronously according to the previous logic. After improvement, our scheme is as follows:

This avoids frequent IO operations on the database and only performs operations in memory, which greatly improves the anti concurrency ability of a single machine. However, a single machine with millions of users can not handle it anyway. Although Nginx uses epoll model to process network requests, the C10K problem has also been solved. However, in the Linux system, all resources are files, and so are network requests. A large number of file descriptors will make the operating system lose response instantly. Above, we use the weighted balancing strategy of Nginx. According to this idea, we can balance the requests of 100W users to 100 servers, so that the concurrency of a single machine will be much less. Then we give each machine a local inventory of 100 train tickets, The total inventory of 100 servers is still 10000 train tickets, which ensures that inventory orders are not oversold. The following figure describes the cluster architecture we just described:

3.2 remote unified inventory reduction
Here comes the problem! In the case of high concurrency, we may not be able to guarantee the high availability of the system. If 2 or 3 of the 100 servers are down because they can't handle the concurrent traffic or for other reasons. Then the orders on these servers cannot be sold, resulting in less sales of orders.
To solve this problem, we need to do a unified management of the total order volume. This is the next fault-tolerant solution. The server not only needs to reduce inventory locally, but also needs to reduce inventory remotely. With the remote unified inventory reduction operation, we can allocate some redundant "buffer inventory" for each machine according to the machine load to prevent machine downtime.
We analyze the architecture of remote unified inventory reduction through a structure chart:

We use Redis to store unified inventory. Because Redis has very high performance, it is claimed that single QPS can resist 10W concurrency. After the local inventory reduction, if there is an order in the local area, we will request Redis to reduce the inventory remotely. Only after the local inventory reduction and remote inventory reduction are successful, the prompt of successful ticket grabbing will be returned to the user. This can also effectively ensure that the order will not be oversold. When one of the machines is down, because there are reserved buffer tickets on each machine, the remaining tickets on the down machine can still be made up on other machines to ensure that they will not be sold less.
How reasonable is the buffer balance set? Theoretically, the more buffer balance is set, the more downtime the system can tolerate. However, setting the buffer too large will also have a certain impact on Redis. Although Redis has a very high anti concurrency ability, the request will still go through a network IO. In fact, the number of requests for Redis in the process of ticket grabbing is the total amount of local inventory and buffer inventory. When the local inventory is insufficient, the system directly returns the message "sold out! Sold out", so the logic of unified inventory will not be deducted. To a certain extent, it also avoids the huge network requests crushing Redis, Therefore, how much the buffer value is set is the architect's careful consideration of the system load capacity.
4. summary and review
Generally speaking, secsha system is very complex. This article simply introduces and simulates some strategies on how to optimize a single machine to the highest performance, how to avoid a single point of failure in a cluster, and how to ensure that orders are not oversold and not oversold.
The complete order system also has a task to view the order progress. Each server has a task: ① synchronize the remaining tickets and inventory information from the total inventory to the user regularly; ② release the order and replenish the inventory when the user does not pay within the order validity period.
In this article, the core logic of high concurrency ticket grabbing is simply implemented. It can be said that the system design is very ingenious, skilfully avoiding the operation of DB database io. For the high concurrency request of Redis network IO, almost all the calculations are completed in memory, which effectively ensures that it is not oversold and not oversold. It can also provide some disaster recovery, and some machines can be allowed to go down.
There are two design ideas worth learning from:
-
Load balancing, divide and Conquer:
Different traffic is divided into different machines. Each machine handles its own requests and gives full play to its own performance. In this way, the overall anti concurrency ability is also high.
12306 will also adopt such a design idea, and divide the 10 o'clock ticket grabbing every day into 9 o'clock, 10 o'clock and 11 o'clock, so as to reduce the pressure of the system.
-
Rational use of concurrency and asynchrony:
What can be done asynchronously can be done asynchronously. Disassembling functions can achieve unexpected results. This is true in Nginx and node JS and Redis. The epoll model they use to process network model requests tells us that single thread can still play a powerful role. Go, a language born for high concurrency, perfectly gives full play to the advantage of multi-core servers. Many tasks that can be processed concurrently can be solved by using concurrency. For example, when go processes http requests, each request will be executed in a goroutine. It is a direction we need to learn and explore to squeeze the CPU reasonably and make it play its due value.
4.1 comparison of withholding forecasting methods
Inventory reduction method | advantage | shortcoming |
---|---|---|
Order less inventory | Avoid oversold | Resulting in less sales; Placing orders maliciously; Great pressure on Database |
Payment minus inventory | Avoid selling less | Oversold; Too many orders, unable to pay later; Huge database and disk IO pressure |
General withholding inventory | Avoid oversold and oversold | ① Transaction processing ② judge the inventory and reduce the inventory ③ submit the transaction. There are a lot of IO and the database is blocked. It is not suitable for the seckill system at all |
Local withholding inventory | ① Avoid oversold and oversold ② avoid frequent IO operations of the database and operate in memory ③ improve the concurrency of a single machine | ① In the case of high concurrency, a large number of requests will cause the operating system to lose response. ② when a server goes down, it will cause less sales |
Remote unified inventory reduction | ① Avoid oversold and undersold ② avoid frequent IO operations of the database and operate in memory ③ the cluster has strong concurrency capability ④ can tolerate machine downtime ⑤ high concurrency will not cause oversold and undersold ⑥ avoid high concurrency requests of Redis network IO | Wonderful~~~ |
5. appendix: code demonstration
The go language is native to concurrent design. I will use the go language to show you the specific process of single machine ticket grabbing.
5.1 initialization
The init function in the go package is executed before the main function. Some preparatory work is mainly done at this stage. Our system needs to do the following preparations: initialize the local inventory, initialize the hash key value of the remote redis storage unified inventory, and initialize the redis connection pool; In addition, it is necessary to initialize an int type channel with a size of 1. The purpose is to realize the distributed lock function. You can also directly use read-write locks or redis and other methods to avoid resource competition. However, using channel is more efficient. This is the philosophy of go language: do not communicate through shared memory, but share memory through communication. The redis library uses redigo. The following is the code implementation:
... //localSpike package structure definition package localSpike type LocalSpike struct { LocalInStock int64 LocalSalesVolume int64 } ... //remoteSpike's definition of hash structure and redis connection pool package remoteSpike //Remote order storage key value type RemoteSpikeKeys struct { SpikeOrderHashKey string //Second kill order hash structure key in redis TotalInventoryKey string //Total order inventory key in hash structure QuantityOfOrderKey string //Existing order quantity key in hash structure } //Initialize the redis connection pool func NewPool() *redis.Pool { return &redis.Pool{ MaxIdle: 10000, MaxActive: 12000, // max number of connections Dial: func() (redis.Conn, error) { c, err := redis.Dial("tcp", ":6379") if err != nil { panic(err.Error()) } return c, err }, } } ... func init() { localSpike = localSpike2.LocalSpike{ LocalInStock: 150, LocalSalesVolume: 0, } remoteSpike = remoteSpike2.RemoteSpikeKeys{ SpikeOrderHashKey: "ticket_hash_key", TotalInventoryKey: "ticket_total_nums", QuantityOfOrderKey: "ticket_sold_nums", } redisPool = remoteSpike2.NewPool() done = make(chan int, 1) done <- 1 }
5.2 local inventory deduction and unified inventory deduction
The logic of local inventory deduction is very simple. The user requests to add the sales volume, then compares whether the sales volume is greater than the local inventory, and returns the bool value:
package localSpike //Deduct inventory locally and return bool value func (spike *LocalSpike) LocalDeductionStock() bool{ spike.LocalSalesVolume = spike.LocalSalesVolume + 1 return spike.LocalSalesVolume < spike.LocalInStock }
Note that the operation of shared data LocalSalesVolume here is implemented by using locks. However, because local inventory deduction and unified inventory deduction are atomic operations, channel is used at the top layer, which will be discussed later. Redis is a unified inventory deduction operation. Because redis is a single thread, we need to implement the steps of fetching data, writing data and calculating some columns. We need to package commands with lua scripts to ensure the atomicity of the operation:
package remoteSpike ...... const LuaScript = ` local ticket_key = KEYS[1] local ticket_total_key = ARGV[1] local ticket_sold_key = ARGV[2] local ticket_total_nums = tonumber(redis.call('HGET', ticket_key, ticket_total_key)) local ticket_sold_nums = tonumber(redis.call('HGET', ticket_key, ticket_sold_key)) -- Check whether there are any remaining tickets,Increase order quantity,Return result value if(ticket_sold_nums > ticket_total_nums) then return redis.call('HINCRBY', ticket_key, ticket_sold_key, 1) end return 0 ` //Remote unified inventory deduction func (RemoteSpikeKeys *RemoteSpikeKeys) RemoteDeductionStock(conn redis.Conn) bool { lua := redis.NewScript(1, LuaScript) result, err := redis.Int(lua.Do(conn, RemoteSpikeKeys.SpikeOrderHashKey, RemoteSpikeKeys.TotalInventoryKey, RemoteSpikeKeys.QuantityOfOrderKey)) if err != nil { return false } return result != 0 }
We use the hash structure to store the information of unified inventory and total sales volume, request it, judge whether the total sales volume is greater than the inventory, and then return the relevant bool value. Before starting the service, we need to initialize the initial inventory information of redis:
hmset ticket_hash_key "ticket_total_nums" 10000 "ticket_sold_nums" 0
5.3 responding to user information
We start an http service and listen on one port:
package main ... func main() { http.HandleFunc("/buy/ticket", handleReq) http.ListenAndServe(":3005", nil) }
We have finished all the initialization work above. The logic of handleReq is very clear. It is OK to judge whether the ticket is successfully robbed and return the user information.
package main //Process the request function and write the response result information to the log according to the request func handleReq(w http.ResponseWriter, r *http.Request) { redisConn := redisPool.Get() LogMsg := "" <-done //Global read / write lock if localSpike.LocalDeductionStock() && remoteSpike.RemoteDeductionStock(redisConn) { util.RespJson(w, 1, "Ticket grabbing succeeded", nil) LogMsg = LogMsg + "result:1,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, 10) } else { util.RespJson(w, -1, "Sold out", nil) LogMsg = LogMsg + "result:0,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, 10) } done <- 1 //Write the ticket grabbing status to the log writeLog(LogMsg, "./stat.log") } func writeLog(msg string, logPath string) { fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644) defer fd.Close() content := strings.Join([]string{msg, "\r\n"}, "") buf := []byte(content) fd.Write(buf) }
As mentioned earlier, we need to consider static conditions when we deduct inventory. Here we use channel to avoid concurrent reading and writing and ensure efficient sequential execution of requests. We write the return information of the interface to/ stat.log file is convenient for pressure measurement statistics.
5.4 single machine service pressure test
Start the service, we use ab pressure test tool to test:
ab -n 10000 -c 100 http://127.0.0.1:3005/buy/ticket
The following is the pressure measurement information of my local low configuration mac
This is ApacheBench, Version 2.3 <$Revision: 1826891 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Server Hostname: 127.0.0.1 Server Port: 8005 Document Path: /buy/ticket Document Length: 29 bytes Concurrency Level: 100 Time taken for tests: 2.339 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 1370000 bytes HTML transferred: 290000 bytes Requests per second: 4275.96 [#/sec] (mean) Time per request: 23.387 [ms] (mean) Time per request: 0.234 [ms] (mean, across all concurrent requests) Transfer rate: 572.08 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 8 14.7 6 223 Processing: 2 15 17.6 11 232 Waiting: 1 11 13.5 8 225 Total: 7 23 22.8 18 239 Percentage of the requests served within a certain time (ms) 50% 18 66% 24 75% 26 80% 28 90% 33 95% 39 98% 45 99% 54 100% 239 (longest request)
According to the indicators, my single machine can process 4000+ requests per second. The normal servers are all multi-core configurations, and there is no problem processing 1W+ requests at all. In addition, it is found that throughout the service process, the requests are normal, the traffic is uniform, and redis is also normal:
//stat.log ... result:1,localSales:145 result:1,localSales:146 result:1,localSales:147 result:1,localSales:148 result:1,localSales:149 result:1,localSales:150 result:0,localSales:151 result:0,localSales:152 result:0,localSales:153 result:0,localSales:154 result:0,localSales:156 ...
6. reference
- https://github.com/GuoZhaoran/spikeSystem
- Geek time - how to design a seckill system
For the follow-up updates of the blog, please follow my personal blog: Stardust blog