In September, datawhale formed a team to learn task04 set operation
1, Addition and subtraction of tables
1. What is set operation
-
The execution results of tables, views and queries learned earlier are collections
- The element is the record, which is each row of the table or query result
-
Set operation
- Set operators
- UNION Union operation
- INTERSECT intersection operation
- Excel difference operation
- (however, MySQL 8.0 does not support INTERSECT and excel)
- When solving the problem, the table can be regarded as a set for set operation
- Set operators
2. Addition UNION of table
-
Mutuality is one of the three elements of a set
- So merging two tables must have the function of de duplication
-
Let's try to combine product with product of product2_ id,product_name, merge it
-
First, let's see what they look like
-
According to the basic mathematical knowledge, we can predict what will happen after UNION. Let's test it
SELECT product_id,product_name FROM product UNION SELECT product_id,product_name FROM product2;
Expected ðŸ¤
The three records of "0001" ~ "0003" exist in both tables, but they will not appear again after UNION, which plays a role
-
-
The above is the union of two tables. In fact, it is also OK for one table. How do you understand it?
-
When you query by some criteria, you can merge records that meet condition A or condition B into records that meet condition A or condition B
-
Like this question
-
When we don't learn set operations, we use OR
SELECT product_id,product_name,product_id, product_name, product_type, sale_price, purchase_price FROM product WHERE sale_price < 800 OR sale_price > 1.5*purchase_price;
-
You can use UNION now!
SELECT product_id,product_name,product_id, product_name, product_type, sale_price, purchase_price FROM product WHERE sale_price < 800 UNION SELECT product_id,product_name,product_id, product_name, product_type, sale_price, purchase_price FROM product WHERE sale_price > 1.5*purchase_price;
-
The results are the same!
-
-
UNION and OR predicates
- Does the same result above mean that UNION is redundant? You see, there are two more lines to write sql statements with UNION
- No!
- For two tables, use UNION
- Even for a table, UNION is sometimes used for query efficiency
- No!
Set operation with duplicate rows UNION ALL
-
Although set merging is generally de duplication, sometimes we don't want to de duplication
-
UNION ALL will be used at this time
SELECT product_id,product_name FROM product UNION ALL SELECT product_id,product_name FROM product2;
-
Implicit data type conversion
Generally speaking, we will use UNION to merge columns with exactly the same type and representing the same attribute. However, sometimes, even if the data types are not exactly the same, we will display two columns with different types in one column through implicit type conversion, such as string and numeric types
3. Intersection join
(MySQL 8.0 does not support INTERSECT)
SELECT p1.product_id, p1.product_name FROM Product p1 INNER JOIN Product2 p2 ON p1.product_id=p2.product_id
AS is omitted above
It is best not to omit:
SELECT p1.product_id,p1.product_name FROM product AS p1 INNER JOIN product2 AS p2 ON p1.product_id=p2.product_id
4. Subtraction of difference set, complement set and table
NOT predicate
MySQL 8.0 does not support excel operation yet
However, NOT IN can also achieve the same effect as the excel operation in SQL standard syntax
-
Find items that exist only in the Product table but not in the Product2 table
SELECT * FROM product WHERE product_id NOT IN (SELECT product_id FROM product2);
-
Use the NOT predicate to subtract the set to find the goods in the Product table with a selling price of more than 2000 but a profit of less than 30%
SELECT * FROM product WHERE sale_price > 2000 AND product_id NOT IN (SELECT product_id FROM product WHERE sale_price < 1.3*purchase_price);
AND predicate
-
For two query results of the same table, their INTERSECT can actually connect the search conditions of the two queries with AND predicate equivalently
-
Find products in the product table with a profit margin of more than 50% and a selling price of less than 1500
SELECT * FROM product WHERE sale_price > 1.5*purchase_price AND sale_price < 1500
5. Symmetry difference
-
In other databases, it can also be used to simply realize the symmetric difference operation of tables or query results: first, use UNION to find the UNION of two tables, then use INTERSECT to find the intersection of two tables, and then subtract the intersection from the UNION to get the symmetric difference
-
However, in MySQL 8.0, because the of two tables or query results cannot be calculated directly, it is not suitable to use the above idea to calculate the symmetry difference
-
So you can use A-B and B-A
-
example
-
Use the symmetry difference between the Product table and the Product2 table to query which products are in only one table
SELECT * FROM product WHERE product_id NOT IN (SELECT product_id FROM product2) UNION SELECT * FROM product2 WHERE product_id NOT IN (SELECT product_id FROM product)
-
2, Link
The addition and subtraction method in the previous table is the change of rows, and the connection is the change of columns
Join is to add columns from other tables by using some association condition (generally using the equality judgment predicate "=") to perform the set operation of "add columns"
1. INNER JOIN
-
Syntax format
-- Inner link FROM <tb_1> INNER JOIN <tb_2> ON <condition(s)>
Use inner links to get information from two tables
-
Let's look at the product table first
SELECT * FROM product;
-
Let's look at the shopproduct table
SELECT * FROM shopproduct;
-
The two tables have different information columns. If we want to analyze them together, we need to link them!
-
operation
SELECT SP.shop_id ,SP.shop_name ,SP.product_id ,P.product_name ,P.product_type ,P.sale_price ,SP.quantity FROM ShopProduct AS SP INNER JOIN Product AS P ON SP.product_id = P.product_id;
be careful
- The join condition must be specified with the ON clause
- The columns in the SELECT clause are best used in the form of table name and column name
- In this way, you can quickly see which column comes from which table
- If two tables have other columns with the same name, you must use the above format to select the column name, otherwise the query statement will report an error
Use inner join with WHERE clause
-
If you need to use the WHERE clause to filter the search results when using inner links, you need to write the WHERE clause after the ON clause
-
Follow the example above
SELECT SP.shop_id ,SP.shop_name ,SP.product_id ,P.product_name ,P.product_type ,P.sale_price ,SP.quantity FROM ShopProduct AS SP INNER JOIN Product AS P ON SP.product_id = P.product_id WHERE SP.shop_name = 'Tokyo' AND P.product_type = 'clothes' ;
-
Note the execution order of the above queries
-
Use inner join with GROUP BY clause
- What are the selling prices of the highest priced goods in each store
SELECT SP.shop_id ,SP.shop_name ,MAX(P.sale_price) AS max_price FROM shopproduct AS SP INNER JOIN product AS P ON SP.product_id = P.product_id GROUP BY SP.shop_id,SP.shop_name
Self linking
The previous inner links linked two different tables. However, in fact, a table can also be connected with itself. This connection is called self connection. It should be noted that self connection is not the third connection to distinguish between internal connection and external connection. Self connection can be external connection or internal connection. It is another classification method of connection different from internal connection and external connection
Inner link and associated subquery
-
Identify goods whose selling price is higher than the average price of such goods
SELECT P1.product_id, P1.product_name, P1.product_type, P1.sale_price, P2.avg_price FROM product AS P1 INNER JOIN (SELECT product_type,AVG(sale_price) AS avg_price FROM product GROUP BY product_type) AS P2 ON P1.product_type = P2.product_type WHERE P1.sale_price > P2.avg_price;
When two tables are connected naturally, the equivalent inner connection will be performed according to the column names contained in both tables. At this time, it is not necessary to use ON to specify the connection conditions
Natural join
SELECT * FROM shopproduct NATURAL JOIN product;
Intersection using links
SELECT * FROM product AS P1 INNER JOIN product2 AS P2 ON P1.product_id = P2.product_id;
2. External connection
-
The inner join discards the rows in the two tables that do not meet the ON condition. The opposite to the inner join is the outer join
-
The outer link will selectively reserve rows that cannot be matched according to the type of outer link.
There are three forms of external links according to the table in which the reserved rows are located
-
Left link
-
The left join will save the rows in the left table that cannot be matched according to the ON clause. At this time, the rows in the right table are missing values
-
Right link
-
The right join will save the rows in the right table that cannot be matched according to the ON clause. At this time, the rows corresponding to the left table are all missing values
-
External link
-
The full outer join will save the rows in the two tables that cannot be matched according to the ON clause at the same time, and the rows in the corresponding other table will be filled with missing values
Unfortunately, MySQL 8.0 does not support all external links at present, but we can unify the results of left and right links to realize all external links
-
The corresponding grammars of the three external links are:
-- Left link FROM <tb_1> LEFT OUTER JOIN <tb_2> ON <condition(s)> -- Right link FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)> -- Total external connection FROM <tb_1> FULL OUTER JOIN <tb_2> ON <condition(s)>
3. Multi table link
Multiple intra table links
-
Use the inner link to find out which goods each store has, and the total inventory of each kind of goods is separately
SELECT SP.shop_id ,SP.shop_name ,SP.product_id ,P.product_name ,P.sale_price ,IP.inventory_quantity FROM ShopProduct AS SP INNER JOIN Product AS P ON SP.product_id = P.product_id INNER JOIN InventoryProduct AS IP ON SP.product_id = IP.product_id WHERE IP.inventory_id = 'P001';
Multiple off table links
- External links can generally have more rows than internal links, so they can give more information about the main table than internal links. Using external links in multi table links also has the same effect
SELECT P.product_id ,P.product_name ,P.sale_price ,SP.shop_id ,SP.shop_name ,IP.inventory_quantity FROM Product AS P LEFT OUTER JOIN ShopProduct AS SP ON SP.product_id = P.product_id LEFT OUTER JOIN InventoryProduct AS IP ON SP.product_id = IP.product_id