# In September, datawhale formed a team to learn task04 set operation

## 1, Addition and subtraction of tables

### 1. What is set operation

• The execution results of tables, views and queries learned earlier are collections

• The element is the record, which is each row of the table or query result
• Set operation

• Set operators
• UNION Union operation
• INTERSECT intersection operation
• Excel difference operation
• (however, MySQL 8.0 does not support INTERSECT and excel)
• When solving the problem, the table can be regarded as a set for set operation

### 2. Addition UNION of table

• Mutuality is one of the three elements of a set

• So merging two tables must have the function of de duplication
• Let's try to combine product with product of product2_ id,product_name, merge it

• First, let's see what they look like

• According to the basic mathematical knowledge, we can predict what will happen after UNION. Let's test it

```SELECT product_id,product_name
FROM product
UNION
SELECT product_id,product_name
FROM product2;
```

Expected ðŸ¤­

The three records of "0001" ~ "0003" exist in both tables, but they will not appear again after UNION, which plays a role

• The above is the union of two tables. In fact, it is also OK for one table. How do you understand it?

• When you query by some criteria, you can merge records that meet condition A or condition B into records that meet condition A or condition B

• Like this question

• When we don't learn set operations, we use OR

```SELECT product_id,product_name,product_id, product_name, product_type,
sale_price, purchase_price
FROM product
WHERE sale_price < 800
OR sale_price > 1.5*purchase_price;
```
• You can use UNION now!

```SELECT product_id,product_name,product_id, product_name, product_type,
sale_price, purchase_price
FROM product
WHERE sale_price < 800

UNION
SELECT product_id,product_name,product_id, product_name, product_type,
sale_price, purchase_price
FROM product
WHERE sale_price > 1.5*purchase_price;
```
• The results are the same!

#### UNION and OR predicates

• Does the same result above mean that UNION is redundant? You see, there are two more lines to write sql statements with UNION
• No!
• For two tables, use UNION
• Even for a table, UNION is sometimes used for query efficiency

#### Set operation with duplicate rows UNION ALL

• Although set merging is generally de duplication, sometimes we don't want to de duplication

• UNION ALL will be used at this time

```SELECT product_id,product_name
FROM  product
UNION ALL
SELECT product_id,product_name
FROM product2;
```

#### Implicit data type conversion

Generally speaking, we will use UNION to merge columns with exactly the same type and representing the same attribute. However, sometimes, even if the data types are not exactly the same, we will display two columns with different types in one column through implicit type conversion, such as string and numeric types

### 3. Intersection join

(MySQL 8.0 does not support INTERSECT)

```SELECT p1.product_id, p1.product_name
FROM Product p1
INNER JOIN Product2 p2
ON p1.product_id=p2.product_id
```

AS is omitted above

It is best not to omit:

```SELECT p1.product_id,p1.product_name
FROM product AS p1
INNER JOIN product2 AS p2
ON p1.product_id=p2.product_id
```

### MySQL 8.0 does not support excel operation yet

However, NOT IN can also achieve the same effect as the excel operation in SQL standard syntax

• Find items that exist only in the Product table but not in the Product2 table

```SELECT *
FROM product
WHERE product_id NOT IN (SELECT product_id FROM product2);
```

• Use the NOT predicate to subtract the set to find the goods in the Product table with a selling price of more than 2000 but a profit of less than 30%

```SELECT *
FROM product
WHERE sale_price > 2000
AND product_id NOT IN (SELECT product_id FROM product WHERE sale_price < 1.3*purchase_price);
```

#### AND predicate

• For two query results of the same table, their INTERSECT can actually connect the search conditions of the two queries with AND predicate equivalently

• Find products in the product table with a profit margin of more than 50% and a selling price of less than 1500

```SELECT *
FROM product
WHERE sale_price > 1.5*purchase_price
AND sale_price < 1500
```

### 5. Symmetry difference

• In other databases, it can also be used to simply realize the symmetric difference operation of tables or query results: first, use UNION to find the UNION of two tables, then use INTERSECT to find the intersection of two tables, and then subtract the intersection from the UNION to get the symmetric difference

• However, in MySQL 8.0, because the of two tables or query results cannot be calculated directly, it is not suitable to use the above idea to calculate the symmetry difference

• So you can use A-B and B-A

• example

• Use the symmetry difference between the Product table and the Product2 table to query which products are in only one table

```SELECT *
FROM product
WHERE product_id NOT IN (SELECT product_id FROM product2)
UNION
SELECT *
FROM product2
WHERE product_id NOT IN (SELECT product_id FROM product)
```

The addition and subtraction method in the previous table is the change of rows, and the connection is the change of columns

Join is to add columns from other tables by using some association condition (generally using the equality judgment predicate "=") to perform the set operation of "add columns"

### 1. INNER JOIN

• Syntax format

```-- Inner link
FROM <tb_1> INNER JOIN <tb_2> ON <condition(s)>
```

#### Use inner links to get information from two tables

• Let's look at the product table first

```SELECT * FROM product;
```

• Let's look at the shopproduct table

```SELECT *
FROM shopproduct;
```

• The two tables have different information columns. If we want to analyze them together, we need to link them!

• operation

```SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.product_type
,P.sale_price
,SP.quantity
FROM ShopProduct AS SP
INNER JOIN Product AS P
ON SP.product_id = P.product_id;

```
##### be careful
• The join condition must be specified with the ON clause
• The columns in the SELECT clause are best used in the form of table name and column name
• In this way, you can quickly see which column comes from which table
• If two tables have other columns with the same name, you must use the above format to select the column name, otherwise the query statement will report an error

#### Use inner join with WHERE clause

• If you need to use the WHERE clause to filter the search results when using inner links, you need to write the WHERE clause after the ON clause

```SELECT  SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.product_type
,P.sale_price
,SP.quantity
FROM ShopProduct AS SP
INNER JOIN Product AS P
ON SP.product_id = P.product_id
WHERE SP.shop_name = 'Tokyo'
AND P.product_type = 'clothes' ;
```
• Note the execution order of the above queries

#### Use inner join with GROUP BY clause

• What are the selling prices of the highest priced goods in each store
```SELECT SP.shop_id
,SP.shop_name
,MAX(P.sale_price) AS max_price
FROM shopproduct AS SP
INNER JOIN product AS P
ON SP.product_id = P.product_id
GROUP BY SP.shop_id,SP.shop_name
```

The previous inner links linked two different tables. However, in fact, a table can also be connected with itself. This connection is called self connection. It should be noted that self connection is not the third connection to distinguish between internal connection and external connection. Self connection can be external connection or internal connection. It is another classification method of connection different from internal connection and external connection

#### Inner link and associated subquery

• Identify goods whose selling price is higher than the average price of such goods

```SELECT  P1.product_id,
P1.product_name,
P1.product_type,
P1.sale_price,
P2.avg_price
FROM product AS P1
INNER JOIN
(SELECT product_type,AVG(sale_price) AS avg_price
FROM product
GROUP BY  product_type) AS P2
ON P1.product_type = P2.product_type
WHERE P1.sale_price > P2.avg_price;
```

When two tables are connected naturally, the equivalent inner connection will be performed according to the column names contained in both tables. At this time, it is not necessary to use ON to specify the connection conditions

#### Natural join

```SELECT *  FROM shopproduct NATURAL JOIN product;
```

```SELECT *
FROM product AS P1
INNER JOIN product2 AS P2
ON P1.product_id = P2.product_id;
```

### 2. External connection

• The inner join discards the rows in the two tables that do not meet the ON condition. The opposite to the inner join is the outer join

• The outer link will selectively reserve rows that cannot be matched according to the type of outer link.

There are three forms of external links according to the table in which the reserved rows are located

• The left join will save the rows in the left table that cannot be matched according to the ON clause. At this time, the rows in the right table are missing values

• The right join will save the rows in the right table that cannot be matched according to the ON clause. At this time, the rows corresponding to the left table are all missing values

• The full outer join will save the rows in the two tables that cannot be matched according to the ON clause at the same time, and the rows in the corresponding other table will be filled with missing values

Unfortunately, MySQL 8.0 does not support all external links at present, but we can unify the results of left and right links to realize all external links

The corresponding grammars of the three external links are:

```-- Left link
FROM <tb_1> LEFT  OUTER JOIN <tb_2> ON <condition(s)>
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- Total external connection
FROM <tb_1> FULL  OUTER JOIN <tb_2> ON <condition(s)>
```

• Use the inner link to find out which goods each store has, and the total inventory of each kind of goods is separately

```SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
,IP.inventory_quantity
FROM ShopProduct AS SP
INNER JOIN Product AS P
ON SP.product_id = P.product_id
INNER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id
WHERE IP.inventory_id = 'P001';
```

```SELECT P.product_id
,P.product_name
,P.sale_price
,SP.shop_id
,SP.shop_name
,IP.inventory_quantity
FROM Product AS P
LEFT OUTER JOIN ShopProduct AS SP
ON SP.product_id = P.product_id
LEFT OUTER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id
```

Tags: Database MySQL SQL

Posted by mcirl2 on Wed, 22 Sep 2021 22:40:36 +0530