In September, datawhale formed a team to learn task04 set operation

In September, datawhale formed a team to learn task04 set operation

1, Addition and subtraction of tables

1. What is set operation

  • The execution results of tables, views and queries learned earlier are collections

    • The element is the record, which is each row of the table or query result
  • Set operation

    • Set operators
      • UNION Union operation
      • INTERSECT intersection operation
      • Excel difference operation
      • (however, MySQL 8.0 does not support INTERSECT and excel)
    • When solving the problem, the table can be regarded as a set for set operation

2. Addition UNION of table

  • Mutuality is one of the three elements of a set

    • So merging two tables must have the function of de duplication
  • Let's try to combine product with product of product2_ id,product_name, merge it

    • First, let's see what they look like

    • According to the basic mathematical knowledge, we can predict what will happen after UNION. Let's test it

      SELECT product_id,product_name
      FROM product
      UNION
      SELECT product_id,product_name
      FROM product2;
      

      Expected 🤭

      The three records of "0001" ~ "0003" exist in both tables, but they will not appear again after UNION, which plays a role

  • The above is the union of two tables. In fact, it is also OK for one table. How do you understand it?

    • When you query by some criteria, you can merge records that meet condition A or condition B into records that meet condition A or condition B

    • Like this question

      • When we don't learn set operations, we use OR

        SELECT product_id,product_name,product_id, product_name, product_type,
               sale_price, purchase_price
        FROM product
        WHERE sale_price < 800
           OR sale_price > 1.5*purchase_price;
        
      • You can use UNION now!

        SELECT product_id,product_name,product_id, product_name, product_type,
               sale_price, purchase_price
        FROM product
        WHERE sale_price < 800
        
        UNION
        SELECT product_id,product_name,product_id, product_name, product_type,
               sale_price, purchase_price
        FROM product
        WHERE sale_price > 1.5*purchase_price;
        
      • The results are the same!

UNION and OR predicates

  • Does the same result above mean that UNION is redundant? You see, there are two more lines to write sql statements with UNION
    • No!
      • For two tables, use UNION
      • Even for a table, UNION is sometimes used for query efficiency

Set operation with duplicate rows UNION ALL

  • Although set merging is generally de duplication, sometimes we don't want to de duplication

    • UNION ALL will be used at this time

      SELECT product_id,product_name
      FROM  product
      UNION ALL
      SELECT product_id,product_name
      FROM product2;
      

Implicit data type conversion

Generally speaking, we will use UNION to merge columns with exactly the same type and representing the same attribute. However, sometimes, even if the data types are not exactly the same, we will display two columns with different types in one column through implicit type conversion, such as string and numeric types

3. Intersection join

(MySQL 8.0 does not support INTERSECT)

SELECT p1.product_id, p1.product_name
  FROM Product p1
INNER JOIN Product2 p2
ON p1.product_id=p2.product_id

AS is omitted above

It is best not to omit:

SELECT p1.product_id,p1.product_name
FROM product AS p1
INNER JOIN product2 AS p2
ON p1.product_id=p2.product_id

4. Subtraction of difference set, complement set and table

NOT predicate

MySQL 8.0 does not support excel operation yet

However, NOT IN can also achieve the same effect as the excel operation in SQL standard syntax

  • Find items that exist only in the Product table but not in the Product2 table

    SELECT *
    FROM product
    WHERE product_id NOT IN (SELECT product_id FROM product2);
    

  • Use the NOT predicate to subtract the set to find the goods in the Product table with a selling price of more than 2000 but a profit of less than 30%

    SELECT *
    FROM product
    WHERE sale_price > 2000
    AND product_id NOT IN (SELECT product_id FROM product WHERE sale_price < 1.3*purchase_price);
    

AND predicate

  • For two query results of the same table, their INTERSECT can actually connect the search conditions of the two queries with AND predicate equivalently

  • Find products in the product table with a profit margin of more than 50% and a selling price of less than 1500

    SELECT *
    FROM product
    WHERE sale_price > 1.5*purchase_price
    AND sale_price < 1500
    

5. Symmetry difference

  • In other databases, it can also be used to simply realize the symmetric difference operation of tables or query results: first, use UNION to find the UNION of two tables, then use INTERSECT to find the intersection of two tables, and then subtract the intersection from the UNION to get the symmetric difference

  • However, in MySQL 8.0, because the of two tables or query results cannot be calculated directly, it is not suitable to use the above idea to calculate the symmetry difference

  • So you can use A-B and B-A

  • example

    • Use the symmetry difference between the Product table and the Product2 table to query which products are in only one table

      SELECT *
      FROM product
      WHERE product_id NOT IN (SELECT product_id FROM product2)
      UNION
      SELECT *
      FROM product2
      WHERE product_id NOT IN (SELECT product_id FROM product)
      

2, Link

The addition and subtraction method in the previous table is the change of rows, and the connection is the change of columns

Join is to add columns from other tables by using some association condition (generally using the equality judgment predicate "=") to perform the set operation of "add columns"

1. INNER JOIN

  • Syntax format

    -- Inner link
    FROM <tb_1> INNER JOIN <tb_2> ON <condition(s)>
    

Use inner links to get information from two tables

  • Let's look at the product table first

    SELECT * FROM product;
    

  • Let's look at the shopproduct table

    SELECT *
    FROM shopproduct;
    

  • The two tables have different information columns. If we want to analyze them together, we need to link them!

  • operation

    SELECT SP.shop_id
           ,SP.shop_name
           ,SP.product_id
           ,P.product_name
           ,P.product_type
           ,P.sale_price
           ,SP.quantity
     FROM ShopProduct AS SP
    INNER JOIN Product AS P
       ON SP.product_id = P.product_id;
    
    
be careful
  • The join condition must be specified with the ON clause
  • The columns in the SELECT clause are best used in the form of table name and column name
    • In this way, you can quickly see which column comes from which table
    • If two tables have other columns with the same name, you must use the above format to select the column name, otherwise the query statement will report an error

Use inner join with WHERE clause

  • If you need to use the WHERE clause to filter the search results when using inner links, you need to write the WHERE clause after the ON clause

  • Follow the example above

    SELECT  SP.shop_id
           ,SP.shop_name
           ,SP.product_id
           ,P.product_name
           ,P.product_type
           ,P.sale_price
           ,SP.quantity
      FROM ShopProduct AS SP
     INNER JOIN Product AS P
        ON SP.product_id = P.product_id
     WHERE SP.shop_name = 'Tokyo'
       AND P.product_type = 'clothes' ;
    
    • Note the execution order of the above queries

Use inner join with GROUP BY clause

  • What are the selling prices of the highest priced goods in each store
SELECT SP.shop_id
      ,SP.shop_name
      ,MAX(P.sale_price) AS max_price
 FROM shopproduct AS SP
INNER JOIN product AS P
ON SP.product_id = P.product_id
GROUP BY SP.shop_id,SP.shop_name

Self linking

The previous inner links linked two different tables. However, in fact, a table can also be connected with itself. This connection is called self connection. It should be noted that self connection is not the third connection to distinguish between internal connection and external connection. Self connection can be external connection or internal connection. It is another classification method of connection different from internal connection and external connection

Inner link and associated subquery

  • Identify goods whose selling price is higher than the average price of such goods

    SELECT  P1.product_id,
           P1.product_name,
           P1.product_type,
           P1.sale_price,
           P2.avg_price
    FROM product AS P1
    INNER JOIN
        (SELECT product_type,AVG(sale_price) AS avg_price
            FROM product
            GROUP BY  product_type) AS P2
        ON P1.product_type = P2.product_type
    WHERE P1.sale_price > P2.avg_price;
    

    When two tables are connected naturally, the equivalent inner connection will be performed according to the column names contained in both tables. At this time, it is not necessary to use ON to specify the connection conditions

Natural join

SELECT *  FROM shopproduct NATURAL JOIN product;

Intersection using links

SELECT *
FROM product AS P1
INNER JOIN product2 AS P2
ON P1.product_id = P2.product_id;

2. External connection

  • The inner join discards the rows in the two tables that do not meet the ON condition. The opposite to the inner join is the outer join

  • The outer link will selectively reserve rows that cannot be matched according to the type of outer link.

    There are three forms of external links according to the table in which the reserved rows are located

    • Left link

    • The left join will save the rows in the left table that cannot be matched according to the ON clause. At this time, the rows in the right table are missing values

    • Right link

    • The right join will save the rows in the right table that cannot be matched according to the ON clause. At this time, the rows corresponding to the left table are all missing values

    • External link

    • The full outer join will save the rows in the two tables that cannot be matched according to the ON clause at the same time, and the rows in the corresponding other table will be filled with missing values

      Unfortunately, MySQL 8.0 does not support all external links at present, but we can unify the results of left and right links to realize all external links

The corresponding grammars of the three external links are:

-- Left link     
FROM <tb_1> LEFT  OUTER JOIN <tb_2> ON <condition(s)>
-- Right link     
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- Total external connection
FROM <tb_1> FULL  OUTER JOIN <tb_2> ON <condition(s)>

3. Multi table link

Multiple intra table links

  • Use the inner link to find out which goods each store has, and the total inventory of each kind of goods is separately

    SELECT SP.shop_id
           ,SP.shop_name
           ,SP.product_id
          ,P.product_name
           ,P.sale_price
           ,IP.inventory_quantity
    FROM ShopProduct AS SP
    INNER JOIN Product AS P
    ON SP.product_id = P.product_id
     INNER JOIN InventoryProduct AS IP
    ON SP.product_id = IP.product_id
    WHERE IP.inventory_id = 'P001';
    

Multiple off table links

  • External links can generally have more rows than internal links, so they can give more information about the main table than internal links. Using external links in multi table links also has the same effect
SELECT P.product_id
       ,P.product_name
       ,P.sale_price
       ,SP.shop_id
       ,SP.shop_name
       ,IP.inventory_quantity
  FROM Product AS P
  LEFT OUTER JOIN ShopProduct AS SP
ON SP.product_id = P.product_id
LEFT OUTER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id

Tags: Database MySQL SQL

Posted by mcirl2 on Wed, 22 Sep 2021 22:40:36 +0530