Remember today's hive installation's understanding of configuration

hive.metastore.local

Before reinstalling hive today, I always thought that this configuration file expressed the relationship between hive and mysql. It wasn't until today that I thought about going to the official website to look at the configuration file and install it. I found out that there was a big problem with the installation understanding of the three modes of hive.

We click on the following link to go to the official website for the installation introduction page.
link

local user mode


For his introduction here, we all know that the metadata of the hive table is stored in a relational database. Generally, we will integrate it with mysql (that is, copy the mysql connection jar package to the HIVE_HOME/lib directory), so the upper part of the picture It is the storage path for metadata. The second half of the picture introduces the configuration of hive's server and client in the integrated mode. You will find that the hive.metastore.local parameter, which I have been confusing with metadata storage, is actually configured by this parameter For the relationship between the hive server and client, if it is true, there is no need to configure the uri because they are together, just find yourself to connect directly.
Then our configuration for the simplest local multi-user mode should be as follows

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>hive.metastore.warehouse.dir</name>
                <value>/user/hive_local/warehouse</value>
        </property>
        <property>
                <name>hive.metastore.local</name>
                <value>true</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionURL</name>
                <value>jdbc:mysql://node01:3306/hive_local?createDatabaseIfNotExist=true</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionDriverName</name>
                <value>com.mysql.jdbc.Driver</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionUserName</name>
                <value>root</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionPassword</name>
                <value>123456</value>
        </property>
</configuration>

Remote Separate Mode

I won’t go into too much detail about the remote integrated mode here. This integration must be remembered as the integration of the hive server and client! So the configuration is exactly the same as above. Let's mainly look at the configuration of the remote separation mode. The screenshot of the official website is as follows:

What we see here is very interesting. The main parameters we discussed have been marked with a no longer needed sign on the official website from beginning to end! Here you only need to set the uris of metasrore. Then mainly look at the place marked by the highlighter above me. The HDFS storage path you set on the server side is the default path of the hive table, and the same configuration parameters are also written on the client side, but the introduction is different, storing non-external hive table data. So we'd better set these two path locations to be the same, to prevent the impact of data not being found and misoperation (deletion) caused by not specifying the path when you create the external table later.
The configuration example of remote separate mode is as follows.

client
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
                <property>
                        <name>hive.metastore.warehouse.dir</name>
                        <value>/user/hive_remote2/warehouse</value>
                </property>
                <property>
                        <name>hive.metastore.uris</name>
                        <value>thrift://node03:9083</value>
                </property>
</configuration>
Server

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>hive.metastore.warehouse.dir</name>
                <value>/user/hive_remote2/warehouse</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionURL</name>
                <value>jdbc:mysql://node01:3306/hive_remote2?createDatabaseIfNotExist=true</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionDriverName</name>
                <value>com.mysql.jdbc.Driver</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionUserName</name>
                <value>root</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionPassword</name>
                <value>123456</value>
        </property>
</configuration>

When I installed hive a long time ago, I would also synchronize the jline jar package in the HIVE_HOME/lib directory with the HADOOP_PREFIX jar package version. Now there is no error, but I am still used to doing this. If there are other problems later, continue to update this article.

Posted by EricC on Mon, 19 Dec 2022 12:08:40 +0530