Blue whale CMDB deployment notes

System introduction

Tencent blue whale smart cloud system is composed of platform level products and general SaaS services. The platform includes control platform, configuration platform, operation platform, data platform, container management platform, mining platform, PaaS platform, mobile platform, etc. General SaaS includes node management, standard operation and maintenance, log retrieval, blue whale monitoring, fault self-healing, etc. it provides various cloud services Users of (public cloud, private cloud and hybrid cloud) provide one-stop technology operation solutions with different scenarios and different needs.

Relying on the concepts of enterprise SOA and integration, Tencent blue whale intelligent cloud system has built a new operation and maintenance mode by using Docker and other most advanced cloud technologies, and is committed to landing DevOps in the way of "atomic service integration" and "low-cost tool construction", so as to help the operation and maintenance quickly realize "unattended basic services" and "value-added services" And further realize more comprehensive and sustainable efficiency improvement of the enterprise through the implementation of DevOps.

Architecture diagram

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-2lssunro-1631774209308)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c8667c1e45.jpg)]

Environmental preparation

System environment

The system version requires CentOS7 or above

IP addresshost nameto configuredescribeSystem version
192.168.31.221cmdb_node18c/16GCentral control computerCentOS 7.6.1810
192.168.31.223cmdb_node24c/8GCentOS 7.6.1810
192.168.31.224cmdb_node34c/8GCentOS 7.6.1810

Environment preparation all nodes need to operate

Close SELinux

setenforce 0
echo "/usr/sbin/setenforce 0" >> /etc/rc.local
sed -i 's/^SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
sed -i 's/^SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

Close NetworkManager

systemctl stop NetworkManager
systemctl disable NetworkManager

Set DNS

cat > /etc/resolv.conf << EOF
nameserver 127.0.0.1
nameserver 114.114.114.114
nameserver 8.8.8.8
EOF

Set hosts file

cat >> /etc/hosts << EOF
192.168.31.221 cmdb_node1
192.168.31.223 cmdb_node2
192.168.31.224 cmdb_node3
EOF

Turn off firewall

systemctl stop iptables
systemctl disable iptables
systemctl stop firewalld
systemctl disable firewalld

Clock calibration

/sbin/ntpdate ntp2.aliyun.com;/sbin/hwclock -w

echo "#Clock synchronization" >> /var/spool/cron/root 
echo "01 00 * * * /sbin/ntpdate ntp2.aliyun.com;/sbin/hwclock -w" >> /var/spool/cron/root

Adjust file open count

cat <<EOF > /etc/security/limits.d/99-nofile.conf
root soft nofile 102400
root hard nofile 102400
EOF

Restart the machine

reboot

#After restart, use the sestatus command to see the disable command before selinux is shut down
sestatus

#Check whether the number of open files has been modified
ulimit -n

Install rsync

yum -y install rsync

Replace yum source

wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.cloud.tencent.com/repo/centos7_base.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.cloud.tencent.com/repo/epel-7.repo
yum clean all
yum makecache

Download the blue whale community version

Download address: https://bk.tencent.com/download/
Select the full version to download, and then upload it to the central control computer 192.168.31.221

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-ev1bel4k-1631774209314)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c83bfb293f.png)]

Download certificate

Before downloading the certificate, you need to obtain the MAC address of the machine

Get the mac addresses of the three machines

cat /sys/class/net/eth0/address

Fill in the mac address, separate it with English semicolons, download it, and upload it to the central control computer after downloading
[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-aoj4ivpm-1631774209317)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c83d77a289.png)]

Central computer operation

Unzip the downloaded blue whale source file to the / data directory

mkdir /data
tar xf bkce_src-4.1.16.tgz -C /data/

After decompression, Get two directories: src, install
src: Store blue whale product software, And dependent open source components
install: Store installation and deployment scripts, parameter configuration during installation, daily operation and maintenance scripts, etc

install.config
install.config is the configuration file of the corresponding relationship between the module and the server, which describes which modules are installed on which machines.
Each row has two columns. The first column is the IP address; the second column is the module name separated by English commas.
For details, refer to the install.config.3IP.sample file (you can copy install.config.3IP.sample to install.config).

cp -rf /data/install/install.config.3IP.sample /data/install/install.config

#IP address before modifying this file
cat /data/install/install.config
192.168.31.221 nginx,appt,rabbitmq,kafka,zk,es,bkdata,consul,fta
192.168.31.223 mongodb,appo,kafka,zk,es,mysql,beanstalk,consul
192.168.31.224 paas,cmdb,job,gse,license,kafka,zk,es,redis,consul,influxdb

explain:
The configuration file is separated from the service name by a space after the ip. For machines with multiple intranet IPS, the first intranet ip in the / sbin/ifconfig output is used by default, and the list of services to be installed on the machine is written after the ip. The standard private address is used by default during deployment. If the enterprise environment uses a non-standard private address, please refer to the processor of non-standard intranet ip processing Law.
zk stands for zookeeper and es stands for elasticsearch
gse and redis need to be deployed on the same machine
If gse needs cross cloud support, the machine where gse is located must have an external network IP
When increasing the number of machines, you can move the services in the above configuration to new machines to share the load. Ensure that the total number of each component of kafka, es and zk is 3

globals.env
This file defines the account password information, function switch control options, etc. of various components. We now specify it in this file. During the following installation, the script will read the contents of this file.

vim /data/install/globals.env
    26	# Domain name information
    27	export BK_DOMAIN="abcops.com"            # Blue whale root domain name (excluding host name). This domain name may not exist. Later, we will use the hosts file to resolve it temporarily
    28	export PAAS_FQDN="paas.$BK_DOMAIN"       # PAAS full domain name
    29	export CMDB_FQDN="cmdb.$BK_DOMAIN"       # CMDB full domain name
    30	export JOB_FQDN="job.$BK_DOMAIN"         # JOB full domain name
    31	export APPO_FQDN="o.$BK_DOMAIN"          # Full domain name of official environment
    32	export APPT_FQDN="t.$BK_DOMAIN"          # Test environment full domain name
    33	
    34	# HAS_DNS_SERVER option, domain name resolution through DNS server or by configuring hosts
    35	# When configuring the mapping relationship through hosts, the default value is 0, indicating that you do not have your own DNS server
    36	#  At this time, the mapping relationship of paas,cmdb,job and other platforms will be added to the / etc/hosts file on all machines
    37	export HAS_DNS_SERVER=0
    38	
    39	# DB information      
    40	export MYSQL_USER="root"                # mysql user name
    41	export MYSQL_PASS='123456'              # Specify mysql password
    42	export REDIS_PASS='123456'              # redis password
    43	export MONGODB_USER="root"              # mongodb user name
    44	export MONGODB_PASS='123456'            # mongodb password
    45	
    46	# Account information (suggested modification)
    47	export MQ_USER=admin
    48	export MQ_PASS='123456'                  # MQ password
    49	export ZK_USER=bkzk
    50	export ZK_PASS='123456'                  # zookeeper password
    51	
    52	export PAAS_ADMIN_USER=admin
    53	export PAAS_ADMIN_PASS='123456'          #Login paas platform password

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-hfirqh53-1631774209320)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c86820cdb3.jpg)]

Configure password free login

Configure password free login on the central control computer (at this time, root must be able to log in to the system, and root login cannot be disabled)

cd /data/install
bash configure_ssh_without_pass
Generating public/private rsa key pair.
Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:uCDzX7a3zHr7uugh5ylZpH69ce/Qa+JXCuEJCjgoDM8 root@cmdb_node1
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|.                |
|oo . .           |
|..E o ... . .    |
| .o ...+S. o o   |
|   + ...o   +.  .|
|    ..oo=.. o..o |
|     .+*.B+o.o+. |
|      .+B+OOo=+  |
+----[SHA256]-----+
Warning: Permanently added '192.168.31.221' (ECDSA) to the list of known hosts.
Warning: Permanently added '192.168.31.223' (ECDSA) to the list of known hosts.
root@192.168.31.223's password:         #Enter the password for 192.168.31.223
Warning: Permanently added '192.168.31.224' (ECDSA) to the list of known hosts.
root@192.168.31.224's password:         #Enter the password for 192.168.31.224

Import certificate SSL

tar xf /usr/local/src/ssl_certificates.tar.gz -C /data/src/cert/
ls /data/src/cert/
gse_agent.crt       gse_api_client.key      gse_esb_api_client.key  gse_server.key          job_esb_api_client.key  license_prv.key  platform.key
gse_agent.key       gseca.crt               gse_job_api_client.p12  job_ca.crt              job_server.p12          md5.txt
gse_api_client.crt  gse_esb_api_client.crt  gse_server.crt          job_esb_api_client.crt  license_cert.cert       platform.cert

Check whether the environment meets the requirements before installation

Before installation, verify whether the environment meets the requirements. After configuring the environment and deployment according to the requirements of the document, run the following script to verify whether it meets the requirements:

cd /data/install
bash precheck.sh

#If the current environment returns the following, it means that the environment is completely ok. If one item is not ok, check the problem or check the above steps
<<check_ssh_nopass>> has been checked successfully... SKIP
<<check_password>> has been checked successfully... SKIP
start <<check_cert_mac>> ... [OK]
start <<check_selinux>> ... [OK]
start <<check_umask>> ... [OK]
start <<check_get_lan_ip>> ... [OK]
start <<check_rabbitmq_version>> ... Repository cr is listed more than once in the configuration
Repository fasttrack is listed more than once in the configuration
[OK]
start <<check_http_proxy>> ... [OK]
start <<check_open_files_limit>> ... [OK]
start <<check_domain>> ... [OK]
start <<check_rsync>> ... [OK]
start <<check_networkmanager>> ... [OK]
start <<check_firewalld>> ... [OK]

[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-hg2t4y3q-1631774209324)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c869ac9e2a.jpg)]

Deploy blue whale

Perform the following operations in sequence to complete the installation of blue whale foundation platform
If there is an error / failure in the following steps, you need to repair the error according to the prompt and re execute the same command (breakpoint continued installation).
If there are errors in each step, you need to fix the errors and ensure that the installation is successful before you can continue. Because the order of installing the blue whale platform is dependent. If the previous platform is not successful, you will encounter more errors if you continue to install.
Please refer to the related commands required to repair errors Maintain documents

Deploy paas platform

Central computer operation

cd /data/install/
./bk_install paas

where do you want to install blueking products.
enter a absolute path [/data/bkce]:                     #Press enter here to confirm the installation path of paas

If you agree to the blue whale agreement, enter "yes"

[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-eig9muu1-1631774209326)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c86ae3d8f3.jpg)]

After the paas platform is installed, print it as follows. We can add hosts parsing file on our own computer to access it

Add the hosts parsing file. This is a mac, so you can add it directly with the command. If you win, open the hosts file and add it like this

tail -1 /etc/hosts
192.168.31.221 paas.abcops.com

On the login page, the account password is "admin/123456" specified in the global.env configuration file just now
[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-gqvw56s7-1631774209327)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c8700c9bcc.jpg)]

After the deployment is completed, the three machines log in again and reload the current environment variables because the host name has been changed

#cmdb_node1 is changed to
[root@nginx-1 ~]# hostname
nginx-1

#cmdb_node2 is changed to
[root@mongodb-1 ~]# hostname
mongodb-1

#cmdb_node3 is changed to
[root@paas-1 ~]# hostname
paas-1

Deploy cmdb platform

Central computer operation

[root@nginx-1 ~]# cd /data/install/
[root@nginx-1 install]# ./bk_install cmdb


[192.168.31.224] server      cmdb_adminserver                 RUNNING   pid 32168, uptime 0:14:53
[192.168.31.224] server      cmdb_apiserver                   RUNNING   pid 32160, uptime 0:14:53
[192.168.31.224] server      cmdb_auditcontoller              RUNNING   pid 32149, uptime 0:14:53
[192.168.31.224] server      cmdb_datacollection              RUNNING   pid 32163, uptime 0:14:53
[192.168.31.224] server      cmdb_eventserver                 RUNNING   pid 32161, uptime 0:14:53
[192.168.31.224] server      cmdb_hostcontroller              RUNNING   pid 32143, uptime 0:14:53
[192.168.31.224] server      cmdb_hostserver                  RUNNING   pid 32144, uptime 0:14:53
[192.168.31.224] server      cmdb_objectcontroller            RUNNING   pid 32146, uptime 0:14:53
[192.168.31.224] server      cmdb_proccontroller              RUNNING   pid 32170, uptime 0:14:53
[192.168.31.224] server      cmdb_procserver                  RUNNING   pid 32148, uptime 0:14:53
[192.168.31.224] server      cmdb_toposerver                  RUNNING   pid 32145, uptime 0:14:53
[192.168.31.224] server      cmdb_webserver                   RUNNING   pid 32147, uptime 0:14:53

If no error is reported in the above steps, You can pass now http://cmdb.abcops.com:80 visit the configuration platform,

After installation, return to the following

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-a9tfxtoj-1631774209330)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c87319cedc.jpg)]

Add the hosts parsing file. This is a mac, so you can add it directly with the command. If you win, open the hosts file and add it like this

tail -2 /etc/hosts
192.168.31.221 paas.abcops.com
192.168.31.221 cmdb.abcops.com

Login page - click Configure platform

[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (IMG mprrqxsc-1631774209332)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c8747d8d20.jpg)]

[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-7j0ig8gb-1631774209333)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c8757754c4.jpg)]

Deploy job platform

Central computer operation

[root@nginx-1 install]# cd /data/install/
[root@nginx-1 install]# ./bk_install job

Screenshot of successful job installation

[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-vbb490sf-1631774209336)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c876c8e48f.jpg)]

Add hosts parse file

tail -3 /etc/hosts
192.168.31.221 paas.abcops.com
192.168.31.221 cmdb.abcops.com
192.168.31.221 job.abcops.com

Click the operation platform after logging in the page
[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-cfrim616-1631774209340)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c87814d4ff.jpg)]

Deploy app_mgr platform

[root@nginx-1 install]# cd /data/install/
[root@nginx-1 install]# ./bk_install app_mgr

After this step is completed, you can see the successfully activated server in the server information and third-party service information of the developer center
At the same time, saas applications (except blue whale monitoring and log retrieval) can also be uploaded and deployed

Screenshot of installation completion:
[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-fua8x9zf-1631774209342)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c87aa40ff9.jpg)]

If there are no errors in the above steps, you can now complete the deployment of the formal environment and the test environment. You can:

  1. Adopted. / bk_install saas-o bk_nodeman deploys the node management app, or
  2. Deploy app s through the developer center
    To install blue whale monitoring and log retrieval, you need to go through. / BK first_ Install bkdata

Start deploying saas-o bk_nodeman node management app

[root@nginx-1 install]# ./bk_install saas-o bk_nodeman

Deploy bkdata

Install the basic module of blue whale data platform and its dependent services

[root@nginx-1 install]# ./bk_install bkdata

If no error is reported in the above steps, You can finish it now bkdata Deployment of,sure:
 1. adopt./bk_install saas-o bk_monitor Deploy blue whale monitoring app, or
 2. Deploy blue whale monitoring through the developer center app

Install blue whale monitoring app
./bk_install saas-o bk_monitor deploy blue whale monitoring app

[root@nginx-1 install]# ./bk_install saas-o bk_monitor

Deploy fta background

Install self-healing background service

[root@nginx-1 install]# ./bk_install fta
 If no error is reported in the above steps, You can now complete the deployment of the fault self-healing background,sure:
 1. adopt./bk_install saas-o bk_fta Deploy fault self-healing app, or
 2. Deploy fault self-healing through the developer center app

Install fault self-healing app
Note that the command printed by the script is ". / bk_install saas-o bk_fta". In fact, this installation package is incomplete. The installation package is stored in / data / SRC / official_ Under SaaS directory

[root@nginx-1 install]# ls /data/src/official_saas/
bk_fta_solutions_V4.1.15.tar.gz  bk_log_search_V1.1.24.tar.gz  bk_monitor_V1.4.73.tar.gz  bk_nodeman_V1.0.80.tar.gz  bk_sops_V3.1.32-ce.tar.gz

The following command should be executed

[root@nginx-1 install]# ./bk_install saas-o bk_fta_solutions

Deploy gse_agent

Reinstall gse_agent and register the correct cluster module to the configuration platform

[root@nginx-1 install]# ./bkcec install gse_agent

Deploy saas

Deploy official SaaS to the official environment (automatically deploy SaaS from / data/src/official_saas / directory through the command line)

[root@nginx-1 install]# ./bkcec install saas-o

At this time, there are 7 modules to refresh the platform
[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-1frbiyg9-1631774209344)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c885ab3754.jpg)]

Installation of third party cooperation components

Install network management platform

Download address: https://bk.tencent.com/download_sdk/
[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (IMG acvrlxom-1631774209346)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c88876786f.jpg)]

After downloading, upload it to the central control computer

#Decompression compression
[root@nginx-1 src]# tar xf bknetwork.tgz -C /data/src/
[root@nginx-1 src]# tar xf /data/src/bknetwork/bknetwork-3.6.1.tgz -C /data/src/

#Synchronize package content
[root@nginx-1 src]# rsync -a /data/src/bknetwork/install/  /data/install/

What is saved in this file is the complete domain name of the blue whale network management platform, which can be modified according to the actual situation or not

[root@nginx-1 src]# cat /data/install/third/globals_bknetwork.env
# vim:ft=sh

# Domain name information (blue whale partner application)
export BKNETWORK_FQDN="bknetwork.$BK_DOMAIN"          # Complete domain name of blue whale network management platform

Deploy network management

[root@nginx-1 src]# cd /data/install/
[root@nginx-1 install]# ./bkco_install bknetwork
[192.168.31.221]20190926-144533 43   please add 'bknetwork' to 'install.config'     #The error message says that bknetwork should be synchronized to install.config

[root@nginx-1 install]# cat install.config
192.168.31.221 nginx,appt,rabbitmq,kafka,zk,es,bkdata,consul,fta
192.168.31.223 mongodb,appo,kafka,zk,es,mysql,beanstalk,consul,bknetwork            #I added it here to deploy bknetwork to the 31.223 machine
192.168.31.224 paas,cmdb,job,gse,license,kafka,zk,es,redis,consul,influxdb

#Install again
[root@nginx-1 install]# ./bkco_install bknetwork

After successful installation, add the hosts parsing file, and add the last one for the three cmdb devices and their own devices
192.168.31.221 bknetwork.abcops.com

[root@nginx-1 nginx]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.31.221 cmdb_node1
192.168.31.223 cmdb_node2
192.168.31.224 cmdb_node3
192.168.31.221   nginx-1
192.168.31.221 paas.abcops.com
192.168.31.221 cmdb.abcops.com
192.168.31.221 job.abcops.com
192.168.31.221   rbtnode1
192.168.31.221 bknetwork.abcops.com

[root@mongodb-1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.31.221 cmdb_node1
192.168.31.223 cmdb_node2
192.168.31.224 cmdb_node3
192.168.31.223   mongodb-1
192.168.31.221 paas.abcops.com
192.168.31.221 cmdb.abcops.com
192.168.31.221 job.abcops.com
192.168.31.221 bknetwork.abcops.com

[root@paas-1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.31.217    cmdb_node1
192.168.31.221 cmdb_node1
192.168.31.223 cmdb_node2
192.168.31.224 cmdb_node3
192.168.31.224   paas-1
192.168.31.221 paas.abcops.com
192.168.31.221 cmdb.abcops.com
192.168.31.221 job.abcops.com
192.168.31.221 bknetwork.abcops.com

Then use your own device to access the parsing. The following is the parsing page
[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-xhwwccvy-1631774209349)( http://172.26.3.89:4999/server/ …/Public/Uploads/2019-09-26/5d8c89ebd41bf.jpg)]

Tags: Linux Operation & Maintenance Docker

Posted by shawngibson on Mon, 20 Sep 2021 10:44:46 +0530