prometheus-operator provides a Probe CRD object, which can be used for black box monitoring, and the specific detection function is implemented by Blackbox-exporter.
blackbox-exporter is a black box monitoring solution provided by the prometheus community, which supports users to conduct network detection on target s through HTTP, HTTPS, TCP, ICMP, etc.
1. Overall structure
When using it specifically:
- First, the user creates a Probe CRD object, in which parameters such as detection mode and detection target are specified;
- Then, prometheus-operator watch to Probe object creation, and then generate the corresponding prometheus pull configuration, reload into prometheus;
- Finally, prometheus uses url=/probe?target={detection target}&module={detection method} to pull blackbox-exporter. At this time, blackbox-exporter will detect the target and return the detection result in metrics format;
2. Deploy prometheus-operator
Use kube-prometheus to deploy prometheus-operator.
# git clone -b release-0.8 git@github.com:prometheus-operator/kube-prometheus.git # cd kube-prometheus
First, deploy the CRD:
# kubectl apply -f manifests/setup # kubectl get crd |grep coreos alertmanagerconfigs.monitoring.coreos.com 2022-05-19T06:44:00Z alertmanagers.monitoring.coreos.com 2022-05-19T06:44:01Z podmonitors.monitoring.coreos.com 2022-05-19T06:44:01Z probes.monitoring.coreos.com 2022-05-19T06:44:01Z prometheuses.monitoring.coreos.com 2022-05-19T06:45:04Z prometheusrules.monitoring.coreos.com 2022-05-19T06:44:01Z servicemonitors.monitoring.coreos.com 2022-05-19T06:44:01Z thanosrulers.monitoring.coreos.com 2022-05-19T06:44:02Z
As you can see, the CRD of probes.monitoring.coreos.com is deployed.
Then, deploy prometheus-operator:
# kubectl apply -f manifests/ # kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 0 46m alertmanager-main-1 2/2 Running 0 46m alertmanager-main-2 2/2 Running 0 46m blackbox-exporter-5cb5d7479d-mznws 3/3 Running 0 49m grafana-d595885ff-cf49m 1/1 Running 0 49m kube-state-metrics-685d769786-tkv7l 3/3 Running 0 22m node-exporter-4d6mq 2/2 Running 0 49m node-exporter-8cr4v 2/2 Running 0 49m node-exporter-krr2h 2/2 Running 0 49m prometheus-adapter-6fd94587c9-6tsgb 0/1 Running 0 3s prometheus-adapter-6fd94587c9-8zm2l 1/1 Running 4 (13m ago) 13m prometheus-k8s-0 2/2 Running 0 46m prometheus-k8s-1 2/2 Running 0 46m prometheus-operator-7684989c7-qt2sp 2/2 Running 0 49m
After the deployment is complete, configure NodePort for service: prometheus-k8s to access the Prometheus UI.
3. Configuration of Blackbox-exporter
When Blackbox-exporter is running, a configuration file needs to be passed in.
The configuration file lists the probes supported by black-exporter, such as icmp, tcp, etc., among which:
- Each detection configuration is called a module, provided in yaml format;
Each module contains:
- Probe type: prober
- timeout: timeout
- ...
Typical black-exporter configuration file:
apiVersion: v1 data: config.yml: |- "modules": "http_2xx": # module name "http": "preferred_ip_protocol": "ip4" "prober": "http" "http_post_2xx": "http": "method": "POST" # POST request "preferred_ip_protocol": "ip4" "prober": "http" "tcp_connect": # tcp connection "prober": "tcp" "timeout": "10s" "tcp": "preferred_ip_protocol": "ip4" "dns": "prober": "dns" "dns": "transport_protocol": "udp" "preferred_ip_protocol": "ipv4" "query_name": "kubernetes.default.svc.cluster.local" "icmp": "prober": "icmp" kind: ConfigMap metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: blackbox-exporter app.kubernetes.io/part-of: kube-prometheus app.kubernetes.io/version: 0.18.0 name: blackbox-exporter-configuration namespace: monitoring
4. Create a Probe object
1. Probe ping
Create a ping task:
apiVersion: monitoring.coreos.com/v1 kind: Probe metadata: name: ping namespace: monitoring spec: jobName: ping # mission name prober: # Specify the address of blackbox url: blackbox-exporter.monitoring:19115 module: icmp # Detection module in configuration file targets: # Target (can be static configuration or ingress configuration) # ingress <Object> staticConfig: # If ingress is configured, static configuration takes precedence static: - https://www.baidu.com
After waiting for a while, you can see the task on the prometheus page:
Correspondingly, the configuration generated by prometheus:
- job_name: probe/monitoring/ping honor_timestamps: true params: module: - icmp scrape_interval: 30s scrape_timeout: 10s metrics_path: /probe scheme: http follow_redirects: true relabel_configs: - source_labels: [job] separator: ; regex: (.*) target_label: __tmp_prometheus_job_name replacement: $1 action: replace - separator: ; regex: (.*) target_label: job replacement: ping action: replace - source_labels: [__address__] separator: ; regex: (.*) target_label: __param_target replacement: $1 action: replace - source_labels: [__param_target] separator: ; regex: (.*) target_label: instance replacement: $1 action: replace - separator: ; regex: (.*) target_label: __address__ replacement: blackbox-exporter.monitoring:19115 action: replace static_configs: - targets: - https://www.baidu.com labels: namespace: monitoring
2. Probe HTTP
Create an HTTP task:
apiVersion: monitoring.coreos.com/v1 kind: Probe metadata: name: domain-probe namespace: monitoring spec: jobName: domain-probe # mission name prober: # Specify the address of blackbox url: blackbox-exporter:19115 module: http_2xx # Detection module in configuration file targets: # Target (can be static configuration or ingress configuration) # ingress <Object> staticConfig: # If ingress is configured, static configuration takes precedence static: - prometheus.io
After waiting for a while, you can see the task on the prometheus page:
Correspondingly, the configuration generated by prometheus:
job_name: probe/monitoring/domain-probe honor_timestamps: true params: module: - http_2xx scrape_interval: 30s scrape_timeout: 10s metrics_path: /probe scheme: http follow_redirects: true relabel_configs: - source_labels: [job] separator: ; regex: (.*) target_label: __tmp_prometheus_job_name replacement: $1 action: replace - separator: ; regex: (.*) target_label: job replacement: domain-probe action: replace - source_labels: [__address__] separator: ; regex: (.*) target_label: __param_target replacement: $1 action: replace - source_labels: [__param_target] separator: ; regex: (.*) target_label: instance replacement: $1 action: replace - separator: ; regex: (.*) target_label: __address__ replacement: blackbox-exporter:19115 action: replace static_configs: - targets: - prometheus.io labels: namespace: monitoring
3. View the pulled indicators
You can send the curl command to bloackbox-exporter, pass in the detection method and detection target, blackbox-exporter initiates the detection, and returns the detection result in the format of metrics:
curl http://192.168.0.1:31392/probe?target=prometheus.io&module=http_2xx
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds # TYPE probe_dns_lookup_time_seconds gauge probe_dns_lookup_time_seconds 0.275433879 # HELP probe_duration_seconds Returns how long the probe took to complete in seconds # TYPE probe_duration_seconds gauge probe_duration_seconds 2.373368898 # HELP probe_failed_due_to_regex Indicates if probe failed due to regex # TYPE probe_failed_due_to_regex gauge probe_failed_due_to_regex 0 # HELP probe_http_content_length Length of http content response # TYPE probe_http_content_length gauge probe_http_content_length -1 # HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects # TYPE probe_http_duration_seconds gauge probe_http_duration_seconds{phase="connect"} 0.400100412 probe_http_duration_seconds{phase="processing"} 0.509387522 probe_http_duration_seconds{phase="resolve"} 0.365111732 probe_http_duration_seconds{phase="tls"} 1.200170298 probe_http_duration_seconds{phase="transfer"} 0.000451343 # HELP probe_http_redirects The number of redirects # TYPE probe_http_redirects gauge probe_http_redirects 1 # HELP probe_http_ssl Indicates if SSL was used for the final redirect # TYPE probe_http_ssl gauge probe_http_ssl 1 # HELP probe_http_status_code Response HTTP status code # TYPE probe_http_status_code gauge probe_http_status_code 200 # HELP probe_http_uncompressed_body_length Length of uncompressed response body # TYPE probe_http_uncompressed_body_length gauge probe_http_uncompressed_body_length 15757 # HELP probe_http_version Returns the version of HTTP of the probe response # TYPE probe_http_version gauge probe_http_version 2 # HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes. # TYPE probe_ip_addr_hash gauge probe_ip_addr_hash 2.590428662e+09 # HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6 # TYPE probe_ip_protocol gauge probe_ip_protocol 4 # HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in unixtime # TYPE probe_ssl_earliest_cert_expiry gauge probe_ssl_earliest_cert_expiry 1.686095999e+09 # HELP probe_ssl_last_chain_expiry_timestamp_seconds Returns last SSL chain expiry in timestamp seconds # TYPE probe_ssl_last_chain_expiry_timestamp_seconds gauge probe_ssl_last_chain_expiry_timestamp_seconds 1.686095999e+09 # HELP probe_ssl_last_chain_info Contains SSL leaf certificate information # TYPE probe_ssl_last_chain_info gauge probe_ssl_last_chain_info{fingerprint_sha256="99ac7e7bf8d38ce32c95b2b3c965a9d2b479b0bf2e3b40c576173131a249f877"} 1 # HELP probe_success Displays whether or not the probe was a success # TYPE probe_success gauge probe_success 1 # HELP probe_tls_version_info Contains the TLS version used # TYPE probe_tls_version_info gauge probe_tls_version_info{version="TLS 1.3"} 1
Five. Source code analysis of Probe
Prometheus-Operator's processing of Probe CRD objects is similar to that of other CRD objects:
- First, the informer monitors the changes of the Probe CRD object;
- Then, generate a new Prometheus configuration based on the new CRD and reload it to prometheus;
1. Listen to the Probe CRD object
Monitor the changes of the Probe CRD object through the Informer.
First, create the Informer:
// prometheus-operator/pkg/prometheus/operator.go // New creates a new controller. func New(ctx context.Context, conf operator.Config, logger log.Logger, r prometheus.Registerer) (*Operator, error) { ... c := &Operator{ ... } ... c.probeInfs, err = informers.NewInformersForResource( informers.NewMonitoringInformerFactories( c.config.Namespaces.AllowList, c.config.Namespaces.DenyList, mclient, resyncPeriod, nil, ), monitoringv1.SchemeGroupVersion.WithResource(monitoringv1.ProbeName), ) if err != nil { return nil, errors.Wrap(err, "error creating probe informers") } ... return c, nil }
Then, add an event handler for the Informer:
// prometheus-operator/pkg/prometheus/operator.go // addHandlers adds the eventhandlers to the informers. func (c *Operator) addHandlers() { ... c.probeInfs.AddEventHandler(cache.ResourceEventHandlerFuncs{ AddFunc: c.handleBmonAdd, UpdateFunc: c.handleBmonUpdate, DeleteFunc: c.handleBmonDelete, }) ... }
Take a look at the Add event handler:
- Enqueue the namespace where the object is located;
// TODO: Don't enqueue just for the namespace func (c *Operator) handleBmonAdd(obj interface{}) { if o, ok := c.getObject(obj); ok { level.Debug(c.logger).Log("msg", "Probe added") c.metrics.TriggerByCounter(monitoringv1.ProbesKind, "add").Inc() c.enqueueForMonitorNamespace(o.GetNamespace()) } }
2. Generate Prometheus configuration
In Prometheus-operator, there are worker threads to obtain changed objects from the queue, and then tune them.
// prometheus-operator/pkg/prometheus/operator.go func (c *Operator) sync(ctx context.Context, key string) error { ... // Handle Probe objects here if err := c.createOrUpdateConfigurationSecret(ctx, p, ruleConfigMapNames, assetStore); err != nil { return errors.Wrap(err, "creating config failed") } ... }
For the Probe object, generate the Prometheus configuration according to its content, and then write it into the secret;
That is to say, the configuration of Prometheus is written into the Secret object, and then the reloader sidecar reloads the content of Secret to Prometheus;
func (c *Operator) createOrUpdateConfigurationSecret(ctx context.Context, p *monitoringv1.Prometheus, ruleConfigMapNames []string, store *assets.Store) error { ... // Get the Probe object bmons, err := c.selectProbes(ctx, p, store) if err != nil { return errors.Wrap(err, "selecting Probes failed") } ... // generate new configuration conf, err := c.configGenerator.generateConfig( p, smons, pmons, bmons, store.BasicAuthAssets, store.BearerTokenAssets, additionalScrapeConfigs, additionalAlertRelabelConfigs, additionalAlertManagerConfigs, ruleConfigMapNames, ) if err != nil { return errors.Wrap(err, "generating config failed") } // Write the configuration to the Secret object s := makeConfigSecret(p, c.config) ... }
The specific process of generating the Prometheus configuration from the Probe object:
// pkg/prometheus/promcfg.go func (cg *configGenerator) generateProbeConfig( version semver.Version, m *v1.Probe, apiserverConfig *v1.APIServerConfig, basicAuthSecrets map[string]assets.BasicAuthCredentials, bearerTokens map[string]assets.BearerToken, ignoreHonorLabels bool, overrideHonorTimestamps bool, ignoreNamespaceSelectors bool, enforcedNamespaceLabel string) yaml.MapSlice { jobName := fmt.Sprintf("probe/%s/%s", m.Namespace, m.Name) cfg := yaml.MapSlice{ { Key: "job_name", Value: jobName, }, } ... // Configuration of metrics_path path := "/probe" if m.Spec.ProberSpec.Path != "" { path = m.Spec.ProberSpec.Path } cfg = append(cfg, yaml.MapItem{Key: "metrics_path", Value: path}) ... // configuration of params cfg = append(cfg, yaml.MapItem{Key: "params", Value: yaml.MapSlice{ {Key: "module", Value: []string{m.Spec.Module}}, }}) ... // Configuration of static_configs if m.Spec.Targets.StaticConfig != nil { staticConfig := yaml.MapSlice{ {Key: "targets", Value: m.Spec.Targets.StaticConfig.Targets}, } if m.Spec.Targets.StaticConfig.Labels != nil { if _, ok := m.Spec.Targets.StaticConfig.Labels["namespace"]; !ok { m.Spec.Targets.StaticConfig.Labels["namespace"] = m.Namespace } } else { m.Spec.Targets.StaticConfig.Labels = map[string]string{"namespace": m.Namespace} } staticConfig = append(staticConfig, yaml.MapSlice{ {Key: "labels", Value: m.Spec.Targets.StaticConfig.Labels}, }...) cfg = append(cfg, yaml.MapItem{ Key: "static_configs", Value: []yaml.MapSlice{staticConfig}, }) ... } ... return cfg }
refer to:
1.https://docs.youdianzhishi.co...
2. Official doc: https://prometheus-operator.d...
3. The CRD of the probe: https://github.com/prometheus...