Probe usage and principle analysis of prometheus-operator

prometheus-operator provides a Probe CRD object, which can be used for black box monitoring, and the specific detection function is implemented by Blackbox-exporter.

blackbox-exporter is a black box monitoring solution provided by the prometheus community, which supports users to conduct network detection on target s through HTTP, HTTPS, TCP, ICMP, etc.

1. Overall structure

When using it specifically:

  • First, the user creates a Probe CRD object, in which parameters such as detection mode and detection target are specified;
  • Then, prometheus-operator watch to Probe object creation, and then generate the corresponding prometheus pull configuration, reload into prometheus;
  • Finally, prometheus uses url=/probe?target={detection target}&module={detection method} to pull blackbox-exporter. At this time, blackbox-exporter will detect the target and return the detection result in metrics format;

2. Deploy prometheus-operator

Use kube-prometheus to deploy prometheus-operator.

# git clone -b release-0.8 git@github.com:prometheus-operator/kube-prometheus.git
# cd kube-prometheus

First, deploy the CRD:

# kubectl apply -f manifests/setup
# kubectl get crd |grep coreos
alertmanagerconfigs.monitoring.coreos.com            2022-05-19T06:44:00Z
alertmanagers.monitoring.coreos.com                  2022-05-19T06:44:01Z
podmonitors.monitoring.coreos.com                    2022-05-19T06:44:01Z
probes.monitoring.coreos.com                         2022-05-19T06:44:01Z
prometheuses.monitoring.coreos.com                   2022-05-19T06:45:04Z
prometheusrules.monitoring.coreos.com                2022-05-19T06:44:01Z
servicemonitors.monitoring.coreos.com                2022-05-19T06:44:01Z
thanosrulers.monitoring.coreos.com                   2022-05-19T06:44:02Z

As you can see, the CRD of probes.monitoring.coreos.com is deployed.

Then, deploy prometheus-operator:

# kubectl apply -f manifests/
# kubectl get pods -n monitoring
NAME                                  READY   STATUS    RESTARTS      AGE
alertmanager-main-0                   2/2     Running   0             46m
alertmanager-main-1                   2/2     Running   0             46m
alertmanager-main-2                   2/2     Running   0             46m
blackbox-exporter-5cb5d7479d-mznws    3/3     Running   0             49m
grafana-d595885ff-cf49m               1/1     Running   0             49m
kube-state-metrics-685d769786-tkv7l   3/3     Running   0             22m
node-exporter-4d6mq                   2/2     Running   0             49m
node-exporter-8cr4v                   2/2     Running   0             49m
node-exporter-krr2h                   2/2     Running   0             49m
prometheus-adapter-6fd94587c9-6tsgb   0/1     Running   0             3s
prometheus-adapter-6fd94587c9-8zm2l   1/1     Running   4 (13m ago)   13m
prometheus-k8s-0                      2/2     Running   0             46m
prometheus-k8s-1                      2/2     Running   0             46m
prometheus-operator-7684989c7-qt2sp   2/2     Running   0             49m

After the deployment is complete, configure NodePort for service: prometheus-k8s to access the Prometheus UI.

3. Configuration of Blackbox-exporter

When Blackbox-exporter is running, a configuration file needs to be passed in.

The configuration file lists the probes supported by black-exporter, such as icmp, tcp, etc., among which:

  • Each detection configuration is called a module, provided in yaml format;
  • Each module contains:

    • Probe type: prober
    • timeout: timeout
    • ...

Typical black-exporter configuration file:

apiVersion: v1
data:
  config.yml: |-
    "modules":
      "http_2xx":                # module name
        "http":
          "preferred_ip_protocol": "ip4"
        "prober": "http"
      "http_post_2xx":
        "http":
          "method": "POST"        # POST request
          "preferred_ip_protocol": "ip4"
        "prober": "http"
      "tcp_connect":            # tcp connection
        "prober": "tcp"
        "timeout": "10s"
        "tcp":
          "preferred_ip_protocol": "ip4"
      "dns":
        "prober": "dns"
        "dns":
          "transport_protocol": "udp"
          "preferred_ip_protocol": "ipv4"
          "query_name": "kubernetes.default.svc.cluster.local"
      "icmp":
        "prober": "icmp"
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: blackbox-exporter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 0.18.0
  name: blackbox-exporter-configuration
  namespace: monitoring

4. Create a Probe object

1. Probe ping

Create a ping task:

apiVersion: monitoring.coreos.com/v1
kind: Probe
metadata:
  name: ping
  namespace: monitoring
spec:
  jobName: ping # mission name
  prober: # Specify the address of blackbox
    url: blackbox-exporter.monitoring:19115
  module: icmp # Detection module in configuration file
  targets: # Target (can be static configuration or ingress configuration)
    # ingress <Object>
    staticConfig: # If ingress is configured, static configuration takes precedence
      static:
        - https://www.baidu.com

After waiting for a while, you can see the task on the prometheus page:

Correspondingly, the configuration generated by prometheus:

- job_name: probe/monitoring/ping
  honor_timestamps: true
  params:
    module:
    - icmp
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /probe
  scheme: http
  follow_redirects: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: job
    replacement: ping
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    target_label: __param_target
    replacement: $1
    action: replace
  - source_labels: [__param_target]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: blackbox-exporter.monitoring:19115
    action: replace
  static_configs:
  - targets:
    - https://www.baidu.com
    labels:
      namespace: monitoring

2. Probe HTTP

Create an HTTP task:

apiVersion: monitoring.coreos.com/v1
kind: Probe
metadata:
  name: domain-probe
  namespace: monitoring
spec:
  jobName: domain-probe # mission name
  prober: # Specify the address of blackbox
    url: blackbox-exporter:19115
  module: http_2xx # Detection module in configuration file
  targets: # Target (can be static configuration or ingress configuration)
    # ingress <Object>
    staticConfig: # If ingress is configured, static configuration takes precedence
      static:
        - prometheus.io

After waiting for a while, you can see the task on the prometheus page:

Correspondingly, the configuration generated by prometheus:

job_name: probe/monitoring/domain-probe
  honor_timestamps: true
  params:
    module:
    - http_2xx
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /probe
  scheme: http
  follow_redirects: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: job
    replacement: domain-probe
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    target_label: __param_target
    replacement: $1
    action: replace
  - source_labels: [__param_target]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: blackbox-exporter:19115
    action: replace
  static_configs:
  - targets:
    - prometheus.io
    labels:
      namespace: monitoring

3. View the pulled indicators

You can send the curl command to bloackbox-exporter, pass in the detection method and detection target, blackbox-exporter initiates the detection, and returns the detection result in the format of metrics:

curl http://192.168.0.1:31392/probe?target=prometheus.io&module=http_2xx
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.275433879
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 2.373368898
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_content_length Length of http content response
# TYPE probe_http_content_length gauge
probe_http_content_length -1
# HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0.400100412
probe_http_duration_seconds{phase="processing"} 0.509387522
probe_http_duration_seconds{phase="resolve"} 0.365111732
probe_http_duration_seconds{phase="tls"} 1.200170298
probe_http_duration_seconds{phase="transfer"} 0.000451343
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 1
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 1
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 200
# HELP probe_http_uncompressed_body_length Length of uncompressed response body
# TYPE probe_http_uncompressed_body_length gauge
probe_http_uncompressed_body_length 15757
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 2
# HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes.
# TYPE probe_ip_addr_hash gauge
probe_ip_addr_hash 2.590428662e+09
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in unixtime
# TYPE probe_ssl_earliest_cert_expiry gauge
probe_ssl_earliest_cert_expiry 1.686095999e+09
# HELP probe_ssl_last_chain_expiry_timestamp_seconds Returns last SSL chain expiry in timestamp seconds
# TYPE probe_ssl_last_chain_expiry_timestamp_seconds gauge
probe_ssl_last_chain_expiry_timestamp_seconds 1.686095999e+09
# HELP probe_ssl_last_chain_info Contains SSL leaf certificate information
# TYPE probe_ssl_last_chain_info gauge
probe_ssl_last_chain_info{fingerprint_sha256="99ac7e7bf8d38ce32c95b2b3c965a9d2b479b0bf2e3b40c576173131a249f877"} 1
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1
# HELP probe_tls_version_info Contains the TLS version used
# TYPE probe_tls_version_info gauge
probe_tls_version_info{version="TLS 1.3"} 1

Five. Source code analysis of Probe

Prometheus-Operator's processing of Probe CRD objects is similar to that of other CRD objects:

  • First, the informer monitors the changes of the Probe CRD object;
  • Then, generate a new Prometheus configuration based on the new CRD and reload it to prometheus;

1. Listen to the Probe CRD object

Monitor the changes of the Probe CRD object through the Informer.

First, create the Informer:

// prometheus-operator/pkg/prometheus/operator.go
// New creates a new controller.
func New(ctx context.Context, conf operator.Config, logger log.Logger, r prometheus.Registerer) (*Operator, error) {
    ...
    c := &Operator{
        ...
    }
    ...
    c.probeInfs, err = informers.NewInformersForResource(
        informers.NewMonitoringInformerFactories(
            c.config.Namespaces.AllowList,
            c.config.Namespaces.DenyList,
            mclient,
            resyncPeriod,
            nil,
        ),
        monitoringv1.SchemeGroupVersion.WithResource(monitoringv1.ProbeName),
    )
    if err != nil {
        return nil, errors.Wrap(err, "error creating probe informers")
    }
    ...
    return c, nil
}

Then, add an event handler for the Informer:

// prometheus-operator/pkg/prometheus/operator.go
// addHandlers adds the eventhandlers to the informers.
func (c *Operator) addHandlers() {
    ...
    c.probeInfs.AddEventHandler(cache.ResourceEventHandlerFuncs{
        AddFunc:    c.handleBmonAdd,
        UpdateFunc: c.handleBmonUpdate,
        DeleteFunc: c.handleBmonDelete,
    })
    ...
}

Take a look at the Add event handler:

  • Enqueue the namespace where the object is located;
// TODO: Don't enqueue just for the namespace
func (c *Operator) handleBmonAdd(obj interface{}) {
   if o, ok := c.getObject(obj); ok {
      level.Debug(c.logger).Log("msg", "Probe added")
      c.metrics.TriggerByCounter(monitoringv1.ProbesKind, "add").Inc()
      c.enqueueForMonitorNamespace(o.GetNamespace())
   }
}

2. Generate Prometheus configuration

In Prometheus-operator, there are worker threads to obtain changed objects from the queue, and then tune them.

// prometheus-operator/pkg/prometheus/operator.go
func (c *Operator) sync(ctx context.Context, key string) error {
    ...
    // Handle Probe objects here
    if err := c.createOrUpdateConfigurationSecret(ctx, p, ruleConfigMapNames, assetStore); err != nil {
        return errors.Wrap(err, "creating config failed")
    }
    ...
}

For the Probe object, generate the Prometheus configuration according to its content, and then write it into the secret;
That is to say, the configuration of Prometheus is written into the Secret object, and then the reloader sidecar reloads the content of Secret to Prometheus;

func (c *Operator) createOrUpdateConfigurationSecret(ctx context.Context, p *monitoringv1.Prometheus, ruleConfigMapNames []string, store *assets.Store) error {
    ...
    // Get the Probe object
    bmons, err := c.selectProbes(ctx, p, store)
    if err != nil {
        return errors.Wrap(err, "selecting Probes failed")
    }
    ...
    // generate new configuration
    conf, err := c.configGenerator.generateConfig(
        p,
        smons,
        pmons,
        bmons,
        store.BasicAuthAssets,
        store.BearerTokenAssets,
        additionalScrapeConfigs,
        additionalAlertRelabelConfigs,
        additionalAlertManagerConfigs,
        ruleConfigMapNames,
    )
    if err != nil {
        return errors.Wrap(err, "generating config failed")
    }
    // Write the configuration to the Secret object
    s := makeConfigSecret(p, c.config)
    ...
}

The specific process of generating the Prometheus configuration from the Probe object:

// pkg/prometheus/promcfg.go
func (cg *configGenerator) generateProbeConfig(
    version semver.Version,
    m *v1.Probe,
    apiserverConfig *v1.APIServerConfig,
    basicAuthSecrets map[string]assets.BasicAuthCredentials,
    bearerTokens map[string]assets.BearerToken,
    ignoreHonorLabels bool,
    overrideHonorTimestamps bool,
    ignoreNamespaceSelectors bool,
    enforcedNamespaceLabel string) yaml.MapSlice {

    jobName := fmt.Sprintf("probe/%s/%s", m.Namespace, m.Name)
    cfg := yaml.MapSlice{
        {
            Key:   "job_name",
            Value: jobName,
        },
    }
    ...
    // Configuration of metrics_path
    path := "/probe"
    if m.Spec.ProberSpec.Path != "" {
        path = m.Spec.ProberSpec.Path
    }
    cfg = append(cfg, yaml.MapItem{Key: "metrics_path", Value: path})
    ...
    // configuration of params
    cfg = append(cfg, yaml.MapItem{Key: "params", Value: yaml.MapSlice{
        {Key: "module", Value: []string{m.Spec.Module}},
    }})
    ...
    // Configuration of static_configs
    if m.Spec.Targets.StaticConfig != nil {
        staticConfig := yaml.MapSlice{
            {Key: "targets", Value: m.Spec.Targets.StaticConfig.Targets},
        }
        if m.Spec.Targets.StaticConfig.Labels != nil {
            if _, ok := m.Spec.Targets.StaticConfig.Labels["namespace"]; !ok {
                m.Spec.Targets.StaticConfig.Labels["namespace"] = m.Namespace
            }
        } else {
            m.Spec.Targets.StaticConfig.Labels = map[string]string{"namespace": m.Namespace}
        }
        staticConfig = append(staticConfig, yaml.MapSlice{
            {Key: "labels", Value: m.Spec.Targets.StaticConfig.Labels},
        }...)
        cfg = append(cfg, yaml.MapItem{
            Key:   "static_configs",
            Value: []yaml.MapSlice{staticConfig},
        })
        ...
    }    
    ...
    return cfg
}    

refer to:

1.https://docs.youdianzhishi.co...
2. Official doc: https://prometheus-operator.d...
3. The CRD of the probe: https://github.com/prometheus...

Tags: Prometheus

Posted by BillyT on Sat, 11 Feb 2023 19:37:05 +0530