解决blackbox-exporter探测https网站报unknown authority的问题

发现问题

按照前面的文章《使用acme.sh+ZeroSSL自动更新ingress中的免费https证书》,申请了ZeroSSL免费证书并部署到了开发环境的网站https://authentication.dev.ityoudao.cn,Chrome浏览器访问该网站一切正常,但是将网站添加到Prometheus的blackbox-exporter的黑盒测试中,告警系统一直报网站非200错误。

诊断问题

  1. 登录Prometheus后台,检查Target状态,发现Target是正常的(up指标是1),但是probe_http_status_codeprobe_http_sslprobe_success三个指标的值明显不正常,这说明不是Prometheus的问题:
# {job="blackbox-http-d0-dev", des="https://authentication.dev.ityoudao.cn/"}
probe_http_ssl{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 0
probe_http_status_code{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 0
probe_success{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 0
up{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 1
  1. 检查blackbox-exporter,使用带debug=true请求参数的curl命令访问blackbox-exporter的/probe接口(也可以添加--log.level=debug命令行参数开启blackbox-exporter的debug日志):
[root@k8s-master ~]# blackbox_exporter_ip=$(kubectl get service -n monitoring blackbox-exporter --output=jsonpath={.spec.clusterIP})
[root@k8s-master ~]# curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_2xx&debug=true"
Logs for the probe:
ts=2024-01-20T12:47:21.501587248Z caller=main.go:181 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Beginning probe" probe=http timeout_seconds=119.5
ts=2024-01-20T12:47:21.502147655Z caller=http.go:328 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Resolving target address" target=authentication.dev.ityoudao.cn ip_protocol=ip4
ts=2024-01-20T12:47:21.587710386Z caller=http.go:328 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Resolved target address" target=authentication.dev.ityoudao.cn ip=192.168.56.20
ts=2024-01-20T12:47:21.587889776Z caller=client.go:252 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Making HTTP request" url=https://192.168.56.20 host=authentication.dev.ityoudao.cn
ts=2024-01-20T12:47:21.594168047Z caller=handler.go:120 module=http_2xx target=https://authentication.dev.ityoudao.cn level=error msg="Error for HTTP request" err="Get "https://192.168.56.20": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2024-01-20T12:47:21.594366738Z caller=handler.go:120 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Response timings for roundtrip" roundtrip=0 start=2024-01-20T12:47:21.588062955Z dnsDone=2024-01-20T12:47:21.588062955Z connectDone=2024-01-20T12:47:21.589454793Z gotConn=0001-01-01T00:00:00Z responseStart=0001-01-01T00:00:00Z tlsStart=2024-01-20T12:47:21.589540428Z tlsDone=2024-01-20T12:47:21.5941391Z end=0001-01-01T00:00:00Z
ts=2024-01-20T12:47:21.594425635Z caller=main.go:181 module=http_2xx target=https://authentication.dev.ityoudao.cn level=error msg="Probe failed" duration_seconds=0.092664024
...
  • 果然发现了问题,blackbox-exporter探针访问https://authentication.dev.ityoudao.cn网站报tls: failed to verify certificate: x509: certificate signed by unknown authority错误,也就是说blackbox-exporter不信任ZeroSSL ECC Domain Secure Site CA
  • 这个很好理解了,跟curl命令直接访问https://authentication.dev.ityoudao.cncurl: (60) Peer's Certificate issuer is not recognized.一样,curl命令也不信任ZeroSSL ECC Domain Secure Site CA
[root@k8s-master ~]# curl https://authentication.dev.ityoudao.cn
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

解决问题

首先我们想一想,怎么解决curl命令访问https://authentication.dev.ityoudao.cn报证书错误的问题?看上面curl报错后的提示,至少有三种方法:使用证书颁发机构的公钥;添加--cacert参数指定服务器CA证书;添加--insecure参数忽略服务器CA证书的校验。我们这里已经是证书颁发机构的颁发的公钥,只不过不被curl命令和blackbox-exporter信任而已,因此我们尝试从后面两种方式入手。

  1. 修改blackbox-exporter的配置文件添加自定义module

在blackbox-exporter的配置文件/etc/blackbox_exporter/config.yml中,添加自定义module:

方法一:忽略服务器CA证书的校验

  http_insecure_skip_verify:
    prober: http
    timeout: 5s
    http:
      preferred_ip_protocol: "ip4"
      method: GET
      tls_config:
        insecure_skip_verify: true

方法二:指定服务器CA证书

  http_zerossl_ca:
    prober: http
    timeout: 5s
    http:
      preferred_ip_protocol: "ip4"
      method: GET
      tls_config:
        ca_file: "/certs/zerossl_ca.cert"

详细的blackbox-exporter的配置参考Blackbox exporter configuration - tls_config

  1. 构建自定义Docker镜像

构建包含修改后的blackbox-exporter配置文件和zerossl_ca.cert证书的自定义Docker镜像:

mkdir blackbox-exporter && cd blackbox-exporter
cp "/opt/acme.sh/certs/*.dev.ityoudao.cn_ecc/ca.cer" zerossl_ca.cert
cat > Dockerfile <<"EOM"
FROM registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0
COPY zerossl_ca.cert /certs/
RUN echo -e '  http_insecure_skip_verify:
    prober: http
    timeout: 5s
    http:
      preferred_ip_protocol: "ip4"
      method: GET
      tls_config:
        insecure_skip_verify: true
  http_zerossl_ca:
    prober: http
    timeout: 5s
    http:
      preferred_ip_protocol: "ip4"
      method: GET
      tls_config:
        ca_file: "/certs/zerossl_ca.cert"' >> /etc/blackbox_exporter/config.yml
EOM
docker build --tag registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0_zerossl_ca .
docker push registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0_zerossl_ca

更新blackbox-exporter服务的镜像:

kubectl patch deployment blackbox-exporter -n monitoring --type json -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0_zerossl_ca"}]'
  1. 使用带debug=true请求参数的curl命令再次访问blackbox-exporter的/probe接口
  • 忽略服务器CA证书的校验,没有报错:
curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_insecure_skip_verify&debug=true"
[root@k8s-master blackbox-exporter]# curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_insecure_skip_verify&debug=true"
Logs for the probe:
ts=2024-01-20T14:26:33.052242857Z caller=main.go:181 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Beginning probe" probe=http timeout_seconds=5
ts=2024-01-20T14:26:33.052608267Z caller=http.go:328 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Resolving target address" target=authentication.dev.ityoudao.cn ip_protocol=ip4
ts=2024-01-20T14:26:33.104520836Z caller=http.go:328 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Resolved target address" target=authentication.dev.ityoudao.cn ip=192.168.56.20
ts=2024-01-20T14:26:33.104715389Z caller=client.go:252 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Making HTTP request" url=https://192.168.56.20 host=authentication.dev.ityoudao.cn
ts=2024-01-20T14:26:33.119631133Z caller=handler.go:120 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Received HTTP response" status_code=200
ts=2024-01-20T14:26:33.120647135Z caller=handler.go:120 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Response timings for roundtrip" roundtrip=0 start=2024-01-20T14:26:33.104951586Z dnsDone=2024-01-20T14:26:33.104951586Z connectDone=2024-01-20T14:26:33.106256513Z gotConn=2024-01-20T14:26:33.11458881Z responseStart=2024-01-20T14:26:33.119521801Z tlsStart=2024-01-20T14:26:33.106358579Z tlsDone=2024-01-20T14:26:33.114542253Z end=2024-01-20T14:26:33.120568776Z
ts=2024-01-20T14:26:33.120872731Z caller=main.go:181 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Probe succeeded" duration_seconds=0.068531396
...
Module configuration:
prober: http
timeout: 5s
http:
  preferred_ip_protocol: ip4
  ip_protocol_fallback: true
  method: GET
  tls_config:
    insecure_skip_verify: true
  follow_redirects: true
  enable_http2: true
tcp:
  ip_protocol_fallback: true
icmp:
  ip_protocol_fallback: true
  ttl: 64
dns:
  ip_protocol_fallback: true
  recursion_desired: true
  • 指定服务器CA证书,也没有报错:
[root@k8s-master blackbox-exporter]# curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_zerossl_ca&debug=true"
Logs for the probe:
ts=2024-01-20T14:27:50.281277378Z caller=main.go:181 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Beginning probe" probe=http timeout_seconds=5
ts=2024-01-20T14:27:50.281698692Z caller=http.go:328 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Resolving target address" target=authentication.dev.ityoudao.cn ip_protocol=ip4
ts=2024-01-20T14:27:50.302748401Z caller=http.go:328 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Resolved target address" target=authentication.dev.ityoudao.cn ip=192.168.56.20
ts=2024-01-20T14:27:50.304247388Z caller=client.go:252 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Making HTTP request" url=https://192.168.56.20 host=authentication.dev.ityoudao.cn
ts=2024-01-20T14:27:50.321878323Z caller=handler.go:120 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Received HTTP response" status_code=200
ts=2024-01-20T14:27:50.322254169Z caller=handler.go:120 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Response timings for roundtrip" roundtrip=0 start=2024-01-20T14:27:50.304858139Z dnsDone=2024-01-20T14:27:50.304858139Z connectDone=2024-01-20T14:27:50.306266153Z gotConn=2024-01-20T14:27:50.318531157Z responseStart=2024-01-20T14:27:50.321791451Z tlsStart=2024-01-20T14:27:50.306348812Z tlsDone=2024-01-20T14:27:50.318493066Z end=2024-01-20T14:27:50.322009365Z
ts=2024-01-20T14:27:50.322546935Z caller=main.go:181 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Probe succeeded" duration_seconds=0.041141285
...
Module configuration:
prober: http
timeout: 5s
http:
  preferred_ip_protocol: ip4
  ip_protocol_fallback: true
  method: GET
  tls_config:
    ca_file: /certs/zerossl_ca.cert
    insecure_skip_verify: false
  follow_redirects: true
  enable_http2: true
tcp:
  ip_protocol_fallback: true
icmp:
  ip_protocol_fallback: true
  ttl: 64
dns:
  ip_protocol_fallback: true
  recursion_desired: true
  1. 重新创建一个使用http_zerossl_ca模块的Prometheus Job
    - job_name: 'blackbox-http-zerossl-ca-d0-dev'
      honor_timestamps: true
      scrape_interval: 30s
      scrape_timeout: 5s
      metrics_path: /probe
      params:
        module: [http_zerossl_ca]
      static_configs:
      - targets:
        - https://authentication.dev.ityoudao.cn/
      relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: des
      - target_label: __address__
        replacement: blackbox-exporter:9115

Prometheus配置生效之后,很快告警系统就不再报https://authentication.dev.ityoudao.cn/网站的错误,至此问题解决!
Prometheus Graph

本文通过黑盒测试发现使用ZeroSSL证书的网站在Prometheus的blackbox-exporter中报错,经过诊断发现是blackbox-exporter不信任证书颁发机构。解决问题的方法是通过修改blackbox-exporter配置,忽略服务器CA证书校验或指定服务器CA证书。最终通过构建了自定义Docker镜像,更新了blackbox-exporter服务,并建议重新创建一个使用新模块的Prometheus Job,解决了blackbox-exporter不信任ZeroSSL CA的问题。