发现问题
按照前面的文章《使用acme.sh+ZeroSSL自动更新ingress中的免费https证书》,申请了ZeroSSL免费证书并部署到了开发环境的网站
诊断问题
- 登录Prometheus后台,检查Target状态,发现Target是正常的(
up 指标是1),但是probe_http_status_code 、probe_http_ssl 、probe_success 三个指标的值明显不正常,这说明不是Prometheus的问题:
# {job="blackbox-http-d0-dev", des="https://authentication.dev.ityoudao.cn/"} probe_http_ssl{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 0 probe_http_status_code{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 0 probe_success{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 0 up{des="https://authentication.dev.ityoudao.cn/", instance="blackbox-exporter:9115", job="blackbox-http-d0-dev"} 1
- 检查blackbox-exporter,使用带
debug=true 请求参数的curl命令访问blackbox-exporter的/probe 接口(也可以添加--log.level=debug 命令行参数开启blackbox-exporter的debug日志):
[root@k8s-master ~]# blackbox_exporter_ip=$(kubectl get service -n monitoring blackbox-exporter --output=jsonpath={.spec.clusterIP}) [root@k8s-master ~]# curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_2xx&debug=true" Logs for the probe: ts=2024-01-20T12:47:21.501587248Z caller=main.go:181 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Beginning probe" probe=http timeout_seconds=119.5 ts=2024-01-20T12:47:21.502147655Z caller=http.go:328 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Resolving target address" target=authentication.dev.ityoudao.cn ip_protocol=ip4 ts=2024-01-20T12:47:21.587710386Z caller=http.go:328 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Resolved target address" target=authentication.dev.ityoudao.cn ip=192.168.56.20 ts=2024-01-20T12:47:21.587889776Z caller=client.go:252 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Making HTTP request" url=https://192.168.56.20 host=authentication.dev.ityoudao.cn ts=2024-01-20T12:47:21.594168047Z caller=handler.go:120 module=http_2xx target=https://authentication.dev.ityoudao.cn level=error msg="Error for HTTP request" err="Get "https://192.168.56.20": tls: failed to verify certificate: x509: certificate signed by unknown authority" ts=2024-01-20T12:47:21.594366738Z caller=handler.go:120 module=http_2xx target=https://authentication.dev.ityoudao.cn level=info msg="Response timings for roundtrip" roundtrip=0 start=2024-01-20T12:47:21.588062955Z dnsDone=2024-01-20T12:47:21.588062955Z connectDone=2024-01-20T12:47:21.589454793Z gotConn=0001-01-01T00:00:00Z responseStart=0001-01-01T00:00:00Z tlsStart=2024-01-20T12:47:21.589540428Z tlsDone=2024-01-20T12:47:21.5941391Z end=0001-01-01T00:00:00Z ts=2024-01-20T12:47:21.594425635Z caller=main.go:181 module=http_2xx target=https://authentication.dev.ityoudao.cn level=error msg="Probe failed" duration_seconds=0.092664024 ...
- 果然发现了问题,blackbox-exporter探针访问
https://authentication.dev.ityoudao.cn 网站报tls: failed to verify certificate: x509: certificate signed by unknown authority 错误,也就是说blackbox-exporter不信任ZeroSSL ECC Domain Secure Site CA 。 - 这个很好理解了,跟
curl 命令直接访问https://authentication.dev.ityoudao.cn 报curl: (60) Peer's Certificate issuer is not recognized. 一样,curl 命令也不信任ZeroSSL ECC Domain Secure Site CA :
[root@k8s-master ~]# curl https://authentication.dev.ityoudao.cn curl: (60) Peer's Certificate issuer is not recognized. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option.
解决问题
首先我们想一想,怎么解决
- 修改blackbox-exporter的配置文件添加自定义module
在blackbox-exporter的配置文件
方法一:忽略服务器CA证书的校验
http_insecure_skip_verify: prober: http timeout: 5s http: preferred_ip_protocol: "ip4" method: GET tls_config: insecure_skip_verify: true
方法二:指定服务器CA证书
http_zerossl_ca: prober: http timeout: 5s http: preferred_ip_protocol: "ip4" method: GET tls_config: ca_file: "/certs/zerossl_ca.cert"
详细的blackbox-exporter的配置参考Blackbox exporter configuration - tls_config
- 构建自定义Docker镜像
构建包含修改后的blackbox-exporter配置文件和
mkdir blackbox-exporter && cd blackbox-exporter cp "/opt/acme.sh/certs/*.dev.ityoudao.cn_ecc/ca.cer" zerossl_ca.cert cat > Dockerfile <<"EOM" FROM registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0 COPY zerossl_ca.cert /certs/ RUN echo -e ' http_insecure_skip_verify: prober: http timeout: 5s http: preferred_ip_protocol: "ip4" method: GET tls_config: insecure_skip_verify: true http_zerossl_ca: prober: http timeout: 5s http: preferred_ip_protocol: "ip4" method: GET tls_config: ca_file: "/certs/zerossl_ca.cert"' >> /etc/blackbox_exporter/config.yml EOM docker build --tag registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0_zerossl_ca . docker push registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0_zerossl_ca
更新blackbox-exporter服务的镜像:
kubectl patch deployment blackbox-exporter -n monitoring --type json -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"registry.cn-hangzhou.aliyuncs.com/ityoudao/blackbox-exporter:v0.24.0_zerossl_ca"}]'
- 使用带
debug=true 请求参数的curl命令再次访问blackbox-exporter的/probe 接口
- 忽略服务器CA证书的校验,没有报错:
curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_insecure_skip_verify&debug=true" [root@k8s-master blackbox-exporter]# curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_insecure_skip_verify&debug=true" Logs for the probe: ts=2024-01-20T14:26:33.052242857Z caller=main.go:181 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Beginning probe" probe=http timeout_seconds=5 ts=2024-01-20T14:26:33.052608267Z caller=http.go:328 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Resolving target address" target=authentication.dev.ityoudao.cn ip_protocol=ip4 ts=2024-01-20T14:26:33.104520836Z caller=http.go:328 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Resolved target address" target=authentication.dev.ityoudao.cn ip=192.168.56.20 ts=2024-01-20T14:26:33.104715389Z caller=client.go:252 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Making HTTP request" url=https://192.168.56.20 host=authentication.dev.ityoudao.cn ts=2024-01-20T14:26:33.119631133Z caller=handler.go:120 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Received HTTP response" status_code=200 ts=2024-01-20T14:26:33.120647135Z caller=handler.go:120 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Response timings for roundtrip" roundtrip=0 start=2024-01-20T14:26:33.104951586Z dnsDone=2024-01-20T14:26:33.104951586Z connectDone=2024-01-20T14:26:33.106256513Z gotConn=2024-01-20T14:26:33.11458881Z responseStart=2024-01-20T14:26:33.119521801Z tlsStart=2024-01-20T14:26:33.106358579Z tlsDone=2024-01-20T14:26:33.114542253Z end=2024-01-20T14:26:33.120568776Z ts=2024-01-20T14:26:33.120872731Z caller=main.go:181 module=http_insecure_skip_verify target=https://authentication.dev.ityoudao.cn level=info msg="Probe succeeded" duration_seconds=0.068531396 ... Module configuration: prober: http timeout: 5s http: preferred_ip_protocol: ip4 ip_protocol_fallback: true method: GET tls_config: insecure_skip_verify: true follow_redirects: true enable_http2: true tcp: ip_protocol_fallback: true icmp: ip_protocol_fallback: true ttl: 64 dns: ip_protocol_fallback: true recursion_desired: true
- 指定服务器CA证书,也没有报错:
[root@k8s-master blackbox-exporter]# curl "http://${blackbox_exporter_ip}:9115/probe?target=https://authentication.dev.ityoudao.cn&module=http_zerossl_ca&debug=true" Logs for the probe: ts=2024-01-20T14:27:50.281277378Z caller=main.go:181 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Beginning probe" probe=http timeout_seconds=5 ts=2024-01-20T14:27:50.281698692Z caller=http.go:328 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Resolving target address" target=authentication.dev.ityoudao.cn ip_protocol=ip4 ts=2024-01-20T14:27:50.302748401Z caller=http.go:328 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Resolved target address" target=authentication.dev.ityoudao.cn ip=192.168.56.20 ts=2024-01-20T14:27:50.304247388Z caller=client.go:252 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Making HTTP request" url=https://192.168.56.20 host=authentication.dev.ityoudao.cn ts=2024-01-20T14:27:50.321878323Z caller=handler.go:120 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Received HTTP response" status_code=200 ts=2024-01-20T14:27:50.322254169Z caller=handler.go:120 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Response timings for roundtrip" roundtrip=0 start=2024-01-20T14:27:50.304858139Z dnsDone=2024-01-20T14:27:50.304858139Z connectDone=2024-01-20T14:27:50.306266153Z gotConn=2024-01-20T14:27:50.318531157Z responseStart=2024-01-20T14:27:50.321791451Z tlsStart=2024-01-20T14:27:50.306348812Z tlsDone=2024-01-20T14:27:50.318493066Z end=2024-01-20T14:27:50.322009365Z ts=2024-01-20T14:27:50.322546935Z caller=main.go:181 module=http_zerossl_ca target=https://authentication.dev.ityoudao.cn level=info msg="Probe succeeded" duration_seconds=0.041141285 ... Module configuration: prober: http timeout: 5s http: preferred_ip_protocol: ip4 ip_protocol_fallback: true method: GET tls_config: ca_file: /certs/zerossl_ca.cert insecure_skip_verify: false follow_redirects: true enable_http2: true tcp: ip_protocol_fallback: true icmp: ip_protocol_fallback: true ttl: 64 dns: ip_protocol_fallback: true recursion_desired: true
- 重新创建一个使用http_zerossl_ca模块的Prometheus Job
- job_name: 'blackbox-http-zerossl-ca-d0-dev' honor_timestamps: true scrape_interval: 30s scrape_timeout: 5s metrics_path: /probe params: module: [http_zerossl_ca] static_configs: - targets: - https://authentication.dev.ityoudao.cn/ relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: des - target_label: __address__ replacement: blackbox-exporter:9115
Prometheus配置生效之后,很快告警系统就不再报
本文通过黑盒测试发现使用ZeroSSL证书的网站在Prometheus的blackbox-exporter中报错,经过诊断发现是blackbox-exporter不信任证书颁发机构。解决问题的方法是通过修改blackbox-exporter配置,忽略服务器CA证书校验或指定服务器CA证书。最终通过构建了自定义Docker镜像,更新了blackbox-exporter服务,并建议重新创建一个使用新模块的Prometheus Job,解决了blackbox-exporter不信任ZeroSSL CA的问题。