i run nginx/1.23.2 on linux
after a clear reboot, on first access to my site front page, I see in log
==> /var/log/nginx/example.com.443.error.log <==
2022/11/09 12:38:15 [info] 1460#1460: *2 SSL_do_handshake() failed (SSL: error:0A000412:SSL routines::sslv3 alert bad certificate:SSL alert number 42) while SSL handshaking, client: 2601:...:xxx1, server: [2600:...:xxx6]:443
if I immediately just reload the page in browser, no more problem; the page renders ok, SSL check out, all site nav is fine
subsequent hits to the front page are also OK
i use include letsencrypt certs.
digging around, i found this from 2013
Can't get OCSP stapling to work, despite openssl working fine
https://success.qualys.com/discussions/s/question/0D52L00004TnuFdSAJ/cant-get-ocsp-stapling-to-work-despite-openssl-working-fine
my config includes,
ssl_stapling on;
ssl_stapling_verify on;
ssl_stapling_responder http://r3.o.lencr.org/;
server {
ssl_trusted_certificate ...;
}
checking, after cold reboot, 1st connect returns an OCSP missing response
echo | openssl s_client -connect example.com:443 -servername example.com -tls1_3 -tlsextdebug -status
CONNECTED(00000003)
...
depth=0 CN = example.com
verify return:1
!! OCSP response: no response sent
...
---
SSL handshake has read 4384 bytes and written 318 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_CHACHA20_POLY1305_SHA256
Server public key is 384 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
DONE
but an immediately subsequent 2nd try returns a response
echo | openssl s_client -connect example.com:443 -servername example.com -tls1_3 -tlsextdebug -status
CONNECTED(00000003)
...
verify return:1
OCSP response:
======================================
OCSP Response Data:
OCSP Response Status: successful (0x0)
Response Type: Basic OCSP Response
Version: 1 (0x0)
Responder Id: C = US, O = Let's Encrypt, CN = R3
Produced At: Nov 9 17:09:00 2022 GMT
Responses:
Certificate ID:
Hash Algorithm: sha1
Issuer Name Hash: 48D...3D1
Issuer Key Hash: 142...2BC
Serial Number: 022...84E
Cert Status: good
This Update: Nov 9 17:00:00 2022 GMT
Next Update: Nov 16 16:59:58 2022 GMT
Signature Algorithm: sha256WithRSAEncryption
Signature Value:
09:...:cf
======================================
...
---
SSL handshake has read 4894 bytes and written 318 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_CHACHA20_POLY1305_SHA256
Server public key is 384 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
DONE
so far, this^^ is 100% reproducible for me; always/only on first load after boot
this 'feels' like a timeout before OCSP is cached, and no issues after.
not sure
reading up at
https://nginx.org/en/docs/http/ngx_http_ssl_module.html
i see
ssl_stapling_responder
"Overrides the URL of the OCSP responder specified in the “Authority Information Access” certificate extension."
which i use, but also
ssl_ocsp_responder
"Overrides the URL of the OCSP responder specified in the “Authority Information Access” certificate extension for validation of client certificates. "
which I don't currently.
what's the difference in function/usage between those two?
As far as caching, I also see
ssl_ocsp_cache
which i haven't defined, so it's at default
ssl_ocsp_cache off
any clues as to what's missing/misconfig'd and responsible for the 1st-time-only fails I see?
an old, 2015 post from Caddy Webserver's author,
OCSP Stapling Robustness in Apache and nginx
https://gist.github.com/mholt/3b4910c802b2ed7e92294e26a1ae8551
comments,
"...
nginx's logic is a lot more robust than Apache's in this regard. Good OCSP responses are cached for an hour, but are not replaced until a successful new response has been received, meaning nginx can weather temporary OCSP responder outages. Unfortunately, nginx's logic is drastically worse in a different way: nginx kicks off OCSP queries on-demand, during the TLS handshake, but continues the handshake without waiting for the OCSP response to return. And since the OCSP response caches are unique per worker process, the first TLS connection handled by any given worker process never has a response stapled! (By the way, this makes testing whether you've properly enabled OCSP stapling rather annoying and confusing if you don't know about this.) This behavior also means that if a worker process sites idle for a long time, it doesn't refresh its OCSP responses and could staple an expired OCSP response on the next request it handles. [Update: the expired response issue is fixed in nginx 1.9.2. Now, if the cached OCSP response is expired, no response at all is stapled. A query to the OCSP responder is still initiated in the background, so subsequent handshakes should have a fresh stapled response.]
..."
that suggests an 'updated' (back then, as of v >= 1.9.2) behavior of no OCSP response on 1st try, but a background-queried-and-cached ok response subsequently.
which, sounds like what i'm seeing.
> i run nginx/1.23.2 on linux
>
> after a clear reboot, on first access to my site front page, I see in log
>
> ==> /var/log/nginx/example.com.443.error.log <==
> 2022/11/09 12:38:15 [info] 1460#1460: *2 SSL_do_handshake() failed (SSL: error:0A000412:SSL routines::sslv3 alert bad certificate:SSL alert number 42) while SSL handshaking, client: 2601:...:xxx1, server: [2600:...:xxx6]:443
>
> if I immediately just reload the page in browser, no more problem; the page renders ok, SSL check out, all site nav is fine
>
> subsequent hits to the front page are also OK
...
is that (still?) the current mode of operation in nginx's ocsp logic ?
This 2012 post
Priming the OCSP cache in Nginx
https://unmitigatedrisk.com/?p=241
comments
"...
in Nginx 1.3.7, unfortunately architectural restrictions made it impractical to make it so that pre-fetching the OCSP response on server start-up so instead the first connection to the server primes the cache that is used for later connections.
This is a fine compromise but what if you really want the first connection to have the benefit too? Well there are two approaches you can take:
..."
where OCSP pre-fetching is a challenge that Cloudflare similarly took up in 2017 outside of its then-Nginx usage,
High-reliability OCSP stapling and why it matters
https://blog.cloudflare.com/high-reliability-ocsp-stapling/
Adding to
edit /etc/systemd/system/nginx.service
+ ExecStartPost=/bin/bash /etc/nginx/scripts/ocsp_prefetch.sh
where
cat /etc/nginx/scripts/ocsp_prefetch.sh
iterates over served domains,
echo QUIT | openssl s_client -connect ${_thisDom}:443 -servername ${_thisDom} -tls1_3 -tlsextdebug -status 2> /dev/null
Does the trick. After cold reboot, 1st hits to site(s) no longer fail to respond in-browser, or fail to provide OCSP response to openssl s_client query.
IS there an nginx prefetch mechanism available natively in current version ?
I found this 7 yr old enhancement request,
Fetch OCSP responses on startup, and store across restarts
https://trac.nginx.org/nginx/ticket/812
which afaict wasn't resolved.