Hello everyone,
My Nginx worker process has frequent segfaults on this codepath.
(ngx_quic_create_stream)
Here are some observations I have made so far.
1. The faults happen with tcmalloc and malloc so this is not the issue.
2. master_process is on
3. 1 worker is enough
4. HTTP3 requests need to come in fairly frequently, at least 2 per second
5. At least one http1 or http2 request needs to be received as well,
regardless of what port
I would really appreciate any suggestions on where do I continue
investigating this.
#0 0x00007ffb0b01876a in (anonymous namespace)::do_memalign(unsigned
long, unsigned long) () from /lib64/libtcmalloc.so.4
#1 0x00007ffb0b037010 in tc_posix_memalign () from /lib64/libtcmalloc.so.4
#2 0x00000000005a7041 in ngx_memalign (alignment=alignment at entry=16,
size=size at entry=16384, log=log at entry=0x60b44d8) at
src/os/unix/ngx_alloc.c:57
#3 0x000000000058067c in ngx_create_pool (size=size at entry=16384,
log=0x60b44d8) at src/core/ngx_palloc.c:23
#4 0x00000000005ca7f0 in ngx_quic_create_stream () at
src/event/quic/ngx_event_quic_streams.c:685
#5 0x00000000005cb1f6 in ngx_quic_get_stream () at
src/event/quic/ngx_event_quic_streams.c:458
#6 0x00000000005cc745 in ngx_quic_handle_stream_frame
(c=c at entry=0x8417068, pkt=pkt at entry=0x7ffef1048250,
frame=frame at entry=0x7ffef1048140)
at src/event/quic/ngx_event_quic_streams.c:1265
#7 0x00000000005bd5f3 in ngx_quic_handle_frames (c=0x8417068,
pkt=0x7ffef1048250) at src/event/quic/ngx_event_quic.c:1254
#8 0x00000000005bf022 in ngx_quic_handle_packet (pkt=0x7ffef1048250,
conf=0x0, c=0x8417068) at src/event/quic/ngx_event_quic.c:850
#9 ngx_quic_handle_datagram (c=c at entry=0x8417068, b=0x7ffef1048480,
conf=conf at entry=0x0) at src/event/quic/ngx_event_quic.c:700
#10 0x00000000005bff6b in ngx_quic_input_handler (rev=0x95965a0) at
src/event/quic/ngx_event_quic.c:443
#11 0x00000000005c0884 in ngx_quic_recvmsg (ev=0x95963c0) at
src/event/quic/ngx_event_quic_udp.c:195
#12 0x00000000005af0c8 in ngx_epoll_process_events (cycle=0x57aa050,
timer=<optimized out>, flags=<optimized out>)
at src/event/modules/ngx_epoll_module.c:901
#13 0x00000000005a2941 in ngx_process_events_and_timers
(cycle=cycle at entry=0x57aa050) at src/event/ngx_event.c:251
#14 0x00000000005abd19 in ngx_worker_process_cycle (cycle=0x57aa050,
data=<optimized out>) at src/os/unix/ngx_process_cycle.c:1135
#15 0x00000000005aa323 in ngx_spawn_process
(cycle=cycle at entry=0x57aa050, proc=proc at entry=0x5abc00
<ngx_worker_process_cycle>, data=data at entry=0x15,
name=name at entry=0xbc6066 "worker process",
respawn=respawn at entry=-4) at src/os/unix/ngx_process.c:209
#16 0x00000000005ac798 in ngx_start_worker_processes
(cycle=cycle at entry=0x57aa050, n=54, type=type at entry=-4,
worker_spawn_start_index=worker_spawn_start_index at entry=0x0,
workers_to_exclude=workers_to_exclude at entry=0x0) at
src/os/unix/ngx_process_cycle.c:600
#17 0x00000000005ae069 in ngx_master_process_cycle (cycle=0x57aa050)
at src/os/unix/ngx_process_cycle.c:424
#18 0x000000000057d19c in main (argc=1, argv=<optimized out>) at
src/core/nginx.c:523
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20231221/59d827a2/attachment.htm>
On Thu, Dec 21, 2023 at 7:35 AM Clima Gabriel
<clima.gabrielphoto at gmail.com> wrote:
>
> Hello everyone,
>
> My Nginx worker process has frequent segfaults on this codepath. (ngx_quic_create_stream)
> Here are some observations I have made so far.
> 1. The faults happen with tcmalloc and malloc so this is not the issue.
> 2. master_process is on
> 3. 1 worker is enough
> 4. HTTP3 requests need to come in fairly frequently, at least 2 per second
> 5. At least one http1 or http2 request needs to be received as well, regardless of what port
>
> I would really appreciate any suggestions on where do I continue investigating this.
>
> #0 0x00007ffb0b01876a in (anonymous namespace)::do_memalign(unsigned long, unsigned long) () from /lib64/libtcmalloc.so.4
> #1 0x00007ffb0b037010 in tc_posix_memalign () from /lib64/libtcmalloc.so.4
> [...]
It's good you have debug symbols available.
A 'bt full' may be more helpful so parameter values are available in
the stack trace.
Jeff
Thanks.
I ended up using valgrind and got much closer to the answer.
Arguments I used:
valgrind --leak-check=full --show-leak-kinds=all --trace-children=yes
--track-origins=yes --verbose --log-file=valgrind-output.txt
/root/nginx/objs/nginx -c /etc/nginx/nginx.conf
That led me to find that the HTTP and QUIC requests somehow end upsing the
same pointer to http_connection_t, which seems obviously wrong.
#0 ngx_SSL_early_cb_fn (s=0x55ae3ad5cae0, al=0x7fff1cf4c8f4,
arg=0x7f3417bf4b00) at src/event/ngx_event_openssl.c:1949
#1 0x00007f3578bc2eba in ?? () from /lib/x86_64-linux-gnu/libssl.so.3
#2 0x00007f3578bb4b8c in ?? () from /lib/x86_64-linux-gnu/libssl.so.3
#3 0x00007f3578bb6608 in ?? () from /lib/x86_64-linux-gnu/libssl.so.3
#4 0x00007f3578b89d42 in SSL_read_early_data () from
/lib/x86_64-linux-gnu/libssl.so.3
*#5 0x000055ae379f85cb in ngx_ssl_try_early_data (c=0x7f3417bf4b00) at
src/event/ngx_event_openssl.c:2229*#6 0x000055ae379f7f6f in
ngx_ssl_handshake (c=0x7f3417bf4b00) at src/event/ngx_event_openssl.c:1998
#7 0x000055ae37a3f01d in ngx_http_ssl_handshake (rev=0x7f34175f33d0) at
src/http/ngx_http_request.c:785
#8 0x000055ae379f0cca in ngx_epoll_process_events (cycle=0x55ae39d80390,
timer=3101, flags=1) at src/event/modules/ngx_epoll_module.c:901
#9 0x000055ae379daacb in ngx_process_events_and_timers
(cycle=0x55ae39d80390) at src/event/ngx_event.c:251
#10 0x000055ae379ed424 in ngx_worker_process_cycle (cycle=0x55ae39d80390,
data=0x0) at src/os/unix/ngx_process_cycle.c:936
#11 0x000055ae379e8c49 in ngx_spawn_process (cycle=0x55ae39d80390,
proc=0x55ae379ed2c3 <ngx_worker_process_cycle>, data=0x0,
name=0x55ae37fa1cc5 "worker process", respawn=-3) at
src/os/unix/ngx_process.c:209
#12 0x000055ae379ebf69 in ngx_start_worker_processes (cycle=0x55ae39d80390,
n=1, type=-3) at src/os/unix/ngx_process_cycle.c:525
#13 0x000055ae379eb480 in ngx_master_process_cycle (cycle=0x55ae39d80390)
at src/os/unix/ngx_process_cycle.c:279
#14 0x000055ae3799d8d1 in main (argc=3, argv=0x7fff1cf4e498) at
src/core/nginx.c:489
(gdb) continue
Continuing.
Breakpoint 1, ngx_SSL_early_cb_fn (s=0x55ae3ad5cae0, al=0x7fff1cf4d214,
arg=0x7f3417bf4b00) at src/event/ngx_event_openssl.c:1949
1949 ngx_SSL_early_cb_fn(SSL *s, int *al, void *arg) {
#0 ngx_SSL_early_cb_fn (s=0x55ae3ad5cae0, al=0x7fff1cf4d214,
arg=0x7f3417bf4b00) at src/event/ngx_event_openssl.c:1949
#1 0x00007f3578bc2eba in ?? () from /lib/x86_64-linux-gnu/libssl.so.3
#2 0x00007f3578bb4b8c in ?? () from /lib/x86_64-linux-gnu/libssl.so.3
#3 0x00007f3578bb6608 in ?? () from /lib/x86_64-linux-gnu/libssl.so.3
*#4 0x000055ae37a1e1c4 in ngx_quic_crypto_input (c=0x7f3417bf4b00,
data=0x7fff1cf4d680, level=ssl_encryption_initial) at
src/event/quic/ngx_event_quic_ssl.c:412*#5 0x000055ae37a1dffc in
ngx_quic_handle_crypto_frame (c=0x7f3417bf4b00, pkt=0x7fff1cf4d880,
frame=0x7fff1cf4d6e0) at src/event/quic/ngx_event_quic_ssl.c:358
#6 0x000055ae37a0af0f in ngx_quic_handle_frames (c=0x7f3417bf4b00,
pkt=0x7fff1cf4d880) at src/event/quic/ngx_event_quic.c:1242
#7 0x000055ae37a0a744 in ngx_quic_handle_payload (c=0x7f3417bf4b00,
pkt=0x7fff1cf4d880) at src/event/quic/ngx_event_quic.c:1054
#8 0x000055ae37a0a12b in ngx_quic_handle_packet (c=0x7f3417bf4b00,
conf=0x55ae39e42ae8, pkt=0x7fff1cf4d880) at
src/event/quic/ngx_event_quic.c:946
#9 0x000055ae37a09630 in ngx_quic_handle_datagram (c=0x7f3417bf4b00,
b=0x55ae3ac65f30, conf=0x55ae39e42ae8) at
src/event/quic/ngx_event_quic.c:700
#10 0x000055ae37a07eb1 in ngx_quic_run (c=0x7f3417bf4b00,
conf=0x55ae39e42ae8) at src/event/quic/ngx_event_quic.c:204
#11 0x000055ae37aace43 in ngx_http_v3_init_stream (c=0x7f3417bf4b00) at
src/http/v3/ngx_http_v3_request.c:75
#12 0x000055ae37a3db4a in ngx_http_init_connection (c=0x7f3417bf4b00) at
src/http/ngx_http_request.c:329
#13 0x000055ae37a0c3cd in ngx_quic_recvmsg (ev=0x7f34175f30d0) at
src/event/quic/ngx_event_quic_udp.c:339
#14 0x000055ae379f0cca in ngx_epoll_process_events (cycle=0x55ae39d80390,
timer=5653, flags=1) at src/event/modules/ngx_epoll_module.c:901
#15 0x000055ae379daacb in ngx_process_events_and_timers
(cycle=0x55ae39d80390) at src/event/ngx_event.c:251
#16 0x000055ae379ed424 in ngx_worker_process_cycle (cycle=0x55ae39d80390,
data=0x0) at src/os/unix/ngx_process_cycle.c:936
#17 0x000055ae379e8c49 in ngx_spawn_process (cycle=0x55ae39d80390,
proc=0x55ae379ed2c3 <ngx_worker_process_cycle>, data=0x0,
name=0x55ae37fa1cc5 "worker process", respawn=-3) at
src/os/unix/ngx_process.c:209
#18 0x000055ae379ebf69 in ngx_start_worker_processes (cycle=0x55ae39d80390,
n=1, type=-3) at src/os/unix/ngx_process_cycle.c:525
#19 0x000055ae379eb480 in ngx_master_process_cycle (cycle=0x55ae39d80390)
at src/os/unix/ngx_process_cycle.c:279
#20 0x000055ae3799d8d1 in main (argc=3, argv=0x7fff1cf4e498) at
src/core/nginx.c:489
On Thu, Dec 21, 2023 at 5:03 PM Jeffrey Walton <noloader at gmail.com> wrote:
> On Thu, Dec 21, 2023 at 7:35 AM Clima Gabriel
> <clima.gabrielphoto at gmail.com> wrote:
> >
> > Hello everyone,
> >
> > My Nginx worker process has frequent segfaults on this codepath.
> (ngx_quic_create_stream)
> > Here are some observations I have made so far.
> > 1. The faults happen with tcmalloc and malloc so this is not the issue.
> > 2. master_process is on
> > 3. 1 worker is enough
> > 4. HTTP3 requests need to come in fairly frequently, at least 2 per
> second
> > 5. At least one http1 or http2 request needs to be received as well,
> regardless of what port
> >
> > I would really appreciate any suggestions on where do I continue
> investigating this.
> >
> > #0 0x00007ffb0b01876a in (anonymous namespace)::do_memalign(unsigned
> long, unsigned long) () from /lib64/libtcmalloc.so.4
> > #1 0x00007ffb0b037010 in tc_posix_memalign () from
> /lib64/libtcmalloc.so.4
> > [...]
>
> It's good you have debug symbols available.
>
> A 'bt full' may be more helpful so parameter values are available in
> the stack trace.
>
> Jeff
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20231222/7c76123d/attachment-0001.htm>
Hello!
On Fri, Dec 22, 2023 at 11:59:22AM +0200, Clima Gabriel wrote:
> Thanks.
> I ended up using valgrind and got much closer to the answer.
> Arguments I used:
> valgrind --leak-check=full --show-leak-kinds=all --trace-children=yes
> --track-origins=yes --verbose --log-file=valgrind-output.txt
> /root/nginx/objs/nginx -c /etc/nginx/nginx.conf
> That led me to find that the HTTP and QUIC requests somehow end upsing the
> same pointer to http_connection_t, which seems obviously wrong.
This might be perfectly correct as long as the original connection
was closed.
> #0 ngx_SSL_early_cb_fn (s=0x55ae3ad5cae0, al=0x7fff1cf4c8f4,
> arg=0x7f3417bf4b00) at src/event/ngx_event_openssl.c:1949
There is no such function in nginx, so it looks like you are using
some 3rd party modifications.
You may want to start with compiling vanilla nginx as available
from nginx.org without any 3rd party modules and/or patches and
testing if you are able to reproduce the problem.
[...]
--
Maxim Dounin
http://mdounin.ru/
Hello Maxim,
You're right.
Disabling the ssl-ja3 module was sufficient to stop the segfaults.
Thanks!
On Fri, Dec 22, 2023 at 4:14 PM Maxim Dounin <mdounin at mdounin.ru> wrote:
> Hello!
>
> On Fri, Dec 22, 2023 at 11:59:22AM +0200, Clima Gabriel wrote:
>
> > Thanks.
> > I ended up using valgrind and got much closer to the answer.
> > Arguments I used:
> > valgrind --leak-check=full --show-leak-kinds=all --trace-children=yes
> > --track-origins=yes --verbose --log-file=valgrind-output.txt
> > /root/nginx/objs/nginx -c /etc/nginx/nginx.conf
> > That led me to find that the HTTP and QUIC requests somehow end upsing
> the
> > same pointer to http_connection_t, which seems obviously wrong.
>
> This might be perfectly correct as long as the original connection
> was closed.
>
> > #0 ngx_SSL_early_cb_fn (s=0x55ae3ad5cae0, al=0x7fff1cf4c8f4,
> > arg=0x7f3417bf4b00) at src/event/ngx_event_openssl.c:1949
>
> There is no such function in nginx, so it looks like you are using
> some 3rd party modifications.
>
> You may want to start with compiling vanilla nginx as available
> from nginx.org without any 3rd party modules and/or patches and
> testing if you are able to reproduce the problem.
>
> [...]
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20231222/acb0db6b/attachment.htm>