Debugging Nginx Memory Spikes on Production Servers

20 Sep '23

Are there any best practices or processes for debugging sudden memory
spikes in Nginx on production servers?  We have a few very high-traffic
servers that are encountering events where the Nginx process memory
suddenly spikes from around 300mb to 12gb of memory before being shut down
by an out-of-memory termination script.  We don't have Nginx compiled with
debug mode and even if we did, I'm not sure that we could enable that
without overly taxing the server due to the constant high traffic load that
the server is under.  Since it's a server with public websites on it, I
don't know that we could filter the debug log to a single IP either.

Access, error, and info logs all seem to be pretty normal.  Internal
monitoring of the Nginx process doesn't suggest that there are major
connection spikes either.  Theoretically, it is possible that there is just
a very large sudden burst of traffic coming in that is hitting our rate
limits very hard and bumping the memory that Nginx is using until the OOM
termination process closes Nginx (which would prevent Nginx from logging
the traffic).  We just don't have a good way to see where the memory in
Nginx is being allocated when these sorts of spikes occur and are looking
for any good insight into how to go about debugging that sort of thing on a
production server.

Any insights into how to go about troubleshooting it?

-- 
Lance Dockins
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230920/a608e45b/attachment.htm>

Manuel

20 Sep '23

Hello,

apparently you could look into dmesg. There should be a stacktrace of the process.

Also you could somehow start nginx with gdb.

You could also log all request and then when the server crashed try to replay them to be confident that the crash is reproducible.

What does ChatGpt says? ?

Do you run the latest nginx version?

Any obscure modules / extensions?

Kind regards,
Manuel

> Am 20.09.2023 um 18:56 schrieb Lance Dockins <lance at wordkeeper.com>:
> 
> 
> Are there any best practices or processes for debugging sudden memory spikes in Nginx on production servers?  We have a few very high-traffic servers that are encountering events where the Nginx process memory suddenly spikes from around 300mb to 12gb of memory before being shut down by an out-of-memory termination script.  We don't have Nginx compiled with debug mode and even if we did, I'm not sure that we could enable that without overly taxing the server due to the constant high traffic load that the server is under.  Since it's a server with public websites on it, I don't know that we could filter the debug log to a single IP either.
> 
> Access, error, and info logs all seem to be pretty normal.  Internal monitoring of the Nginx process doesn't suggest that there are major connection spikes either.  Theoretically, it is possible that there is just a very large sudden burst of traffic coming in that is hitting our rate limits very hard and bumping the memory that Nginx is using until the OOM termination process closes Nginx (which would prevent Nginx from logging the traffic).  We just don't have a good way to see where the memory in Nginx is being allocated when these sorts of spikes occur and are looking for any good insight into how to go about debugging that sort of thing on a production server.
> 
> Any insights into how to go about troubleshooting it?
> 
> -- 
> Lance Dockins
> 
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230920/e6bd1954/attachment.htm>

Maxim

20 Sep '23

Hello!

On Wed, Sep 20, 2023 at 11:55:39AM -0500, Lance Dockins wrote:

> Are there any best practices or processes for debugging sudden memory
> spikes in Nginx on production servers?  We have a few very high-traffic
> servers that are encountering events where the Nginx process memory
> suddenly spikes from around 300mb to 12gb of memory before being shut down
> by an out-of-memory termination script.  We don't have Nginx compiled with
> debug mode and even if we did, I'm not sure that we could enable that
> without overly taxing the server due to the constant high traffic load that
> the server is under.  Since it's a server with public websites on it, I
> don't know that we could filter the debug log to a single IP either.
> 
> Access, error, and info logs all seem to be pretty normal.  Internal
> monitoring of the Nginx process doesn't suggest that there are major
> connection spikes either.  Theoretically, it is possible that there is just
> a very large sudden burst of traffic coming in that is hitting our rate
> limits very hard and bumping the memory that Nginx is using until the OOM
> termination process closes Nginx (which would prevent Nginx from logging
> the traffic).  We just don't have a good way to see where the memory in
> Nginx is being allocated when these sorts of spikes occur and are looking
> for any good insight into how to go about debugging that sort of thing on a
> production server.
> 
> Any insights into how to go about troubleshooting it?

In no particular order:

- Make sure you are monitoring connection and request numbers as 
  reported by the stub_status module as well as memory usage.

- Check 3rd party modules you are using, if there are any - try 
  disabling them.

- If you are using subrequests, such as with SSI, make sure these 
  won't generate enormous number of subrequests.

- Check your configuration for buffer sizes and connection limits, 
  and make sure that your server can handle maximum memory 
  allocation without invoking the OOM Killer, that is: 
  worker_processes * worker_connections * (total amount of various 
  buffers as allocated per connection).  If not, consider reducing 
  various parts of the equation.

Hope this helps.

-- 
Maxim Dounin
http://mdounin.ru/

Lance

21 Sep '23

Thank you, Maxim.

I’ve been doing some testing since I reached out earlier and I’m not sure whether I’m looking at a memory leak in Nginx/NJS or whether I’m looking at some sort of quirk in how memory stats are being reported by Nginx. All that I know is that my testing looks like a memory leak and under the right conditions, I've seen what appears to be a single Nginx worker thread run away with its memory use until my OOM monitor terminates the thread (which also seems to have some connection with memory use and file I/O). While trying to use some buffers for large file reads in NJS, I started noticing strange memory behavior in basic file operations.

To keep a long story short, I use NJS to control some elements of Nginx and it seems like any form of file I/O in NJS is causing NJS to leak memory. As it stands, I'm not really using many Nginx modules to begin with but to reduce the potential for 3rd party module problems, I recompiled Nginx with nothing but Nginx and NJS. I’m using Nginx 1.23.4 and NJS 0.8.1 but I’ve seen the same behavior with earlier versions of Nginx and NJS.

I’ve tried this with several different tests and I see the same thing with all variations. Any form of repeat file I/O “seems” like it is leaking memory. Here is some sample code that I used in a test.

In the http block, I’ve imported a test.js script that I then use to set a variable with js_set
js_set $test test.test;

At the top of the server block after the minimum set of needed server definitions (server_name, etc)
if ($test = 1) { return 200; }

In the test.js file:
function test(r){
let i = 0;
while(i < 500){
i++;
r.log(njs.memoryStats.size);
}
return 1;
}

export default {test}

Checking the memory use in the info logs after this shows this.

Start of loop:
2023/09/20 21:42:15 [info] 1394272#1394272: *113 js: 32120
2023/09/20 21:42:15 [info] 1394272#1394272: *113 js: 40312

End of loop:
2023/09/20 21:42:15 [info] 1394272#1394272: *113 js: 499064
2023/09/20 21:42:15 [info] 1394272#1394272: *113 js: 499064

If you increase the loop to higher #’s of loops, it just keeps going. Here’s the end of the loop on 10000 runs:
2023/09/20 21:57:04 [info] 1404965#1404965: *4 js: 4676984
2023/09/20 21:57:04 [info] 1404965#1404965: *4 js: 4676984

The moment that I move the r.log statements out of the loop, the start/end memory use appears to be about the same as the start of the loop memory above. So this seems to have some sort of correlation with the amount of data being written to the file. Given that Nginx log writes are supposed to be using buffered writes according to the Nginx docs, I would expect the max memory used during log writes to cap out at some much lower value. We’re not specifying a buffer size so the default of 64k should apply here but by the end of the test loop above, we’re sitting at either 0.5mb or 4.6mb depending on which of the loop sizes (1000 vs 10000) we’re looking at.

The problem is that I am actually trying to sort out a memory issue that I think has to do with large file reads rather than writes and since I’m getting this sort of high memory use data when just writing to log files to test things out, it makes it appear as if the problem is both for file reads and file writes so I have no idea whether buffered file reads are using less memory than reading the entire file into memory or not. A buffered read “should” use less total memory. But since the end memory stats in any testing that I do look the same either way, I can’t tell.

I’ve seen the exact same memory behavior with fs.appendFileSync. So regardless of whether I use r.log, r.error, or fs.appendFileSync to write to some file that isn’t a default Nginx log file, I’m getting this output that suggests a memory leak. So it’s not specific to log file writes.

I realize that these test cases aren’t necessarily realistic as large batches of file writes (or just large file writes) from NJS are likely going to be far less common than large file reads. But either way, whether it’s a large file read that isn’t constricting its memory footprint to the buffer that it’s assigned or whether it’s file writes doing the same, it seems like a problem.

So I guess my question at the moment is whether endless memory use growth being reported by njs.memoryStats.size after file writes is some sort of false positive tied to quirks in how memory use is being reported or whether this is indicative of a memory leak? Any insight would be appreicated.

—
Lance Dockins

> On Wednesday, Sep 20, 2023 at 2:07 PM, Maxim Dounin <mdounin at mdounin.ru (mailto:mdounin at mdounin.ru)> wrote:
> Hello!
>
> On Wed, Sep 20, 2023 at 11:55:39AM -0500, Lance Dockins wrote:
>
> > Are there any best practices or processes for debugging sudden memory
> > spikes in Nginx on production servers? We have a few very high-traffic
> > servers that are encountering events where the Nginx process memory
> > suddenly spikes from around 300mb to 12gb of memory before being shut down
> > by an out-of-memory termination script. We don't have Nginx compiled with
> > debug mode and even if we did, I'm not sure that we could enable that
> > without overly taxing the server due to the constant high traffic load that
> > the server is under. Since it's a server with public websites on it, I
> > don't know that we could filter the debug log to a single IP either.
> >
> > Access, error, and info logs all seem to be pretty normal. Internal
> > monitoring of the Nginx process doesn't suggest that there are major
> > connection spikes either. Theoretically, it is possible that there is just
> > a very large sudden burst of traffic coming in that is hitting our rate
> > limits very hard and bumping the memory that Nginx is using until the OOM
> > termination process closes Nginx (which would prevent Nginx from logging
> > the traffic). We just don't have a good way to see where the memory in
> > Nginx is being allocated when these sorts of spikes occur and are looking
> > for any good insight into how to go about debugging that sort of thing on a
> > production server.
> >
> > Any insights into how to go about troubleshooting it?
>
> In no particular order:
>
> - Make sure you are monitoring connection and request numbers as
> reported by the stub_status module as well as memory usage.
>
> - Check 3rd party modules you are using, if there are any - try
> disabling them.
>
> - If you are using subrequests, such as with SSI, make sure these
> won't generate enormous number of subrequests.
>
> - Check your configuration for buffer sizes and connection limits,
> and make sure that your server can handle maximum memory
> allocation without invoking the OOM Killer, that is:
> worker_processes * worker_connections * (total amount of various
> buffers as allocated per connection). If not, consider reducing
> various parts of the equation.
>
> Hope this helps.
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230920/c5a0879a/attachment-0001.htm>

Dmitry

21 Sep '23

On 20.09.2023 20:37, Lance Dockins wrote:
> So I guess my question at the moment is whether endless memory use 
> growth being reported by njs.memoryStats.size after file writes is 
> some sort of false positive tied to quirks in how memory use is being 
> reported or whether this is indicative of a memory leak?  Any insight 
> would be appreicated.

Hi Lance,
The reason njs.memoryStats.size keeps growing is because NJS uses arena 
memory allocator linked to a current request and a new object 
representing memoryStats structure is returned every time 
njs.memoryStats is accessed. Currently NJS does not free most of the 
internal objects and structures until the current request is destroyed 
because it is not intended for a long running code.

Regarding the sudden memory spikes, please share some details about JS 
code you are using.
One place to look is to analyze the amount of traffic that goes to NJS 
locations and what exactly those location do.

Lance

21 Sep '23

Thanky you, Dmitry.

One question before I describe what we are doing with NJS. I did read about the VM handling process before switching from Lua to NJS and it sounded very practical but my current understanding is that there could be multiple VM’s instantiated for a single request. A js_set, js_content, and js_header_filter directive that applies to a single request, for example, would instantiate 3 VMs. And were you to need to set multiple variables with js_set, then keep adding to that # of VMs. My original understanding of that was that those VMs would be destroyed once they exited so even if you had multiple VMs instantiated per request, the memory impact would not be cumulative in a single request. Is that understanding correct? Or are you saying that each VM accumulates more and more memory until the entire request completes?

As far as how we’re using NJS, we’re mostly using it for header filters, internal redirection, and access control. So there really shouldn’t be a threat to memory in most instances unless we’re not just dealing with a single request memory leak inside of a VM but also a memory leak that involves every VM that NJS instantiates just accumulating memory until the request completes.

Right now, my working theory about what is most likely to be creating the memory spikes has to do with POST body analysis. Unfortunately, some of the requests that I have to deal with are POSTs that have to either be denied access or routed differently depending on the contents of the POST body. Unfortunately, these same routes can vary in the size of the POST body and I have no control over how any of that works because the way it works is controlled by third parties. One of those third parties has significant market share on the internet so we can’t really avoid dealing with it.

In any case, before we switched to NJS, we were using Lua to do the same things and that gave us the advantage of doing both memory cleanup if needed and also doing easy analysis of POST body args. I was able to do this sort of thing with Lua before:
local post_args, post_err = ngx.req.get_post_args()
if post_args.arg_name = something then

But in NJS, there’s no such POST body utility so I had to write my own. The code that I use to parse out the POST body works for both URL encoded POST bodies and multipart POST bodies, but it has to read the entire POST into a variable before I can use it. For small POSTs, that’s not a problem. For larger POSTs that contain a big attachment, it would be. Ultimately, I only care about the string key/value pairs for my purposes (not file attachments) so I was hoping to discard attachment data while parsing the body. I think that that is actually how Lua’s version of this works too. So my next thought was that I could use a Buffer and rs.readSync to read the POST body in buffer frames to keep memory minimal so that I could could discard the any file attachments from the POST body and just evaluate the key/value data that uses simple strings. But from what you’re saying, it sounds like there’s basically no difference between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual memory use. So either way, with a large POST body, you’d be steamrolling the memory use in a single Nginx worker thread. When I had to deal with stuff like this in Lua, I’d just run collectgarbage() to clean up memory and it seemed to work fine. But then I also wasn’t having to parse out the POST body myself in Lua either.

It’s possible that something else is going on other than that. qs.parse seems like it could get us into some trouble if the query_string that was passed was unusuall long too from what you’re saying about how memory is handled. None of the situations that I’m handling are for long running requests. They’re all designed for very fast requests that come into the servers that I manage on a constant basis.

If you can shed some light on the way that VM’s and their memory are handled per my question above and any insights into what to do about this type of situation, that would help a lot. I don’t know if there are any plans to offer a POST body parsing feature in NJS for those that need to evalute POST body data like how Lua did it, but if there was some way to be able to do that at the Nginx layer instead of at the NJS layer, it seems like that could be a lot more sensitive to memory use. Right now, if my understanding is correct, the only option that I’d even have would be to just stop doing POST body handling if the POST body is above a certain total size. I guess if there was some way to forcibly free memory, that would help too. But I don’t think that that is as common of a problem as having to deal with very large query strings that some third party appends to a URL (probably maliciously) and/or a very large file upload attached to a multipart POST. So the only concern that I’d have about memory in a situation where I don’t have to worry about memory when parsing a larger file woudl be if multiple js_sets and such would just keep spawning VMs and accumulating memory during a single request.

Any thoughts?

—
Lance Dockins

> On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
>
> On 20.09.2023 20:37, Lance Dockins wrote:
> > So I guess my question at the moment is whether endless memory use
> > growth being reported by njs.memoryStats.size after file writes is
> > some sort of false positive tied to quirks in how memory use is being
> > reported or whether this is indicative of a memory leak? Any insight
> > would be appreicated.
>
> Hi Lance,
> The reason njs.memoryStats.size keeps growing is because NJS uses arena
> memory allocator linked to a current request and a new object
> representing memoryStats structure is returned every time
> njs.memoryStats is accessed. Currently NJS does not free most of the
> internal objects and structures until the current request is destroyed
> because it is not intended for a long running code.
>
> Regarding the sudden memory spikes, please share some details about JS
> code you are using.
> One place to look is to analyze the amount of traffic that goes to NJS
> locations and what exactly those location do.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230921/0749799b/attachment.htm>

Dmitry

21 Sep '23

On 9/21/23 6:50 AM, Lance Dockins wrote:

Hi Lance,

See my comments below.

> Thanky you, Dmitry.
>
> One question before I describe what we are doing with NJS.  I did read 
> about the VM handling process before switching from Lua to NJS and it 
> sounded very practical but my current understanding is that there 
> could be multiple VM’s instantiated for a single request.  A js_set, 
> js_content, and js_header_filter directive that applies to a single 
> request, for example, would instantiate 3 VMs.  And were you to need 
> to set multiple variables with js_set, then keep adding to that # of VMs.
>
>
This is not correct. For js_set, js_content and js_header_filter there 
is only a single VM.
The internalRedirect() is the exception, because a VM does not survive 
it, but the previous VMs will not be freed until current request is 
finished. BTW, a VM instance itself is pretty small in size (~2kb) so it 
should not be a problem if you have a reasonable number of redirects.

>
> My original understanding of that was that those VMs would be 
> destroyed once they exited so even if you had multiple VMs 
> instantiated per request, the memory impact would not be cumulative in 
> a single request.  Is that understanding correct?  Or are you saying 
> that each VM accumulates more and more memory until the entire request 
> completes?
>
> As far as how we’re using NJS, we’re mostly using it for header 
> filters, internal redirection, and access control.  So there really 
> shouldn’t be a threat to memory in most instances unless we’re not 
> just dealing with a single request memory leak inside of a VM but also 
> a memory leak that involves every VM that NJS instantiates just 
> accumulating memory until the request completes.
>
> Right now, my working theory about what is most likely to be creating 
> the memory spikes has to do with POST body analysis.  Unfortunately, 
> some of the requests that I have to deal with are POSTs that have to 
> either be denied access or routed differently depending on the 
> contents of the POST body.  Unfortunately, these same routes can vary 
> in the size of the POST body and I have no control over how any of 
> that works because the way it works is controlled by third parties. 
>  One of those third parties has significant market share on the 
> internet so we can’t really avoid dealing with it.
>
> In any case, before we switched to NJS, we were using Lua to do the 
> same things and that gave us the advantage of doing both memory 
> cleanup if needed and also doing easy analysis of POST body args.  I 
> was able to do this sort of thing with Lua before:
> local post_args, post_err = ngx.req.get_post_args()
> if post_args.arg_name = something then
>
> But in NJS, there’s no such POST body utility so I had to write my 
> own.  The code that I use to parse out the POST body works for both 
> URL encoded POST bodies and multipart POST bodies, but it has to read 
> the entire POST into a variable before I can use it.  For small POSTs, 
> that’s not a problem.  For larger POSTs that contain a big attachment, 
> it would be.  Ultimately, I only care about the string key/value pairs 
> for my purposes (not file attachments) so I was hoping to discard 
> attachment data while parsing the body.
>
>
>
Thank you for the feedback, I will add it as to a future feature list.

>  I think that that is actually how Lua’s version of this works too. 
>  So my next thought was that I could use a Buffer and rs.readSync to 
> read the POST body in buffer frames to keep memory minimal so that I 
> could could discard the any file attachments from the POST body and 
> just evaluate the key/value data that uses simple strings.  But from 
> what you’re saying, it sounds like there’s basically no difference 
> between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual 
> memory use. So either way, with a large POST body, you’d be 
> steamrolling the memory use in a single Nginx worker thread. When I 
> had to deal with stuff like this in Lua, I’d just run collectgarbage() 
> to clean up memory and it seemed to work fine.  But then I also wasn’t 
> having to parse out the POST body myself in Lua either.
>
> It’s possible that something else is going on other than that. 
>  qs.parse seems like it could get us into some trouble if the 
> query_string that was passed was unusuall long too from what you’re 
> saying about how memory is handled.
>
>
for qs.parse() there is a limit for a number of arguments, which you can 
specify.

>
> None of the situations that I’m handling are for long running 
> requests.  They’re all designed for very fast requests that come into 
> the servers that I manage on a constant basis.
>
> If you can shed some light on the way that VM’s and their memory are 
> handled per my question above and any insights into what to do about 
> this type of situation, that would help a lot.  I don’t know if there 
> are any plans to offer a POST body parsing feature in NJS for those 
> that need to evalute POST body data like how Lua did it, but if there 
> was some way to be able to do that at the Nginx layer instead of at 
> the NJS layer, it seems like that could be a lot more sensitive to 
> memory use.  Right now, if my understanding is correct, the only 
> option that I’d even have would be to just stop doing POST body 
> handling if the POST body is above a certain total size.  I guess if 
> there was some way to forcibly free memory, that would help too.  But 
> I don’t think that that is as common of a problem as having to deal 
> with very large query strings that some third party appends to a URL 
> (probably maliciously) and/or a very large file upload attached to a 
> multipart POST.  So the only concern that I’d have about memory in a 
> situation where I don’t have to worry about memory when parsing a 
> larger file woudl be if multiple js_sets and such would just keep 
> spawning VMs and accumulating memory during a single request.
>
> Any thoughts?
>
> —
> Lance Dockins
>
>
>     On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
>     <xeioex at nginx.com> wrote:
>
>     On 20.09.2023 20:37, Lance Dockins wrote:
>>     So I guess my question at the moment is whether endless memory use
>>     growth being reported by njs.memoryStats.size after file writes is
>>     some sort of false positive tied to quirks in how memory use is
>>     being
>>     reported or whether this is indicative of a memory leak?  Any
>>     insight
>>     would be appreicated.
>
>     Hi Lance,
>     The reason njs.memoryStats.size keeps growing is because NJS uses
>     arena
>     memory allocator linked to a current request and a new object
>     representing memoryStats structure is returned every time
>     njs.memoryStats is accessed. Currently NJS does not free most of the
>     internal objects and structures until the current request is
>     destroyed
>     because it is not intended for a long running code.
>
>     Regarding the sudden memory spikes, please share some details
>     about JS
>     code you are using.
>     One place to look is to analyze the amount of traffic that goes to
>     NJS
>     locations and what exactly those location do.
>

Lance

21 Sep '23

That’s good info. Thank you.

I have been doing some additional testing since my email last night and I have seen enough evidence to believe that file I/O in NJS is basically the source of the memory issues. I did some testing with very basic commands like readFileSync and Buffer + readSync and in all cases, the memory footprint when doing file handling in NJS is massive.

Just doing this:

let content = fs.readFileSync(path/to//file);
let parts = content.split(boundary);

Resulted in memory use that was close to a minimum of 4-8x the size of the file during my testing. We do have an upper bound on files that can be uploaded and that does contain this somwhat but it’s not hard for a larger request that is 99% file attachment to use exhorbitant amounts of memory. I actually tried doing a Buffer + readSync variation on the same thing and the memory footprint was actually FAR FAR worse when I did that.

The 4-8x minimum memory commit seems like a problem to me just generally. But the fact that readSync doesn’t seem to be any better on memory (much worse actually) basically means that NJS is only safe to use for processing smaller files (or POST bodies) right now. There’s just no good way to keep data that you don’t care about in a file from occupying excessive amounts of memory that can’t be reclaimed. If there is no way to improve the memory footprint when handling files (or big strings), no memory conservative way to stream a file through some sort of buffer, and no first-party utility for providing parsed POST bodies right now, then it might be worth the time to put some notes in the NJS docs that the fs module may not be appropriate for larger files (e.g. files over 1mb).

For what it’s worth, I’d also love to see some examples of how to properly use fs.readSync in the NJS examples docs. There really wasn’t much out there for that for NJS (or even in a lot of the Node docs) so I can’t say that my specific test implementation for that was ideal. But that’s just above and beyond the basic problems that I’m seeing with memory use with any form of file I/O at all (since the memory problems seem to be persistent whether doing reads or even log writes).

—
Lance Dockins

> On Thursday, Sep 21, 2023 at 5:01 PM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
>
> On 9/21/23 6:50 AM, Lance Dockins wrote:
>
> Hi Lance,
>
> See my comments below.
>
> > Thanky you, Dmitry.
> >
> > One question before I describe what we are doing with NJS. I did read
> > about the VM handling process before switching from Lua to NJS and it
> > sounded very practical but my current understanding is that there
> > could be multiple VM’s instantiated for a single request. A js_set,
> > js_content, and js_header_filter directive that applies to a single
> > request, for example, would instantiate 3 VMs. And were you to need
> > to set multiple variables with js_set, then keep adding to that # of VMs.
> >
> >
> This is not correct. For js_set, js_content and js_header_filter there
> is only a single VM.
> The internalRedirect() is the exception, because a VM does not survive
> it, but the previous VMs will not be freed until current request is
> finished. BTW, a VM instance itself is pretty small in size (~2kb) so it
> should not be a problem if you have a reasonable number of redirects.
>
>
> >
> > My original understanding of that was that those VMs would be
> > destroyed once they exited so even if you had multiple VMs
> > instantiated per request, the memory impact would not be cumulative in
> > a single request. Is that understanding correct? Or are you saying
> > that each VM accumulates more and more memory until the entire request
> > completes?
> >
> > As far as how we’re using NJS, we’re mostly using it for header
> > filters, internal redirection, and access control. So there really
> > shouldn’t be a threat to memory in most instances unless we’re not
> > just dealing with a single request memory leak inside of a VM but also
> > a memory leak that involves every VM that NJS instantiates just
> > accumulating memory until the request completes.
> >
> > Right now, my working theory about what is most likely to be creating
> > the memory spikes has to do with POST body analysis. Unfortunately,
> > some of the requests that I have to deal with are POSTs that have to
> > either be denied access or routed differently depending on the
> > contents of the POST body. Unfortunately, these same routes can vary
> > in the size of the POST body and I have no control over how any of
> > that works because the way it works is controlled by third parties.
> > One of those third parties has significant market share on the
> > internet so we can’t really avoid dealing with it.
> >
> > In any case, before we switched to NJS, we were using Lua to do the
> > same things and that gave us the advantage of doing both memory
> > cleanup if needed and also doing easy analysis of POST body args. I
> > was able to do this sort of thing with Lua before:
> > local post_args, post_err = ngx.req.get_post_args()
> > if post_args.arg_name = something then
> >
> > But in NJS, there’s no such POST body utility so I had to write my
> > own. The code that I use to parse out the POST body works for both
> > URL encoded POST bodies and multipart POST bodies, but it has to read
> > the entire POST into a variable before I can use it. For small POSTs,
> > that’s not a problem. For larger POSTs that contain a big attachment,
> > it would be. Ultimately, I only care about the string key/value pairs
> > for my purposes (not file attachments) so I was hoping to discard
> > attachment data while parsing the body.
> >
> >
> >
> Thank you for the feedback, I will add it as to a future feature list.
>
> > I think that that is actually how Lua’s version of this works too.
> > So my next thought was that I could use a Buffer and rs.readSync to
> > read the POST body in buffer frames to keep memory minimal so that I
> > could could discard the any file attachments from the POST body and
> > just evaluate the key/value data that uses simple strings. But from
> > what you’re saying, it sounds like there’s basically no difference
> > between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual
> > memory use. So either way, with a large POST body, you’d be
> > steamrolling the memory use in a single Nginx worker thread. When I
> > had to deal with stuff like this in Lua, I’d just run collectgarbage()
> > to clean up memory and it seemed to work fine. But then I also wasn’t
> > having to parse out the POST body myself in Lua either.
> >
> > It’s possible that something else is going on other than that.
> > qs.parse seems like it could get us into some trouble if the
> > query_string that was passed was unusuall long too from what you’re
> > saying about how memory is handled.
> >
> >
> for qs.parse() there is a limit for a number of arguments, which you can
> specify.
>
> >
> > None of the situations that I’m handling are for long running
> > requests. They’re all designed for very fast requests that come into
> > the servers that I manage on a constant basis.
> >
> > If you can shed some light on the way that VM’s and their memory are
> > handled per my question above and any insights into what to do about
> > this type of situation, that would help a lot. I don’t know if there
> > are any plans to offer a POST body parsing feature in NJS for those
> > that need to evalute POST body data like how Lua did it, but if there
> > was some way to be able to do that at the Nginx layer instead of at
> > the NJS layer, it seems like that could be a lot more sensitive to
> > memory use. Right now, if my understanding is correct, the only
> > option that I’d even have would be to just stop doing POST body
> > handling if the POST body is above a certain total size. I guess if
> > there was some way to forcibly free memory, that would help too. But
> > I don’t think that that is as common of a problem as having to deal
> > with very large query strings that some third party appends to a URL
> > (probably maliciously) and/or a very large file upload attached to a
> > multipart POST. So the only concern that I’d have about memory in a
> > situation where I don’t have to worry about memory when parsing a
> > larger file woudl be if multiple js_sets and such would just keep
> > spawning VMs and accumulating memory during a single request.
> >
> > Any thoughts?
> >
> > —
> > Lance Dockins
> >
> >
> > On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
> > <xeioex at nginx.com> wrote:
> >
> > On 20.09.2023 20:37, Lance Dockins wrote:
> > > So I guess my question at the moment is whether endless memory use
> > > growth being reported by njs.memoryStats.size after file writes is
> > > some sort of false positive tied to quirks in how memory use is
> > > being
> > > reported or whether this is indicative of a memory leak? Any
> > > insight
> > > would be appreicated.
> >
> > Hi Lance,
> > The reason njs.memoryStats.size keeps growing is because NJS uses
> > arena
> > memory allocator linked to a current request and a new object
> > representing memoryStats structure is returned every time
> > njs.memoryStats is accessed. Currently NJS does not free most of the
> > internal objects and structures until the current request is
> > destroyed
> > because it is not intended for a long running code.
> >
> > Regarding the sudden memory spikes, please share some details
> > about JS
> > code you are using.
> > One place to look is to analyze the amount of traffic that goes to
> > NJS
> > locations and what exactly those location do.
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230921/5173f67b/attachment-0001.htm>

Dmitry

22 Sep '23

On 9/21/23 4:41 PM, Lance Dockins wrote:
> That’s good info.  Thank you.
>
> I have been doing some additional testing since my email last night 
> and I have seen enough evidence to believe that file I/O in NJS is 
> basically the source of the memory issues.  I did some testing with 
> very basic commands like readFileSync and Buffer + readSync and in all 
> cases, the memory footprint when doing file handling in NJS is massive.
>
> Just doing this:
>
> let content = fs.readFileSync(path/to//file);
> let parts = content.split(boundary);
>
> Resulted in memory use that was close to a minimum of 4-8x the size of 
> the file during my testing.  We do have an upper bound on files that 
> can be uploaded and that does contain this somwhat but it’s not hard 
> for a larger request that is 99% file attachment to use exhorbitant 
> amounts of memory.
>
>
>
Regarding the task at hand, do you check for Content-Type of the POST 
body? So you can exclude anything except probably 
application/x-www-form-urlencoded. At least what I see in lua: the 
handler is only looking for application/x-www-form-urlencoded and not 
multipart/form-data.

https://github.com/openresty/lua-nginx-module/blob/c89469e920713d17d703a5f3736c9335edac22bf/src/ngx_http_lua_args.c#L171

>  I actually tried doing a Buffer + readSync variation on the same 
> thing and the memory footprint was actually FAR FAR worse when I did that.
>
>
>
As of now, the resulting memory consumption will depend heavily on the 
boundary.

In worst case, for 1mb of memory file that is split into 1 character 
array, You will get ~16x memory consumed, because every 1 byte character 
will be put into a njs_value_t structure.

With larger chunks the situation will be less extreme. Right now we are 
implementing a way to deduplicate identical strings, this may help in 
some situations.

> The 4-8x minimum memory commit seems like a problem to me just 
> generally.  But the fact that readSync doesn’t seem to be any better 
> on memory (much worse actually) basically means that NJS is only safe 
> to use for processing smaller files (or POST bodies) right now. 
>  There’s just no good way to keep data that you don’t care about in a 
> file from occupying excessive amounts of memory that can’t be 
> reclaimed. If there is no way to improve the memory footprint when 
> handling files (or big strings), no memory conservative way to stream 
> a file through some sort of buffer, and no first-party utility for 
> providing parsed POST bodies right now,
> then it might be worth the time to put some notes in the NJS docs that 
> the fs module may not be appropriate for larger files (e.g. files over 
> 1mb).
>
> For what it’s worth, I’d also love to see some examples of how to 
> properly use fs.readSync in the NJS examples docs.  There really 
> wasn’t much out there for that for NJS (or even in a lot of the Node 
> docs) so I can’t say that my specific test implementation for that was 
> ideal.  But that’s just above and beyond the basic problems that I’m 
> seeing with memory use with any form of file I/O at all (since the 
> memory problems seem to be persistent whether doing reads or even log 
> writes).
>
> —
> Lance Dockins
>
>
>     On Thursday, Sep 21, 2023 at 5:01 PM, Dmitry Volyntsev
>     <xeioex at nginx.com> wrote:
>
>     On 9/21/23 6:50 AM, Lance Dockins wrote:
>
>     Hi Lance,
>
>     See my comments below.
>
>>     Thanky you, Dmitry.
>>
>>     One question before I describe what we are doing with NJS.  I did
>>     read
>>     about the VM handling process before switching from Lua to NJS
>>     and it
>>     sounded very practical but my current understanding is that there
>>     could be multiple VM’s instantiated for a single request.  A js_set,
>>     js_content, and js_header_filter directive that applies to a single
>>     request, for example, would instantiate 3 VMs.  And were you to need
>>     to set multiple variables with js_set, then keep adding to that #
>>     of VMs.
>>
>>
>     This is not correct. For js_set, js_content and js_header_filter
>     there
>     is only a single VM.
>     The internalRedirect() is the exception, because a VM does not
>     survive
>     it, but the previous VMs will not be freed until current request is
>     finished. BTW, a VM instance itself is pretty small in size (~2kb)
>     so it
>     should not be a problem if you have a reasonable number of redirects.
>
>
>>
>>     My original understanding of that was that those VMs would be
>>     destroyed once they exited so even if you had multiple VMs
>>     instantiated per request, the memory impact would not be
>>     cumulative in
>>     a single request.  Is that understanding correct?  Or are you saying
>>     that each VM accumulates more and more memory until the entire
>>     request
>>     completes?
>>
>>     As far as how we’re using NJS, we’re mostly using it for header
>>     filters, internal redirection, and access control.  So there really
>>     shouldn’t be a threat to memory in most instances unless we’re not
>>     just dealing with a single request memory leak inside of a VM but
>>     also
>>     a memory leak that involves every VM that NJS instantiates just
>>     accumulating memory until the request completes.
>>
>>     Right now, my working theory about what is most likely to be
>>     creating
>>     the memory spikes has to do with POST body analysis.  Unfortunately,
>>     some of the requests that I have to deal with are POSTs that have to
>>     either be denied access or routed differently depending on the
>>     contents of the POST body.  Unfortunately, these same routes can
>>     vary
>>     in the size of the POST body and I have no control over how any of
>>     that works because the way it works is controlled by third parties.
>>      One of those third parties has significant market share on the
>>     internet so we can’t really avoid dealing with it.
>>
>>     In any case, before we switched to NJS, we were using Lua to do the
>>     same things and that gave us the advantage of doing both memory
>>     cleanup if needed and also doing easy analysis of POST body args.  I
>>     was able to do this sort of thing with Lua before:
>>     local post_args, post_err = ngx.req.get_post_args()
>>     if post_args.arg_name = something then
>>
>>     But in NJS, there’s no such POST body utility so I had to write my
>>     own.  The code that I use to parse out the POST body works for both
>>     URL encoded POST bodies and multipart POST bodies, but it has to
>>     read
>>     the entire POST into a variable before I can use it.  For small
>>     POSTs,
>>     that’s not a problem.  For larger POSTs that contain a big
>>     attachment,
>>     it would be.  Ultimately, I only care about the string key/value
>>     pairs
>>     for my purposes (not file attachments) so I was hoping to discard
>>     attachment data while parsing the body.
>>
>>
>>
>     Thank you for the feedback, I will add it as to a future feature
>     list.
>
>>      I think that that is actually how Lua’s version of this works too.
>>      So my next thought was that I could use a Buffer and rs.readSync to
>>     read the POST body in buffer frames to keep memory minimal so that I
>>     could could discard the any file attachments from the POST body and
>>     just evaluate the key/value data that uses simple strings.  But from
>>     what you’re saying, it sounds like there’s basically no difference
>>     between fs.readSync w/ a Buffer and rs.readFileSync in terms of
>>     actual
>>     memory use. So either way, with a large POST body, you’d be
>>     steamrolling the memory use in a single Nginx worker thread. When I
>>     had to deal with stuff like this in Lua, I’d just run
>>     collectgarbage()
>>     to clean up memory and it seemed to work fine.  But then I also
>>     wasn’t
>>     having to parse out the POST body myself in Lua either.
>>
>>     It’s possible that something else is going on other than that.
>>      qs.parse seems like it could get us into some trouble if the
>>     query_string that was passed was unusuall long too from what you’re
>>     saying about how memory is handled.
>>
>>
>     for qs.parse() there is a limit for a number of arguments, which
>     you can
>     specify.
>
>>
>>     None of the situations that I’m handling are for long running
>>     requests.  They’re all designed for very fast requests that come
>>     into
>>     the servers that I manage on a constant basis.
>>
>>     If you can shed some light on the way that VM’s and their memory are
>>     handled per my question above and any insights into what to do about
>>     this type of situation, that would help a lot.  I don’t know if
>>     there
>>     are any plans to offer a POST body parsing feature in NJS for those
>>     that need to evalute POST body data like how Lua did it, but if
>>     there
>>     was some way to be able to do that at the Nginx layer instead of at
>>     the NJS layer, it seems like that could be a lot more sensitive to
>>     memory use.  Right now, if my understanding is correct, the only
>>     option that I’d even have would be to just stop doing POST body
>>     handling if the POST body is above a certain total size.  I guess if
>>     there was some way to forcibly free memory, that would help too.
>>      But
>>     I don’t think that that is as common of a problem as having to deal
>>     with very large query strings that some third party appends to a URL
>>     (probably maliciously) and/or a very large file upload attached to a
>>     multipart POST.  So the only concern that I’d have about memory in a
>>     situation where I don’t have to worry about memory when parsing a
>>     larger file woudl be if multiple js_sets and such would just keep
>>     spawning VMs and accumulating memory during a single request.
>>
>>     Any thoughts?
>>
>>     —
>>     Lance Dockins
>>
>>
>>     On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
>>     <xeioex at nginx.com> wrote:
>>
>>     On 20.09.2023 20:37, Lance Dockins wrote:
>>>     So I guess my question at the moment is whether endless memory use
>>>     growth being reported by njs.memoryStats.size after file writes is
>>>     some sort of false positive tied to quirks in how memory use is
>>>     being
>>>     reported or whether this is indicative of a memory leak?  Any
>>>     insight
>>>     would be appreicated.
>>
>>     Hi Lance,
>>     The reason njs.memoryStats.size keeps growing is because NJS uses
>>     arena
>>     memory allocator linked to a current request and a new object
>>     representing memoryStats structure is returned every time
>>     njs.memoryStats is accessed. Currently NJS does not free most of the
>>     internal objects and structures until the current request is
>>     destroyed
>>     because it is not intended for a long running code.
>>
>>     Regarding the sudden memory spikes, please share some details
>>     about JS
>>     code you are using.
>>     One place to look is to analyze the amount of traffic that goes to
>>     NJS
>>     locations and what exactly those location do.
>>

Lance

22 Sep '23

I am checking the content type, yes. But in my case, I’m just switching between body parsing methodologies depending on the body type. I do actually have a few scenarios where I have to evaluate body data in multipart submissions, but I only actually need the parts of the multipart form that are not including attachments (basic key/value string data). I can’t say for sure that this is the common use case for POST body parsing but when I’ve read the Lua GitHub issues and various examples and discussions in the past, it has always seemed to me like the only common use cases were for url encoded POST bodies OR for the plain key/value string data (not attachments) in multipart bodies. I can’t say that I’ve seen many discussions that were asking for access to the file attachment data in POST bodies - just for what it is worth.

I did notice that the boundary had an effect on memory. It seems like memory is sort of contained as long as we’re talking about an actual multipart boundary. When it’s a single char or something otherwise smaller, the memory use is extreme but that’s partly inherent to even spitting data that way.

For now, I think that I’m just going to have to use a workaround that limits when POST body parsing triggers. There’s just no way to do it at all under certain conditions right now.

Thank you for all of your feedback and work on NJS and for filing the POST body provision in NJS into a feature request.

—
Lance Dockins

> On Thursday, Sep 21, 2023 at 7:47 PM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
>
> On 9/21/23 4:41 PM, Lance Dockins wrote:
> > That’s good info. Thank you.
> >
> > I have been doing some additional testing since my email last night
> > and I have seen enough evidence to believe that file I/O in NJS is
> > basically the source of the memory issues. I did some testing with
> > very basic commands like readFileSync and Buffer + readSync and in all
> > cases, the memory footprint when doing file handling in NJS is massive.
> >
> > Just doing this:
> >
> > let content = fs.readFileSync(path/to//file);
> > let parts = content.split(boundary);
> >
> > Resulted in memory use that was close to a minimum of 4-8x the size of
> > the file during my testing. We do have an upper bound on files that
> > can be uploaded and that does contain this somwhat but it’s not hard
> > for a larger request that is 99% file attachment to use exhorbitant
> > amounts of memory.
> >
> >
> >
> Regarding the task at hand, do you check for Content-Type of the POST
> body? So you can exclude anything except probably
> application/x-www-form-urlencoded. At least what I see in lua: the
> handler is only looking for application/x-www-form-urlencoded and not
> multipart/form-data.
>
> https://github.com/openresty/lua-nginx-module/blob/c89469e920713d17d703a5f3736c9335edac22bf/src/ngx_http_lua_args.c#L171
>
>
> > I actually tried doing a Buffer + readSync variation on the same
> > thing and the memory footprint was actually FAR FAR worse when I did that.
> >
> >
> >
> As of now, the resulting memory consumption will depend heavily on the
> boundary.
>
> In worst case, for 1mb of memory file that is split into 1 character
> array, You will get ~16x memory consumed, because every 1 byte character
> will be put into a njs_value_t structure.
>
> With larger chunks the situation will be less extreme. Right now we are
> implementing a way to deduplicate identical strings, this may help in
> some situations.
>
> > The 4-8x minimum memory commit seems like a problem to me just
> > generally. But the fact that readSync doesn’t seem to be any better
> > on memory (much worse actually) basically means that NJS is only safe
> > to use for processing smaller files (or POST bodies) right now.
> > There’s just no good way to keep data that you don’t care about in a
> > file from occupying excessive amounts of memory that can’t be
> > reclaimed. If there is no way to improve the memory footprint when
> > handling files (or big strings), no memory conservative way to stream
> > a file through some sort of buffer, and no first-party utility for
> > providing parsed POST bodies right now,
> > then it might be worth the time to put some notes in the NJS docs that
> > the fs module may not be appropriate for larger files (e.g. files over
> > 1mb).
> >
> > For what it’s worth, I’d also love to see some examples of how to
> > properly use fs.readSync in the NJS examples docs. There really
> > wasn’t much out there for that for NJS (or even in a lot of the Node
> > docs) so I can’t say that my specific test implementation for that was
> > ideal. But that’s just above and beyond the basic problems that I’m
> > seeing with memory use with any form of file I/O at all (since the
> > memory problems seem to be persistent whether doing reads or even log
> > writes).
> >
> > —
> > Lance Dockins
> >
> >
> > On Thursday, Sep 21, 2023 at 5:01 PM, Dmitry Volyntsev
> > <xeioex at nginx.com> wrote:
> >
> > On 9/21/23 6:50 AM, Lance Dockins wrote:
> >
> > Hi Lance,
> >
> > See my comments below.
> >
> > > Thanky you, Dmitry.
> > >
> > > One question before I describe what we are doing with NJS. I did
> > > read
> > > about the VM handling process before switching from Lua to NJS
> > > and it
> > > sounded very practical but my current understanding is that there
> > > could be multiple VM’s instantiated for a single request. A js_set,
> > > js_content, and js_header_filter directive that applies to a single
> > > request, for example, would instantiate 3 VMs. And were you to need
> > > to set multiple variables with js_set, then keep adding to that #
> > > of VMs.
> > >
> > >
> > This is not correct. For js_set, js_content and js_header_filter
> > there
> > is only a single VM.
> > The internalRedirect() is the exception, because a VM does not
> > survive
> > it, but the previous VMs will not be freed until current request is
> > finished. BTW, a VM instance itself is pretty small in size (~2kb)
> > so it
> > should not be a problem if you have a reasonable number of redirects.
> >
> >
> > >
> > > My original understanding of that was that those VMs would be
> > > destroyed once they exited so even if you had multiple VMs
> > > instantiated per request, the memory impact would not be
> > > cumulative in
> > > a single request. Is that understanding correct? Or are you saying
> > > that each VM accumulates more and more memory until the entire
> > > request
> > > completes?
> > >
> > > As far as how we’re using NJS, we’re mostly using it for header
> > > filters, internal redirection, and access control. So there really
> > > shouldn’t be a threat to memory in most instances unless we’re not
> > > just dealing with a single request memory leak inside of a VM but
> > > also
> > > a memory leak that involves every VM that NJS instantiates just
> > > accumulating memory until the request completes.
> > >
> > > Right now, my working theory about what is most likely to be
> > > creating
> > > the memory spikes has to do with POST body analysis. Unfortunately,
> > > some of the requests that I have to deal with are POSTs that have to
> > > either be denied access or routed differently depending on the
> > > contents of the POST body. Unfortunately, these same routes can
> > > vary
> > > in the size of the POST body and I have no control over how any of
> > > that works because the way it works is controlled by third parties.
> > > One of those third parties has significant market share on the
> > > internet so we can’t really avoid dealing with it.
> > >
> > > In any case, before we switched to NJS, we were using Lua to do the
> > > same things and that gave us the advantage of doing both memory
> > > cleanup if needed and also doing easy analysis of POST body args. I
> > > was able to do this sort of thing with Lua before:
> > > local post_args, post_err = ngx.req.get_post_args()
> > > if post_args.arg_name = something then
> > >
> > > But in NJS, there’s no such POST body utility so I had to write my
> > > own. The code that I use to parse out the POST body works for both
> > > URL encoded POST bodies and multipart POST bodies, but it has to
> > > read
> > > the entire POST into a variable before I can use it. For small
> > > POSTs,
> > > that’s not a problem. For larger POSTs that contain a big
> > > attachment,
> > > it would be. Ultimately, I only care about the string key/value
> > > pairs
> > > for my purposes (not file attachments) so I was hoping to discard
> > > attachment data while parsing the body.
> > >
> > >
> > >
> > Thank you for the feedback, I will add it as to a future feature
> > list.
> >
> > > I think that that is actually how Lua’s version of this works too.
> > > So my next thought was that I could use a Buffer and rs.readSync to
> > > read the POST body in buffer frames to keep memory minimal so that I
> > > could could discard the any file attachments from the POST body and
> > > just evaluate the key/value data that uses simple strings. But from
> > > what you’re saying, it sounds like there’s basically no difference
> > > between fs.readSync w/ a Buffer and rs.readFileSync in terms of
> > > actual
> > > memory use. So either way, with a large POST body, you’d be
> > > steamrolling the memory use in a single Nginx worker thread. When I
> > > had to deal with stuff like this in Lua, I’d just run
> > > collectgarbage()
> > > to clean up memory and it seemed to work fine. But then I also
> > > wasn’t
> > > having to parse out the POST body myself in Lua either.
> > >
> > > It’s possible that something else is going on other than that.
> > > qs.parse seems like it could get us into some trouble if the
> > > query_string that was passed was unusuall long too from what you’re
> > > saying about how memory is handled.
> > >
> > >
> > for qs.parse() there is a limit for a number of arguments, which
> > you can
> > specify.
> >
> > >
> > > None of the situations that I’m handling are for long running
> > > requests. They’re all designed for very fast requests that come
> > > into
> > > the servers that I manage on a constant basis.
> > >
> > > If you can shed some light on the way that VM’s and their memory are
> > > handled per my question above and any insights into what to do about
> > > this type of situation, that would help a lot. I don’t know if
> > > there
> > > are any plans to offer a POST body parsing feature in NJS for those
> > > that need to evalute POST body data like how Lua did it, but if
> > > there
> > > was some way to be able to do that at the Nginx layer instead of at
> > > the NJS layer, it seems like that could be a lot more sensitive to
> > > memory use. Right now, if my understanding is correct, the only
> > > option that I’d even have would be to just stop doing POST body
> > > handling if the POST body is above a certain total size. I guess if
> > > there was some way to forcibly free memory, that would help too.
> > > But
> > > I don’t think that that is as common of a problem as having to deal
> > > with very large query strings that some third party appends to a URL
> > > (probably maliciously) and/or a very large file upload attached to a
> > > multipart POST. So the only concern that I’d have about memory in a
> > > situation where I don’t have to worry about memory when parsing a
> > > larger file woudl be if multiple js_sets and such would just keep
> > > spawning VMs and accumulating memory during a single request.
> > >
> > > Any thoughts?
> > >
> > > —
> > > Lance Dockins
> > >
> > >
> > > On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
> > > <xeioex at nginx.com> wrote:
> > >
> > > On 20.09.2023 20:37, Lance Dockins wrote:
> > > > So I guess my question at the moment is whether endless memory use
> > > > growth being reported by njs.memoryStats.size after file writes is
> > > > some sort of false positive tied to quirks in how memory use is
> > > > being
> > > > reported or whether this is indicative of a memory leak? Any
> > > > insight
> > > > would be appreicated.
> > >
> > > Hi Lance,
> > > The reason njs.memoryStats.size keeps growing is because NJS uses
> > > arena
> > > memory allocator linked to a current request and a new object
> > > representing memoryStats structure is returned every time
> > > njs.memoryStats is accessed. Currently NJS does not free most of the
> > > internal objects and structures until the current request is
> > > destroyed
> > > because it is not intended for a long running code.
> > >
> > > Regarding the sudden memory spikes, please share some details
> > > about JS
> > > code you are using.
> > > One place to look is to analyze the amount of traffic that goes to
> > > NJS
> > > locations and what exactly those location do.
> > >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230922/5bb06e80/attachment-0001.htm>

Lance

26 Sep '23

Dmitry,

I’ve been testing this more and I think that there’s more going on here than I was originally thinking. I have some js_set code that I have to run to properly route and filter requests and I noticed that it was consuming around 1mb per request (which woudl then carry to later header filters and things that NJS had to do). Given that it’s just a bunch of if/then statements, that seemed very odd to me. In very concurent and high traffic environments, that’s a little much for extra memory use - particularly in environments where you’ll have a mix of requests to slower upstreams or bigger file downloads. Since it’s hard to get a totally accurate read on memory uses I did both a test of how much memory a single if statement used and a separate one over 10 if statements (and averaged out the reported memory increase)

From what I can see, it looks like basic statements like these:

if(r.variables.some_variable.match(/someregex or string/i){
// block content
}

Consumes about 17k to 20k per “if” or “match" statement minimum (even if there is nothing in the block content). I’m fairly certain that it’s mostly the string.match statements that are causing this. Since that memory isn’t freed, it just accumulates.

If you have a lot of if/then statements (particularly with string matches), it’s pretty easy to burn through 1mb of memory that then carries through the entire request even for very small files unless you do a forced internalRedirect to regenerate the VM.

Obviously variable declarations and assignments are going to consume more memory. That’s to be expected. But for if statements and string matches to use (but not free) the memory, seems like a little bit of a problem. If you have a complex routing table with a lot of if/thens, it’s going to create some problems (particularly if you have to do more than one js_set or add in something like js_content or header filters as well). To be clear, I’m not substituing the use of location blocks for NJS scripting. The only way to try to replicate the logic that I need in Nginx is through complex use of the Nginx if statement too (and those don’t fully replicate it)

Up until now, I had assumed that string.match types of statements were just transparently calling PCRE behind the scenes so that the associated memory from the PCRE call was being freed after use. Maybe that’s not even an accurate read on how Nginx is using PCRE but that’s how I envisioned it.

In any case, that lead me to a few questions.
How does keepalive affect these VMs? I realize that keepalive is for the connection rather than the request, but I still wanted to confirm that VM’s are not sticking around occupying memory after the request completes due to keepavlive settings. We have a mix of fast requests and long running requests (e.g 5-10s) so if enough long running requests build up at the same time that a flood of very fast requests come in, NJS could certainly burn out a lot of memory under the current conditions as I understand it. If there were a lot of large file downloads, those would also occupy memory for the entirety of the download if I understand correctly.
Is it possible to adjust the NJS codebase to free memory for the actual if condition and for string matches after they’ve run as long as they didn’t include a variable assignment inside of the actual if condition? I’m not talking about whatever is in the block content - just the actual condition. Given how likely it is for people to use basic if conditions and string matches in NJS, that seems like it might be a stopgap measure that reduces the memory footprint without having to build full garbage collection.
Is there any workaround for this type of memory problem other than just to force the use of an internalRedirect to drop the existing VM and create a new one?
Could it make sense to allow for a directive that would force the destruction and regeneration of a new JS VM? That wouldn’t solve the memory leak that bulids to 1mb per request but it would shorten its lifetime (which could ease memory pressure in situations where there are some long running requests holding open the requests)

Thanks again for all of your feedback.

—
Lance Dockins

> On Friday, Sep 22, 2023 at 8:34 AM, Me <lance at wordkeeper.com (mailto:lance at wordkeeper.com)> wrote:
> I am checking the content type, yes. But in my case, I’m just switching between body parsing methodologies depending on the body type. I do actually have a few scenarios where I have to evaluate body data in multipart submissions, but I only actually need the parts of the multipart form that are not including attachments (basic key/value string data). I can’t say for sure that this is the common use case for POST body parsing but when I’ve read the Lua GitHub issues and various examples and discussions in the past, it has always seemed to me like the only common use cases were for url encoded POST bodies OR for the plain key/value string data (not attachments) in multipart bodies. I can’t say that I’ve seen many discussions that were asking for access to the file attachment data in POST bodies - just for what it is worth.
>
> I did notice that the boundary had an effect on memory. It seems like memory is sort of contained as long as we’re talking about an actual multipart boundary. When it’s a single char or something otherwise smaller, the memory use is extreme but that’s partly inherent to even spitting data that way.
>
> For now, I think that I’m just going to have to use a workaround that limits when POST body parsing triggers. There’s just no way to do it at all under certain conditions right now.
>
> Thank you for all of your feedback and work on NJS and for filing the POST body provision in NJS into a feature request.
>
> —
> Lance Dockins
>
> > On Thursday, Sep 21, 2023 at 7:47 PM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
> >
> > On 9/21/23 4:41 PM, Lance Dockins wrote:
> > > That’s good info. Thank you.
> > >
> > > I have been doing some additional testing since my email last night
> > > and I have seen enough evidence to believe that file I/O in NJS is
> > > basically the source of the memory issues. I did some testing with
> > > very basic commands like readFileSync and Buffer + readSync and in all
> > > cases, the memory footprint when doing file handling in NJS is massive.
> > >
> > > Just doing this:
> > >
> > > let content = fs.readFileSync(path/to//file);
> > > let parts = content.split(boundary);
> > >
> > > Resulted in memory use that was close to a minimum of 4-8x the size of
> > > the file during my testing. We do have an upper bound on files that
> > > can be uploaded and that does contain this somwhat but it’s not hard
> > > for a larger request that is 99% file attachment to use exhorbitant
> > > amounts of memory.
> > >
> > >
> > >
> > Regarding the task at hand, do you check for Content-Type of the POST
> > body? So you can exclude anything except probably
> > application/x-www-form-urlencoded. At least what I see in lua: the
> > handler is only looking for application/x-www-form-urlencoded and not
> > multipart/form-data.
> >
> > https://github.com/openresty/lua-nginx-module/blob/c89469e920713d17d703a5f3736c9335edac22bf/src/ngx_http_lua_args.c#L171
> >
> >
> > > I actually tried doing a Buffer + readSync variation on the same
> > > thing and the memory footprint was actually FAR FAR worse when I did that.
> > >
> > >
> > >
> > As of now, the resulting memory consumption will depend heavily on the
> > boundary.
> >
> > In worst case, for 1mb of memory file that is split into 1 character
> > array, You will get ~16x memory consumed, because every 1 byte character
> > will be put into a njs_value_t structure.
> >
> > With larger chunks the situation will be less extreme. Right now we are
> > implementing a way to deduplicate identical strings, this may help in
> > some situations.
> >
> > > The 4-8x minimum memory commit seems like a problem to me just
> > > generally. But the fact that readSync doesn’t seem to be any better
> > > on memory (much worse actually) basically means that NJS is only safe
> > > to use for processing smaller files (or POST bodies) right now.
> > > There’s just no good way to keep data that you don’t care about in a
> > > file from occupying excessive amounts of memory that can’t be
> > > reclaimed. If there is no way to improve the memory footprint when
> > > handling files (or big strings), no memory conservative way to stream
> > > a file through some sort of buffer, and no first-party utility for
> > > providing parsed POST bodies right now,
> > > then it might be worth the time to put some notes in the NJS docs that
> > > the fs module may not be appropriate for larger files (e.g. files over
> > > 1mb).
> > >
> > > For what it’s worth, I’d also love to see some examples of how to
> > > properly use fs.readSync in the NJS examples docs. There really
> > > wasn’t much out there for that for NJS (or even in a lot of the Node
> > > docs) so I can’t say that my specific test implementation for that was
> > > ideal. But that’s just above and beyond the basic problems that I’m
> > > seeing with memory use with any form of file I/O at all (since the
> > > memory problems seem to be persistent whether doing reads or even log
> > > writes).
> > >
> > > —
> > > Lance Dockins
> > >
> > >
> > > On Thursday, Sep 21, 2023 at 5:01 PM, Dmitry Volyntsev
> > > <xeioex at nginx.com> wrote:
> > >
> > > On 9/21/23 6:50 AM, Lance Dockins wrote:
> > >
> > > Hi Lance,
> > >
> > > See my comments below.
> > >
> > > > Thanky you, Dmitry.
> > > >
> > > > One question before I describe what we are doing with NJS. I did
> > > > read
> > > > about the VM handling process before switching from Lua to NJS
> > > > and it
> > > > sounded very practical but my current understanding is that there
> > > > could be multiple VM’s instantiated for a single request. A js_set,
> > > > js_content, and js_header_filter directive that applies to a single
> > > > request, for example, would instantiate 3 VMs. And were you to need
> > > > to set multiple variables with js_set, then keep adding to that #
> > > > of VMs.
> > > >
> > > >
> > > This is not correct. For js_set, js_content and js_header_filter
> > > there
> > > is only a single VM.
> > > The internalRedirect() is the exception, because a VM does not
> > > survive
> > > it, but the previous VMs will not be freed until current request is
> > > finished. BTW, a VM instance itself is pretty small in size (~2kb)
> > > so it
> > > should not be a problem if you have a reasonable number of redirects.
> > >
> > >
> > > >
> > > > My original understanding of that was that those VMs would be
> > > > destroyed once they exited so even if you had multiple VMs
> > > > instantiated per request, the memory impact would not be
> > > > cumulative in
> > > > a single request. Is that understanding correct? Or are you saying
> > > > that each VM accumulates more and more memory until the entire
> > > > request
> > > > completes?
> > > >
> > > > As far as how we’re using NJS, we’re mostly using it for header
> > > > filters, internal redirection, and access control. So there really
> > > > shouldn’t be a threat to memory in most instances unless we’re not
> > > > just dealing with a single request memory leak inside of a VM but
> > > > also
> > > > a memory leak that involves every VM that NJS instantiates just
> > > > accumulating memory until the request completes.
> > > >
> > > > Right now, my working theory about what is most likely to be
> > > > creating
> > > > the memory spikes has to do with POST body analysis. Unfortunately,
> > > > some of the requests that I have to deal with are POSTs that have to
> > > > either be denied access or routed differently depending on the
> > > > contents of the POST body. Unfortunately, these same routes can
> > > > vary
> > > > in the size of the POST body and I have no control over how any of
> > > > that works because the way it works is controlled by third parties.
> > > > One of those third parties has significant market share on the
> > > > internet so we can’t really avoid dealing with it.
> > > >
> > > > In any case, before we switched to NJS, we were using Lua to do the
> > > > same things and that gave us the advantage of doing both memory
> > > > cleanup if needed and also doing easy analysis of POST body args. I
> > > > was able to do this sort of thing with Lua before:
> > > > local post_args, post_err = ngx.req.get_post_args()
> > > > if post_args.arg_name = something then
> > > >
> > > > But in NJS, there’s no such POST body utility so I had to write my
> > > > own. The code that I use to parse out the POST body works for both
> > > > URL encoded POST bodies and multipart POST bodies, but it has to
> > > > read
> > > > the entire POST into a variable before I can use it. For small
> > > > POSTs,
> > > > that’s not a problem. For larger POSTs that contain a big
> > > > attachment,
> > > > it would be. Ultimately, I only care about the string key/value
> > > > pairs
> > > > for my purposes (not file attachments) so I was hoping to discard
> > > > attachment data while parsing the body.
> > > >
> > > >
> > > >
> > > Thank you for the feedback, I will add it as to a future feature
> > > list.
> > >
> > > > I think that that is actually how Lua’s version of this works too.
> > > > So my next thought was that I could use a Buffer and rs.readSync to
> > > > read the POST body in buffer frames to keep memory minimal so that I
> > > > could could discard the any file attachments from the POST body and
> > > > just evaluate the key/value data that uses simple strings. But from
> > > > what you’re saying, it sounds like there’s basically no difference
> > > > between fs.readSync w/ a Buffer and rs.readFileSync in terms of
> > > > actual
> > > > memory use. So either way, with a large POST body, you’d be
> > > > steamrolling the memory use in a single Nginx worker thread. When I
> > > > had to deal with stuff like this in Lua, I’d just run
> > > > collectgarbage()
> > > > to clean up memory and it seemed to work fine. But then I also
> > > > wasn’t
> > > > having to parse out the POST body myself in Lua either.
> > > >
> > > > It’s possible that something else is going on other than that.
> > > > qs.parse seems like it could get us into some trouble if the
> > > > query_string that was passed was unusuall long too from what you’re
> > > > saying about how memory is handled.
> > > >
> > > >
> > > for qs.parse() there is a limit for a number of arguments, which
> > > you can
> > > specify.
> > >
> > > >
> > > > None of the situations that I’m handling are for long running
> > > > requests. They’re all designed for very fast requests that come
> > > > into
> > > > the servers that I manage on a constant basis.
> > > >
> > > > If you can shed some light on the way that VM’s and their memory are
> > > > handled per my question above and any insights into what to do about
> > > > this type of situation, that would help a lot. I don’t know if
> > > > there
> > > > are any plans to offer a POST body parsing feature in NJS for those
> > > > that need to evalute POST body data like how Lua did it, but if
> > > > there
> > > > was some way to be able to do that at the Nginx layer instead of at
> > > > the NJS layer, it seems like that could be a lot more sensitive to
> > > > memory use. Right now, if my understanding is correct, the only
> > > > option that I’d even have would be to just stop doing POST body
> > > > handling if the POST body is above a certain total size. I guess if
> > > > there was some way to forcibly free memory, that would help too.
> > > > But
> > > > I don’t think that that is as common of a problem as having to deal
> > > > with very large query strings that some third party appends to a URL
> > > > (probably maliciously) and/or a very large file upload attached to a
> > > > multipart POST. So the only concern that I’d have about memory in a
> > > > situation where I don’t have to worry about memory when parsing a
> > > > larger file woudl be if multiple js_sets and such would just keep
> > > > spawning VMs and accumulating memory during a single request.
> > > >
> > > > Any thoughts?
> > > >
> > > > —
> > > > Lance Dockins
> > > >
> > > >
> > > > On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
> > > > <xeioex at nginx.com> wrote:
> > > >
> > > > On 20.09.2023 20:37, Lance Dockins wrote:
> > > > > So I guess my question at the moment is whether endless memory use
> > > > > growth being reported by njs.memoryStats.size after file writes is
> > > > > some sort of false positive tied to quirks in how memory use is
> > > > > being
> > > > > reported or whether this is indicative of a memory leak? Any
> > > > > insight
> > > > > would be appreicated.
> > > >
> > > > Hi Lance,
> > > > The reason njs.memoryStats.size keeps growing is because NJS uses
> > > > arena
> > > > memory allocator linked to a current request and a new object
> > > > representing memoryStats structure is returned every time
> > > > njs.memoryStats is accessed. Currently NJS does not free most of the
> > > > internal objects and structures until the current request is
> > > > destroyed
> > > > because it is not intended for a long running code.
> > > >
> > > > Regarding the sudden memory spikes, please share some details
> > > > about JS
> > > > code you are using.
> > > > One place to look is to analyze the amount of traffic that goes to
> > > > NJS
> > > > locations and what exactly those location do.
> > > >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230926/c445d2c6/attachment-0001.htm>

Dmitry

26 Sep '23

On 9/26/23 8:30 AM, Lance Dockins wrote:
> Up until now, I had assumed that string.match types of statements were 
> just transparently calling PCRE behind the scenes so that the 
> associated memory from the PCRE call was being freed after use.  Maybe 
> that’s not even an accurate read on how Nginx is using PCRE but that’s 
> how I envisioned it.

String.prototype.match() returns an array of matched elements. See 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match. 
It means that many calls to that method that have positive matches 
consume memory proportional to the size of the resulting chunks. But if 
the match fails it should not consume (much) memory. So if you have a 
bunch of if/then clauses until the first match it should not be a 
problem (if not let me know).

In general doing a routing in NJS (or any other scripting) is not 
recommended, because it complicates the native nginx location routing. 
Could you please explain what exactly you want to do with NJS RegExp 
what nginx cannot do with regexp locations?

>
> In any case, that lead me to a few questions.
>
>  1. How does keepalive affect these VMs?  I realize that keepalive is
>     for the connection rather than the request, but I still wanted to
>     confirm that VM’s are not sticking around occupying memory after
>     the request completes due to keepavlive settings.
>
>
All the current VMs are destroyed when a current HTTP request is finalized.

>  1.   We have a mix of fast requests and long running requests (e.g
>     5-10s) so if enough long running requests build up at the same
>     time that a flood of very fast requests come in, NJS could
>     certainly burn out a lot of memory under the current conditions as
>     I understand it.  If there were a lot of large file downloads,
>     those would also occupy memory for the entirety of the download if
>     I understand correctly.
>  2. Is it possible to adjust the NJS codebase to free memory for the
>     actual if condition and for string matches after they’ve run as
>     long as they didn’t include a variable assignment inside of the
>     actual if condition?  I’m not talking about whatever is in the
>     block content - just the actual condition.  Given how likely it is
>     for people to use basic if conditions and string matches in NJS,
>     that seems like it might be a stopgap measure that reduces the
>     memory footprint without having to build full garbage collection.
>
>
The problem with JS Regexps is that they produce a lot of intermediary 
objects. Because most of the RegExp related calls end up 
RegExp.prototype.exec() which return a large object representing the 
result of PCRE matching.

One way to mediate the problem could be using RegExp.prototype.test() 
which return just a boolean.
But I need to improve it first, because right now it uses 
RegExp.prototype.exec() internally. The reason RegExp.prototype.test() 
will be easier to improve is that I can be sure that the resulting 
object can be freed right away. In addition to that I plan to improve 
the memory consumption for RegExp.prototype.exec() result, thanks for 
reporting.

>
>  1. Is there any workaround for this type of memory problem other than
>     just to force the use of an internalRedirect to drop the existing
>     VM and create a new one?
>
>  1. Could it make sense to allow for a directive that would force the
>     destruction and regeneration of a new JS VM?  That wouldn’t solve
>     the memory leak that bulids to 1mb per request but it would
>     shorten its lifetime (which could ease memory pressure in
>     situations where there are some long running requests holding open
>     the requests)
>
I do not think so.

Lance

27 Sep '23

To clarify, I am NOT using regexp as a replacement for locations. Some of the things that I need to do might be possible with locations. Most aren’t. I have to test more than just the request_uri and logically I have to use AND/OR and nested conditions. So it’s largely unfeasible with core Nginx directives. That’s why I moved to scripting alternatives.

As for RegExp.test vs String.match, that’s true. I guess technically I should be using RegExp.test since I only care whether the value matches or not for most instances but it’s much uglier and harder to read. If it’s usable, though, that’s better than consuming loads of memory.

As a rough test, I did just convert most of the String.match references to RegExp.test and it did reduce the memory. But it only reduced it by around 40kb. That’s certainly better but on a 1mb memory footprint, not by much. Maybe that is because of what you mentioned about the internal call to RegExp.exec, though. That would make sense if it’s only mildly different from .match under the hood. So if .test is likely to diverge from .match on consumed memory over time, that’s a better fit if you need to do a lot of things that might invoke the regex engine. I’m guessing that String.replace (which I also use in a number of places) is also running its regexes through RegExp.exec too so if that’s where some of this memory leak is coming from, improvements to that might help by a lot.

For whatever it’s worth, the current math seems to play out like this:

(/someregex/).test(string)

is about 8kb smaller than:

let regex = new RegExp(/someregex/);
regex.test(string)

That makes sense since one is also creating a variable but the first route seems to be the only viable option if you need to do anything that is going to touch regex under the hood.

Thus far, the only reliable solution that I’ve found at reducing memory is to just carefully craft the order of if/then statements to ensure that no more string replacements or regex tests occur than absolutely have to for a given type of request. That’s made a significant dent in specific use cases but mostly it’s just been mild improvements. Between what you’ve shared about the regex internals and what I’ve seen doing basic tests around conditions (either using .match or .test), I do think that that is where the bulk of the problem is. Every time that I do anything that involves a regex (string replacement, etc), it seems to burn at least 20k.

Given that, it seems like I only have a few options to be able to address this at all right now and most of them are just workarounds (e.g. moving some of these regexes to maps, maybe splitting some of the logic into a few different js_set directives and localizing those to specific contexts to make them function in a bit more of a JIT way, or using an internalRedirect just for the sake of freeing memory). I’ll to try to figure out what I can. Hopefully whatever I come up with is enough to keep Nginx from spontaneously crashing during some types of concurrent traffic load. That’s my only real goal at the moment. :)

Thank you again for the feedback on all of this. It has helped to find at least a few ways to contain this even if just to improve it mildly. Every little bit seems to count for me at the moment.

—
Lance Dockins

> On Tuesday, Sep 26, 2023 at 5:51 PM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
>
> On 9/26/23 8:30 AM, Lance Dockins wrote:
> > Up until now, I had assumed that string.match types of statements were
> > just transparently calling PCRE behind the scenes so that the
> > associated memory from the PCRE call was being freed after use. Maybe
> > that’s not even an accurate read on how Nginx is using PCRE but that’s
> > how I envisioned it.
>
> String.prototype.match() returns an array of matched elements. See
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match.
> It means that many calls to that method that have positive matches
> consume memory proportional to the size of the resulting chunks. But if
> the match fails it should not consume (much) memory. So if you have a
> bunch of if/then clauses until the first match it should not be a
> problem (if not let me know).
>
>
> In general doing a routing in NJS (or any other scripting) is not
> recommended, because it complicates the native nginx location routing.
> Could you please explain what exactly you want to do with NJS RegExp
> what nginx cannot do with regexp locations?
>
> >
> > In any case, that lead me to a few questions.
> >
> > 1. How does keepalive affect these VMs? I realize that keepalive is
> > for the connection rather than the request, but I still wanted to
> > confirm that VM’s are not sticking around occupying memory after
> > the request completes due to keepavlive settings.
> >
> >
> All the current VMs are destroyed when a current HTTP request is finalized.
>
>
> > 1. We have a mix of fast requests and long running requests (e.g
> > 5-10s) so if enough long running requests build up at the same
> > time that a flood of very fast requests come in, NJS could
> > certainly burn out a lot of memory under the current conditions as
> > I understand it. If there were a lot of large file downloads,
> > those would also occupy memory for the entirety of the download if
> > I understand correctly.
> > 2. Is it possible to adjust the NJS codebase to free memory for the
> > actual if condition and for string matches after they’ve run as
> > long as they didn’t include a variable assignment inside of the
> > actual if condition? I’m not talking about whatever is in the
> > block content - just the actual condition. Given how likely it is
> > for people to use basic if conditions and string matches in NJS,
> > that seems like it might be a stopgap measure that reduces the
> > memory footprint without having to build full garbage collection.
> >
> >
> The problem with JS Regexps is that they produce a lot of intermediary
> objects. Because most of the RegExp related calls end up
> RegExp.prototype.exec() which return a large object representing the
> result of PCRE matching.
>
>
> One way to mediate the problem could be using RegExp.prototype.test()
> which return just a boolean.
> But I need to improve it first, because right now it uses
> RegExp.prototype.exec() internally. The reason RegExp.prototype.test()
> will be easier to improve is that I can be sure that the resulting
> object can be freed right away. In addition to that I plan to improve
> the memory consumption for RegExp.prototype.exec() result, thanks for
> reporting.
>
>
> >
> > 1. Is there any workaround for this type of memory problem other than
> > just to force the use of an internalRedirect to drop the existing
> > VM and create a new one?
> >
> > 1. Could it make sense to allow for a directive that would force the
> > destruction and regeneration of a new JS VM? That wouldn’t solve
> > the memory leak that bulids to 1mb per request but it would
> > shorten its lifetime (which could ease memory pressure in
> > situations where there are some long running requests holding open
> > the requests)
> >
> I do not think so.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230926/4a533bce/attachment.htm>

Page 1 of 1