127 lines
5.8 KiB
Plaintext
127 lines
5.8 KiB
Plaintext
From: arcege at shore.net (Michael P. Reilly)
|
|
Date: Wed, 26 May 1999 22:17:56 GMT
|
|
Subject: Bug report: memory leak in python 1.5.2
|
|
References: <19990526160643.54023@nms.otc.com.au> <374BAB31.2F8FA25C@lemburg.com> <14156.1920.64148.193367@weyr.cnri.reston.va.us> <LKY23.786$nn.228437@news.shore.net> <14156.25520.795257.69628@weyr.cnri.reston.va.us>
|
|
Message-ID: <oq_23.797$nn.233472@news.shore.net>
|
|
Content-Length: 5456
|
|
X-UID: 1820
|
|
|
|
Fred L. Drake <fdrake at cnri.reston.va.us> wrote:
|
|
|
|
: Michael P. Reilly writes:
|
|
: > There is also the situation where some UNIX systems put the environment
|
|
: > initially in the u area, and it is difficult to programmatically determine
|
|
: > where different runtime segments are (where is the heap vs. where is the
|
|
: > u area).
|
|
: >
|
|
: > Fred, your solution should work because it takes the problem case: what
|
|
: > to do with the string initially, but I think it might be better to copy
|
|
: > the values at module initialization time. I've included an addition to
|
|
: > Fred's patch to be called instead of the PyDict_New() function (in the
|
|
: > module init function).
|
|
:
|
|
: If I understand correctly, your patch avoids the problem of memory
|
|
: leaked from the initial environment (a static size). Is this correct?
|
|
|
|
My initializer only deals with the possible UNIX implimentations of the
|
|
environment, by copying the environment (supposedly) before it is
|
|
used. There are no guarantees how putenv will modify the environment.
|
|
AIX 4.2.1 states that the environment is expanded as needed for new
|
|
values, but with some tests, it still shows that putenv/getenv is using
|
|
passed (borrowed) values. If we mix Python-changed environment
|
|
variables with some that are not as a blanket system-independant
|
|
implimentation, I foresee problems. If it is going to be treated as
|
|
volitile memory, we should initialize it in a more appropriate manner
|
|
(in my thinking).
|
|
|
|
Remember that the original definition of UNIX put the environment (and
|
|
argument list) in the u area, in-accessible to the process without the
|
|
passed copies (in main()) and using putenv/getenv. I've never used
|
|
Linux, but I wouldn't be surprised if it used something along those
|
|
lines. Mixing dynamic string with static could get dangerous (I
|
|
recently debugged a library that would return malloc'd string or a
|
|
static string indiscriminantly).
|
|
|
|
Regardless, it was just a suggestion for consistancy, your patch would
|
|
work except on a few, very odd systems where the environment isn't
|
|
quite what people expect, I was trying to handle that case. But.. read
|
|
on, MacDuff.
|
|
|
|
: If so, I'm not sure it's worth the extra code. My intention was to
|
|
: avoid the Python-induced leak that would allow a long-running Python
|
|
: script that occaisionally created a subprocess to become a MemoryError
|
|
: traceback. ;-) In the case of systems without a lot of memory
|
|
: available, the environment should be kept small to begin with (making
|
|
: the additional data structures created by the startup code more of a
|
|
: problem).
|
|
: I don't think I've ever checked the size of the "typical" UNIX
|
|
: environment; "printenv | wc -c" tells me I'm running under 2Kb in a
|
|
: fresh shell. Is that enough to worry about, and slow down
|
|
: initialization?
|
|
|
|
A common bare minimum environment is: HOME, USER, SHELL, MAIL. That's
|
|
not all that much (should be less then 256 on virtually every system).
|
|
|
|
Most systems have an upper bound based on a limit of the argument list
|
|
and the environment (ARG_MAX). POSIX limits this at 4k, but most
|
|
systems have it larger (SunOS is at 1Mb). This means that there's a
|
|
limit to how much we would be initializing anyway (with my addition).
|
|
|
|
: > Also, how should we deal with this in terms of C applications who might
|
|
: > change the environment? (Embedders beware!)
|
|
:
|
|
: In this case, we may not clear all the possible garbage, but we only
|
|
: leak for keys that are:
|
|
:
|
|
: 1. Changed from Python at least once, then
|
|
: 2. Changed from C, and
|
|
: 3. Never changed from Python again.
|
|
:
|
|
: Note that only one copy of the variable gets leaked, not an infinite
|
|
: succession.
|
|
: In the case of two Python putenv() calls with C putenv() calls
|
|
: inbetween, we don't introduce any new leaks; the effect is that the
|
|
: data from the first Python putenv() isn't collected until the second
|
|
: Python putenv(). This is acceptable.
|
|
|
|
True; my thought was in some system where you get a cowboy programmer
|
|
who decides to take control of the environment making everything
|
|
borrowed in his library. Then you get:
|
|
|
|
* cowboy initializes
|
|
* python changes (with borrowed memory)
|
|
* cowboy frees python-borrowed memory and makes change
|
|
* python changes again attempting to free already freed memory
|
|
...
|
|
|
|
My point wasn't that we should handle this case, just to make it known
|
|
somewhere that Python is now attempting to borrow some aspects of the
|
|
environment (depending on the platform). We can't babysit everything,
|
|
but taking control of a system managed facility does takes
|
|
responsibility.
|
|
|
|
(I try not to go so far as making these types of statements anymore,
|
|
however hypothetical - too many "people" take it personally. It's
|
|
better to let intelligent people read and make there own conclusions.)
|
|
|
|
: > From a programming standpoint, I don't think that it should be "proper"
|
|
: > to be changing the environment all that much. It's purpose is to
|
|
: > propragate values to child processes, not to store runtime values.
|
|
:
|
|
: I agree. Processes that run a lot of children, like HTTP servers
|
|
: running CGI scripts, won't be using their own environment to do this,
|
|
: but will create the desired environments on the fly. (Especially if
|
|
: they're threaded!)
|
|
|
|
Yes, my point to the statement was that people shouldn't be leaking
|
|
much from using the current implimentation we have - assuming they use
|
|
it "properly". Maybe there should be better education in the docs of
|
|
its purpose.
|
|
|
|
-Arcege
|
|
|
|
|
|
|
|
|
|
|