EVOLUTION-MANAGER

Edit File: version-4.4.0.txt

=============
Version 4.4.0
=============

Version 4.4.0 of mod_wsgi can be obtained from:

https://codeload.github.com/GrahamDumpleton/mod_wsgi/tar.gz/4.4.0

Known Issues
------------

1. The makefiles for building mod_wsgi on Windows are currently broken and
need updating. As most new changes relate to mod_wsgi daemon mode, which is
not supported under Windows, you should keep using the last available
binary for version 3.X on Windows instead.

Bugs Fixed
----------

1. When an exception occurs during the yielding of data from a generator
returned from the WSGI application, and chunked transfer encoding was used
on the response, then a '0' chunk would be errornously added at the end of
the response content even though the response was likely incomplete. The
result would be that clents wouldn't be able to properly detect that the
response was truncated due to an error. This issue is now fixed for when
embedded mode is being used. Fixing it for daemon mode is a bit trickier.

2. Response headers returned from the WSGI application running in daemon
mode were being wrongly attached to the internal Apache data structure for
``err_headers_out`` instead of ``headers_out``. This meant that the
``Header`` directive of the ``mod_headers`` module, with its default
condition of only checking ``onsuccess`` headers would not work as
expected.

In order to be able to check for or modify the response headers one would
have had to use the ``Header`` directive with the ``always`` condition and
if also working with an embedded WSGI application, also define a parallel
``Header`` directive but with the ``onsuccess`` condition.

For daemon mode, response headers will now be correctly associated with
``headers_out`` and the ``onsuccess`` condition of the ``Header`` directive.
The only exception to this in either embedded or daemon mode now is that
of the ``WWW-Authenticate`` header, which remains associated with
``err_headers_out`` so that the header will survive an internal redirect
such as to an ``ErrorDocument``.

3. When optional support for chunked requests was enabled, it was only
working properly for embedded mode. The feature now also works properly for
daemon mode.

The directive to enable support for chunked request content is
``WSGIChunkedRequest``. The command line option when using mod_wsgi express
is ``--chunked-request``.

This is an optional feature, as the WSGI specification is arguably broken
in not catering properly for mutating input filters or chunked request
content. Support for chunked request content could be enabled by default,
but then WSGI applications which don't simply read all available content
and instead rely entirely on ``CONTENT_LENGTH``, would likely see a chunked
request as having no content at all, as it would interpret the lack of
the ``CONTENT_LENGTH`` as meaning the length of the content is zero.

An attempt to get the WSGI specification ammended to be more sensible and
allow what is a growing requirement to support chunked request content was
ignored. Thus support is optional. You will need to enable this if you wish
to rely on features of any WSGI framework that take the more sensible
approach of ignoring ``CONTENT_LENGTH`` as a true indicator of content
length. One such WSGI framework which provides some support for chunked
request content is Flask/Werkzeug. Check its documentation or the code for
Flask/Werkzeug to to see if any additional ``SetEnv`` directive may be
required to enable the support in Flask/Werkzeug.

4. Fixed a potential request content data corruption issue when running a
WSGI application in daemon mode. The bug in the code is quite obvious, yet
unable to trigger it on older mod_wsgi versions. It was though triggering
quite easily in the current release on MacOS X, prior to it being fixed,
due to the changes made to support chunked request content for daemon
processes.

Suspect it is still a latent bug in older mod_wsgi versions, but the
conditions under which it would trigger must have been harder to induce.
The lack of reported problems may have been aided by virtue of Linux UNIX
socket buffer size being quite large, in comparison to MacOS X, and so
harder to create a condition where not all data could be written onto the
UNIX socket in one call. Yet, when buffer sizes for the UNIX socket on
MacOS X were increased, it was still possible to induce the bug.

5. When the ``--working-directory`` option for ``mod_wsgi-express`` was
given a relative path name, that wasn't being translated to an absolute
path name when substituting the ``home`` option of ``WSGIDaemonProcess``
causing server startup to fail.

6. When using ``--debug-mode`` of ``mod_wsgi-express`` the working
directory for the application was not being added to ``sys.path``. This
meant that if the WSGI script was referenced from a different directory,
any module imports for other modules in that directory would fail.

Features Changed
----------------

1. Until recently, a failed attempt to change the working directory for a
daemon process to the user the process runs as would be ignored. Now it
will cause a hard failure that will prevent the daemon process from
starting. This would cause issues where the user, usually the default
Apache user, has not valid home directory. Now what will happens is that
any attempt will only be made to change the working directory to the home
directory of the user the daemon process runs as, if the 'user' option had
been explicitly set to define the user and the user is different to the
user that Apache child worker processes run as. In other words, is
different to the default Apache user.

2. The support for the ``wdb`` debugger was removed. Decided that it wasn't
mainstream enough and not ideal that still required a separate service and
port to handle debugging sessions.

New Features
------------

1. Added new feature to ``mod_wsgi-express`` implementing timeouts on the
reading of the request, including headers, and the request body. This
feature uses the Apache module ``mod_reqtimeout`` to implement the feature.

By default a read timeout on the initial request including headers of 15
seconds is used. This can dynamically increase up to a maximum of 30
seconds if the request data is received at a minimum required rate.

By default a read timeout on the request body of 15 seconds is used. This
can dynamically increase if the request data is received at a minimum
required rate.

The options to override the defaults are ``--header-timeout``,
``--header-max-timeout``, ``--header-min-rate``, ``--body-timeout``,
``--body-max-timeout`` and ``--body-min-rate``. For a more detailed
explaination of this feature, consult the documentation for the Apache
``mod_reqtimeout`` module.

2. Added a new ``%{HOST}`` label that can be used when specifying the
application group (Python sub interpreter context) to run the WSGI
application in, via the ``WSGIApplicationGroup`` directive, or the
``application-group`` option to ``WSGIScriptAlias``.

This new label will result in an application group being used with a name
that corresponds to the name of the site as identified by the HTTP request
``Host`` header. Where the accepting port number is other than 80 or 443,
then the name of the application group will be suffixed with the port
number separated by a colon.

Note that extreme care must be exercised when using this new label to
specify the application group. This is because the HTTP request ``Host``
header is under the control of the user of the site.

As such, it should only be used in conjunction with a configuration which
adequately blocks access to anything but the expected hosts.

For example, it would be dangerous to use this inside of a ``VirtualHost``
where the ``ServerAlias`` directive is used with a wildcard. This is
because a user could pick arbitrary host names matching the wildcard and so
force a new sub interpreter context to be created each time and so blow out
memory usage.

Similarly, caution should be exercised with ``mod_vhost_alias``, with any
configuration forbidding any host which doesn't specifically match some
specified resource such as a directory.

Finally, this should probably never be used when not using either
``VirtualHost`` or ``mod_vhost_alias`` as in that case the server is likely
going to accept any ``Host`` header value without exclusions.

3. Allow ``%{RESOURCE}``, ``%{SERVER}`` and ``%{HOST}`` labels to be used
with the ``WSGIProcessGroup`` directive, or the ``process-group`` option of
the ``WSGIScriptAlias`` directive.

For this to work, it is still necessary to have setup an appropriate
mod_wsgi daemon process group using the ``WSGIDaemonProcess`` directive,
with name that will match the expanded value for the respective labels.
If there is no matching mod_wsgi daemon process group specified, then
a generic HTTP 500 internal server error response would be returned and
the reason, lack of matching mod_wsgi daemon process group, being logged in
the Apache error log.

4. Error messages and exceptions raised when there is a failure to read
request content, or write back a response now provide the internal error
indication from Apache as to why. For the ``IOError`` exceptions which are
raised, that the exception originates within Apache/mod_wsgi is now flagged
in the description associated with the exception.

5. When using mod_wsgi daemon mode and there is a timeout when reading
request content in order to proxy it to the daemon process, a 408 request
timeout HTTP response is now returned where as previously a generic 500
internal server error HTTP response was returned.

Note that this doesn't mean that the WSGI application wasn't actually run.
The WSGI application in the daemon process would have run as soon as the
headers had been received.

If the WSGI application had actually attempted to read the request content,
it should also have eventually received an exception of type ``IOError``
when accessing ``wsgi.input`` to read the request content, due to a
timeout or due to the proxy connection being closed before all request
content was able to be read.

If the WSGI application wasn't expecting any request content and had
ignored it, even though some was present, it would still have run to
completion and generated a response, but because the Apache child worker
process was blocked waiting for content, when the timeout occurred the
client would get the 408 HTTP response rather than the actual response
generated by the WSGI application.

6. Added the ``--log-to-terminal`` option to ``mod_wsgi-express`` to allow
the error log output to be directed to standard error for the controlling
terminal, and the access log output, if enabled, to be directed to standard
output. Similarly, the startup log output, if enabled, will be sent to
standard error also.

This should not be used in conjunction with ``--setup-only`` option when
using the generated ``apachectl`` script, unless the ``-DFOREGROUND``
option is also being supplied to ``apachectl`` at the time it is run with
the ``start`` command.

7. Added the ``--access-log-format`` option to ``mod_wsgi-express``. By
default if the access log is enabled, entries will follow the 'common' log
format as typically used by Apache. You have two options of how you can use
the ``--access-log-format``. The first is to give it the argument
'combined', which will then cause it to use this alternate log format
which is again often used with Apache. The other is to specify the log
format string yourself.

The format string can contain format string components as would be used
with the ``LogFormat`` directive. For example, to specify the equivalent to
the 'common' log format, you could use::

--access-log-format "%h %l %u %t \"%r\" %>s %b"

This 'common' log format is identified via a nickname in the same way
'combined' is, so if you did have to specify it explicitly for some reason,
you could just have instead used::

--access-log-format common

8. Added the ``--newrelic-config-file`` and ``--newrelic-environment``
options to ``mod_wsgi-express``. This allows these to be set using command
line options rather than requiring the New Relic environment variables.
Importantly, when the options are used, the values will be embedded in the
generated files if using ``--setup-only``. Thus they will still be set when
later using the ``apachectl`` control script to start the server.

Note that when these options are used, they will cause the equivalent New
Relic environment variable for that option to be ignored, both if running
the server immediately, or if using ``--setup-only`` and running the server
later using ``apachectl``.

9. Added the ``--enable-debugger`` option to ``mod_wsgi-express``. When
specified and at the same time the ``--debug-mode`` option is specified,
then when an exception is raised from the initial execution of the WSGI
application, when consuming the response iterable, or when calling any
``close()`` method of the response iterable, then post mortem debugging of
the exception will be triggered. Post mortem debugging is performed using
the Python debugger (pdb).

10. Added the ``--enable-coverage`` option to ``mod_wsgi-express``. When
specified and at the same time the ``--debug-mode`` option is specified,
then coverage analysis is enabled. When the server is exited, then the HTML
reports will be output to the ``htmlcov`` directory under the server
working directory, or the directory specified using the
``--coverage-directory`` option. The ``coverage`` module must be installed
for this feature to work.

11. Added the ``--enable-profiler`` option to ``mod_wsgi-express``. When
specified and at the same time the ``--debug-mode`` option is specified,
then coverage analysis is enabled. When the server is exited, then the
profiler data will be output to the ``pstats.dat`` file under the server
working directory, or the file specified using the ``--profiler-output-file``
option.

12. Added the ``--python-path`` option to ``mod_wsgi-express`` to specify
additional directories that should be added to the Python module search path.

Note that these directories will not be processed for ``.pth`` files. If
processing of ``.pth`` files is required, then the ``PYTHONPATH`` environment
variable should be set and exported in a script file referred to using the
``--envvars-script`` option.