Log formatters which use newlines to separate messages should quote newlines for security reasons · Issue #94503 · python/cpython · GitHub | Latest TMZ Celebrity News & Gossip | Watch TMZ Live
Skip to content

Log formatters which use newlines to separate messages should quote newlines for security reasons #94503

Open
@glyph

Description

@glyph

The logging module, in most common configurations, is vulnerable to log injection attacks.

For example:

import logging
logging.basicConfig(format='%(asctime)s %(message)s')
logging.warning('message\n2022-06-17 15:15:15,123 was logged.message')

results in

2022-06-16 14:03:06,858 message
2022-06-17 15:15:15,123 was logged.message

All available log formatters in the standard library should provide a straightforward way to tell the difference between log message contents and log file format framing. For example, if your output format is newline-delimited, then it cannot allow raw newlines in messages and should "sanitize" by quoting them somehow.

Twisted deals with this by quoting them with trailing tabs, so, for example, the following code:

from twisted.logger import globalLogBeginner, textFileLogObserver, Logger
import sys

globalLogBeginner.beginLoggingTo(
    [textFileLogObserver(sys.stdout)], redirectStandardIO=False
)

log = Logger()
log.info("regular log message\nhaha i tricked you this isn't a log message")
log.info("second log message")

Produces this output:

2022-06-17T15:35:13-0700 [__main__#info] regular log message
	haha i tricked you this isn't a log message
2022-06-17T15:35:13-0700 [__main__#info] second log message

I'd suggest that the stdlib do basically the same thing.

One alternate solution is just documenting that no application or framework is ever allowed to log a newlines without doing this manually themselves (and unfortunately this seems to be where the Java world has ended up, see for example spring-projects/spring-framework@e9083d7 ), but putting the responsibility on individual projects to do this themselves means making app and library authors predict all possible Formatters that they might have applied to them, then try to avoid any framing characters that that Formatter might use to indicate a message boundary. Today the most popular default formatter uses newlines. But what if some framework were to try to make parsing easier by using RFC2822? Now every application has to start avoiding colons as well as newlines. CSV? Better make sure you don't use commas. Et cetera, et cetera.

Pushing this up to the app or framework means that every library that wants to log anything derived from user data can't log the data in a straightforward structured way that will be useful to sophisticated application consumers, because they have to mangle their output in a way which won't trigger any log-parsing issues with the naive formats. In practice this really just means newlines, but if we make newlines part of the contract here, that also hems in any future Formatter improvements the stdlib might want to make.

I suspect that the best place to handle this would be logging.Formatter.format; there's even some precedent for this, since it already has a tiny bit of special-cased handling of newlines (albeit only when logging exceptions).

(The right thing to do to avoid logging injection robustly is to emit all your logs as JSON, dump them into Cloudwatch or Honeycomb or something like that and skip this problem entirely. The more that the standard logging framework can encourage users to get onto that happy path quickly, the less Python needs to worry about trying to support scraping stuff out of text files with no schema as an interesting compatibility API surface, but this is a really big problem that I think spans more than one bug report.)

Linked PRs

Metadata

Metadata

Assignees

Labels

stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    TMZ Celebrity News – Breaking Stories, Videos & Gossip

    Looking for the latest TMZ celebrity news? You've come to the right place. From shocking Hollywood scandals to exclusive videos, TMZ delivers it all in real time.

    Whether it’s a red carpet slip-up, a viral paparazzi moment, or a legal drama involving your favorite stars, TMZ news is always first to break the story. Stay in the loop with daily updates, insider tips, and jaw-dropping photos.

    🎥 Watch TMZ Live

    TMZ Live brings you daily celebrity news and interviews straight from the TMZ newsroom. Don’t miss a beat—watch now and see what’s trending in Hollywood.