Optimize the JIT's low-level assembly control flow · Issue #135904 · python/cpython · GitHub | Latest TMZ Celebrity News & Gossip | Watch TMZ Live
Skip to content

Optimize the JIT's low-level assembly control flow #135904

Open
@brandtbucher

Description

@brandtbucher

Currently, our JIT backend mostly just takes the machine code that Clang gives us, and emits it without modification. The one exception to this is removing zero-length jumps to the next uop at the end of each blob of machine code, which is both fragile and extremely limited in what it can do.

For example, consider _GUARD_NOS_NULL. This is the (16-byte) sequence that Clang gives us:

cmpq    $0x1, -0x10(%r13)
je      _JIT_CONTINUE
jmp     _JIT_JUMP_TARGET

And this is the (11-byte) sequence we want:

cmpq    $0x1, -0x10(%r13)
jne      _JIT_JUMP_TARGET

We should do a bit more here, and doing more is a lot easier if we're modifying textual assembly at build time. Thankfully, this is pretty straightforward: we just compile to assembly using Clang (-S), modify it, and finish compiling using Clang again.

We should intentionally not take on too much complexity here, especially since we support a few different platforms. There's a lot that we can do by only reasoning about labels, jumps, branches, and returns and treating all other sequences of instructions as black boxes. Once we have the assembly parsed into basic blocks, we can do things like:

  • Inverting the direction of branches (from branch-hot/jump-cold to branch-cold/jump-hot) like in _GUARD_NOS_NULL above is straightforward.
  • Having the assembler encode all _JIT_CONTINUE jumps during this step (by just adding the label at the end of the assembly) instead of doing it at runtime. It will also use more efficient "short" jump encodings most of the time, as an additional benefit.
  • Removing zero length jumps, as we do now, is trivial.
  • Later: Splitting the stencils into "hot" (core uop logic) and "cold" (deopts, error handling, etc) code. The JIT will emit all "hot" code for a trace, followed by all cold code for a trace, keeping the cold code out-of-line.

I have a branch to do all but the last of these, and will open a PR soon.

Linked PRs

Metadata

Metadata

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetopic-JITtype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    TMZ Celebrity News – Breaking Stories, Videos & Gossip

    Looking for the latest TMZ celebrity news? You've come to the right place. From shocking Hollywood scandals to exclusive videos, TMZ delivers it all in real time.

    Whether it’s a red carpet slip-up, a viral paparazzi moment, or a legal drama involving your favorite stars, TMZ news is always first to break the story. Stay in the loop with daily updates, insider tips, and jaw-dropping photos.

    🎥 Watch TMZ Live

    TMZ Live brings you daily celebrity news and interviews straight from the TMZ newsroom. Don’t miss a beat—watch now and see what’s trending in Hollywood.