Codegen optimizations in LLVM

Tags:  llvm clang codegen optimizations

The IR-level optimizations in LLVM seem fairly well documented in the list of Transform passes. However, the optimizations applied after IR generation (i.e. optimizations applied just before lowering the IR to the machine code) weren't quite easy to find. There are a few target-independent optimizations, defined in lib/CodeGen while there are target-dependent optimizations as well, defined in the source files that correspond to the specific target (for instance, X86-specific optimizations are implemented in lib/Target/X86). Here's an attempt to document the target-independent optimizations. I cannot guarantee that this list is complete for I certainly cannot make claims of having deep knowledge about LLVM.

Read more…

Compiler optimizations in the presence of function calls

Tags:  optimization pure const

What is it that happens when computational loops now contain function calls and how does the compiler react to the change? Taking the GNU gcc compiler (v4.8.2) and the Intel icc compiler (v14.0), I wrote a few pathological code snippets that bring out the general actions built into the compiler.

Read more…

Machine code randomization in JIT compilers

Tags:  JIT memory safety

Recently, I happened to stumble upon a patent from Microsoft that lists possible defenses against attacks on memory safety in JIT compilers. While I never understood the reason behind filing a patent on this, I documented it nevertheless because it has a fairly long laundry list of randomizations applied to machine code generation in modern-day JIT compilers.

Read more…

Indirect jmp instructions in GCC

Tags:  control flow integrity indirect jumps

As part of understanding techniques to implement Control Flow Integrity, I am studying indirect jmp, indirect call and return instructions. In doing so, I wanted to understand the source of these instructions. Well, the source of return instructions is fairly obvious (they are found in function epilogues). But the indirect jump and call instructions evaded me -- what source code pattern generates these instructions? In some sense, you can hazard a guess that indirect call instructions arise from the use of function pointers (although I have been having a tough time figuring this out from the GCC source code). But indirect jumps?

Turns out, there are four sources of indirect jump instructions in code generated by GCC:

  • Computed goto statements.
  • Use of __builtin_nonlocal_goto.
  • Use of __builtin_longjmp.
  • Unconditional jump related optimization.

Why is this relevant? Good question. I found this in my notes just recently and I stumbled upon it today (13th Feb 2014) and I forgot why I had recorded this. Anyway, posting it here for the sake of completeness, if I ever go back to reading about indirect jumps.

Resurrecting libdwarf

Tags:  download libdwarf

libdwarf seems to have dissapeared from SGI's website. The last known location of libdwarf was here, but the domain. There appears to be an alternate implementation of libdwarf on SourceForce but it seems like it houses more than what I currently require. To get around the situation, I extracted libdwarf from hpctoolkit-externals.

Update [Sep 2013]: Thanks to Bill Williams from for pointing out that libdwarf is available for download from prevanders.

Memory maps in 64-bit Linux

Tags:  x86-64 linux segmentation

Today I learnt that all of what I had studied about segmented memory while programming the 8085 and 8086 microprocessors was useless. Well, to be precise, segmentation has been almost abandoned in 64-bit x86 processors and the operating systems that run on it. And if you, like me, were trying to wrap your head around the complexities and inconsistencies that surround memory mapping on 64-bit processors in Linux (and partly, Windows as well), then here's something that might help.

Read more…

Hello, World!

Tags:  hello-world intro

Need I say more?