Integrated assembler improvements in LLVM 19

104 points by MaskRay a year ago

aengelke a year ago

Nice summary! Additional changes I have planned:

- Removing per-instruction timers, which add a measurable overhead even when disabled (https://github.com/llvm/llvm-project/pull/97046)

- Splitting AsmPrinterHandler (used for unwind info) and DebugHandler (used also for per-instruction location information) to avoid two virtual function calls per instruction (https://github.com/llvm/llvm-project/pull/96785)

- Remove several maps from ELFObjectWriter, including some std::map (changed locally, need to make PR)

- Faster section allocation, remove ELF "mergeable section info" hash maps (although this is called just ~40 times per object file, it is very measurable in JIT use cases when compiling many small objects) (planned)

- X86 encoding in general; this consumes quite some time and looks very inefficient -- having written my own x86 encoder, I'm confident that there's a lot of improvement potential. (not started)

Some takeaways on a higher level -- most of these aren't really surprising, but nonetheless are very frequent problems(/patterns) in the LLVM code base:

- Maps/hash maps/sets are quite expensive when used frequently, and sometimes can be easily avoided, e.g., with a vector or, for pointer keys, a pointer dereference

- Virtual functions(/abstraction) calls comes at a cost, especially when done frequently

- raw_svector_ostream is slow, because writes are virtual function calls and don't get inlined (I previously replaced raw_svector_ostream with a SmallVector&: https://reviews.llvm.org/D145792)

- Frequent heap allocations are costly, especially with glibc's malloc

- Many small inefficiencies add up (=> many small improvements do, too)

MaskRay a year ago

Big thanks for the recent performance changes! The "many small inefficiencies" point resonates – it definitely shows how performance is hurt in many small areas.
(I aim to write blog posts every 2-3 weeks, but this latest one was postponed... I wrote this in relatively short time so that the gap would not be too long, and I really should take time to refine the post.)

Keyframe a year ago

Side note, but I was looking for a pre-built binaries in releases of LLVM project. Specifically I was looking for clang+llvm releases for x86_64 linux (ubuntu preferably) in order to save some time (always had trouble compiling it) and to put it into my own `prefix` directory. It's kind of wild to see aarch64, armv7, powerpc64, x86_64_windows.. but not something like this. I am aware of https://apt.llvm.org/ and its llvm.sh - but as I said, I'd prefer it to live in its own `prefix`. Anyone knows where else there might be pre-builts? There used to be something just like that for v17, like https://github.com/llvm/llvm-project/releases/download/llvmo...

MaskRay a year ago

https://mirrors.edge.kernel.org/pub/tools/llvm/ provides a PGO-optimized LLVM toolchain. It is likely much faster than Distro provided Clang.
You might also want to replace the malloc with mimalloc/snmalloc, which might yield ~10% performance boost.
- Keyframe a year ago
  
  oh this is nice! Thanks!
quic_bcain a year ago

> It's kind of wild to see aarch64, armv7, powerpc64, x86_64_windows.. but not something like this.
Yeah, sorry, mostly my fault. I'd been producing these regularly and haven't done as well lately. I'll get one uploaded for 18 soon. :(
- Keyframe a year ago
  
  thank you, friend! you're awesome
  - quic_bcain a year ago
    
    Ok, 18.1.8 uploaded [1] [2].
    [1] https://github.com/llvm/llvm-project/releases/tag/llvmorg-18...
    [2] https://discourse.llvm.org/t/18-1-8-has-been-tagged/79726/10...
    
    Keyframe a year ago
    
    Fantastic, thank you! Can you share the process how you build and package?
    
    quic_bcain a year ago
    
    I can - it's well documented by the project itself [1].
    Very nearly all of the work is done by the test-release.sh script.
    [1] https://llvm.org/docs/HowToReleaseLLVM.html

mncharity a year ago

In the first sentence, "[Intro to the LLVM MC Project]" was likely intended to be a link[1].

[1] https://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html

MaskRay a year ago

Thx. Fixed

matrix_overload a year ago

TLDR: building projects with Clang is now about 4% faster due to optimizations in the way it internally handles assembly.

JonChesterfield a year ago

Perhaps more important, someone is going through MC and simplifying it. Decent chance that's a net reduction in bugs as well.
- MaskRay a year ago
  
  Thanks!

brcmthrowaway a year ago

[flagged]

Smaug123 a year ago

Did you have anything in mind? I must say, "we added LLMs to LLVM" is a scenario that fills me with horror.
- pjmlp a year ago
  
  Actually I expect that eventually we will have such scenario.
  Instead of LLM => some language generated output => its compiler => executable
  We will get LLM => magic pixie dust => executable
  The dream of many corporate overlords.
  - plingbang a year ago
    
    Why not just an LLM-based interpreter that direclty executes a PDF spec plus edits received by email? No need to recomplile and restart the app. A DB is also not required - the LLM will naturally remember all user requests and figure out the current state. (We'll solve the limitations of context later)
  - wyldfire a year ago
    
    > We will get LLM => magic pixie dust => executable
    Indeed one of the challenges with using machine learning as a part of compilation is reasoning about it when trying to investigate reported defects.
    Some of the research focuses on simpler/more practical domains, such as the ordering of the compiler passes.
  - JonChesterfield a year ago
    
    Have it emit plausible-looking x64 instructions by training on lots of executables, get a program out which has some behaviour. Might be worth seed funding at the moment.
    
    pjmlp a year ago
    
    Yeah, this is however the same kind of discussion as back in the day Assembly developers not trusting FORTRAN compilers, so it is a matter of time, and funding.
    
    JonChesterfield a year ago
    
    The Fortran compilers were trying to get the answer right whereas the proposed funding void would at best be trying to avoid a segv.
    What probably does have real merit is tying a superoptimiser to a LLM, provided you've got the SAT solver included in the mix as well to know if it worked.
  - abainbridge a year ago
    
    You missed out the input to the LLM, which would presumably be a requirements spec with all behaviour specified in exact detail, including all the tricky corner cases were someone has to think hard about which solution is most useful and least confusing to the customer. Natural language isn't great for expressing such things. A formal notation would be easier. Perhaps something that makes it easy to express if-this-then-that kinds of things. I wonder if a programming language would be good for that.
    
    pjmlp a year ago
    
    Indeed, that is why, based on offshoring experience, I see a future where the developers of tomorrow are mostly technical architects, with Star Trek style "Computer do XYZ".
    This has been tried before with UML, see Rational, Together or Enterprise Architect, however LLMs bring an additional automation step to the whole thing.
- adrianN a year ago
  
  If you have a verification step behind the llm that proves semantic equivalence between the original code and the llm output I could imagine scenarios where it can be beneficial.
- binary132 a year ago
  
  Error: I’m afraid I can’t let you compile that.
superb_dev a year ago

Finally, LLVM can hallucinate brand new instruction sets