NLNet Grant Proposal -- Libre-Chip's CPU with a Programmable Decoder to Run Multiple ISAs at Full Speed -- 2025-12-681
Project Name: Libre-Chip's CPU with a Programmable Decoder to Run Multiple ISAs at Full Speed
Website/Wiki: https://libre-chip.org/
Abstract
Modern computers are built on several different mutually-incompatible popular ISAs such as x86_64, PowerISA, AArch64, and RISC-V. Many of the most popular ISAs have no high-speed libre/open-source implementations, which makes them much harder to trust since you can't inspect their source-code to look for bugs or secret backdoors. Additionally, there are basically no existing modern CPUs which can run more than one of those ISAs without requiring software emulation, which is slow and can often be buggy.
To solve those issues, we are building a libre-licensed CPU with speculative out-of-order superscalar execution that will support a programmable decoder (loosely inspired by FPGAs) followed by a µOp cache so the CPU can be programmed to decode and run at full speed just about any ISA you select, by handling the most common instructions entirely in hardware, with software fallback for decoding some of the less common instructions which can still easily be executed in hardware (storing the decoded instructions in the µOp cache), and full software emulation for the remaining instructions.
Relevant Previous Involvement
-
Jacob Lifshay -- Currently working on https://nlnet.nl/project/Libre-Chip-proof/. Worked on designing PowerISA CPUs with Libre-SOC for 5yr, built a simple OoO Superscalar CPU simulator https://salsa.debian.org/Kazan-team/power-cpu-sim, built a RV32I CPU with VGA output in a few weeks that runs a 2.5D maze game https://github.com/programmerjake/rv32. Also is the main author of the Fayalite HDL library. Has some experience writing compilers, e.g. Fayalite's simulator is implemented as a simple compiler to an internal p-code, also wrote a compiler from a language based on QuickBASIC to x86 (it successfully compiled a program he also wrote that animates a 3D diagram of a molecule), also wrote a JavaScript interpreter that properly handles running generators and is a compiler to internal p-code and has some optimization passes for specialization based on deduced types. Has experience writing and using LLVM IR, as well as being a contributor to LLVM and Rust.
-
Cesar Strauss - Currently working on https://nlnet.nl/project/Libre-Chip-proof/. Contributed to the Libre-SOC project, mostly on digital design and formal verification. Presented the talk "An introduction to Formal Verification of Digital Circuits" on FOSDEM 2024 (https://archive.fosdem.org/2024/schedule/event/fosdem-2024-2215-an-introduction-to-formal-verification-of-digital-circuits/).
-
Tobias Platen - Currently working on https://nlnet.nl/project/Libre-Chip-proof/. Contributed to the Libre-SOC project, mostly on ECP5 FPGA prototypes and DDR SDRAM memory interfaces. Presented the talk "Using the ECP5 for Libre-SOC prototyping" on FOSDEM 2024 (https://archive.fosdem.org/2024/schedule/event/fosdem-2024-2060-using-the-ecp5-for-libre-soc-prototyping/).
Requested support
Requested Amount €105000
Cost Explanation
Libre-Chip is currently funded by NLnet for https://nlnet.nl/project/Libre-Chip-proof/, which we expect to be mostly completed by the time we can start working on this grant.
We are requesting more than €50000 based on Jacob Lifshay having previously completed a grant as part of Libre-SoC. If that is not allowed, we can adjust our budget down by removing some tasks, and possibly by not implementing as much of the custom compiler, leaving some of it as future work.
We're aiming for a FTE rate of $69305.60/yr per person which is rate used in our grant application for https://nlnet.nl/project/Libre-Chip-proof/
Estimated Budget:
- € 40000 Adding missing features to our CPU, such as memory paging, floating-point instructions, a better cache hierarchy, and better compatibility with the PowerISA specification.
- € 20000 Add the programmable decoder and µOp cache to our CPU design.
- € 20000 Build a compiler that can extract the decoder portion of QEMU using pattern matching and some symbolic execution of LLVM IR, converting to a HDL IR more suitable for hardware.
- € 15000 Write code to convert the HDL IR to a bitstream we can program into the decoder.
- € 10000 Get the fallback software decoder and the software instruction emulator to work, as well as misc. other parts of the compiler needed to make the whole system work together.
Compare with existing/historical efforts
The Transmeta Crusoe is somewhat similar in that it implemented x86 by translating to an internal VLIW instruction set by using software JIT compilation, however our grant proposal differs in that we have hardware to handle the most common instructions instead of relying on software for everything, also we aim for compatibility with much more than just x86, unlike the Transmeta Crusoe.
Technical challenges we expect to solve during the project
We expect to solve 3 technical challenges:
- To design and write a working programmable decoder and µOp cache such that our CPU can run arbitrary ISAs. For now we're planning on it being able to support older x86_64 (where the patents have expired), PowerISA, and RISC-V, though we don't necessarily expect to have complete support for all of those within the scope of this grant.
- To build a custom compiler that can successfully extract a decoder for the most common instructions of the user's selected ISA, as well as generate a bitstream to program our decoder.
- To continue work on getting our CPU to be more complete and able to run more complex software.
A WIP high-level design of our CPU: https://libre-chip.org/first_arch/index.html
Text that was intended to be part of the Abstract, but didn't fit within the 1200 character length constraint: Additionally you can easily switch between selected ISAs by reprogramming the decoder, allowing you to e.g. run an x86_64 program alongside a RISC-V program in the same OS and quickly context-switch between them.
We are planning on building a custom compiler so the user can select an ISA and that compiler can compile the source code of QEMU to automatically generate the required bitstream for the FPGA, as well as generating the software required for decoding and/or emulating the remaining parts of the chosen ISA.
Our CPU is a continuation of the CPU from https://nlnet.nl/project/Libre-Chip-proof/, which we will also be extending to support more features such as memory paging, floating-point instructions, and better compatibility with the PowerISA specification.
Ecosystem
We are likely to work with QEMU upstream, as well as LLVM and Clang. We are already working with the FIRRTL specification GitHub repo to resolve problems we encounter, as well as with LLVM Circt, and with the Rust Language.
This project benefits Europeans (as well as everyone else) by providing a libre/open-source CPU with good performance that supports many of the most-popular ISAs all on the same CPU.