diff --git a/nlnet-2024-12-324/plan-amendment-1.txt b/nlnet-2024-12-324/plan-amendment-1.txt index 9a5b3a1..db673e5 100644 --- a/nlnet-2024-12-324/plan-amendment-1.txt +++ b/nlnet-2024-12-324/plan-amendment-1.txt @@ -21,14 +21,14 @@ This is for adding the tooling to run all the right programs to generate FPGA bi - {1500} Add support for the Orange Crab since both Cesar and Jacob have one. * {1000} Add support for the Arty A7 100T since that's what we're using for CI. -## {14250} Register Renaming, Execution, and Instruction Retire +## {11750} Register Renaming, Execution, and Instruction Retire This covers getting register renaming working, as well as scheduling, executing simple ALU and Branch instructions, and properly handling instruction retire. (Some of that work is already done.) A lot of this is the work to come up with a detailed low-level plan for the CPU, so I don't have a good idea of how complex or not this is, though I expect it to be probably 40% of the CPU's complexity. * {1500} Add to the simulator in Fayalite the ability to transfer non-HDL data (e.g. HashMap) through the digital signalling mechanism, this allows using those data types when writing procedural models. -- {9000} Create a model of the whole rename/execute/retire control system, using procedural implementations of the most complex HDL modules where appropriate. +- {6500} Create a model of the whole rename/execute/retire control system, using procedural implementations of the most complex HDL modules where appropriate. - {3750} Translate the procedural model to use actual synthesizeable HDL. includes a proof of correctness of the out-of-order CPU in relation to a sequential CPU (probably most easily done by adding the proof to the retire stage). ## {9000} Instruction Fetch/Decode @@ -43,7 +43,7 @@ https://git.libre-chip.org/libre-chip/parse_powerisa_pdf * {2000} Create a model of the instruction fetch/decode control system, using procedural implementations of the most complex HDL modules where appropriate. - {3000} Translate the procedural model to use actual synthesizeable HDL. -## {15000} Load/Store instructions +## {17500} Load/Store instructions This covers implementing the load/store hierarchy, including an L1 cache. For now, the CPU will only target on-FPGA memory blocks, as well as simple I/O devices. (Support for DRAM can be added at a later point outside of this grant.) @@ -51,18 +51,18 @@ It should include d-cache, some kind of memory, and at least one IO device. It should include at least lr/sc, some atomic fetch-op, cached load/store, and IO load/store (IO needs to wait until non-speculative to start executing). -- {1500} memory system: main memory and IO devices -- I'm expecting just a big sram to be good enough for simulation of memory, on the fpga we could probably get away with a relatively small sram and put off a dram interface for later. for the IO device, I'm thinking we'd have a simple fixed-frequency uart. +- {4000} memory system: main memory and IO devices -- I'm expecting just a big sram to be good enough for simulation of memory, on the fpga we could probably get away with a relatively small sram and put off a dram interface for later. for the IO device, I'm thinking we'd have a simple fixed-frequency uart. - {1500} d-cache - {3750} memory load execution unit (we'll want to be able to do more than one load at once) - {3750} memory store execution unit - {3000} adding atomics: lr/sc, atomic fetch-add (or other fetch-op) - {1500} adding order-violation detection logic, so we can make memory look like it has total-store-order (for x86), or even sequential consistency (meaning we can ignore all non-IO fences) -## {0} Work towards the Formal Proof of No Spectre bugs +## Work towards the Formal Proof of No Spectre bugs (Deprecated, left as future work) This covers working on the Formal Proof of No Spectre bugs as well as improvements to other software/hardware needed for that. A major portion of this is to figure out what exact properties we need to formally prove for each part of the CPU, so we can put those parts together to make a working proof for the entire CPU. I'm hoping this will be enough work to get nearly all of the proof written out, even if there are some flaws we discover that we'll have to put off fixing for a later grant. -! {0} Write Rocq and HDL logic for tracking which instructions will eventually be cancelled and which will eventually be retired. -! {0} (Partial) proof that our CPU but with zeroed outputs for all eventually-cancelled instructions is equivalent to our real CPU design. This may need significant modifications to the CPU. This task may be too big and need further subdividing once we've made some progress on the proof so we know where to subdivide it. \ No newline at end of file +- Write Rocq and HDL logic for tracking which instructions will eventually be cancelled and which will eventually be retired. +- (Partial) proof that our CPU but with zeroed outputs for all eventually-cancelled instructions is equivalent to our real CPU design. This may need significant modifications to the CPU. This task may be too big and need further subdividing once we've made some progress on the proof so we know where to subdivide it. \ No newline at end of file