Projects (individually/group)
Students will cooperate/collaborate to design different components on an
advance out-of-order processor. Projects are compossed of four major
components:
- Setup
- Design
- Implementation
- Testing
Setup (individually)
To understand the SYNOS infrastructure, there is a small group of setup
projects done at the beginning of the quarter. These projects will teach you
how to use the SYNOS infrastructure. Remember that these projects should be
done individually.
Before starting to code, there are several small projects designed to get
used to the SCOORE infrastructure. Our infrastructure uses the following
tools:
- VCS: Synopsys Verilog behavioral simulator to test our
infrastructure. Most testbenches would use PLI.
- Synplicity
Pro is a FPGA synthesis tool. We'll use synplicity Pro to check that
our design is synthesizable and that we meet the our target of 125MHz for
Stratix-II FPGAs.
- dc_shell: Synopsys Design Compiler with either 90nm ASIC or 180nm
student technology files. For 90nm of frequency target is 900MHz (550MHz
for 180nm).
More details about the setup projects are given on the lab slides.
- Setup Project 1: check lab01 slides
- Setup Project 2: get ps2.tar.gz, and
create a new testbench (implement a testbench for synos/storage/rtl/scfifo).
Project (group)
At the same time that you start to get used to the SYNOS/SCOORE
infrastructure, you should start to design on paper your project components.
Once it is clear the number of students, projects will be assigned the
second week of class by the instructor. The objective is to maximize the
chances of having a working processor by the end of the quarter.
Projects have one or two members. Every lab (weekly), each student should
present a 5 minutes summary of the project status.
List of projects: (check the synos/scoore directory for more details)
- IF/ID (Keertika): SCOORE instruction fetch.
- Task 1: IF/ID without branches (handle stalls)
- Task 2: handle branches
- Task 3: call/returns and BTB predictions
- Task 4: branch miss (update, recovery)
- Task 5: Integrate with QEMU (assume all branches resolved in n-cycles)
- Task 6: tune branch with different QEMU traces to maximize E*D^1.5/A
- Crack (Wael): SCOORE instruction crack stage (decode)
- Task 1: BTAA (Branch Target Address ALU)
- Task 2: crack0 (split instructions)
- Task 3: crack0 (nullify some delay slots)
- Task 4: crack1 (save/restore, 1 branch)
- Task 5: crack1 (cluster selection)
- Task 6: Integrate with QEMU
- RAT/ROB (Melisa): SCOORE rename/retire stage
- Task 1: Free register list
- Task 2: rename rat (front_rat)
- Task 3: 1 + retirement rat (recycle registers)
- Task 4: 1,2,3 + handle branch miss predictions
- Task 5: ROB state, finish, and retire in-order
- Task 6: 5 + verify in-order
- Task 7: integration with QEMU
- SEED (Carlos): SCOORE issue logic
- Task 1: dep bank insert
- Task 2: dep bank remove (executed)
- Task 3: dep table (multiple dep banks)
- Task 4: wakeup loop (load speculation)
- Task 5: pseudo-VLWI scheduler
- Task 6: integration with QEMU
- Task 7: Tune parameters so that QEMU traces maximize E*D^1.5/A
- Support (vacant): some basic functionality shared by several
modules
- Task 1: verify implement cfifo when inputs == outputs
- Task 2: implement cfifo when inputs != outputs
- Task 3: simple_alu and simple_alu_icc
- CE (Andrew): SCOORE compute engine
- Task 1: finish cunit
- Task 2: cunit testbench
- Task 3: bunit (branch unit)
- Task 4: munit (memory unit, interface to l0_dcache)
- Task 5: integrate with QEMU
- Task 6: Check max IPCs for several QEMU traces
- L2/pseudo-memory (Hari): L2 (single proc) and memory interface
- Task 1: L2 cache plug to mnet
- Task 2: pseudo-memory plug to L2 (no DDR, just PLI interface rd/wr)
- Task 3: recover memory state (pseudo-mem and L2) from QEMU
- MNET (vacant): nnet interconnect
- Task 1: build mnet with virtual channels and 5 nodes per switch
- Task 2: integrate with QEMU (loads/stores) check IPCs for several traces
- L0 & L1 data caches(Suraj): SCOORE memory subsystem
- Task 1: L0 data cache
- Task 2: L1 cache
- Task 4: integrate with QEMU (loads/stores) check IPCs for several traces
- L0 instruction cache(Matt): SCOORE instruction cache
- Task 1: L0 instruction cache
- Task 2: flush/stall support
- Task 3: next cache line prefetch
- Task 4: integrate with QEMU (1 taken branch per cycle) check max IPCs for several traces
- BugHunt (Sangeetha): IVM bug hunt
- Task 1: scvtools parse the IVM project
- Task 2: introduce bugs on IVM. Check accuracy
- Task 3: add other bug tracking (paths, value change...)
- Task 4: add invariant track
|