RISC-V FPGA Keyword Spotting Speedup
ISEF Category: Embedded Systems
Ready to Turn This Idea Into a Real Project?
This guide was put together with the help of AI research tools to give you a solid starting point. But a competitive science fair project lives in the details: refining your research question, fine-tuning your variables, analyzing your data, and presenting your findings like a seasoned scientist.
For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
Subcategory: Other · Difficulty: Advanced · Setup: University Lab · Time: Full Year
The Hook
A tiny chip can listen for a wake word and make a decision before you finish saying the next syllable. That speed matters when power is low and every cycle counts. Your project asks a sharp question, can one custom instruction make a small processor much better at keyword spotting?
What Is It?
This project studies a simple idea with a big payoff. You start with a RISC-V soft-core, which means a processor design built in an FPGA instead of a fixed chip. Then you add one custom instruction for int4 matrix multiply, which is a fast way to do the math behind many machine learning models.
Think of it like adding a specialty tool to a toolbox. A normal processor can do the job with many small steps. A custom instruction bundles that work into one step, so the chip may finish faster, use less energy, or both. In this project, you check whether that extra hardware helps on a speech keyword spotting task, which asks a device to recognize short spoken words.
Why This Is a Good Topic
This is a strong science fair topic because you can measure clear outputs, like speed, resource use, and accuracy. You also get a real-world link to edge AI, where tiny devices need to run models without draining a battery. A student can learn how hardware and software trade off, how benchmarks work, and how to compare one design against a baseline with real data.
Research Questions
- How does adding a custom int4 matrix-multiply instruction change inference speed for keyword spotting?
- What is the effect of the custom instruction on FPGA resource use compared with a pure software implementation?
- Does the custom instruction change keyword spotting accuracy when the same model weights are used?
- To what extent does int4 quantization affect speed and accuracy across different keyword sets?
- Which benchmark input sizes show the largest speedup from the custom instruction?
- How does the custom instruction affect energy use per inference compared with the baseline?
Basic Materials
- Tang Nano FPGA board with USB cable.
- Computer with Linux or Windows for HDL synthesis and flashing.
- Open-source RISC-V soft-core source code.
- FPGA design tools that support the Tang Nano board.
- Keyword spotting dataset or recorded voice clips.
- Python installed with data analysis libraries.
- Logic analyzer or serial monitor for timing checks.
- Spreadsheet software for recording benchmark results.
Advanced Materials
- FPGA board with power measurement access or external power meter.
- Oscilloscope or power analyzer for current profiling.
- Open-source RISC-V core and custom instruction toolchain.
- Verilator or another simulator for cycle-level testing.
- Speech keyword spotting model converted to int4 weights.
- Test harness for automated benchmark runs.
- ImageJ or similar tool for waveform inspection if needed.
- Git for version control and experiment tracking.
Software & Tools
- Python: Runs benchmark scripts, organizes timing data, and plots speed and accuracy comparisons.
- NumPy: Handles arrays for performance calculations and summary statistics.
- Pandas: Stores results from many test runs in a clean table.
- Matplotlib: Makes charts that compare the baseline and the custom instruction.
- Verilator: Simulates the processor design before you flash hardware.
Experiment Steps
- Define the exact performance claim you want to test, such as speed, energy, or resource use.
- Choose one baseline processor build and one modified build so the comparison stays fair.
- Plan a keyword spotting benchmark that uses the same model, the same inputs, and the same output metric for both builds.
- Decide how you will measure runtime, accuracy, and hardware cost, then write down the measurement method before testing.
- Build a control set that checks whether any gain comes from the new instruction, not from unrelated compiler or code changes.
- Map out how you will repeat runs, summarize variation, and compare results with simple statistics.
Common Pitfalls
- Changing the benchmark code between the baseline and the modified core, which makes the speed comparison unfair.
- Using a model that is not truly int4 quantized, which hides whether the custom instruction helps the intended workload.
- Measuring only wall-clock time from the host computer instead of cycle counts from the FPGA, which blurs the processor-level result.
- Ignoring synthesis reports, which leaves you blind to whether the custom instruction costs too many logic resources.
- Testing on one tiny sample set, which makes keyword spotting accuracy look better or worse than it really is.
What Makes This Competitive
A stronger project shows more than a speedup chart. You would compare several model sizes, report hardware cost, and separate compute gains from memory or bus effects. You would also use repeated trials and a fair baseline so your result stands up to scrutiny. A project gets even stronger if you explain why the custom instruction helps and where it stops helping.
Project Variations
- Test the same custom instruction on a different edge AI task, such as gesture or sensor classification.
- Compare int4, int8, and float implementations to see how quantization changes speed and accuracy.
- Measure energy per inference on the FPGA instead of, or in addition to, raw runtime.
Learn More
- The RISC-V Instruction Set Manual: Find the official base and custom instruction rules on the RISC-V International site.
- MIT OpenCourseWare Digital Systems Laboratory: Search MIT OpenCourseWare for FPGA and processor design labs that explain hardware testing.
- NVIDIA Deep Learning Institute papers and open talks are not needed here, so look instead at review articles in IEEE Xplore through your school library for quantized neural network hardware.
- NIH PubMed: Search PubMed for review articles on keyword spotting, edge AI, and low-power inference.
- NASA Technology Reports Server: Search NTRS for embedded machine learning hardware reports and benchmarking methods.
- Open-source FPGA documentation for Tang Nano: Find board guides, pinouts, and tool setup notes on the vendor and community documentation pages.
Embedded Systems Category Guide
How to Do Real Embedded Systems Research at Home: A High School Student’s Guide to Free Tools, Affordable Kits, and Public Datasets →For next steps tailored to your interests, skill level, and timeline, work one-on-one with a MehtA+ mentor. Learn more about MehtA+ Science & Engineering Research Mentorship →
To discover more projects, visit the MehtA+ Science Fair Hub →
