🧠 GPT-2 Project in ECE408
Here is the report of our ParallelSlackers Team project! 📄
(Actually, I combined the two reports into one.)
I learned a lot from this project. In particular, I worked on CUDA-based optimization of the GPT-2 model, applying techniques such as Tensor Cores, FlashAttention, and reduction strategies. I also used NVIDIA Nsight Systems and Nsight Compute to profile the execution and analyze performance bottlenecks. We finally achieved 3rd place in the competition.
🐧 Custom Linux Operating System in ECE391
ThreadRipper was the best team!
We built a simple RISC-V operating system including key components like virtual memory management and a filesystem with caching. We also added VirtIO devices, such as viogpu, to support running fun games in our OS.
Personally, I was responsible for the ELF loader, system calls, process management, and the shell interface. Additionally, I extended the shell by implementing tab completion using a trie and enabling scrolling through command history with the up and down keys, going beyond the course requirements.