+886 932-230-966Remote, during semester (2025/11 - Present)
Focusing on software deployment and performance validation within the AMD ROCm ecosystem.
Deployed a single-node multi-GPU (MI325X*8) inference environment using Docker Compose (with LMCache, vLLM), enhancing large language model inference efficiency through PD separation.
hipify-perl and CMake, enabling execution on AMD GPUs.================================================================================
T/V N NB P Q Time Gflops ( per GPU)
--------------------------------------------------------------------------------
WR0 1853440 2048 16 16 369.72 1.148e+07 ( 4.485e+04)
HPL_pdgesv() start time Sat Mar 7 02:41:45 2026
HPL_pdgesv() end time Sat Mar 7 02:47:54 2026
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.000264582858 ...... PASSED
||Ax-b||_oo . . . . . . . . . . . . . . . . . = 0.0000001085912040
||A||_oo . . . . . . . . . . . . . . . . . . . = 464354.0734328660182655
||x||_oo . . . . . . . . . . . . . . . . . . . = 4.2953119208323303
||b||_oo . . . . . . . . . . . . . . . . . . . = 0.9923124922204719
================================================================================
.gitignore templates for HIP projects to enhance developer experience.