在线咨询 切换到宽版
eetop公众号 创芯大讲堂 创芯人才网

 找回密码
 注册

手机号码,快捷登录

手机号码,快捷登录

搜帖子
查看: 650|回复: 6

[求助] 求论文:HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs

[复制链接]
发表于 2025-6-5 10:05:05 | 显示全部楼层 |阅读模式
悬赏50资产已解决
https://www.computer.org/csdl/pr ... 700a168/22niuJK8Uj6
Abstract

The widespread adoption of GPUs has driven the development of GPU simulators, which, in turn, lead advancements in both GPU architectures and software optimization. Trace-driven cycle-accurate Cycle-accurate simulators, which provide detailed microarchitectural models and clock-level precision, come at the cost of extended simulation times and require high computational resources. Their scalability has become a bottleneck. A growing trend is the adoption of cycle-approximate simulators, which introduce mathematical modeling of partial hardware units and utilize sampling to accelerate simulation. However, this approach faces challenges regarding the accuracy of performance predictions. To address these limitations, we introduce HyFiSS, a hybrid fidelity stall-aware GPU simulator. HyFiSS features fine-grained stall events tracking and attribution by constructing a detailed execution pipeline model for various stall events on Streaming Multiprocessors (SMs). It accurately emulates the thread block scheduler behavior using real-time scheduling logs and utilizes sampling based on thread block sets to minimize the precision loss due to fine-grained sampling points on the microarchitectural state. We achieve a balance between reliability, speed, and the level of simulation detail, especially regarding bottlenecks. By evaluating a diverse set of benchmarks, HyFiSS achieves a mean absolute percentage error in predicting active cycles that is comparable to the state-of-the-art cycle-accurate simulator Accel-Sim. Moreover, HyFiSS achieves a substantial 12.8 × speedup in the simulation efficiency compared to Accel-Sim. HyFiSS also requires at least 3.2 × less disk storage than both Accel-Sim and another state-of-the-art cycle-approximate simulator PPT-GPU due to its efficient SASS (Streaming Assembler) traces compression. With precise, per-cycle stall events statistics, HyFiSS can provide accurate GPU performance metrics and stall cause reporting. This significantly simplifies performance analysis, bottleneck identification, and performance optimization tasks for researchers, making it easier to enhance GPU performance effectively.

发表于 2025-6-5 10:05:06 | 显示全部楼层

HyFiSS_A_Hybrid_Fidelity_Stall-Aware_Simulator_for_GPGPUs.pdf

1.07 MB, 下载次数: 16 , 下载积分: 资产 -2 信元, 下载支出 2 信元

回复

使用道具 举报

 楼主| 发表于 2025-6-5 16:11:26 | 显示全部楼层
回复

使用道具 举报

发表于 2025-6-5 21:07:23 | 显示全部楼层
thanks
回复

使用道具 举报

发表于 2025-6-6 07:05:50 | 显示全部楼层
Thanks
回复

使用道具 举报

发表于 2025-6-6 10:52:03 | 显示全部楼层
多谢分享 多谢分享 多谢分享
回复

使用道具 举报

发表于 2025-6-7 09:58:35 | 显示全部楼层
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

关闭

站长推荐 上一条 /1 下一条

X

手机版| 小黑屋| 关于我们| 联系我们| 隐私声明| EETOP 创芯网
( 京ICP备:10050787号 京公网安备:11010502037710 )

GMT+8, 2025-7-9 13:23 , Processed in 0.108536 second(s), 10 queries , Gzip On, MemCached On.

eetop公众号 创芯大讲堂 创芯人才网
快速回复 返回顶部 返回列表