Toward standardized near-data processing with unrestricted data placement for GPUs
Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems
Anatomy of GPU Memory System for Multi-Application Execution
Proceedings of the 2015 International Symposium on Memory Systems - MEMSYS ’15
2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC ’17
Kevin Hsieh
Mike O’Connor
Gwangsun Kim
Stephen W. Keckler
Mahmut T. Kandemir
Nandita Vijaykumar
Data movement aware computation partitioning