About: GPU computing kernels are relatively simple to write if achieving the best performance is not of the highest priority. However, it can quickly become a much more daunting task when users try to tune and optimize their kernels to obtain the highest performance. This is due to GPUs’ massive degree of parallelism, complex memory hierarchy, fine grain synchronization, and long memory access latency. Hence, users must carry out the complex tasks of profiling, analyzing, and tuning to reduce performance bottlenecks. Today’s GPUs can generate hundreds of performance events that comprehensively quantify the behavior of a kernel. Instead of relying on experts’ manual analysis, this paper targets using machine learning methods to generalize GPU performance counter data to determine the characteristics of a GPU kernel as they will reveal possible reasons for low performance. We choose a set of problem-independent counters as our inputs to design and compare three machine learning methods to automatically classify the execution behavior of a kernel. The experimental results on stencil computing kernels and sparse matrix multiplications show the machine learning models’ good accuracy, and demonstrate a feasible approach that is capable of classifying a kernel’s characterizations and suggesting changes to a skilled user, who can subsequently improve kernel performance with less guessing.   Goto Sponge  NotDistinct  Permalink

An Entity of Type : fabio:Abstract, within Data Space : covidontheweb.inria.fr associated with source document(s)

AttributesValues
type
value
  • GPU computing kernels are relatively simple to write if achieving the best performance is not of the highest priority. However, it can quickly become a much more daunting task when users try to tune and optimize their kernels to obtain the highest performance. This is due to GPUs’ massive degree of parallelism, complex memory hierarchy, fine grain synchronization, and long memory access latency. Hence, users must carry out the complex tasks of profiling, analyzing, and tuning to reduce performance bottlenecks. Today’s GPUs can generate hundreds of performance events that comprehensively quantify the behavior of a kernel. Instead of relying on experts’ manual analysis, this paper targets using machine learning methods to generalize GPU performance counter data to determine the characteristics of a GPU kernel as they will reveal possible reasons for low performance. We choose a set of problem-independent counters as our inputs to design and compare three machine learning methods to automatically classify the execution behavior of a kernel. The experimental results on stencil computing kernels and sparse matrix multiplications show the machine learning models’ good accuracy, and demonstrate a feasible approach that is capable of classifying a kernel’s characterizations and suggesting changes to a skilled user, who can subsequently improve kernel performance with less guessing.
Subject
  • Artificial intelligence
  • Virtual reality
  • Australian inventions
  • OpenCL compute devices
  • Operating system kernels
  • GPGPU
  • Hardware acceleration
  • Graphics hardware
  • Application-specific integrated circuits
  • Graphics processing units
part of
is abstract of
is hasSource of
Faceted Search & Find service v1.13.91 as of Mar 24 2020


Alternative Linked Data Documents: Sponger | ODE     Content Formats:       RDF       ODATA       Microdata      About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data]
OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software