Article Source
- Title: GPU Accelerated Computing with C and C++
- Source: NVIDIA CUDA Zone
GPU Accelerated Computing with C and C++
With the CUDA Toolkit from NVIDIA, you can accelerate your C or C++ code by moving the computationally intensive portions of your code to an NVIDIA GPU. In addition to providing drop-in library acceleration, you are able to efficiently access the massive parallel power of a GPU with a few new syntactic elements and calling functions from the CUDA Runtime API.
The CUDA toolkit from NVIDIA is free and includes:
- Visual and command-line debugger
- Visual and command-line GPU profiler
- Many GPU optimized libraries
- The CUDA C/C++ compiler
- GPU management tools
- Lots of other features
Getting Started:
- Make sure you have an understanding of what CUDA
is.
- Read through the Introduction to CUDA C/C++ series on Mark Harris’ Parallel Forall blog.
- Try CUDA by taking a self-paced lab on nvidia.qwiklab.com. These labs only require a supported web browser and a network that allows Web Sockets. Click here to verify that your network & system support Web Sockets in section “Web Sockets (Port 80)”, all check marks should be green.
- Download and install the CUDA Toolkit.
- See how to quickly write your first CUDA C program by watching the following video:
Learning CUDA:
- Take the easily digestible, high-quality, and free Udacity Intro to Parallel Programming course which uses CUDA as the parallel programming platform of choice.
- Visit docs.nvidia.com for CUDA C/C++ documentation.
- Work through hands-on examples:
- Look through the code samples that come installed with the CUDA Toolkit.
- If you are working in C++, you should definitely check out the Thrust parallel template library.
- Browse and ask questions on stackoverflow.com or NVIDIA’s DevTalk forum.
- Learn more by:
- Reading the CUDA C Programming Guide
- Reading the CUDA C Best Practices Guide
- Watching the many hours of recorded sessions from the gputechconf.com site.
- d.Participating in trainings provided at conferences, such as Supercomputing, International Supercomputing, GPU Technology Conference, any may others.
- Browsing here for more learning opportunities.
- Look at the following for more advanced hands-on examples:
- A 1D Stencil example, including shared memory and synchronized threads.
- Optimizing a Jacobi Point Iterative method.
So, now you’re ready to deploy your application? You can register today to have FREE access to NVIDIA TESLA K40 GPUs. Develop your codes on the fastest accelerator in the world. Try a Tesla K40 GPU and accelerate your development.
Availability
The CUDA Toolkit is a free download from NVIDIA and is supported on Windows, Mac, and most standard Linux distributions.