1. What is OpenCL
- OpenCL- Open Computing Language
- Open Specification
- Proposed by Apple
- Specification developed by number of Companies
- Specification is maintained by the Khronos Group
2. Why OpenCL ?
- Computational performance has shifted from clock speed to cores
- Multiple CPUs and programmable GPUs
- Need a programming interface that allows users to take advantage of all the system resources
- Supports general purpose parallel computations
- OpenCL is device agnostic
- As an open standard, code should be portable across implementations
- No single company controls the specification – vendor neutral
3. OpenCL Devices
- Commonly CPUs and GPUs
- FPGA
- Embedded processors
- DSPs
4. Uses of OpenCL
- Image, Video and audio processing
- Simulations and scientific calculations
- Medical imaging
- Financial models
- Data parallel algorithms
5. What is not right for OpenCL
- Sequential problems
- Calculations that require a lot of pointer chasing or constant data permutation
- Calculations that require a lot of communication and result updates
- Device dependent limitations
6. OpenCL Programming Model
In developing an OpenCL project, the first step is to code the host
application. This runs on a user's computer (the host) and dispatches
kernels to connected devices. The host application can be coded in C or
C++, and every host application requires five data structures:
cl_device_id, cl_kernel, cl_program, cl_command_queue, and cl_context.
Data Structures
- Device: OpenCl device receives kernels from the host represented by cl_device_id
- Kernel: A host application
distributes kernels to devices represented by a
cl_kernel - Program: The host selects kernels
from a program represented by a
cl_program - Command queue: Each device receives
kernels through a command queue represented
by a
cl_command_queue - Context: An OpenCL context allows devices to receive kernels and
transfer data represented by a
cl_context
OpenCL Kernels
One of OpenCL's great advantages is that kernels can execute on high-performance computing devices such as GPUs.
- The OpenCL Execution Model: Kernels are executed by one or more work-items. Work-items are collected into work-groups and each work-group executes on a compute unit.
- The OpenCL Memory Model: Kernel data must be specifically placed in one of four address spaces — global memory, constant memory, local memory, or private memory. The location of the data determines how quickly it can be processed.
7. OpenCL Memory model
The OpenCL memory model identifies four address spaces:
- Global memory: Stores data for the entire device.
- Constant memory: Similar to global memory, but is read-only.
- Local memory: Stores data for the work-items in a work-group.
- Private memory: Stores data for an individual work-item.
FAQ on OpenCL
- What makes OpenCL fast?
- On what type of devices it will work?
- What is the difference between CUDA, OPENMP and OPENCL?
- What is the stability of OpenCL
No comments:
Post a Comment