Thursday, 23 March 2017

OpenCL

1. What is OpenCL

  • OpenCL- Open Computing Language
  • Open Specification
  • Proposed by Apple
  • Specification developed by number of Companies
  • Specification is maintained by the Khronos Group

2. Why OpenCL ?

  • Computational performance has shifted from clock speed to cores
  • Multiple CPUs and programmable GPUs
  • Need a programming interface that allows users to take advantage of all the system resources
  • Supports general purpose parallel computations
  • OpenCL is device agnostic
  • As an open standard, code should be portable across implementations
  • No single company controls the specification – vendor neutral

3. OpenCL Devices

  • Commonly CPUs and GPUs
  • FPGA
  • Embedded processors
  • DSPs

4. Uses of OpenCL

  • Image, Video and audio processing
  • Simulations and scientific calculations
  • Medical imaging
  • Financial models
  • Data parallel algorithms

5. What is not right for OpenCL

  • Sequential problems
  • Calculations that require a lot of pointer chasing or constant data permutation
  • Calculations that require a lot of communication and result updates
  • Device dependent limitations

6. OpenCL Programming Model

 In developing an OpenCL project, the first step is to code the host application. This runs on a   user's computer (the host) and dispatches kernels to connected devices. The host application can be coded in C or C++, and every host application requires five data structures:cl_device_id, cl_kernel, cl_program, cl_command_queue, and cl_context.

    Data Structures
  1. Device: OpenCl device receives kernels from the host represented by cl_device_id
  2. Kernel: A host application distributes kernels to devices represented by a cl_kernel
  3. Program: The host selects kernels from a program represented by a cl_program
  4. Command queue: Each device receives kernels through a command queue represented by a cl_command_queue
  5. Context: An OpenCL context allows devices to receive kernels and transfer data represented by a cl_context
    OpenCL Kernels

   One of OpenCL's great advantages is that kernels can execute on high-performance computing devices such as GPUs.
  1. The OpenCL Execution Model: Kernels are executed by one or more work-items. Work-items are collected into work-groups and each work-group executes on a compute unit.
  2. The OpenCL Memory Model: Kernel data must be specifically placed in one of four address spaces — global memory, constant memory, local memory, or private memory. The location of the data determines how quickly it can be processed.

7. OpenCL Memory model

The OpenCL memory model identifies four address spaces:
  1. Global memory: Stores data for the entire device.
  2. Constant memory: Similar to global memory, but is read-only.
  3. Local memory: Stores data for the work-items in a work-group.
  4. Private memory: Stores data for an individual work-item.
FAQ on OpenCL
  • What makes OpenCL fast?
  • On what type of devices it will work?
  • What is the difference between CUDA, OPENMP and OPENCL?
  • What is the stability of OpenCL

No comments:

Post a Comment