OpenCL for Python API

opencl.Platform

class opencl.Platform

opencl.Platform not constructible.

Use opencl.get_platforms() to get a list of connected platoforms.

devices

list of all devices attached to this platform

extensions

platform extensions as a string

get_devices(device_type=opencl.Device.ALL)

return a list of devices by type.

name

platform name

profile

return the plafrom profile info

vendor

platform vendor

version

return the version string of the platform

opencl.Device

class opencl.Device

A device is a collection of compute units. A command-queue is used to queue commands to a device. Examples of commands include executing kernels, or reading and writing memory objects.

OpenCL devices typically correspond to a GPU, a multi-core CPU, and other processors such as DSPs and the Cell/B.E. processor.

DEFAULT

flag: device type default.

ALL

flag: for all devices

CPU

flag: for all CPU devices

GPU

flag: for all GPU devices

ALL = 4294967295L
CPU = 2L
DEFAULT = 1L
GPU = 4L
address_bits
available
compiler_available
driver_version
extensions
global_mem_size
has_image_support

test if this device supports the openc.Image class

has_local_mem
has_native_kernel

test if this device supports native python kernels

has_queue_out_of_order_exec_mode

test if this device supports out_of_order_exec_mode for queues

has_queue_profiling

test if this device supports profiling for queues

host_unified_memory
local_mem_size
max_clock_frequency

return the clock frequency.

max_compute_units

The number of parallel compute cores on the OpenCL device. The minimum value is 1.

max_const_buffer_size
max_image2d_shape
max_image3d_shape
max_mem_alloc_size
max_parameter_size
max_read_image_args
max_work_group_size
max_work_item_dimensions

Maximum dimensions that specify the global and local work-item IDs used by the data parallel execution model. (Refer to clEnqueueNDRangeKernel).

The minimum value is 3.

max_work_item_sizes

Maximum number of work-items that can be specified in each dimension to opencl.Queue.enqueue_nd_range_kernel.

Returns:n entries, where n is the value returned by the query for opencl.Device.max_work_item_dimensions
max_write_image_args
name

the name of this device

platform

return the platform this device is associated with.

profile
profiling_timer_resolution
queue_properties

return queue properties as a bitfield

see also has_queue_out_of_order_exec_mode and has_queue_profiling

type

return device type: one of [Device.DEFAULT, Device.ALL, Device.GPU or Device.CPU]

vendor_id

return the vendor ID

version

opencl.Event

class opencl.Event

An event object can be used to track the execution status of a command. The API calls that enqueue commands to a command-queue create a new event object that is returned in the event argument.

COMPLETE = 0
QUEUED = 3
RUNNING = 1
STATUS_DICT = {0: 'complete', 1: 'running', 2: 'submitted', 3: 'queued'}
SUBMITTED = 2
add_callback(callback)

Registers a user callback function for on completion of the event.

Parameters:callback – must be of the signature callback(event, status)
status

the current status of the event.

wait()

Waits on the host thread for commands identified by event objects in event_list to complete. A command is considered complete if its execution status is CL_COMPLETE or a negative value.

opencl.UserEvent

class opencl.UserEvent

Creates a user event object. User events allow applications to enqueue commands that wait on a user event to finish before the command is executed by the device.

complete()

Set this event status to complete.

opencl.Program

class opencl.Program(context, source=None, binaries=None, devices=None)

Create an opencl program.

Parameters:
  • context – opencl.Context object.
  • source – program source to compile.
  • binaries – dict of pre-compiled binaries. of the form {device:bytes, ..}
  • devices – list of devices to compile on.

To get a kernel do program.name or program.kernel(‘name’).

ERROR = -2
IN_PROGRESS = -3
NONE = -1
SUCCESS = 0
binaries

return a dict of {device:bytes} for each device associated with this program

Binaries may be used in a program constructor.

binary_sizes

return a dict of device:binary_size for each device associated with this program

build()

Builds (compiles & links) a program executable from the program source or binary for all the devices or a specific device(s) in the OpenCL context associated with program.

OpenCL allows program executables to be built using the source or the binary.

context

get the context associated with this program

devices

returns a list of devices associate with this program.

kernel()

Return a kernel object.

logs

get the build logs for each device.

return a dict of {device:str} for each device associated with this program.

num_devices

number of devices to build on

source

get the source code used to build this program

status

return a dict of {device:int} for each device associated with this program.

Valid statuses:

  • Program.NONE
  • Program.ERROR
  • Program.SUCCESS
  • Program.IN_PROGRESS

opencl.MemoryObject

class opencl.MemoryObject

Memory objects are categorized into two types: buffer objects, and image objects. A buffer object stores a one-dimensional collection of elements whereas an image object is used to store a two- or three- dimensional texture, frame-buffer or image.

BUFFER = 4336
IMAGE2D = 4337
IMAGE3D = 4338
add_destructor_callback(callback, *args, **kwargs)

Registers a user callback function with a memory object. Each call to add_destructor_callback registers the specified user callback function on a callback stack associated with memobj.

The registered user callback functions are called in the reverse order in which they were registered. The user callback functions are called and then the memory object’s resources are freed and the memory object is deleted. This provides a mechanism for the application (and libraries) using memobj to be notified when the memory referenced by host_ptr, specified when the memory object is created and used as the storage bits for the memory object, can be reused or freed.

Parameters:callback – function with the signature callback(memobj, *args, **kwargs)
base

Return the original memobject if this is a sub-buffer.

context

Return the context that this object was created with.

get_buffer_id()

Return the pointer to the opencl memory object.

mem_size

Return the size in bytes of this memory object.

type

return the enumed type of this object. One of MemoryObject.BUFFER, MemoryObject.IMAGE2D or MemoryObject.IMAGE3D

opencl.DeviceMemoryView

class opencl.DeviceMemoryView

A buffer object stores a one-dimensional collection of elements. Elements of a buffer object can be a scalar data type (such as an int, float), vector data type, or a user-defined structure

array_info
copy()

Copy this buffer into a new object

ctype
format

ctype format of the elements

static from_host(context, host, copy=True, readable=True, writeable=True)

Create an OpenCL buffer from a Python memoryview object.

is_contiguous

is this array C-contiguous

item()

Return a python object if the size of this array is 1.

itemsize

size of each element

map(queue, blocking=True, readable=True, writeable=True)

enqueues a command to map a region of the buffer object given by buffer into the host address space and returns a pointer to this mapped region.

nbytes

Total number of bytes

ndim

number of dimensions

offset_

offset from base buffer

read(queue, out, wait_on=(), blocking=False)

Read this buffer into a memory view

readonly

Is this memory writable

reshape()
shape

Shape of the array

size

Total number of elements

strides

Strides of the array

write(queue, buf, wait_on=(), blocking=False)

Write data from a memoryview to the device.

opencl.ImageFormat

class opencl.ImageFormat
CHANNEL_DTYPES = {4304: 'CL_SNORM_INT8', 4305: 'CL_SNORM_INT16', 4306: 'CL_UNORM_INT8', 4307: 'CL_UNORM_INT16', 4308: 'CL_UNORM_SHORT_565', 4309: 'CL_UNORM_SHORT_555', 4310: 'CL_UNORM_INT_101010', 4311: 'CL_SIGNED_INT8', 4312: 'CL_SIGNED_INT16', 4313: 'CL_SIGNED_INT32', 4314: 'CL_UNSIGNED_INT8', 4315: 'CL_UNSIGNED_INT16', 4316: 'CL_UNSIGNED_INT32', 4317: 'CL_HALF_FLOAT', 4318: 'CL_FLOAT'}
CHANNEL_ORDERS = {4272: 'CL_R', 4273: 'CL_A', 4274: 'CL_RG', 4275: 'CL_RA', 4276: 'CL_RGB', 4277: 'CL_RGBA', 4278: 'CL_BGRA', 4279: 'CL_ARGB', 4280: 'CL_INTENSITY', 4281: 'CL_LUMINANCE', 4282: 'CL_Rx', 4283: 'CL_RGx', 4284: 'CL_RGBx'}
channel_data_type
channel_order
ctype
format
static from_ctype()
static supported_formats()

opencl.Image

class opencl.Image
format
image_depth
image_format
image_height
image_width
map()
shape
strides

opencl.ContextProperties

class opencl.ContextProperties

Store of key value pairs that can be used to initialize an opencl.Context object.

as_dict()
platform
properties_dict
property_names_lookup = {4228: 'platform'}
set_property()

opencl.Context

class opencl.Context(devices=(), device_type=cl.Device.DEFAULT, properties=None, callback=None)

opencl.Context(devices=(), device_type=cl.Device.DEFAULT, ContextProperties properties=None, callback=print_context_error)

Creates an OpenCL context. An OpenCL context is created with one or more devices. Contexts are used by the OpenCL runtime for managing objects such as command-queues, memory, program and kernel objects and for executing kernels on one or more devices specified in the context.

Parameters:
  • devices – list of opencl devices
  • device_type – type of device to create context from. used only if devices is empty.
  • properties – cl.ContextProperties object
  • callback – This callback function will be used by the OpenCL implementation to report information on errors that occur in this context. the function signature must be callback(str, bytes)
devices

return a list of devices associated with this context

num_devices

return the number of devices

properties

return a ContextProperties object

ref_count

return opencl internal refrence count of this object

opencl.OpenCLException

class opencl.OpenCLException

Base opencl exception object.

opencl.contextual_memory

class opencl.contextual_memory(ctype=None, shape=None, flat=False)

Memory ‘type’ descriptor.

array_info
ctype_string()
derefrence()

Return the type that this object is a pointer to.

from_param()

Return a ctypes.c_void_p from arg.

Parameters:arg – must be a MemoryObject.
is_const = False
nbytes
offset
qualifier = None
shape
size
strides

opencl.global_memory

class opencl.global_memory(ctype=None, shape=None, flat=False)
qualifier = '__global'

opencl.Kernel

class opencl.Kernel

openCl kernel object.

A kernel object encapsulates a specific __kernel function declared in a program and the argument values to be used when executing this __kernel function.

argnames

Get or set the argument names. len(argnames) must equal kernel.nargs

argtypes

Assign a tuple of ctypes types to specify the argument types that the function accepts

len(argtypes) must equal kernel.nargs.

It is now possible to put items in argtypes which are not ctypes types, but each item must have a from_param() method which returns a value usable as argument (integer, string, ctypes instance). This allows to define adapters that can adapt custom objects as function parameters.

compile_work_group_size(device)

Returns the work-group size specified by the __attribute__((reqd_work_group_size(X, Y, Z))) qualifier.

context

The context this kernel was created with.

global_work_offset
global_work_size
local_mem_size(device)

Returns the amount of local memory in bytes being used by a kernel. This includes local memory that may be needed by an implementation to execute the kernel, variables declared inside the kernel with the __local address qualifier and local memory to be allocated for arguments to the kernel declared as pointers with the __local address qualifier and whose size is specified with clSetKernelArg

local_work_size
name

The name of this kernel

nargs

Number of arguments that this kernel takes

preferred_work_group_size_multiple(device)

Returns the preferred multiple of workgroup size for launch. This is a performance hint. Specifying a workgroup size that is not a multiple of the value returned by this query as the value of the local work size argument to clEnqueueNDRangeKernel will not fail to enqueue the kernel for execution unless the work-group size specified is larger than the device maximum.

private_mem_size(device)

Returns the minimum amount of private memory, in bytes, used by each workitem in the kernel. This value may include any private memory needed by an implementation to execute the kernel, including that used by the language built-ins and variable declared inside the kernel with the __private qualifier.

set_args(self, *args, **kwargs)

Set the arguments for this kernel

work_group_size(device)

This provides a mechanism for the application to query the maximum work-group size that can be used to execute a kernel on a specific device given by device. The OpenCL implementation uses the resource requirements of the kernel (register usage etc.) to determine what this workgroup size should be.

opencl.Queue

class opencl.Queue

opencl.Queue(context, device=None, out_of_order_exec_mode=False, profiling=False)

OpenCL objects such as memory, program and kernel objects are created using a context. Operations on these objects are performed using a command-queue. The command-queue can be used to queue a set of operations (referred to as commands) in order. Having multiple command-queues allows applications to queue multiple independent commands without requiring synchronization. Note that this should work as long as these objects are not being shared. Sharing of objects across multiple command-queues will require the application to perform appropriate synchronization

Parameters:
  • context – An opencl.Context object
  • device – if None use the first device in the context [default None]
  • out_of_order_exec_mode – enable out_of_order_exec_mode [default False]
  • profiling – enable profiling [default False]
barrier()

Enqueues a barrier operation. The queue.barrier command ensures that all queued commands in command_queue have finished execution before the next batch of commands can begin execution. The queue.barrier command is a synchronization point

context

Return the context that this queue was created with

copy()
device

Return the device associated with this queue

enqueue_copy_buffer(source, dest, src_offset=0, dst_offset=0, size=0, wait_on=())

Enqueues a command to copy a buffer object identified by source to another buffer object identified by dest.

Parameters:
  • source – memory object
  • dest – memory object
  • src_offset – refers to the offset where to begin copying data from src
  • dst_offset – refers to the offset where to begin copying data into dest
  • size – number of bytes to copy
  • wait_on – a sequence of events to wait for before submitting this command.
enqueue_copy_buffer_rect()
queue.enqueue_copy_buffer_rect(source, dest, region, src_origin=(0, 0, 0), dst_origin=(0, 0, 0),
src_row_pitch=0, src_slice_pitch=0, dst_row_pitch=0, dst_slice_pitch=0, wait_on=())

TODO: document this.

enqueue_native_kernel(function[, arg, ..., kwarg=, ...])

Enqueues a command to execute a python function.

Parameters:
  • function – A callable python object
  • args – Arguments for function
  • kwargs – Keywords for function
enqueue_nd_range_kernel(kernel, work_dim, global_work_size, global_work_offset=None, local_work_size=None, wait_on=())

Enqueues a command to execute a kernel on a device

Parameters:
  • kernel – an opencl.Kernel object
  • work_dim – is the number of dimensions used to specify the global work-items and work-items in the work-group.
  • global_work_size – A list of length work_dim that describe the number of global work-items in work_dim dimensions that will execute the kernel function. T
  • global_work_offset – Can be used to specify an array of work_dim unsigned values that describe the offset used to calculate the global ID of a work-item.
:param local_work_size:f A list of length work_dim unsigned values that describe the number of
work-items that make up a work-group. If None, the OpenCL implementation will determine how to be break the global work-items into appropriate work-group instances
Parameters:wait_on – A list of events
enqueue_read_buffer(source, dest, offset=0, size=0, wait_on=(), blocking_read=False)

Read a buffer object to host memory.

enqueue_task(kernel, wait_on=())

Enqueues a command to execute a kernel on a device. The kernel is executed using a single work-item.

Parameters:
  • kernel – an opencl kernel.
  • wait_on – a list of events
enqueue_wait_for_events(self, event, event2, ...)

queue.enqueue_wait_for_events(self, eventlist)

Enqueues a wait for a specific event or a list of events to complete before any future commands queued in the command-queue are executed. num_events specifies the number of events given by event_list.

enqueue_write_buffer()

queue.enqueue_read_buffer(source, dest, offset=0, size=0, wait_on=(), blocking_read=False)

Write host memory into a buffer object.

finish()

Blocks until all previously queued OpenCL commands in command_queue are issued to the associated device and have completed. clFinish does not return until all queued commands in command_queue have been processed and completed. clFinish is also a synchronization point

flush()

Issues all previously queued OpenCL commands in command_queue to the device associated with command_queue. clFlush only guarantees that all queued commands to command_queue will eventually be submitted to the appropriate device. There is no guarantee that they will be complete after clFlush returns.

marker()

Enqueues a marker command to command_queue. The marker command is not completed until all commands enqueued before it have completed. The marker command returns an event which can be waited on, i.e. this event can be waited on to insure that all commands, which have been queued before the marker command, have been completed.

Functions