Skip to content

Version 4.0.0

Latest
Compare
Choose a tag to compare
@coldav coldav released this 30 Jul 13:07
· 184 commits to main since this release
538093a

Version 4.0.0

Feature Additions:

  • RISC-V support for host target (either cross compiled or native)
  • Features can be enabled for host targets using the cmake
    option CA_HOST_TARGET_<ARCH>_FEATURES or the environment variable
    CA_HOST_TARGET_FEATURES. This uses a format similar to -mattr on tools like
    opt e.g. "+v;-zfencei".
  • The compiler-passes directory has been created to just allow important
    compiler passes to be made available externally such as the vectorizer vecz.
  • Support for SPIR 1.2 programs (cl_khr_spir) has been dropped.

Upgrade guidance:

  • LLVM 17 and 18 only are supported in this release.
  • Added support for a Remote HAL over tcp/ip, primarily for demoing new RISC-V
    targets quickly
  • The cmake variable CA_HOST_TARGET_CPU has been split into capitalized
    architecture variants of the form CA_HOST_TARGET_<ARCH>_CPU e.g
    CA_HOST_TARGET_X86_64_CPU, CA_HOST_TARGET_AARCH64_CPU and
    CA_HOST_TARGET_RISCV64_CPU. The environment variable CA_HOST_TARGET_CPU
    remains the same name. Note that CA_HOST_TARGET_CPU_NATIVE is no longer
    supported but can be achieved by using native as a value for the variants.
  • The mux spec has been bumped:
    • 0.77.0: to loosen the requirements on the mux event type used by
      DMA builtins.
    • 0.78.0: to introduce mux builtins for sub-group, work-group, and
      vector-group operations.
    • 0.79.0: to introduce mux builtins for sub-group shuffle operations.
    • 0.80.0: to introduce support for 64-bit atomic operations.
  • The compiler::ImageArgumentSubstitutionPass now replaces sampler typed
    parameters in kernel functions with i32 parameters via a wrapper function.
  • The host target as a consequence now passes samplers to kernels as 32-bit
    integer arguments, not as integer arguments disguised as pointer values.
  • The compiler::utils::ReplaceBarriersPass has been replaced with the
    compiler::utils::LowerToMuxBuiltinsPass.
  • The compiler::utils::HandleBarriersPass has been renamed to the
    compiler::utils::WorkItemLoopsPass.
  • The compiler::utils::createLoop API has moved its list of IVs parameter
    into its compiler::utils::CreateLoopOpts parameter. It can now also set the
    IV names via a second CreateLoopOpts field.
  • Support for LLVM versions is now limited to LLVM 17 and LLVM 18. Support for
    earlier LLVM versions has been removed.
  • Support for FMA (fused multiply-add) is required for the device. For the host
    device for x86-64, this means only x86-64-v3 and newer are supported. This
    roughly translates to 2015 or newer, both for Intel and for AMD.
  • Although hardware support for FMA is available on all platforms we currently
    test, if you are using OCK on a platform we do not test and encounter
    issues, please let us know by opening an issue!
  • compiler-utils library has been split into compiler-pipeline and
    compiler-binary-metadata to allow use of compiler pipeline utilities without
    the binary metadata requirements. Both will be needed for mux targets.
  • The utility function addParamToAllFunctions has been moved to
    ReplaceLocalModuleScopeVariablesPass and renamed as it is only used there.
  • OpenCL-Headers now fetches from the github repo with tag v2024.05.08.This can
    be overridden using -DFETCHCONTENT_SOURCE_DIR_OPENCL_HEADERS to point to a
    different repo.