Intel OneAPI 2024.2.0 | 7.1 Gb
The oneAPI Toolkits development team is pleased to announce the availability of Intel oneAPI Base & HPC Toolkit 2024.2.0 is a comprehensive suite of development tools that make it fast and easy to build modern code that gets every last ounce of performance out of the newest Intel processors in high-performance computing (HPC) platforms.
What's new in this release - Date: 6/14/2024
Toolkit Level Updates
- Take your application's efficiency to the next level with the Intel® oneAPI DPC++/C++ Compiler's enhanced SYCL* Graph capabilities, now featuring pause/resume support for better control and graph profiling to tune for more performance. Additionally the Intel® oneAPI DPC++/C++ Compiler delivers more SYCL performance on Windows with default context enabled. With the latest release the kernel compiler introduces SPIR-V support, and OpenCL* query support, allowing for greater flexibility and optimization in your compute kernels.
- Our latest OpenMP* enhancements include support for omp_target_memset() and omp_target_memset_async(), enabling developers to efficiently initialize large data on target devices, reducing overhead and accelerating parallel computing tasks. Additionally the compiler emits detailed remarks about OpenMP loop collapsing under the -qopt-report option. Gain valuable insights into your loop transformations and make informed decisions to fine-tune your application's performance.
- Enhance your debugging experience in Microsoft* Visual Studio* and VS Code* with the Intel Distribution for GDB*'s new Lane Variable Watch Window, allowing you to monitor and analyze variables more efficiently, leading to quicker problem resolution and enhanced application stability.
- Strengthen the security of your applications with expanded Control-flow Enforcement Technology (CET) in Intel Distribution for GDB* which now includes Shadow Stack capabilities to efficiently debug applications and enhance the reliability of your software.
- Use Intel VTune Profiler to gain insights into sub-optimal Intel oneAPI Collective Communications Library (oneCCL) communication in your applications by finding out the time spent in oneCCL calls and identifying most active oneCCL communication tasks in your application.
- Intel oneAPI DPC++ Library (oneDPL) adds new C++ Standard Template Library (STL) copy_if and inclusive_scan algorithm extensions for developers to write parallel programs for multiarchitecture devices. The performance of many existing algorithms* are also improved on Intel and other vendors' GPUs.
- Apps run faster on 5th Gen Intel Xeon Processors with Intel oneAPI Threading Building Blocks (oneTBB) optimized thread synchronization to reduce startup latency
- Apps run faster using oneTBB parallel_reduce improved data movement to avoid extra copying
- Intel oneAPI Math Kernel Library (oneMKL) 2024.2 introduces enhanced performance of 2D and 3D real and complex FFT targeted for Intel Data Center GPU Max Series.
- To extend sparsity functions across Intel oneAPI Data Analytics Library (oneDAL) algorithms, this release adds DPC++ sparse gemm and gemv primitives and includes sparsity support for the logloss function primitive.
- Intel oneAPI Collective Communications Library (oneCCL) introduces multiple enhancements that improve the utilization of system resources such as memory and I/O, unlocking even better performance.
- Intel Distribution for Python* added the following features:
. The Data Parallel Control Library offers improved productivity with new sorting and summing functions along with updated documentation and bug fixes.
. The Data Parallel Extension for NumPy increases productivity with the addition of a new family of cumulative functions and improved linear algebra functions.
- Intel oneAPI Deep Neural Network Library (oneDNN) 2024.2 introduces:
. Enhanced Performance for next generation client platforms: Experience faster and more efficient processing with broad production quality optimizations, maximizing the performance potential of upcoming AI enhanced Intel client processors.
. Optimized Performance for next generation server platforms: Future-proof your systems with enhanced production quality optimizations, ensuring top-tier performance for upcoming Intel Xeon Scalable processors.
. Improved Large Language Model Performance: Boost the efficiency of your AI workloads with support for int8 and int4 weight decompression in matmul, accelerating large language models with compressed weights for faster insights and results.
- Intel Integrated Performance Primitives added the following features:
. Improved compression ratio and throughput with new optimization patch for zlib 1.3.1
. Accelerated image processing capabilities on select color conversion functions using Intel AVX-512
- Intel Integrated Performance Primitives Cryptography added the following features:
. Enhanced data protection in post-quantum era, with new Intel-optimized LMS post-quantum crypto algorithm
. Advanced AES-GCM performance on 5th Gen Intel Xeon Scalable Processors and Intel Core Ultra processors, with simplified integration with new code sample-
- Visual AI and imaging apps using bindless textures can accelerate on multi-vendor GPUs, with Intel DPC++ Compatibility Tool option enabled migration to SYCL* image API extension
- Save time validating migrated SYCL is equivalent to original code using Intel DPC++ Compatibility Tool to auto compare kernel run logs and report differences
- Easily migrate to SYCL with Intel DPC++ Compatibility Tool migrating 126 more commonly used CUDA APIs
Intel oneAPI DPC++ Compiler 2024.2.0
- The Intel oneAPI DPC++/C++ Compiler added enhanced SYCL Graph capabilities, now featuring pause/resume support for better control and graph profiling to tune for more performance.
- The Intel oneAPI DPC++/C++ Compiler delivers more SYCL performance on Windows with default context enabled. With the latest release the kernel compiler introduces SPIR-V support, and OpenCL* query support, allowing for greater flexibility and optimization in your compute kernels.
- Our latest OpenMP enhancements include support for omp_target_memset() and omp_target_memset_async(), enabling developers to efficiently initialize large data on target devices, reducing overhead and accelerating parallel computing tasks. Additionally the compiler emits detailed remarks about OpenMP loop collapsing under the -qopt-report option. Gain valuable insights into your loop transformations and make informed decisions to fine-tune your application's performance.
- Ensure greater reliability, stability, and security in C++ applications that offload computational tasks to the GPU with the newly added device-side LLVM Address Sanitizer support in Intel oneAPI DPC++/C++ Compiler to swiftly detect and diagnose memory-related bugs.
Intel oneAPI DPC++ Library 2022.6.0
- Intel oneAPI DPC++ Library adds new C++ Standard Template Library (STL) inclusive_scan algorithm extension for developers to write parallel programs for multiarchitecture devices.
- The performance of many existing algorithms like reduce, min_element, max_element, minmax_elelment, is_partitioned, lexicograpical_compare, binary_search, lower_bound and upper_bound are also improved on Intel and other vendors' GPUs.
Intel DPC++ Compatibility Tool 2024.2.0
- Visual AI and imaging apps using bindless textures can accelerate on multi-vendor GPUs, with Intel DPC++ Compatibility Tool option enabled migration to SYCL image API extension
- Save time validating migrated SYCL is equivalent to original code using Intel DPC++ Compatibility Tool to auto compare kernel run logs and report differences
- Easily migrate to SYCL with Intel DPC++ Compatibility Tool migrates 126 more commonly used CUDA APIs
Intel oneAPI Math Kernel Library 2024.2.0
- InteloneAPI Math Kernel Library (oneMKL) 2024.2 introduces enhanced performance of 2D and 3D real and complex FFT targeted for Intel® Data Center GPU Max Series.
- Several other optimizations for various domains.
- Various bug fixes
Intel Distribution for GDB* 2024.2.0
- Intel Distribution for GDB* now supports Intel Core Ultra processors on Windows*.
- Intel Distribution for GDB* has enhanced debugging experience in Microsoft* Visual Studio* and VS Code* with the new Lane Variable Watch Window, allowing you to monitor and analyze variables more efficiently, leading to quicker problem resolution and enhanced application stability.
- Strengthen the security of your applications with expanded Control-flow Enforcement Technology (CET) support, now including Shadow Stack capabilities to efficiently debug applications and enhancing the reliability of your software.
- Various improvements and bug fixes in GPU handling to efficiently debug applications that utilize GPU offload.
Intel VTune Profiler 2024.2.0
- Get insights into sub-optimal oneCCL communication in your applications by finding out the time spent in oneCCL calls and identifying most active oneCCL communication tasks in your application.
- Added support for .NET 8 and new Intel architectures code named Sierra Forest and Grand Ridge.
- Technical preview feature: Get a high-level view of potential bottlenecks in software performance analysis before exploring top-down microarchitecture metrics for deeper analysis. Currently supports 5th Gen Intel® Xeon® processors (code-named Emerald Rapids), 4th Gen Intel® Xeon® Scalable processors (code-named Sapphire Rapids), and the Intel® Xeon® CPU Max Series (code-named Sapphire Rapids HBM).
- Faster performance profiling of GPU workloads running on a specific tile.
Intel Advisor 2024.2.0
- Intel Advisor added AMX profiling support on Sapphire Rapids.
- Added significant improvements in understanding compute kernels and multi-GPU setups.
- Added HTML improvements and fixed all legacy Coverity issues.
Intel oneAPI Threading Building Blocks 2021.13.0
- Apps run faster on 5th Gen Xeon with Intel oneAPI Threading Building Blocks (oneTBB) optimized thread synchronization to reduce startup latency
- OpenVINO run up to 4X faster on ARM CPU (including Apple Mac) using oneTBB improved multi-thread synchronization strategies
- Apps run faster using oneTBB parallel_reduce improved data movement to avoid extra copying
Intel Integrated Performance Primitives 2021.12.0
What’s new in Intel IPP:
- Experience better compression ratio and throughput in your data compression tasks with new optimization patch for zlib 1.3.1
- Accelerated image processing capabilities on select color conversion functions using Intel AVX-512 VNNI
- Enhanced stability and several bug fixes
What’s new in Intel IPP Cryptography:
- Enhanced data protection in post-quantum era, using Intel-optimized LMS post-quantum crypto algorithm for single buffer implementation.
- Optimized and advanced AES-GCM performance on 5th Gen Intel Xeon Scalable Processors and Intel Core Ultra processors, with simplified implementation with new code sample.
- Maximize adaptability and streamline development with Clang 16.0 compiler support for Linux
Intel oneAPI Collective Communications Library 2021.13.0
- In this release, oneCCL introduces multiple enhancements that improves the utilization of system resources such as memory and I/O, unlocking even better performance.
Intel oneAPI Data Analytics Library 2024.3.0
- To extend sparsity functions across oneDAL algorithms, this release adds DPC++ sparse gemm and gemv primitives and includes sparsity support for the logloss function primitive.
Intel oneAPI Deep Neural Networks Library 2024.2.0
Intel oneAPI Deep Neural Network Library (oneDNN) 2024.2 introduces:
- Enhanced Performance for next generation client platforms: Experience faster and more efficient processing with broad production quality optimizations, maximizing the performance potential of upcoming AI enhanced Intel client processors.
- Optimized Performance for next generation server platforms: Future-proof your systems with enhanced production quality optimizations, ensuring top-tier performance for upcoming Intel Xeon Scalable processors.
- Improved Large Language Model Performance: Boost the efficiency of your AI workloads with support for int8 and int4 weight decompression in matmul, accelerating large language models with compressed weights for faster insights and results.
Deprecation Notices
- SLES 15SPR3 and Ubuntu 20.04 support on CPU are deprecated with 2024.2 release and will be removed in a future release.
- Take your application's efficiency to the next level with the Intel® oneAPI DPC++/C++ Compiler's enhanced SYCL* Graph capabilities, now featuring pause/resume support for better control and graph profiling to tune for more performance. Additionally the Intel® oneAPI DPC++/C++ Compiler delivers more SYCL performance on Windows with default context enabled. With the latest release the kernel compiler introduces SPIR-V support, and OpenCL* query support, allowing for greater flexibility and optimization in your compute kernels.
- Our latest OpenMP* enhancements include support for omp_target_memset() and omp_target_memset_async(), enabling developers to efficiently initialize large data on target devices, reducing overhead and accelerating parallel computing tasks. Additionally the compiler emits detailed remarks about OpenMP loop collapsing under the -qopt-report option. Gain valuable insights into your loop transformations and make informed decisions to fine-tune your application's performance.
- Enhance your debugging experience in Microsoft* Visual Studio* and VS Code* with the Intel Distribution for GDB*'s new Lane Variable Watch Window, allowing you to monitor and analyze variables more efficiently, leading to quicker problem resolution and enhanced application stability.
- Strengthen the security of your applications with expanded Control-flow Enforcement Technology (CET) in Intel Distribution for GDB* which now includes Shadow Stack capabilities to efficiently debug applications and enhance the reliability of your software.
- Use Intel VTune Profiler to gain insights into sub-optimal Intel oneAPI Collective Communications Library (oneCCL) communication in your applications by finding out the time spent in oneCCL calls and identifying most active oneCCL communication tasks in your application.
- Intel oneAPI DPC++ Library (oneDPL) adds new C++ Standard Template Library (STL) copy_if and inclusive_scan algorithm extensions for developers to write parallel programs for multiarchitecture devices. The performance of many existing algorithms* are also improved on Intel and other vendors' GPUs.
- Apps run faster on 5th Gen Intel Xeon Processors with Intel oneAPI Threading Building Blocks (oneTBB) optimized thread synchronization to reduce startup latency
- Apps run faster using oneTBB parallel_reduce improved data movement to avoid extra copying
- Intel oneAPI Math Kernel Library (oneMKL) 2024.2 introduces enhanced performance of 2D and 3D real and complex FFT targeted for Intel Data Center GPU Max Series.
- To extend sparsity functions across Intel oneAPI Data Analytics Library (oneDAL) algorithms, this release adds DPC++ sparse gemm and gemv primitives and includes sparsity support for the logloss function primitive.
- Intel oneAPI Collective Communications Library (oneCCL) introduces multiple enhancements that improve the utilization of system resources such as memory and I/O, unlocking even better performance.
- Intel Distribution for Python* added the following features:
. The Data Parallel Control Library offers improved productivity with new sorting and summing functions along with updated documentation and bug fixes.
. The Data Parallel Extension for NumPy increases productivity with the addition of a new family of cumulative functions and improved linear algebra functions.
- Intel oneAPI Deep Neural Network Library (oneDNN) 2024.2 introduces:
. Enhanced Performance for next generation client platforms: Experience faster and more efficient processing with broad production quality optimizations, maximizing the performance potential of upcoming AI enhanced Intel client processors.
. Optimized Performance for next generation server platforms: Future-proof your systems with enhanced production quality optimizations, ensuring top-tier performance for upcoming Intel Xeon Scalable processors.
. Improved Large Language Model Performance: Boost the efficiency of your AI workloads with support for int8 and int4 weight decompression in matmul, accelerating large language models with compressed weights for faster insights and results.
- Intel Integrated Performance Primitives added the following features:
. Improved compression ratio and throughput with new optimization patch for zlib 1.3.1
. Accelerated image processing capabilities on select color conversion functions using Intel AVX-512
- Intel Integrated Performance Primitives Cryptography added the following features:
. Enhanced data protection in post-quantum era, with new Intel-optimized LMS post-quantum crypto algorithm
. Advanced AES-GCM performance on 5th Gen Intel Xeon Scalable Processors and Intel Core Ultra processors, with simplified integration with new code sample-
- Visual AI and imaging apps using bindless textures can accelerate on multi-vendor GPUs, with Intel DPC++ Compatibility Tool option enabled migration to SYCL* image API extension
- Save time validating migrated SYCL is equivalent to original code using Intel DPC++ Compatibility Tool to auto compare kernel run logs and report differences
- Easily migrate to SYCL with Intel DPC++ Compatibility Tool migrating 126 more commonly used CUDA APIs
Intel oneAPI DPC++ Compiler 2024.2.0
- The Intel oneAPI DPC++/C++ Compiler added enhanced SYCL Graph capabilities, now featuring pause/resume support for better control and graph profiling to tune for more performance.
- The Intel oneAPI DPC++/C++ Compiler delivers more SYCL performance on Windows with default context enabled. With the latest release the kernel compiler introduces SPIR-V support, and OpenCL* query support, allowing for greater flexibility and optimization in your compute kernels.
- Our latest OpenMP enhancements include support for omp_target_memset() and omp_target_memset_async(), enabling developers to efficiently initialize large data on target devices, reducing overhead and accelerating parallel computing tasks. Additionally the compiler emits detailed remarks about OpenMP loop collapsing under the -qopt-report option. Gain valuable insights into your loop transformations and make informed decisions to fine-tune your application's performance.
- Ensure greater reliability, stability, and security in C++ applications that offload computational tasks to the GPU with the newly added device-side LLVM Address Sanitizer support in Intel oneAPI DPC++/C++ Compiler to swiftly detect and diagnose memory-related bugs.
Intel oneAPI DPC++ Library 2022.6.0
- Intel oneAPI DPC++ Library adds new C++ Standard Template Library (STL) inclusive_scan algorithm extension for developers to write parallel programs for multiarchitecture devices.
- The performance of many existing algorithms like reduce, min_element, max_element, minmax_elelment, is_partitioned, lexicograpical_compare, binary_search, lower_bound and upper_bound are also improved on Intel and other vendors' GPUs.
Intel DPC++ Compatibility Tool 2024.2.0
- Visual AI and imaging apps using bindless textures can accelerate on multi-vendor GPUs, with Intel DPC++ Compatibility Tool option enabled migration to SYCL image API extension
- Save time validating migrated SYCL is equivalent to original code using Intel DPC++ Compatibility Tool to auto compare kernel run logs and report differences
- Easily migrate to SYCL with Intel DPC++ Compatibility Tool migrates 126 more commonly used CUDA APIs
Intel oneAPI Math Kernel Library 2024.2.0
- InteloneAPI Math Kernel Library (oneMKL) 2024.2 introduces enhanced performance of 2D and 3D real and complex FFT targeted for Intel® Data Center GPU Max Series.
- Several other optimizations for various domains.
- Various bug fixes
Intel Distribution for GDB* 2024.2.0
- Intel Distribution for GDB* now supports Intel Core Ultra processors on Windows*.
- Intel Distribution for GDB* has enhanced debugging experience in Microsoft* Visual Studio* and VS Code* with the new Lane Variable Watch Window, allowing you to monitor and analyze variables more efficiently, leading to quicker problem resolution and enhanced application stability.
- Strengthen the security of your applications with expanded Control-flow Enforcement Technology (CET) support, now including Shadow Stack capabilities to efficiently debug applications and enhancing the reliability of your software.
- Various improvements and bug fixes in GPU handling to efficiently debug applications that utilize GPU offload.
Intel VTune Profiler 2024.2.0
- Get insights into sub-optimal oneCCL communication in your applications by finding out the time spent in oneCCL calls and identifying most active oneCCL communication tasks in your application.
- Added support for .NET 8 and new Intel architectures code named Sierra Forest and Grand Ridge.
- Technical preview feature: Get a high-level view of potential bottlenecks in software performance analysis before exploring top-down microarchitecture metrics for deeper analysis. Currently supports 5th Gen Intel® Xeon® processors (code-named Emerald Rapids), 4th Gen Intel® Xeon® Scalable processors (code-named Sapphire Rapids), and the Intel® Xeon® CPU Max Series (code-named Sapphire Rapids HBM).
- Faster performance profiling of GPU workloads running on a specific tile.
Intel Advisor 2024.2.0
- Intel Advisor added AMX profiling support on Sapphire Rapids.
- Added significant improvements in understanding compute kernels and multi-GPU setups.
- Added HTML improvements and fixed all legacy Coverity issues.
Intel oneAPI Threading Building Blocks 2021.13.0
- Apps run faster on 5th Gen Xeon with Intel oneAPI Threading Building Blocks (oneTBB) optimized thread synchronization to reduce startup latency
- OpenVINO run up to 4X faster on ARM CPU (including Apple Mac) using oneTBB improved multi-thread synchronization strategies
- Apps run faster using oneTBB parallel_reduce improved data movement to avoid extra copying
Intel Integrated Performance Primitives 2021.12.0
What’s new in Intel IPP:
- Experience better compression ratio and throughput in your data compression tasks with new optimization patch for zlib 1.3.1
- Accelerated image processing capabilities on select color conversion functions using Intel AVX-512 VNNI
- Enhanced stability and several bug fixes
What’s new in Intel IPP Cryptography:
- Enhanced data protection in post-quantum era, using Intel-optimized LMS post-quantum crypto algorithm for single buffer implementation.
- Optimized and advanced AES-GCM performance on 5th Gen Intel Xeon Scalable Processors and Intel Core Ultra processors, with simplified implementation with new code sample.
- Maximize adaptability and streamline development with Clang 16.0 compiler support for Linux
Intel oneAPI Collective Communications Library 2021.13.0
- In this release, oneCCL introduces multiple enhancements that improves the utilization of system resources such as memory and I/O, unlocking even better performance.
Intel oneAPI Data Analytics Library 2024.3.0
- To extend sparsity functions across oneDAL algorithms, this release adds DPC++ sparse gemm and gemv primitives and includes sparsity support for the logloss function primitive.
Intel oneAPI Deep Neural Networks Library 2024.2.0
Intel oneAPI Deep Neural Network Library (oneDNN) 2024.2 introduces:
- Enhanced Performance for next generation client platforms: Experience faster and more efficient processing with broad production quality optimizations, maximizing the performance potential of upcoming AI enhanced Intel client processors.
- Optimized Performance for next generation server platforms: Future-proof your systems with enhanced production quality optimizations, ensuring top-tier performance for upcoming Intel Xeon Scalable processors.
- Improved Large Language Model Performance: Boost the efficiency of your AI workloads with support for int8 and int4 weight decompression in matmul, accelerating large language models with compressed weights for faster insights and results.
Deprecation Notices
- SLES 15SPR3 and Ubuntu 20.04 support on CPU are deprecated with 2024.2 release and will be removed in a future release.
Intel oneAPI DPC++/C++ Compiler 2024.2.0
- The Intel oneAPI DPC++/C++ Compiler added enhanced SYCL Graph capabilities, now featuring pause/resume support for better control and graph profiling to tune for more performance.
- The Intel oneAPI DPC++/C++ Compiler delivers more SYCL performance on Windows with default context enabled. With the latest release the kernel compiler introduces SPIR-V support, and OpenCL* query support, allowing for greater flexibility and optimization in your compute kernels.
- Our latest OpenMP enhancements include support for omp_target_memset() and omp_target_memset_async(), enabling developers to efficiently initialize large data on target devices, reducing overhead and accelerating parallel computing tasks. Additionally the compiler emits detailed remarks about OpenMP loop collapsing under the -qopt-report option. Gain valuable insights into your loop transformations and make informed decisions to fine-tune your application's performance.
- Ensure greater reliability, stability, and security in C++ applications that offload computational tasks to the GPU with the newly added device-side LLVM Address Sanitizer support in Intel oneAPI DPC++/C++ Compiler to swiftly detect and diagnose memory-related bugs.
Intel Fortran Compiler 2024.2.0
- The Intel Fortran Compiler adds the -fstrict-overflow and Qstrict-overflow[-] options to instruct the Fortran compiler to optimize under the assumption that integer operations won't overflow. For applications that rely on integer overflow behavior, the -fnostrict-overflow option ensures correct functionality.
- Stay at the forefront of parallel programming with our ongoing conformance enhancements for the latest OpenMP standards, including 5.x and the forthcoming 6.0. With this compiler release you can now specify OpenMP 5.1 THREAD_LIMIT on TEAMS and TARGET constructs to better manage thread usage, and OpenMP 5.2 enhancements like COPYPRIVATE and NOWAIT on construct beginnings, as well as an updated LINEAR clause syntax for more precise control. With OpenMP TR12, the LOOP directive is now applicable to DO CONCURRENT loops, paving the way for more powerful loop optimizations.
- The added OpenMP runtime library extensions* provide a robust set of memory management extensions, including functions for precise host pointer registration, targeted memory allocations, and device-specific optimizations. With these powerful extensions, you can push the boundaries of performance and efficiency in your high-performance computing applications.
* added extensions: ompx_target_register_host_pointer, ompx_target_unregister_host_pointer, ompx_target_aligned_alloc, ompx-target_aligned_alloc_device, ompx_target_alloc_host, ompx_target_aligned_alloc_shared, ompx_target_aligned_alloc_shared_with_hint, ompx_target_realloc, ompx_target_realloc_device, ompx_target_realloc_host, ompx_target_realloc_shared, ompx_get_device_from_ptr, and ompx_get_num_subdevices.
Intel Fortran Compiler Classic 2021.13.0
- The Intel Fortran Compiler Classic has been updated to include recent versions of 3rd party components, which include functional and security updates.
Intel MPI Library 2021.13.0
- On machines with multiple Network Interface Cards (NICs), developers using the Intel® MPI Library now have the option of increased application performance control by pinning specific threads to individual NICs.
- Developers can now realize faster application performance through optimizations for GPU-aware broadcasts, RMA peer to peer device-initiated communications, intranode thread-splits, and Infiniband tuning for 5th Gen Intel® Xeon® Scalable Processors.
- The Intel oneAPI DPC++/C++ Compiler added enhanced SYCL Graph capabilities, now featuring pause/resume support for better control and graph profiling to tune for more performance.
- The Intel oneAPI DPC++/C++ Compiler delivers more SYCL performance on Windows with default context enabled. With the latest release the kernel compiler introduces SPIR-V support, and OpenCL* query support, allowing for greater flexibility and optimization in your compute kernels.
- Our latest OpenMP enhancements include support for omp_target_memset() and omp_target_memset_async(), enabling developers to efficiently initialize large data on target devices, reducing overhead and accelerating parallel computing tasks. Additionally the compiler emits detailed remarks about OpenMP loop collapsing under the -qopt-report option. Gain valuable insights into your loop transformations and make informed decisions to fine-tune your application's performance.
- Ensure greater reliability, stability, and security in C++ applications that offload computational tasks to the GPU with the newly added device-side LLVM Address Sanitizer support in Intel oneAPI DPC++/C++ Compiler to swiftly detect and diagnose memory-related bugs.
Intel Fortran Compiler 2024.2.0
- The Intel Fortran Compiler adds the -fstrict-overflow and Qstrict-overflow[-] options to instruct the Fortran compiler to optimize under the assumption that integer operations won't overflow. For applications that rely on integer overflow behavior, the -fnostrict-overflow option ensures correct functionality.
- Stay at the forefront of parallel programming with our ongoing conformance enhancements for the latest OpenMP standards, including 5.x and the forthcoming 6.0. With this compiler release you can now specify OpenMP 5.1 THREAD_LIMIT on TEAMS and TARGET constructs to better manage thread usage, and OpenMP 5.2 enhancements like COPYPRIVATE and NOWAIT on construct beginnings, as well as an updated LINEAR clause syntax for more precise control. With OpenMP TR12, the LOOP directive is now applicable to DO CONCURRENT loops, paving the way for more powerful loop optimizations.
- The added OpenMP runtime library extensions* provide a robust set of memory management extensions, including functions for precise host pointer registration, targeted memory allocations, and device-specific optimizations. With these powerful extensions, you can push the boundaries of performance and efficiency in your high-performance computing applications.
* added extensions: ompx_target_register_host_pointer, ompx_target_unregister_host_pointer, ompx_target_aligned_alloc, ompx-target_aligned_alloc_device, ompx_target_alloc_host, ompx_target_aligned_alloc_shared, ompx_target_aligned_alloc_shared_with_hint, ompx_target_realloc, ompx_target_realloc_device, ompx_target_realloc_host, ompx_target_realloc_shared, ompx_get_device_from_ptr, and ompx_get_num_subdevices.
Intel Fortran Compiler Classic 2021.13.0
- The Intel Fortran Compiler Classic has been updated to include recent versions of 3rd party components, which include functional and security updates.
Intel MPI Library 2021.13.0
- On machines with multiple Network Interface Cards (NICs), developers using the Intel® MPI Library now have the option of increased application performance control by pinning specific threads to individual NICs.
- Developers can now realize faster application performance through optimizations for GPU-aware broadcasts, RMA peer to peer device-initiated communications, intranode thread-splits, and Infiniband tuning for 5th Gen Intel® Xeon® Scalable Processors.
The Intel oneAPI Base Toolkit is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures. It features an industry-leading C++ compiler and the Data Parallel C++ (DPC++) language, an evolution of C++ for heterogeneous computing. Domain-specific libraries and the Intel Distribution for Python provide drop-in acceleration across relevant architectures. Enhanced profiling, design assistance, and debug tools complete the kit. High-performance computing (HPC) is at the core of artificial intelligence, machine learning, and deep learning applications. The Intel oneAPI HPC Toolkit delivers what developers need to build, analyze, optimize, and scale HPC applications with the latest techniques in vectorization, multithreading, multi-node parallelization, and memory optimization. Intel oneAPI HPC Toolkit is an add-on to the Intel oneAPI Base Toolkit, which is required for full functionality. It also includes access to the Intel Distribution for Python, the Intel oneAPI DPC++/C++ Compiler, powerful data-centric libraries, and advanced analysis tools.
Intel oneAPI Base Toolkit
What is the Intel oneAPI HPC Toolkit?
Intel is a world leader in computing innovation. The company designs and builds the essential technologies that serve as the foundation for the world's computing devices. As a leader in corporate responsibility and sustainability, Intel also manufactures the world's first commercially available "conflict-free" microprocessors.
Owner: Intel
Product Name: oneAPI Base & HPC Toolkit
Version: 2024.2.0
Supported Architectures: x64
Website Home Page : https://software.intel.com/
Languages Supported: english
System Requirements: Windows & Linux *
Size: 7.1 Gb
Please visit my blog
Added by 3% of the overall size of the archive of information for the restoration
No mirrors please
Added by 3% of the overall size of the archive of information for the restoration
No mirrors please