12

GitHub - beehive-lab/TornadoVM: TornadoVM: A practical and efficient heterogeneo...

 2 years ago
source link: https://github.com/beehive-lab/TornadoVM
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

TornadoVM

tornadoVM_Logo.jpg

TornadoVM is a plug-in to OpenJDK and GraalVM that allows programmers to automatically run Java programs on heterogeneous hardware. TornadoVM currently targets OpenCL-compatible devices and it runs on multi-core CPUs, dedicated GPUs (NVIDIA, AMD), integrated GPUs (Intel HD Graphics and ARM Mali), and FPGAs (Intel and Xilinx).

Website: tornadovm.org

For a quick introduction please read the following FAQ.

Current Release: TornadoVM 0.12 - 17/11/2021 : See CHANGELOG

Previous Releases can be found here

1. Installation

In Linux and Mac OSx, TornadoVM can be installed automatically with the installation script. For example:

./scripts/tornadovmInstaller.sh 
TornadoVM installer for Linux and OSx
Usage:
       --jdk8           : Install TornadoVM with OpenJDK 8
       --jdk11          : Install TornadoVM with OpenJDK 11
       --jdk17          : Install TornadoVM with OpenJDK 17
       --graal-jdk-11   : Install TornadoVM with GraalVM and JDK 11 (GraalVM 21.3.0)
       --graal-jdk-17   : Install TornadoVM with GraalVM and JDK 16 (GraalVM 21.3.0)
       --corretto-11    : Install TornadoVM with Corretto JDK 11
       --corretto-17    : Install TornadoVM with Corretto JDK 16
       --mandrel-11     : Install TornadoVM with Mandrel 21.3.0 (JDK 11)
       --mandrel-17     : Install TornadoVM with Mandrel 21.3.0 (JDK 17)
       --windows-jdk-11 : Install TornadoVM with Windows JDK 11
       --windows-jdk-17 : Install TornadoVM with Windows JDK 17
       --opencl         : Install TornadoVM and build the OpenCL backend
       --ptx            : Install TornadoVM and build the PTX backend
       --spirv          : Install TornadoVM and build the SPIR-V backend
       --help           : Print this help

NOTE Select the desired backend:

  • --opencl: Enables the OpenCL backend (requires OpenCL drivers)
  • --ptx: Enables the PTX backend (requires NVIDIA CUDA drivers)
  • --spirv: Enables the SPIRV backend (requires Intel Level Zero drivers)

Alternatively, TornadoVM can be installed either manually from source or by using Docker.

You can also run TornadoVM on Amazon AWS CPUs, GPUs, and FPGAs following the instructions here.

2. Usage Instructions

TornadoVM is currently being used to accelerate machine learning and deep learning applications, computer vision, physics simulations, financial applications, computational photography, and signal processing.

We have a use-case, kfusion-tornadovm, for accelerating a computer-vision application implemented in Java using the Tornado-API to run on GPUs.

We also have a set of examples that includes NBody, DFT, KMeans computation and matrix computations.

Additional Information

Documentation

Benchmarks

Reductions

Execution Flags

FPGA execution

Profiler Usage

3. Programming Model

TornadoVM exposes to the programmer task-level, data-level and pipeline-level parallelism via a light Application Programming Interface (API). In addition, TornadoVM uses single-source property, in which the code to be accelerated and the host code live in the same Java program.

Compute-kernels in TornadoVM can be programmed using two different approaches:

a) Loop-parallelism

Compute kernels are written in a sequential form (tasks programmed for a single thread execution). To express parallelism, TornadoVM exposes two annotations that can be used in loops and parameters: a) @Parallel for annotating parallel loops; and b) @Reduce for annotating parameters used in reductions.

The following code snippet shows a full example to accelerate Matrix-Multiplication using TornadoVM and the loop-parallel API:

public class Compute {
    private static void mxmLoop(Matrix2DFloat A, Matrix2DFloat B, Matrix2DFloat C, final int size) {
        for (@Parallel int i = 0; i < size; i++) {
            for (@Parallel int j = 0; j < size; j++) {
                float sum = 0.0f;
                for (int k = 0; k < size; k++) {
                    sum += A.get(i, k) * B.get(k, j);
                }
                C.set(i, j, sum);
            }
        }
    }

    public void run(Matrix2DFloat A, Matrix2DFloat B, Matrix2DFloat C, final int size) {
        TaskSchedule ts = new TaskSchedule("s0")
                .streamIn(A, B)                               // Stream data from host to device
                .task("t0", Compute::mxmLoop, A, B, C, size)  // Each task points to an existing Java method
                .streamOut(C);                                // sync arrays with the host side
        ts.execute();   // It will execute the code on the default device (e.g. a GPU)
    }
}

b) Kernel Parallelism

Another way to express compute-kernels in TornadoVM is via the kernel-parallel API. To do so, TornadoVM exposes a KernelContext with which the application can directly access the thread-id, allocate memory in local memory (shared memory on NVIDIA devices), and insert barriers. This model is similar to programming compute-kernels in OpenCL and CUDA. Therefore, this API is more suitable for GPU/FPGA expert programmers that want more control or want to port existing CUDA/OpenCL compute kernels into TornadoVM.

The following code-snippet shows the Matrix Multiplication example using the kernel-parallel API:

public class Compute {
    private static void mxmKernel(KernelContext context, Matrix2DFloat A, Matrix2DFloat B, Matrix2DFloat C, final int size) {
        int idx = context.threadIdx;
        int jdx = context.threadIdy;
        float sum = 0;
        for (int k = 0; k < size; k++) {
            sum += A.get(idx, k) * B.get(k, jdx);
        }
        C.set(idx, jdx, sum);
    }

    public void run(Matrix2DFloat A, Matrix2DFloat B, Matrix2DFloat C, final int size) {
        // When using the kernel-parallel API, we need to create a Grid and a Worker

        WorkerGrid workerGrid = new WorkerGrid2D(size, size);    // Create a 2D Worker
        GridScheduler gridScheduler = new GridScheduler("s0.t0", workerGrid);  // Attach the worker to the Grid
        KernelContext context = new KernelContext();             // Create a context
        workerGrid.setLocalWork(32, 32, 1);                      // Set the local-group size

        TaskSchedule ts = new TaskSchedule("s0")
                .streamIn(A, B)                                 // Stream data from host to device
                .task("t0", Compute::mxmKernel, context, A, B, C, size)  // Each task points to an existing Java method
                .streamOut(C);                                  // sync arrays with the host side
        ts.execute(gridScheduler);   // Execute with a GridScheduler
    }
}

Additionally, the two modes of expressing parallelism (kernel and loop parallelization) can be combined in the same task-schedule object.

4. Dynamic Reconfiguration

Dynamic reconfiguration is the ability of TornadoVM to perform live task migration between devices, which means that TornadoVM decides where to execute the code to increase performance (if possible). In other words, TornadoVM switches devices if it can detect that a specific device can yield better performance (compared to another). With the task-migration, the TornadoVM's approach is to only switch device if it detects an application can be executed faster than the CPU execution using the code compiled by C2 or Graal-JIT, otherwise it will stay on the CPU. So TornadoVM can be seen as a complement to C2 and Graal. This is because there is no single hardware to best execute all workloads efficiently. GPUs are very good at exploiting SIMD applications, and FPGAs are very good at exploiting pipeline applications. If your applications follow those models, TornadoVM will likely select heterogeneous hardware. Otherwise, it will stay on the CPU using the default compilers (C2 or Graal).

To use the dynamic reconfiguration, you can execute using TornadoVM policies. For example:

// TornadoVM will execute the code in the best accelerator.
ts.execute(Policy.PERFORMANCE);

Further details and instructions on how to enable this feature can be found here.

5. How to Use it in your Projects?

You can import the API and start using TornadoVM. Set this in the pom.xml file.

<repositories>
    <repository>
        <id>universityOfManchester-graal</id>
        <url>https://raw.githubusercontent.com/beehive-lab/tornado/maven-tornadovm</url>
    </repository>
</repositories>

<dependencies>
<dependency>
    <groupId>tornado</groupId>
    <artifactId>tornado-api</artifactId>
    <version>0.12</version>
</dependency>
<dependency>
    <groupId>tornado</groupId>
    <artifactId>tornado-matrices</artifactId>
    <version>0.12</version>
</dependency>
</dependencies>

To run TornadoVM, you need to either install the TornadoVM extension for GraalVM/OpenJDK, or run with our Docker images.

6. Additional Resources

Here you can find videos, presentations, and articles and artefacts describing TornadoVM and how to use it.

7. Academic Publications

Selected publications and citations can be found here.

8. Acknowledgments

This work is partially funded by Intel corporation the EU Horizon 2020 ELEGANT 957286 grant. In addition, it has been supported by EU Horizon 2020 E2Data 780245, the EU Horizon 2020 ACTiCLOUD 732366, and EPSRC PAMELA EP/K008730/1, and AnyScale Apps EP/L000725/1 grants.

9. Contributions and Collaborations

We welcome collaborations! Please see how to contribute to the project in the CONTRIBUTING page.

Write your questions and proposals:

Additionally, you can open new proposals on the Github discussions page:https://github.com/beehive-lab/TornadoVM/discussions

Mailing List:

A mailing list is also available to discuss TornadoVM related issues: [email protected]

Collaborations:

For Academic & Industry collaborations, please contact here.

10. TornadoVM Team

Visit our website to meet the team.

11. Licenses

To use TornadoVM, you can link the TornadoVM API to your application which is under the CLASSPATH Exception of GPLv2.0.

Each TornadoVM module is licensed as follows:

Module License

Tornado-API + CLASSPATH Exception

Tornado-Runtime

Tornado-Assembly

Tornado-Drivers

Tornado-Drivers-OpenCL-Headers

Tornado-scripts

Tornado-Annotation

Tornado-Unittests

Tornado-Benchmarks

Tornado-Examples

Tornado-Matrices

JNI Libraries (OpenCL, PTX and LevelZero)


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK