HIP-Fortran

Cross-Platform GPU Acceleration in Fortran

AMD's HIP provides an API for accelerating C/C++ applications with both Nvidia and AMD GPUs. Fluid Numerics has provided HIP-Fortran help give Fortran developers a way to call HIP routines directly from Fortran applications.

Download and contribute to the open-source HIP-Fortran repository from Fluid Numerics.

Learn how to install and work with hip-fortran through our hands-on codelabs

Community

HIP-Fortran is in its infancy and Fluid Numerics is ready to help users get started. We are currently focusing on building tests to support Continuous Integration and creating live training modules, documentation, and codelabs. If you have requests for more HIP API exposure, would like to report bugs, or need guidance on implementation not covered in documentation or test, open a new issue on the Github page.

Support HIP-Fortran

Fluid Numerics offers consulting and hands-on coding to assist you and your team with code refactoring, GPU porting, GPU optimization, and multi-GPU acceleration. When you engage us for support, a significant portion of our service fee is reinvested into R&D accounts at Fluid Numerics to allow us to continue to support the open-source HIP-Fortran.

As you build expertise in GPU programming and working with HIP-Fortran, we want to hear from you! Fluid Numerics is ready to help give you a virtual stage to share your experience and lessons learned with the rest of the HPC community.

HIP-Fortran is meant to be a layer that allows Fortran developers to accelerate applications with both AMD and Nvidia GPUs. HIP-Fortran is regularly tested on Nvidia GPU platforms, but consistent testing on AMD platforms is still wanting. Support this project by providing access to AMD GPU systems or donating AMD GPU cards to Fluid Numerics to set up a hip-fortran CI server.

Fluid Numerics incurs expenses for time & labor and cloud computing to maintain and support the HIP-Fortran repository, codelabs, tutorials, demos, CI infrastructure, and mini-app testing. Consider donating funds to Fluid Numerics that will be reserved for these expenses.

Usage Example

A simple example program is shown that demonstrates how to :

  • Create device (GPU) pointers
  • Copy data from host to device and device to host
  • How to launch a simple HIP kernel from Fortran
  • How to build a hip-fortran application

In this example, we have three files

  • main.f03 : The main Fortran program
  • my_module.f03 : A Fortran module that defines the kernel interface
  • my_module_hip.cpp : The C++ code that defines the HIP Kernel and the kernel wrapper to launch the kernel

Assuming you

  • Have installed hip-fortran under /opt/hip-fortran,
  • Are using the gfortran compiler,
  • Are using the included modulefile,
  • Have the hipcc compiler and all necessary dependencies,

You can build this application with

gfortran ${HIPFORTRAN_INCLUDE} -c my_module.f03
gfortran ${HIPFORTRAN_INCLUDE} -c main.f03
hipcc -c my_module_hip.cpp
hipcc -lgfortran main.o my_module.o my_module_hip.o ${HIPFORTRAN_INCLUDE} ${HIPFORTRAN_LIB} -o hip_test 

main.f03

PROGRAM main

USE hip_fortran
USE my_module

IMPLICIT NONE

  REAL(8), ALLOCATABLE, TARGET :: a(:,:)
  REAL(8), ALLOCATABLE, TARGET :: b(:,:)
  TYPE(c_ptr) :: a_dev = c_null_ptr
  TYPE(c_ptr) :: b_dev = c_null_ptr

    ! Allocate and initialize host array
    ALLOCATE(array(0:10,0:10), b(0:10,0:10))
    array = 10.0D0
 
    ! Allocate device array
    CALL hfMalloc(a_dev, SIZEOF(a))
    CALL hfMalloc(b_dev, SIZEOF(b))

    ! Copy host memory to device memory
    CALL hfMemcpy(a_dev, c_loc(a), SIZEOF(a), hipMemcpyHostToDevice)

    CALL myRoutine(a_dev,b_dev,N) 

    CALL hfMemcpy(c_loc(b), b_dev, SIZEOF(b), hipMemcpyDeviceToHost)

    CALL hfFree(a_dev)
    CALL hfFree(b_dev)
    DEALLOCATE(a, b)

END PROGRAM main

my_module.f03

MODULE my_module

IMPLICIT NONE

  INTERFACE
    SUBROUTINE myRoutine(a,b,N)
      USE iso_c_binding
      IMPLICIT NONE
      TYPE(c_ptr) :: a, b
      INTEGER, VALUE :: N
    END SUBROUTINE myRoutine(a,b,N)
  END INTERFACE

END MODULE my_module

my_module_hip.cpp

#include <hip/hip_runtime.h>

__global__ void myroutine_hipkernel(double *a, double *b, int n){

  size_t i  = blockIdx.x*blockDim.x + threadIdx.x;
  if ( i < (n+1)*(n+1) ) {
    b[i] = 2.0*a[i];
  }

}

extern "C"
{
  void myroutine(double **a, double **b, int n)
  {
    int threadPerBlock = 256;
    int blockCount = (n+1)*(n+1)/256; 

    hipLaunchKernelGGL((myroutine_hipkernel), dim3(blockCount), dim3(threadPerBlock), 0, 0, *a, *b, n);
  }
}