跳转到内容

并行谱数值方法/使用ParaView协同处理进行可视化

来自维基教科书,开放世界中的开放书籍

ParaView协同处理插件允许应用程序代码被检测以连接到ParaView服务器,以执行可视化管道。管道可以生成各种格式的图像或VTK XML并行文件格式数据集。

用户使用一个示例(可能分辨率较低)数据集在ParaView客户端中创建一个管道。然后,该管道使用协同处理插件导出为Python脚本,该脚本将由应用程序加载。

应用程序开发人员还需要编写一些Fortran/C++代码,将应用程序的数据结构转换为ParaView可以理解的格式。对于完全规则的网格,这是很简单的。

在单个计算节点上使用ParaView协同处理

[编辑 | 编辑源代码]

ParaView客户端设置

[编辑 | 编辑源代码]

第一个要求是拥有一个可以导出Python脚本的ParaView客户端。截至ParaView 3.14.1,这需要从源代码构建。 标准说明适用于一些关于构建 脚本生成插件的额外说明。构建完成后,第一次启动ParaView时,转到工具,管理插件,并将协同处理插件设置为在启动时加载。加载插件后,应该有两个新的菜单项:写入器和协同处理

客户端构建说明

[编辑 | 编辑源代码]
  • 可能需要安装CMake
  • 可能需要构建Qt
  • 可能需要关闭构建Manta插件

现有代码修改

[编辑 | 编辑源代码]

以下说明适用于2012年夏季在NCSA的Forge上测试的Navier-Stokes CUDA Fortran。

模拟代码

[编辑 | 编辑源代码]

 

 

 

 

( A)

navierstokes.cuf添加了4行并删除了.datbin写入。 代码下载

!--------------------------------------------------------------------
!
! PURPOSE
!
! This program numerically solves the 2D incompressible Navier-Stokes
! on a Square Domain [0,1]x[0,1] using pseudo-spectral methods and
! Crank-Nicolson timestepping. The numerical solution is compared to
! the exact Taylor-Green Vortex Solution. 
!
! Periodic free-slip boundary conditions and Initial conditions:
!       u(x,y,0)=sin(2*pi*x)cos(2*pi*y)
!       v(x,y,0)=-cos(2*pi*x)sin(2*pi*y)
! Analytical Solution (subscript denote derivatives):
!       u(x,y,t)=sin(2*pi*x)cos(2*pi*y)exp(-8*pi^2*t/Re)
!       v(x,y,t)=-cos(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
!   u_y(x,y,t)=-2*pi*sin(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
!       v_x(x,y,t)=2*pi*sin(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
!       omega=v_x-u_y
!
! .. Parameters ..
!  Nx                           = number of modes in x - power of 2 for FFT
!  Ny                           = number of modes in y - power of 2 for FFT
!  nplots                       = number of plots produced
!  plotgap                      = number of timesteps inbetween plots
!  Re                           = dimensionless Renold's number
!  ReInv                        = 1/Re for optimization
!  dt                           = timestep size 
!  dtInv                        = 1/dt for optimization
!  tol                          = determines when convergences is reached
!  scalemodes           = 1/(Nx*Ny), scaling after preforming FFTs
!  numthreads           = number of CPUs used in calculation
! .. Scalars ..
!  i                            = loop counter in x direction
!  j                            = loop counter in y direction
!  n                            = loop counter for timesteps direction  
!  allocatestatus       = error indicator during allocation
!  time                         = times at which data is saved
!  chg                          = error at each iteration
!  temp                         = used for ordering saved omega 
! .. Arrays (gpu) ..
!  u_d                          = velocity in x direction
!  uold_d                               = velocity in x direction at previous timestep
!  v_d                          = velocity in y direction
!  vold_d                               = velocity in y direction at previous timestep
!  omeg_d                               = vorticity     in real space
!  omeghat_d                    = 2D Fourier transform of vorticity
!                                               at next iterate
!  omegoldhat_d         = 2D Fourier transform of vorticity at previous
!                                               iterate
!  nloldhat_d                   = nonlinear term in Fourier space
!                                               at previous iterate
!  omegexact_d          = taylor-green vorticity at
!                                               at final step
!  psihat_d                     = 2D Fourier transform of streamfunction
!                                               at next iteration
!  omegcheck_d          = store of vorticity at previous iterate
!  temp1_d/temp2_d              = reusable complex/real space used for
!                                               calculations. This reduces number of
!                                               arrays stored.
! .. Vectors (gpu) ..
!  kx_d                         = fourier frequencies in x direction
!  ky_d                         = fourier frequencies in y direction
!  x_d                          = x locations
!  y_d                          = y locations
!  name_config          = array to store filename for data to be saved                  
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!               
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! This program has not been fully optimized.
!--------------------------------------------------------------------   
module precision
  ! Precision control
  
  integer, parameter, public :: Single = kind(0.0) ! Single precision
  integer, parameter, public :: Double = kind(0.0d0) ! Double precision
  
  integer, parameter, public :: fp_kind = Double
  !integer, parameter, public :: fp_kind = Single
  
end module precision

module cufft
  
  integer, public :: CUFFT_FORWARD = -1
  integer, public :: CUFFT_INVERSE = 1
  integer, public :: CUFFT_R2C = Z'2a' ! Real to Complex (interleaved)
  integer, public :: CUFFT_C2R = Z'2c' ! Complex (interleaved) to Real
  integer, public :: CUFFT_C2C = Z'29' ! Complex to Complex, interleaved
  integer, public :: CUFFT_D2Z = Z'6a' ! Double to Double-Complex
  integer, public :: CUFFT_Z2D = Z'6c' ! Double-Complex to Double
  integer, public :: CUFFT_Z2Z = Z'69' ! Double-Complex to Double-Complex
  
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  !
  ! cufftPlan2d(cufftHandle *plan, int nx,int ny, cufftType type)
  !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  
  interface cufftPlan2d
     subroutine cufftPlan2d(plan, nx, ny, type) bind(C,name='cufftPlan2d')
       use iso_c_binding
       integer(c_int):: plan
       integer(c_int),value:: nx, ny, type
     end subroutine cufftPlan2d
  end interface
  
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  !
  ! cufftDestroy(cufftHandle plan)
  !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  
  interface cufftDestroy
     subroutine cufftDestroy(plan) bind(C,name='cufftDestroy')
       use iso_c_binding
       integer(c_int),value:: plan
     end subroutine cufftDestroy
  end interface
  
  interface cufftExecD2Z
     subroutine cufftExecD2Z(plan, idata, odata) &
          & bind(C,name='cufftExecD2Z')
       use iso_c_binding
       use precision
       integer(c_int),  value  :: plan
       real(fp_kind),   device :: idata(1:nx,1:ny)
       complex(fp_kind),device :: odata(1:nx,1:ny)
     end subroutine cufftExecD2Z
  end interface
  
  interface cufftExecZ2D
     subroutine cufftExecZ2D(plan, idata, odata) &
          & bind(C,name='cufftExecZ2D')
       use iso_c_binding
       use precision
       integer(c_int),value:: plan
       complex(fp_kind),device:: idata(1:nx,1:ny)
       real(fp_kind),device :: odata(1:nx,1:ny)
     end subroutine cufftExecZ2D
  end interface
  
end module cufft

PROGRAM main
  use precision
  use cufft
  
  ! coprocessing:
  use ns2dcnadaptor

  ! declare variables
  IMPLICIT NONE
  INTEGER(kind=4), PARAMETER            ::  Nx=1024
  INTEGER(kind=4), PARAMETER            ::  Ny=1024     
  !INTEGER(kind=8)                                              ::  
  REAL(fp_kind), PARAMETER              ::  dt=0.00025d0
  REAL(fp_kind), PARAMETER              ::  dtInv=1.0d0/REAL(dt,kind(0d0))   
  REAL(fp_kind), PARAMETER      &
       ::  pi=3.14159265358979323846264338327950288419716939937510d0
  REAL(fp_kind), PARAMETER              ::  Re=5000.0d0                 
  REAL(fp_kind), PARAMETER              ::      ReInv=1.0d0/REAL(Re,kind(0d0))
  REAL(fp_kind), PARAMETER              ::      tol=0.1d0**10
  
  !coprocessing: temp is used as an int, runtime crashes on 
  ! floating temp used in write format string on line 376
  !REAL(fp_kind)                                        ::  scalemodes,chg,temp=10000000
  real(fp_kind) :: scalemodes, chg
  integer(kind=4) :: temp=100000000

  INTEGER(kind=4), PARAMETER            ::      nplots=400, plotgap=200                         
  REAL(fp_kind),DIMENSION(:), ALLOCATABLE               ::  x,y
  REAL(fp_kind),DIMENSION(:,:), ALLOCATABLE     ::      omeg,omegexact
  INTEGER(kind=4)                               ::  i,j,n,t, AllocateStatus
  INTEGER(kind=4)                               ::  planz2d,pland2z, kersize
  !variables used for saving data and timing            
  INTEGER(kind=4)                               ::      start, finish, count_rate,count, iol    
  CHARACTER*100                                 ::  name_config
  ! Declare variables for GPU   
  REAL(fp_kind), DEVICE, DIMENSION(:,:), ALLOCATABLE            ::      u_d,v_d,&
       omegcheck_d,omeg_d,&
       nl_d, temp2_d, omegexact_d
  COMPLEX(fp_kind), DEVICE, DIMENSION(:,:), ALLOCATABLE :: omegoldhat_d, nloldhat_d,&
       omeghat_d, nlhat_d, psihat_d,&
       temp1_d                                                          
  COMPLEX(fp_kind), DEVICE, DIMENSION(:), ALLOCATABLE           ::  kx_d,ky_d                                           
  REAL(kind=8),DEVICE, DIMENSION(:), ALLOCATABLE                ::  x_d,y_d
  
  kersize=min(Nx,256)
  PRINT *,'Program starting'
  PRINT *,'Grid:',Nx,'X',Ny
  PRINT *,'dt:',dt      
  ALLOCATE(x(1:Nx),y(1:Ny),omeg(1:Nx,1:Ny),omegexact(1:Nx,1:Ny),&
       stat=AllocateStatus)
  IF (AllocateStatus .ne. 0) STOP               
  PRINT *,'Allocated CPU arrays'
  ALLOCATE(kx_d(1:Nx/2+1),ky_d(1:Ny),x_d(1:Nx),y_d(1:Ny),u_d(1:Nx,1:Ny),&
       v_d(1:Nx,1:Ny),omeg_d(1:Nx,1:Ny),&
       omegexact_d(1:Nx,1:Ny),temp2_d(1:Nx,1:Ny),&
       omegoldhat_d(1:Nx/2+1,1:Ny),nloldhat_d(1:Nx/2+1,1:Ny),&
       omegcheck_d(1:Nx,1:Ny),omeghat_d(1:Nx/2+1,1:Ny),nl_d(1:Nx,1:Ny),&
       nlhat_d(1:Nx/2+1,1:Ny), psihat_d(1:Nx/2+1,1:Ny),temp1_d(1:Nx/2+1,1:Ny),&
       stat=AllocateStatus)
  IF (AllocateStatus .ne. 0) STOP 
  PRINT *,'Allocated GPU arrays'
  
  
  CALL cufftPlan2D(pland2z,nx,ny,CUFFT_D2Z)
  CALL cufftPlan2D(planz2d,nx,ny,CUFFT_Z2D)
  PRINT *,'Setup FFTs'
  
  ! setup fourier frequencies
  !$cuf kernel do <<< *,* >>>
  DO i=1,Nx/2+1
     kx_d(i)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(i-1,kind=fp_kind)  
  END DO
  kx_d(1+Nx/2)=0.0d0
  !$cuf kernel do <<< *,* >>>
  DO i=1,Nx
     x_d(i)=REAL(i-1,kind(0d0))/REAL(Nx,kind=fp_kind) 
  END DO
  !$cuf kernel do <<< *,* >>>
  DO j=1,Ny/2+1
     ky_d(j)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(j-1,kind=fp_kind) 
  END DO
  ky_d(1+Ny/2)=0.0d0
  !$cuf kernel do <<< *,* >>>
  DO j = 1,Ny/2 -1
     ky_d(j+1+Ny/2)=-ky_d(1-j+Ny/2)
  END DO
  !$cuf kernel do <<< *, * >>>
  DO j=1,Ny
     y_d(j)=REAL(j-1,kind(0d0))/REAL(Ny,kind=fp_kind) 
  END DO
  scalemodes=1.0d0/REAL(Nx*Ny,kind=fp_kind)
  PRINT *,'Setup grid and fourier frequencies'
  
  !initial data
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx
        u_d(i,j)=sin(2.0d0*pi*x_d(i))*cos(2.0d0*pi*y_d(j))
     END DO
  END DO
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx
        v_d(i,j)=-cos(2.0d0*pi*x_d(i))*sin(2.0d0*pi*y_d(j))
     END DO
  END DO
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx
        omeg_d(i,j)=4.0d0*pi*sin(2.0d0*pi*x_d(i))*sin(2.0d0*pi*y_d(j))+0.01d0*cos(2.0d0*pi*y_d(j))
     END DO
  END DO
  
  CALL cufftExecD2Z(pland2z,omeg_d,omeghat_d) 
  
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx/2+1
        temp1_d(i,j)=omeghat_d(i,j)*kx_d(i)
     END DO
  END DO
  
  CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
  !first part nonlinear term in real space
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx
        nl_d(i,j)=u_d(i,j)*temp2_d(i,j)
     END DO
  END DO
  
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx/2+1
        temp1_d(i,j)=omeghat_d(i,j)*ky_d(j)
     END DO
  END DO
  
  CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
  
  !$cuf kernel do <<<  *,*  >>>
  DO j=1,Ny
     DO i=1,Nx
        nl_d(i,j)=(nl_d(i,j)+v_d(i,j)*temp2_d(i,j))*scalemodes
     END DO
  END DO
  
  CALL cufftExecD2Z(pland2z,nl_d,nlhat_d) 
  
  omegcheck_d=omeg_d
  PRINT *,'Got initial data, starting timestepping'     
  CALL system_clock(start,count_rate)

  ! coprocessing: simulation loop
  ! coprocessing: before entering, initialze the ParaView coprocessor
  ! coprocessing: this is from FortranAdaptorAPI.cxx
  ! coprocessing: it takes the name of the pipeline script you created
  ! coprocessing: in the ParaView client and the name's length in chars
  call coprocessorinitialize("pipeline.py", 11 )
  DO t=1,nplots
     DO n=1,plotgap
        chg=1.0d0
        nloldhat_d=nlhat_d
        omegoldhat_d=omeghat_d
        DO WHILE (chg>tol)
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx/2+1
                 omeghat_d(i,j)=((dtInv+0.5d0*ReInv*(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)))&
                      *omegoldhat_d(i,j) - 0.5d0*(nloldhat_d(i,j)+nlhat_d(i,j)))&
                      /(dtInv-0.5d0*ReInv*(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)))   
              END DO
           END DO
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx/2+1
                 psihat_d(i,j)=-omeghat_d(i,j)/(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)+0.10d0**14)
              END DO
           END DO
           CALL cufftExecZ2D(planz2d,omeghat_d,omeg_d)
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx/2+1
                 temp1_d(i,j)=-psihat_d(i,j)*kx_d(i)*scalemodes
              END DO
           END DO
           CALL cufftExecZ2D(planz2d,temp1_d,v_d) 
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>                                
           DO j=1,Ny
              DO i=1,Nx/2+1
                 temp1_d(i,j)=psihat_d(i,j)*ky_d(j)*scalemodes
              END DO
           END DO
           CALL cufftExecZ2D(planz2d,temp1_d,u_d)
           
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx/2+1
                 temp1_d(i,j)=omeghat_d(i,j)*kx_d(i)
              END DO
           END DO
           
           CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
           
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx
                 nl_d(i,j)=u_d(i,j)*temp2_d(i,j)
              END DO
           END DO
           
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx/2+1
                 temp1_d(i,j)=omeghat_d(i,j)*ky_d(j)
              END DO
           END DO
           
           CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
           
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx
                 nl_d(i,j)=(nl_d(i,j)+v_d(i,j)*temp2_d(i,j))*scalemodes
              END DO
           END DO
           
           CALL cufftExecD2Z(pland2z,nl_d,nlhat_d) 
           
           chg=0.0d0
           !$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
           DO j=1,Ny
              DO i=1,Nx
                 chg=chg+(omeg_d(i,j)-omegcheck_d(i,j))*(omeg_d(i,j)-omegcheck_d(i,j))&
                      *scalemodes*scalemodes
              END DO
           END DO
           omegcheck_d=omeg_d
        END DO
     END DO
     PRINT *, t*plotgap*dt
     omeg=omeg_d
     !temp=temp+1
     !write(name_config,'(a,i0,a)') 'omega',temp,'.datbin'
     !INQUIRE(iolength=iol) omeg(1,1)
     !OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol) 
     !count = 1 !coprocessing - there's an intrinsic array function this shadows
     !DO j=1,Ny
     !   DO i=1,Nx
     !      WRITE(11,rec=count) omeg(i,j)*scalemodes
     !      count=count+1
     !   END DO
     !END DO
     !CLOSE(11)

     ! coprocessing: do the coprocessing at the end of the sim loop
     ! grid dims, time step, time, scalar array
     call navierStokesCoprocessor(Nx, Ny, 1, t, t*dt  ,omeg)  
 
  END DO
 
  ! coprocessing: clean up
  call coprocessorfinalize()

 CALL system_clock(finish,count_rate)
  PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),&
       'for Time stepping'
  
  
  ! Copy results back to host
  x=x_d
  y=y_d         
  
!!!!!!!!!!!!!!!!!!!!!!!!
  !copy over data to disk!
!!!!!!!!!!!!!!!!!!!!!!!!
  
  name_config = 'xcoord.dat' 
  OPEN(unit=11,FILE=name_config,status="UNKNOWN")       
  REWIND(11)
  DO i=1,Nx
     WRITE(11,*) x(i)
  END DO
  CLOSE(11)
  
  name_config = 'ycoord.dat' 
  OPEN(unit=11,FILE=name_config,status="UNKNOWN")       
  REWIND(11)
  DO j=1,Ny
     WRITE(11,*) y(j)
  END DO
  CLOSE(11)
!!!!!!!!!!!!!!!!!!!!!!!!
  
  CALL cufftDestroy(planz2d)
  CALL cufftDestroy(pland2z)
  PRINT *,'Destroyed fft plan'
  
  DEALLOCATE(x,y,omeg,omegexact,stat=AllocateStatus)    
  IF (AllocateStatus .ne. 0) STOP
  PRINT *,'Deallocated CPU memory'
  
  DEALLOCATE(kx_d,ky_d,x_d,y_d,u_d,&
       v_d, omegcheck_d, omeg_d,omegoldhat_d,&
       omegexact_d, nloldhat_d,omeghat_d,nl_d, nlhat_d,&
       temp1_d,temp2_d, psihat_d,stat=AllocateStatus)   
  IF (AllocateStatus .ne. 0) STOP
  PRINT *,'Deallocated GPU memory'
  PRINT *,'Program execution complete'
END PROGRAM main

开发的文件

[编辑 | 编辑源代码]

ns2dcnadaptor.f90

[编辑 | 编辑源代码]

 

 

 

 

(B)
定义navierstokescoprocessor子例程。这会调用来自ParaView提供的FortranAdaptorAPI.h的一些函数,以及ns2dcnVTKDataSet.cxx中用户定义的函数。

代码下载

! fortran module for interacting with the ParaView CoProcessor
! loosely based on: 
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE

! subroutine determines if coprocessing needed during the
! current simulation step or not.
      
module ns2dcnadaptor
  implicit none
  public
contains
  subroutine navierstokescoprocessor(nx, ny, nz, step, time, omeg)
    ! nx, ny, nz -- grid dimensions, nz will be 1 since this is a 2D problem.
    ! step       -- current simulation time step
    ! time       -- current simulation time
    ! omega      -- scalar array for the current time step
    integer, intent(in) :: nx, ny, nz, step
    real(kind=8), intent(in) :: time
    real(kind=8), dimension(:,:), intent (in) :: omeg
    integer :: flag

    ! check if processing this time step
    ! defined in FortranAdaptorAPI.h
    call requestdatadescription(step, time, flag)
        
    if (flag .ne. 0) then
       ! processing requested
       ! check if need to create grid
       ! defined in FortranAdaptorAPI.h
       call needtocreategrid(flag)
       
       if (flag .ne. 0) then
          ! grid needed
          ! defined in ns2dcnVTKDataSet.cxx
          call createcpimagedata(nx, ny, nz)
       end if
          
       ! defined in ns2dcnVTKDataSet.cxx 
       call addfield(omeg)
       
       ! defined in FortranAdaptorAPI.h
       call coprocess()
    end if
    
    return
  end subroutine navierstokescoprocessor
end module ns2dcnadaptor

ns2dcnVTKDataSet.cxx

[编辑 | 编辑源代码]

 

 

 

 

( C)

定义将模拟数据放入ParaView/VTK库可以理解的格式的函数。对于完全规则的网格来说非常简单。扩展到3D很简单。非结构化网格将需要更多努力。 代码下载

// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx


// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h" 

// CoProcessor specific headers
// ParaView-3.14.1-Source/CoProcessing/CoProcessor/
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"

#include "vtkDoubleArray.h"
#include "vtkPointData.h"

// ParaView-3.14.1-Source/VTK/Filtering/vtkImageData.h
#include "vtkImageData.h"

// These will be called from the Fortran "glue" code"
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.

// Creates the data container for the CoProcessor.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz)
{
  if (!ParaViewCoProcessing::GetCoProcessorData()) {
    vtkGenericWarningMacro("Unable to access CoProcessorData.");
    return;
  }

  // The simulation grid is a 2-dimensional topologically and geometrically 
  // regular grid. In VTK/ParaView, this is considered an image data set.
  vtkImageData* Grid = vtkImageData::New();
  
  // assuming dimZ == 1 for now
  Grid->SetDimensions(*nx, *ny, *nz);
  
  // Setting the Origin and Spacing are also options.

  // Name should be consistent between here, Fortran and Python client script.
  ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(Grid);
}

// Add field(s) to the data container.
// Separate from above because this will be dynamic, grid is static.
// Might be an issue, VTK probably assumes row major, but
// omeg probably passed column major...
// by hand name mangling for fortran
extern "C" void addfield_(double* scalars)
{
  vtkCPInputDataDescription *idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");

  vtkImageData* Image = vtkImageData::SafeDownCast(idd->GetGrid());

  if (!Image) {
    vtkGenericWarningMacro("No adaptor grid to attach field data to.");
    return;
  }


  // field name must match that in the fortran code.
  if (idd->IsFieldNeeded("omeg")) {
    vtkDoubleArray* omega = vtkDoubleArray::New();
    omega->SetName("omeg");
    omega->SetArray(scalars, Image->GetNumberOfPoints(), 1); 
    Image->GetPointData()->AddArray(omega);
    omega->Delete();
  }
}

在ParaView客户端中创建可视化管道

[编辑 | 编辑源代码]

脚本管道是通过使用协同处理脚本生成器插件在ParaView客户端中生成的。

  • 文件 > 打开 > 浏览到数据集,该文件现在应该在管道浏览器窗口中可见。
  • 在属性窗口中,单击应用,数据现在应该出现在布局窗口中。
  • 要更改颜色映射,请单击工具栏上带有绿色圆圈的彩虹按钮。
  • 要在布局中显示颜色映射,请单击工具栏上垂直彩虹的按钮。
  • 在工具 > 锁定视图大小自定义中设置视窗的大小。这将帮助你为最终图像排列项目。记下尺寸,稍后你需要它们来解决一个错误。
  • 如果要写出VTK数据集,请转到写入器 > 并行图像数据写入器,这应该会在管道中添加一个写入器。
  • 要导出python管道,请转到协同处理 > 导出状态,这应该会启动一个向导。
    • 单击下一步。
    • 选择你想要连接到你的模拟的数据集,添加它,单击下一步。
    • 默认的模拟名称为“输入”。如果你更改它,请务必更新ns2dcnVTKDataSet.cxx,单击下一步。
    • 检查是否要输出图像。单击完成。
    • 保存文件。文件名将在navierstokes.cuf中引用,因此请保存为类似pipeline-test-??.py的名称,然后将其复制到pipeline.py,以避免为新的管道重新编译。

Python 脚本问题/错误

[编辑 | 编辑源代码]

如果管道设置为输出图像,那么脚本中可能存在两个问题,可以视为错误。

  1. 无论在客户端中设置了什么视口大小,脚本都不会使用它。
  2. 颜色映射标签将以一个包含它们的框绘制。

第一个问题的解决方法是在客户端中,向ViewSize方法调用添加与视口窗口尺寸相同的参数。第二个问题只需要简单的编辑。这两个问题都在示例pipeline.py中用#mvm注释标记了。

设置环境

[编辑 | 编辑源代码]

Forge 要求加载 ParaView OSMesa 模块。$ module load paraview/3.14.1-OSMesa。在 Nautilus 上,该模块为 paraview/3.14.1。

运行模拟

[编辑 | 编辑源代码]

按照运行未经检测版本的程序的方式运行。

将图像转换为动画

[编辑 | 编辑源代码]

有许多方法可以做到这一点,如果 ffmpeg 可用,则可以使用 ffmpeg -sameq -i path/to/images/image%04d.png myanimation.mpg。

或者,如果将 VTK 数据集写入文件,则可以直接在 ParaView 中使用它们。

dt 及其对动画时间的影响。

[编辑 | 编辑源代码]

流畅视频约为 30 fps 或 dt = 0.033… 如果将较大 dt 写入 VTK 数据集,则可以在 ParaView 中进行线性时间插值。如果将较大 dt 写入图像并将其拼接成 mpg,则可用选项会大大降低视频质量。

ParaView 服务器设置

[编辑 | 编辑源代码]
  • Forge 没有 X 支持,因此需要构建离屏 Mesa (OSMesa)。
  • Nautilus 支持 X,因此可以使用提供的 OpenGL 库进行渲染。

新 API 的更改摘要

[编辑 | 编辑源代码]

自 ParaView 3.14.1 以来,协同处理 API 有一些细微的更改。以下内容适用于 ParaView 3.98+ 和 Catalyst 1.0 alpha+。这些更改已于 2013 年 8 月 20 日在 NICS 的 Beacon 上通过 offload 模式进行了测试。

代码更改如下:

  • 在模拟代码中,coprocessorinitialize 现在为 coprocessorinitializewithpython,参数相同。
  • 在 C++ 数据集辅助代码中,包含 vtkCPPythonAdaptorAPI.h,而不是 FortranAdaptorAPI.h。
  • 在 C++ 数据集辅助代码中,从 ParaViewCoprocessing 命名空间更改为 vtkCPPythonAdaptorAPI 命名空间。

要链接的库也已更改,对于 ParaView 4.0.1 版本,它们为:

  • -lvtkPVCatalystPython26D-pv4.0
  • -lvtkPVCatalyst-pv4.0
  • -lvtkCommonCore-pv4.0
  • -lvtkCommonDataModel-pv4.0
  • -lvtkPVPythonCatalyst-pv4.0

使用新 API 的示例

[编辑 | 编辑源代码]

 

 

 

 

( A)
模拟:navierstokes.f90 -- 唯一的更改是初始化函数名称。

代码下载

   
!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 2D incompressible Navier-Stokes
! on a Square Domain [0,1]x[0,1] using pseudo-spectral methods and
! Crank-Nicolson timestepping. The numerical solution is compared to
! the exact Taylor-Green Vortex Solution. 
!
! Periodic free-slip boundary conditions and Initial conditions:
! u(x,y,0)=sin(2*pi*x)cos(2*pi*y)
! v(x,y,0)=-cos(2*pi*x)sin(2*pi*y)
! Analytical Solution:
! u(x,y,t)=sin(2*pi*x)cos(2*pi*y)exp(-8*pi^2*nu*t)
! v(x,y,t)=-cos(2*pi*x)sin(2*pi*y)exp(-8*pi^2*nu*t)
!
! .. Parameters ..
!  Nx    = number of modes in x - power of 2 for FFT
!  Ny    = number of modes in y - power of 2 for FFT
!  Nt    = number of timesteps to take
!  Tmax    = maximum simulation time
!  FFTW_IN_PLACE  = value for FFTW input 
!  FFTW_MEASURE  = value for FFTW input
!  FFTW_EXHAUSTIVE  = value for FFTW input
!  FFTW_PATIENT  = value for FFTW input    
!  FFTW_ESTIMATE  = value for FFTW input
!  FFTW_FORWARD     = value for FFTW input
!  FFTW_BACKWARD = value for FFTW input 
!  pi = 3.14159265358979323846264338327950288419716939937510d0
!  mu    = viscosity
!  rho    = density
! .. Scalars ..
!  i    = loop counter in x direction
!  j    = loop counter in y direction
!  n    = loop counter for timesteps direction 
!  allocatestatus = error indicator during allocation
!  count   = keep track of information written to disk
!  iol    = size of array to write to disk
!  start   = variable to record start time of program
!  finish   = variable to record end time of program
!  count_rate  = variable for clock count rate
!  planfx   = Forward 1d fft plan in x
!  planbx   = Backward 1d fft plan in x
!  planfy   = Forward 1d fft plan in y
!  planby   = Backward 1d fft plan in y
!  dt    = timestep
! .. Arrays ..
!  u    = velocity in x direction
!  uold    = velocity in x direction at previous timestep
!  v    = velocity in y direction
!  vold    = velocity in y direction at previous timestep
!  u_y    = y derivative of velocity in x direction
!  v_x    = x derivative of velocity in y direction
!  omeg    = vorticity in real space
!  omegold   = vorticity in real space at previous
!      iterate
!  omegcheck  = store of vorticity at previous iterate
!  omegoldhat  = 2D Fourier transform of vorticity at previous
!      iterate
!  omegoldhat_x  = x-derivative of vorticity in Fourier space
!      at previous iterate
!  omegold_x  = x-derivative of vorticity in real space 
!      at previous iterate
!  omegoldhat_y  = y-derivative of vorticity in Fourier space 
!      at previous iterate
!  omegold_y  = y-derivative of vorticity in real space
!       at previous iterate
!  nlold   = nonlinear term in real space
!      at previous iterate
!  nloldhat   = nonlinear term in Fourier space
!      at previous iterate
!  omeghat   = 2D Fourier transform of vorticity
!      at next iterate
!  omeghat_x  = x-derivative of vorticity in Fourier space
!      at next timestep
!  omeghat_y  = y-derivative of vorticity in Fourier space
!      at next timestep
!  omeg_x   = x-derivative of vorticity in real space
!      at next timestep
!  omeg_y   = y-derivative of vorticity in real space
!      at next timestep
! .. Vectors ..
!  kx    = fourier frequencies in x direction
!  ky    = fourier frequencies in y direction
!  kxx    = square of fourier frequencies in x direction
!  kyy    = square of fourier frequencies in y direction
!  x    = x locations
!  y    = y locations
!  time    = times at which save data
!  name_config  = array to store filename for data to be saved      
!  fftfx   = array to setup x Fourier transform
!  fftbx   = array to setup y Fourier transform
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!  
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!--------------------------------------------------------------------
! External routines required
! 
! External libraries required
! FFTW3  -- Fast Fourier Transform in the West Library
!   (http://www.fftw.org/)         
! declare variables
PROGRAM main
    use nsadaptor_glue
    IMPLICIT NONE  
    INTEGER(kind=4), PARAMETER  ::  Nx=256   
    INTEGER(kind=4), PARAMETER  ::  Ny=256   
    REAL(kind=8), PARAMETER  ::  dt=0.00125     
    REAL(kind=8), PARAMETER &
        ::  pi=3.14159265358979323846264338327950288419716939937510
    REAL(kind=8), PARAMETER  ::  rho=1.0d0   
    REAL(kind=8), PARAMETER  ::  mu=1.0d0  
    REAL(kind=8), PARAMETER  :: tol=0.1d0**10  
    REAL(kind=8)    :: chg  
    INTEGER(kind=4), PARAMETER :: nplots=50
    REAL(kind=8), DIMENSION(:), ALLOCATABLE   :: time
    COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE  ::  kx,kxx    
    COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE  ::  ky,kyy    
    REAL(kind=8), DIMENSION(:), ALLOCATABLE   ::  x     
    REAL(kind=8), DIMENSION(:), ALLOCATABLE   ::  y      
    COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE :: & 
        u,uold,v,vold,u_y,v_x,omegold, omegcheck, omeg,&
        omegoldhat, omegoldhat_x, omegold_x,& 
        omegoldhat_y, omegold_y, nlold, nloldhat,&
        omeghat, omeghat_x, omeghat_y, omeg_x, omeg_y,&
        nl, nlhat, psihat, psihat_x, psi_x, psihat_y, psi_y

    ! added for coprocessing
    real(kind=8), dimension(:,:), allocatable :: dump

    REAL(kind=8),DIMENSION(:,:), ALLOCATABLE :: uexact_y,vexact_x,omegexact
    INTEGER(kind=4)        ::  i,j,k,n, allocatestatus, count, iol
    INTEGER(kind=4)         :: start, finish, count_rate
    INTEGER(kind=4), PARAMETER         ::  FFTW_IN_PLACE = 8, FFTW_MEASURE = 0, &
                FFTW_EXHAUSTIVE = 8, FFTW_PATIENT = 32,    &
                                FFTW_ESTIMATE = 64
    INTEGER(kind=4),PARAMETER          ::  FFTW_FORWARD = -1, FFTW_BACKWARD=1 
    COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE ::  fftfx,fftbx
    INTEGER(kind=8)        ::  planfxy,planbxy
    CHARACTER*100         ::  name_config
       
    CALL system_clock(start,count_rate)  
    ALLOCATE(time(1:nplots),kx(1:Nx),kxx(1:Nx),ky(1:Ny),kyy(1:Ny),x(1:Nx),y(1:Ny),&
        u(1:Nx,1:Ny),uold(1:Nx,1:Ny),v(1:Nx,1:Ny),vold(1:Nx,1:Ny),u_y(1:Nx,1:Ny),&
        v_x(1:Nx,1:Ny),omegold(1:Nx,1:Ny),omegcheck(1:Nx,1:Ny), omeg(1:Nx,1:Ny),&
        omegoldhat(1:Nx,1:Ny),omegoldhat_x(1:Nx,1:Ny), omegold_x(1:Nx,1:Ny), &
        omegoldhat_y(1:Nx,1:Ny),omegold_y(1:Nx,1:Ny), nlold(1:Nx,1:Ny), nloldhat(1:Nx,1:Ny),&
        omeghat(1:Nx,1:Ny), omeghat_x(1:Nx,1:Ny), omeghat_y(1:Nx,1:Ny), omeg_x(1:Nx,1:Ny),&
        omeg_y(1:Nx,1:Ny), nl(1:Nx,1:Ny), nlhat(1:Nx,1:Ny), psihat(1:Nx,1:Ny), &
        psihat_x(1:Nx,1:Ny), psi_x(1:Nx,1:Ny), psihat_y(1:Nx,1:Ny), psi_y(1:Nx,1:Ny),&
        uexact_y(1:Nx,1:Ny), vexact_x(1:Nx,1:Ny), omegexact(1:Nx,1:Ny),fftfx(1:Nx,1:Ny),&
        fftbx(1:Nx,1:Ny),dump(1:Nx,1:Ny),stat=AllocateStatus) 
    IF (AllocateStatus .ne. 0) STOP 
    PRINT *,'allocated space'
  
    ! set up ffts
    CALL dfftw_plan_dft_2d_(planfxy,Nx,Ny,fftfx(1:Nx,1:Ny),fftbx(1:Nx,1:Ny),&
        FFTW_FORWARD,FFTW_EXHAUSTIVE)
    CALL dfftw_plan_dft_2d_(planbxy,Nx,Ny,fftbx(1:Nx,1:Ny),fftfx(1:Nx,1:Ny),&
    FFTW_BACKWARD,FFTW_EXHAUSTIVE)
  
     ! setup fourier frequencies in x-direction
    DO i=1,1+Nx/2
        kx(i)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))     
    END DO
    kx(1+Nx/2)=0.0d0
    DO i = 1,Nx/2 -1
        kx(i+1+Nx/2)=-kx(1-i+Nx/2)
    END DO
    DO i=1,Nx
        kxx(i)=kx(i)*kx(i)
    END DO
    DO i=1,Nx
        x(i)=REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)) 
    END DO
  
    ! setup fourier frequencies in y-direction
    DO j=1,1+Ny/2
        ky(j)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))     
    END DO
    ky(1+Ny/2)=0.0d0
    DO j = 1,Ny/2 -1
        ky(j+1+Ny/2)=-ky(1-j+Ny/2)
    END DO
    DO j=1,Ny
        kyy(j)=ky(j)*ky(j)
    END DO
    DO j=1,Ny
        y(j)=REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)) 
    END DO
    PRINT *,'Setup grid and fourier frequencies'
 
  
    DO j=1,Ny
        DO i=1,Nx
            u(i,j)=sin(2.0d0*pi*x(i))*cos(2.0d0*pi*y(j))
            v(i,j)=-cos(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
            u_y(i,j)=-2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
            v_x(i,j)=2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
            omeg(i,j)=v_x(i,j)-u_y(i,j)
        END DO
    END DO
 
    ! Vorticity to Fourier Space
    CALL dfftw_execute_dft_(planfxy,omeg(1:Nx,1:Ny),omeghat(1:Nx,1:Ny)) 
 
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!Initial nonlinear term !!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
    ! obtain \hat{\omega}_x^{n,k}
    DO j=1,Ny
        omeghat_x(1:Nx,j)=omeghat(1:Nx,j)*kx(1:Nx)
    END DO
    ! obtain \hat{\omega}_y^{n,k}
    DO i=1,Nx
        omeghat_y(i,1:Ny)=omeghat(i,1:Ny)*ky(1:Ny)
    END DO
    ! convert to real space 
    CALL dfftw_execute_dft_(planbxy,omeghat_x(1:Nx,1:Ny),omeg_x(1:Nx,1:Ny))
    CALL dfftw_execute_dft_(planbxy,omeghat_y(1:Nx,1:Ny),omeg_y(1:Nx,1:Ny))
    ! compute nonlinear term in real space
    DO j=1,Ny
        nl(1:Nx,j)=u(1:Nx,j)*omeg_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))+&
        v(1:Nx,j)*omeg_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0)) 
    END DO
    CALL dfftw_execute_dft_(planfxy,nl(1:Nx,1:Ny),nlhat(1:Nx,1:Ny)) 
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
    time(1)=0.0d0
    PRINT *,'Got initial data, starting timestepping'

    ! New Coprocessor API:
    call coprocessorinitializewithpython("pipeline.py", 11)
    
    DO n=1,nplots
        chg=1
        ! save old values
        uold(1:Nx,1:Ny)=u(1:Nx,1:Ny)
        vold(1:Nx,1:Ny)=v(1:Nx,1:Ny)
        omegold(1:Nx,1:Ny)=omeg(1:Nx,1:Ny)
        omegcheck(1:Nx,1:Ny)=omeg(1:Nx,1:Ny)  
        omegoldhat(1:Nx,1:Ny)=omeghat(1:Nx,1:Ny)
        nloldhat(1:Nx,1:Ny)=nlhat(1:Nx,1:Ny)
        DO WHILE (chg>tol) 
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
            !!!!!!!!!!!!!!nonlinear fixed (n,k+1)!!!!!!!!!!!!!!!!!!!!!!!!!
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
            ! obtain \hat{\omega}_x^{n+1,k}
            DO j=1,Ny
                omeghat_x(1:Nx,j)=omeghat(1:Nx,j)*kx(1:Nx) 
            END DO
            ! obtain \hat{\omega}_y^{n+1,k}
            DO i=1,Nx
                omeghat_y(i,1:Ny)=omeghat(i,1:Ny)*ky(1:Ny)
            END DO
            ! convert back to real space 
            CALL dfftw_execute_dft_(planbxy,omeghat_x(1:Nx,1:Ny),omeg_x(1:Nx,1:Ny)) 
            CALL dfftw_execute_dft_(planbxy,omeghat_y(1:Nx,1:Ny),omeg_y(1:Nx,1:Ny))
            ! calculate nonlinear term in real space
            DO j=1,Ny
                nl(1:Nx,j)=u(1:Nx,j)*omeg_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))+&
                v(1:Nx,j)*omeg_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
            END DO
            ! convert back to fourier
            CALL dfftw_execute_dft_(planfxy,nl(1:Nx,1:Ny),nlhat(1:Nx,1:Ny)) 
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
   
            ! obtain \hat{\omega}^{n+1,k+1} with Crank Nicolson timestepping
            DO j=1,Ny
                omeghat(1:Nx,j)=( (1.0d0/dt+0.5d0*(mu/rho)*(kxx(1:Nx)+kyy(j)))&
                *omegoldhat(1:Nx,j) - 0.5d0*(nloldhat(1:Nx,j)+nlhat(1:Nx,j)))/&
                (1.0d0/dt-0.5d0*(mu/rho)*(kxx(1:Nx)+kyy(j)))   
            END DO
   
            ! calculate \hat{\psi}^{n+1,k+1}
            DO j=1,Ny
                psihat(1:Nx,j)=-omeghat(1:Nx,j)/(kxx(1:Nx)+kyy(j))
            END DO
            psihat(1,1)=0.0d0
            psihat(Nx/2+1,Ny/2+1)=0.0d0
            psihat(Nx/2+1,1)=0.0d0
            psihat(1,Ny/2+1)=0.0d0
   
            ! obtain \psi_x^{n+1,k+1} and \psi_y^{n+1,k+1}
            DO j=1,Ny
                psihat_x(1:Nx,j)=psihat(1:Nx,j)*kx(1:Nx)
            END DO
            CALL dfftw_execute_dft_(planbxy,psihat_x(1:Nx,1:Ny),psi_x(1:Nx,1:Ny)) 
            DO i=1,Nx
                psihat_y(i,1:Ny)=psihat(i,1:Ny)*ky(1:Ny)
            END DO
            CALL dfftw_execute_dft_(planbxy,psihat_y(1:Ny,1:Ny),psi_y(1:Ny,1:Ny)) 
            DO j=1,Ny
                psi_x(1:Nx,j)=psi_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
                psi_y(1:Nx,j)=psi_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
            END DO 
   
            ! obtain \omega^{n+1,k+1}
            CALL dfftw_execute_dft_(planbxy,omeghat(1:Nx,1:Ny),omeg(1:Nx,1:Ny)) 
            DO j=1,Ny
                omeg(1:Nx,j)=omeg(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
            END DO
   
            ! obtain u^{n+1,k+1} and v^{n+1,k+1} using stream function (\psi) in real space
            DO j=1,Ny
                u(1:Nx,j)=psi_y(1:Nx,j)
                v(1:Nx,j)=-psi_x(1:Nx,j)
            END DO 
   
            ! check for convergence 
            chg=maxval(abs(omeg-omegcheck)) 
            ! saves {n+1,k+1} to {n,k} for next iteration
            omegcheck=omeg  
        END DO
        time(n+1)=time(n)+dt
        PRINT *,'TIME ',time(n+1)
        ! call adaptor here
        ! not sure if there's a way to cleanly pass complex to C++ code
        do j = 1, Ny
            do i = 1, Nx
                dump(i,j) = real(omeg(i,j), kind(0d0))
            end do
        end do
        call nsadaptor(Nx, Ny, 1, n, time(n), dump)
    END DO
  
    call coprocessorfinalize()

    DO j=1,Ny
        DO i=1,Nx
            uexact_y(i,j)=-2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))*&
            exp(-8.0d0*mu*(pi**2)*nplots*dt)
            vexact_x(i,j)=2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))*&
            exp(-8.0d0*mu*(pi**2)*nplots*dt)
            omegexact(i,j)=vexact_x(i,j)-uexact_y(i,j)
        END DO
    END DO
  
    name_config = 'omegafinal.datbin' 
    INQUIRE(iolength=iol) omegexact(1,1)
    OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol) 
    count = 1 
    DO j=1,Ny
        DO i=1,Nx
            WRITE(11,rec=count) REAL(omeg(i,j),KIND(0d0))
            count=count+1
        END DO
    END DO
    CLOSE(11)
  
    name_config = 'omegaexactfinal.datbin' 
    OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol) 
    count = 1 
    DO j=1,Ny
        DO i=1,Nx
            WRITE(11,rec=count) omegexact(i,j)
            count=count+1
        END DO
    END DO
    CLOSE(11)

    name_config = 'xcoord.dat' 
    OPEN(unit=11,FILE=name_config,status="UNKNOWN")  
    REWIND(11)
    DO i=1,Nx
        WRITE(11,*) x(i)
    END DO
    CLOSE(11)

    name_config = 'ycoord.dat' 
    OPEN(unit=11,FILE=name_config,status="UNKNOWN")  
    REWIND(11)
    DO j=1,Ny
        WRITE(11,*) y(j)
    END DO
    CLOSE(11)
 
    CALL dfftw_destroy_plan_(planfxy)
    CALL dfftw_destroy_plan_(planbxy)
    CALL dfftw_cleanup_()
  
    DEALLOCATE(time,kx,kxx,ky,kyy,x,y,&
        u,uold,v,vold,u_y,v_x,omegold, omegcheck, omeg, &
        omegoldhat, omegoldhat_x, omegold_x,& 
        omegoldhat_y, omegold_y, nlold, nloldhat,&
        omeghat, omeghat_x, omeghat_y, omeg_x, omeg_y,&
        nl, nlhat, psihat, psihat_x, psi_x, psihat_y, psi_y,&
        uexact_y,vexact_x,omegexact, &
        fftfx,fftbx,stat=AllocateStatus) 
    IF (AllocateStatus .ne. 0) STOP 
    PRINT *,'Program execution complete'
END PROGRAM main

 

 

 

 

( A)

Fortran 适配器:NSadaptor.f90 -- 没有 API 更改,出于完整性考虑,将其包含在内。 代码下载

! Fortran module for interacting with the ParaView/Catalyst coprocessor
module nsadaptor_glue
    implicit none
    public
contains
    subroutine nsadaptor(nx, ny, nz, step, time, scalar)
        ! nx, ny, nz -- grid dimensions
        ! step       -- current time step
        ! time       -- current time
        ! scalar     -- the data
        integer, intent(in) :: nx, ny, nz, step
        real(kind=8), intent(in) :: time
        real(kind=8), dimension(:,:), intent(in) :: scalar
        integer :: flag
        
        ! check if coprocessing this step
        ! defined in adaptor header in ParaView src
        call requestdatadescription(step, time, flag)
        if (flag .ne. 0) then
            ! coprocessing requested
            ! check if grid exists
            ! also from ParaView adaptor header
            call needtocreategrid(flag)

            if (flag .ne. 0) then
                ! grid needed
                ! defined by application developer
                call createcpimagedata(nx, ny, nz)
            end if

            ! also defined by application dev
            call addfield(scalar, "test")
    
            ! from ParaView adaptor header
            call coprocess()
        end if

        return
    end subroutine nsadaptor
end module nsadaptor_glue

 

 

 

 

( A)

C++ VTK 数据辅助程序:PointBasedDataSet.cxx -- 更改包括新的标头和新的命名空间。还要注意 vtkSmartPointer 的使用,而不是原始指针处理。 代码下载

// VTK Data set creator for ParaView/Catalyst Coprocessing.
// Interfaces with a Fortran adaptor to get data from a Fortran
// simulation.

// Fortran specific header for API called from Fortran glue module
#include "vtkCPPythonAdaptorAPI.h"

// Adaptor specific headers, will need to change if simulation data structure
// changes. This should available from the ParaView install include directory
// if you chose to install development files during configuration.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include "vtkCellData.h"

#include <string>
#include <iostream>
// Names are mangled with trailing underscore for linker visibility, along with
// using extern "C"

// Creates the vtkDataSet container for the CoProcessor. The simulation data
// must be compatible.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz)
{
  if (!vtkCPPythonAdaptorAPI::GetCoProcessorData()) {
    vtkGenericWarningMacro("Unable to access coprocessor data.");
    return;
  }

  vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
	
  // Note: expects Fortran indexing
  img->SetExtent(0, *nx - 1, 0, *ny - 1, 0, *nz - 1);

  vtkCPPythonAdaptorAPI::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
}

// Add data to the container.
extern "C" void addfield_(double* simdata, char* name)
{
  vtkSmartPointer<vtkCPInputDataDescription> idd = 
    vtkCPPythonAdaptorAPI::GetCoProcessorData()->GetInputDescriptionByName("input");
  
  vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());

  if (!img) {
    vtkGenericWarningMacro("No adaptor grid to attach field data to.");
    return;
  }

  std::string fieldPythonName = name;
	
  if (idd->IsFieldNeeded(name)) {
    vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
    field->SetName(name);
    field->SetArray(simdata, img->GetNumberOfPoints(), 1);
    img->GetPointData()->AddArray(field);
  }
}

将 ParaView 协同处理与 MPI 和域分解结合使用

[编辑 | 编辑源代码]

以下说明是在 NICS 的 Nautilus 上使用 NS3D-MPI、NavierStokes3DfftIMR.f90、KgSemiImp3D.f90 和 NSLsplitting.f90 时得出的。

节点解释

[编辑 | 编辑源代码]

在使用单个 CPU 节点(非分区数据)时,可以将模拟数组直接传递给 ParaView,并使用 vtkDataSet::GetPointData()->AddArray() 方法将其添加到管道中。然后,vtkImageData 的点将与模拟的网格节点完全对应。该 vtkImageData 将具有隐式细胞结构,每个细胞角点都是模拟节点之一。

如果没有针对 MPI 的特殊更改,此代码将运行,但是 ParaView 将无法创建角点位于不同秩上的细胞。加载 .pvti 文件时会显示错误,指出范围不正确。可以加载各个 .vtis 文件,查看实际情况确实是这样的,以及细胞之间是否存在间隙。

例如,假设数组是 dimension(20,10,10),并且已分解为 2 个铅笔 (1:10,:,:) 和 (11:20,:,:)。ParaView 认为 (10:11,:,:) 的数据丢失,因此会出现间隙。

使用点光晕

[编辑 | 编辑源代码]

光晕提供了一种填补间隙的方法。2decomp&fft 光晕 api 创建包含铅笔数据和周围光晕元素的数组。例如,假设铅笔为 (10:20,10:20,:), 则光晕数组为 (9:21,9:21,:). 使用光晕会带来一些问题:

  • 光晕元素可能会超出计算域,这些元素将包含垃圾数据。
  • 并非每个铅笔都需要将相同的光晕元素发送给协同处理器。这需要在程序中添加额外的决策逻辑。
  • 从 C++ 的角度来看,光晕数组部分不会紧密打包,因此会包含垃圾数据。

模拟代码中的代码更改

[编辑 | 编辑源代码]

 

 

 

 

( D)
  • 添加了光晕数组。
  • 添加了用于将打包数组传递给 C++ 的一维数组。
  • 根据铅笔范围添加了用于判断要传递哪些光晕元素的分支。
    1. 如果铅笔距离索引原点最远,则仅发送铅笔,不发送光晕元素。
    2. 否则,如果铅笔位于最远范围的任一侧,但不位于两侧,则发送其他范围中最远的光晕元素。
    3. 否则,发送两侧最远的光晕元素。
  • 添加了对光晕区域的重塑。数组通过引用从 F90 传递到 C++。C++ 不理解 F90 数组区域,并且只会从引用中递增读取(这是任何数组区域问题,不仅仅是光晕)。解决此问题的一种方法是将区域重塑为一维数组,这将使所需数据在内存中连续。

代码下载

PROGRAM main    
  !-----------------------------------------------------------------------------------
  !
  !
  ! PURPOSE
  !
  ! This program numerically solves the 3D incompressible Navier-Stokes
  ! on a Cubic Domain [0,2pi]x[0,2pi]x[0,2pi] using pseudo-spectral methods and
  ! Implicit Midpoint rule timestepping. The numerical solution is compared to
  ! an exact solution reported by Shapiro 
  !
  ! Analytical Solution:
  !       u(x,y,z,t)=-0.25*(cos(x)sin(y)sin(z)+sin(x)cos(y)cos(z))exp(-t/Re)
  !       v(x,y,z,t)= 0.25*(sin(x)cos(y)sin(z)-cos(x)sin(y)cos(z))exp(-t/Re)
  !       w(x,y,z,t)= 0.5*cos(x)cos(y)sin(z)exp(-t/Re)
  !
  ! .. Parameters ..
  !  Nx                           = number of modes in x - power of 2 for FFT
  !  Ny                           = number of modes in y - power of 2 for FFT
  !  Nz                           = number of modes in z - power of 2 for FFT
  !  Nt                           = number of timesteps to take
  !  Tmax                         = maximum simulation time
  !  FFTW_IN_PLACE        = value for FFTW input 
  !  FFTW_MEASURE         = value for FFTW input
  !  FFTW_EXHAUSTIVE      = value for FFTW input
  !  FFTW_PATIENT         = value for FFTW input    
  !  FFTW_ESTIMATE        = value for FFTW input
  !  FFTW_FORWARD     = value for FFTW input
  !  FFTW_BACKWARD        = value for FFTW input  
  !  pi = 3.14159265358979323846264338327950288419716939937510d0
  !  Re                           = Reynolds number
  ! .. Scalars ..
  !  i                            = loop counter in x direction
  !  j                            = loop counter in y direction
  !  k                            = loop counter in z direction
  !  n                            = loop counter for timesteps direction  
  !  allocatestatus       = error indicator during allocation
  !  count                        = keep track of information written to disk
  !  iol                          = size of array to write to disk
  !  start                        = variable to record start time of program
  !  finish                       = variable to record end time of program
  !  count_rate           = variable for clock count rate
  !  planfxyz                     = Forward 3d fft plan 
  !  planbxyz                     = Backward 3d fft plan
  !  dt                           = timestep
  ! .. Arrays ..
  !  u                            = velocity in x direction
  !  v                            = velocity in y direction
  !  w                            = velocity in z direction
  !  uold                         = velocity in x direction at previous timestep
  !  vold                         = velocity in y direction at previous timestep
  !  wold                         = velocity in z direction at previous timestep
  !  ux                           = x derivative of velocity in x direction
  !  uy                           = y derivative of velocity in x direction
  !  uz                           = z derivative of velocity in x direction
  !  vx                           = x derivative of velocity in y direction
  !  vy                           = y derivative of velocity in y direction
  !  vz                           = z derivative of velocity in y direction
  !  wx                           = x derivative of velocity in z direction
  !  wy                           = y derivative of velocity in z direction
  !  wz                           = z derivative of velocity in z direction
  !  uxold                        = x derivative of velocity in x direction
  !  uyold                        = y derivative of velocity in x direction
  !  uzold                        = z derivative of velocity in x direction
  !  vxold                        = x derivative of velocity in y direction
  !  vyold                        = y derivative of velocity in y direction
  !  vzold                        = z derivative of velocity in y direction
  !  wxold                        = x derivative of velocity in z direction
  !  wyold                        = y derivative of velocity in z direction
  !  wzold                        = z derivative of velocity in z direction
  !  utemp                        = temporary storage of u to check convergence
  !  vtemp                        = temporary storage of u to check convergence
  !  wtemp                        = temporary storage of u to check convergence
  !  temp_r                       = temporary storage for untransformed variables
  !  uhat                         = Fourier transform of u
  !  vhat                         = Fourier transform of v
  !  what                         = Fourier transform of w
  !  rhsuhatfix           = Fourier transform of righthand side for u for timestepping
  !  rhsvhatfix           = Fourier transform of righthand side for v for timestepping
  !  rhswhatfix           = Fourier transform of righthand side for w for timestepping
  !  nonlinuhat           = Fourier transform of nonlinear term for u
  !  nonlinvhat           = Fourier transform of nonlinear term for u
  !  nonlinwhat           = Fourier transform of nonlinear term for u
  !  phat                         = Fourier transform of nonlinear term for pressure, p
  !  temp_c                       = temporary storage for Fourier transforms
  !  realtemp                     = Real storage
  !
  ! .. Vectors ..
  !  kx                           = fourier frequencies in x direction
  !  ky                           = fourier frequencies in y direction
  !  kz                           = fourier frequencies in z direction
  !  x                            = x locations
  !  y                            = y locations
  !  z                            = y locations
  !  time                         = times at which save data
  !  name_config          = array to store filename for data to be saved
  !               
  ! REFERENCES
  !
  ! A. Shapiro " The use of an exact solution of the Navier-Stokes equations 
  ! in a validation test of a three-dimensional nonhydrostatic numerical model"
  ! Monthly Weather Review vol. 121, 2420-2425, (1993).
  !
  ! ACKNOWLEDGEMENTS
  !
  ! ACCURACY
  !               
  ! ERROR INDICATORS AND WARNINGS
  !
  ! FURTHER COMMENTS
  !
  ! This program has not been optimized to use the least amount of memory
  ! but is intended as an example only for which all states can be saved
  !
  !--------------------------------------------------------------------------------
  ! External routines required
  ! 
  ! External libraries required
  ! 2DECOMP&FFT -- Fast Fourier Transform in the West Library
  !                       (http://2decomp.org/)


  !---------------------------------------------------------------------------------

  USE decomp_2d
  USE decomp_2d_fft
  USE decomp_2d_io
  USE MPI
  !coprocessing:
  use NSadaptor_module 
  IMPLICIT NONE   
  ! declare variables
  INTEGER(kind=4), PARAMETER              :: Nx=256
  INTEGER(kind=4), PARAMETER              :: Ny=256
  INTEGER(kind=4), PARAMETER              :: Nz=256
  INTEGER(kind=4), PARAMETER              :: Lx=1
  INTEGER(kind=4), PARAMETER              :: Ly=1
  INTEGER(kind=4), PARAMETER              :: Lz=1
  INTEGER(kind=4), PARAMETER              :: Nt=20
  REAL(kind=8), PARAMETER                 :: dt=0.05d0/Nt
  REAL(kind=8), PARAMETER                 :: Re=1.0d0     
  REAL(kind=8), PARAMETER                 :: tol=0.1d0**10
  REAL(kind=8), PARAMETER                 :: theta=0.0d0

  REAL(kind=8), PARAMETER &
       ::  pi=3.14159265358979323846264338327950288419716939937510d0
  REAL(kind=8), PARAMETER         ::      ReInv=1.0d0/REAL(Re,kind(0d0))
  REAL(kind=8), PARAMETER         ::  dtInv=1.0d0/REAL(dt,kind(0d0)) 
  REAL(kind=8)                            :: scalemodes,chg,factor
  REAL(kind=8), DIMENSION(:), ALLOCATABLE                 :: x, y, z, time,mychg,allchg
  COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE  :: u, v, w,&
       ux, uy, uz,&
       vx, vy, vz,&
       wx, wy, wz,&
       uold, uxold, uyold, uzold,&
       vold, vxold, vyold, vzold,&
       wold, wxold, wyold, wzold,&
       utemp, vtemp, wtemp, temp_r

  COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE      ::      kx, ky, kz                                              
  COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE  ::      uhat, vhat, what,&
       rhsuhatfix, rhsvhatfix,&
       rhswhatfix, nonlinuhat,&
       nonlinvhat, nonlinwhat,&
       phat,temp_c
  !coprocessing: added x,y,z component arrays also halos
  !coprocessing: halos get allocated inside call to update_halo
  REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE     ::  realtemp, realtempx, &
       realtempy, realtempz, tempxhalo, tempyhalo, tempzhalo
  !coprocessing: added 1D array for passing to C++
  real(kind=8), dimension(:), allocatable :: xpass2c, ypass2c, zpass2c
  ! MPI and 2DECOMP variables
  TYPE(DECOMP_INFO)                               ::  decomp
  INTEGER(kind=MPI_OFFSET_KIND)                   ::  filesize, disp
  INTEGER(kind=4)                                 ::  p_row=0, p_col=0, numprocs, myid, ierr      

  ! variables used for saving data and timing
  INTEGER(kind=4)                                 :: count, iol 
  INTEGER(kind=4)                                 :: i,j,k,n,t,allocatestatus
  INTEGER(kind=4)                                 :: ind, numberfile
  CHARACTER*100                                   :: name_config
  INTEGER(kind=4)                                 :: start, finish, count_rate 

  ! initialisation of 2DECOMP&FFT
  CALL MPI_INIT(ierr)
  CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
  CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr) 
  ! do automatic domain decomposition
  CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
  ! get information about domain decomposition choosen
  CALL decomp_info_init(Nx,Ny,Nz,decomp)
  ! initialise FFT library
  CALL decomp_2d_fft_init
  IF (myid.eq.0) THEN
     PRINT *,'Grid:',Nx,'X',Ny,'Y',Nz,'Z'
     PRINT *,'dt:',dt
  END IF

  ALLOCATE(x(1:Nx),y(1:Ny),z(1:Nz),time(1:Nt+1),mychg(1:3),allchg(1:3),&
       u(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),& 
       v(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       w(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       ux(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uy(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uz(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vx(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vy(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vz(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wx(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wy(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wz(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uxold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uyold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uzold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vxold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vyold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vzold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wxold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wyold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wzold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       utemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vtemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wtemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       temp_r(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       kx(1:Nx),ky(1:Ny),kz(1:Nz),&
       uhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       vhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       what(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       rhsuhatfix(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       rhsvhatfix(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       rhswhatfix(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       nonlinuhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       nonlinvhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       nonlinwhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       phat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       temp_c(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       realtemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),& 
       realtempx(decomp%xst(1):decomp%xen(1), &
       decomp%xst(2):decomp%xen(2), &
       decomp%xst(3):decomp%xen(3)),&
       realtempy(decomp%xst(1):decomp%xen(1), &
       decomp%xst(2):decomp%xen(2), &
       decomp%xst(3):decomp%xen(3)),&
       realtempz(decomp%xst(1):decomp%xen(1), &
       decomp%xst(2):decomp%xen(2), &
       decomp%xst(3):decomp%xen(3)),&
       stat=AllocateStatus)      
  IF (AllocateStatus .ne. 0) STOP
  IF (myid.eq.0) THEN
     PRINT *,'allocated space'
  END IF

  ! setup fourier frequencies in x-direction
  DO i=1,Nx/2+1
     kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx                        
  END DO
  kx(1+Nx/2)=0.0d0
  DO i = 1,Nx/2 -1
     kx(i+1+Nx/2)=-kx(1-i+Nx/2)
  END DO
  ind=1
  DO i=-Nx/2,Nx/2-1
     x(ind)=2.0d0*pi*REAL(i,kind(0d0))*Lx/REAL(Nx,kind(0d0))
     ind=ind+1
  END DO
  ! setup fourier frequencies in y-direction
  DO j=1,Ny/2+1
     ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly                        
  END DO
  ky(1+Ny/2)=0.0d0
  DO j = 1,Ny/2 -1
     ky(j+1+Ny/2)=-ky(1-j+Ny/2)
  END DO
  ind=1
  DO j=-Ny/2,Ny/2-1
     y(ind)=2.0d0*pi*REAL(j,kind(0d0))*Ly/REAL(Ny,kind(0d0))
     ind=ind+1
  END DO
  ! setup fourier frequencies in z-direction
  DO k=1,Nz/2+1
     kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz                        
  END DO
  kz(1+Nz/2)=0.0d0
  DO k = 1,Nz/2 -1
     kz(k+1+Nz/2)=-kz(1-k+Nz/2)
  END DO
  ind=1
  DO k=-Nz/2,Nz/2-1
     z(ind)=2.0d0*pi*REAL(k,kind(0d0))*Lz/REAL(Nz,kind(0d0))
     ind=ind+1
  END DO
  scalemodes=1.0d0/REAL(Nx*Ny*Nz,kind(0d0))
  IF (myid.eq.0) THEN
     PRINT *,'Setup grid and fourier frequencies'
  END IF

  !initial conditions for Taylor-Green vortex
  !       factor=2.0d0/sqrt(3.0d0)
  !       DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
  !               u(i,j,k)=factor*sin(theta+2.0d0*pi/3.0d0)*sin(x(i))*cos(y(j))*cos(z(k))
  !       END DO; END DO; END DO
  !       DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
  !               v(i,j,k)=factor*sin(theta-2.0d0*pi/3.0d0)*cos(x(i))*sin(y(j))*cos(z(k))
  !       END DO ; END DO ; END DO
  !       DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
  !               w(i,j,k)=factor*sin(theta)*cos(x(i))*cos(y(j))*sin(z(k))
  !       END DO ; END DO ; END DO

  time(1)=0.0d0
  factor=sqrt(3.0d0)
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     u(i,j,k)=-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
          +sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
  END DO; END DO; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     v(i,j,k)=0.5*(  factor*sin(x(i))*cos(y(j))*sin(z(k))&
          -cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
  END DO; END DO ; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     w(i,j,k)=cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(1)/Re)
  END DO; END DO ; END DO

  CALL decomp_2d_fft_3d(u,uhat,DECOMP_2D_FFT_FORWARD)
  CALL decomp_2d_fft_3d(v,vhat,DECOMP_2D_FFT_FORWARD)
  CALL decomp_2d_fft_3d(w,what,DECOMP_2D_FFT_FORWARD)


  ! derivative of u with respect to x, y, and z 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD) 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD) 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD) 

  ! derivative of v with respect to x, y, and z 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)         
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD) 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)         

  ! derivative of w with respect to x, y, and z 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)         
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)         
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)         

  ! save initial data
  n = 0
  !coprocessing: removed savedata calls from coprocessing version
  !coprocessing: could have this after the coprocessor initialize and
  !coprocessing: followed by a call to the coprocessor to write out the initial data
  !coprocessing: image. Otherwise, these are just wasted calls at this spot.
  call coprocessorinitialize("pipeline.py", 11)

  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
  END DO; END DO ; END DO
  call update_halo(realtempx, tempxhalo, level=1)

  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
  END DO; END DO ; END DO
  call update_halo(realtempy, tempyhalo, level=1)

  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
  END DO; END DO ; END DO
  call update_halo(realtempz, tempzhalo, level=1)

  !coprocessing: Four cases of sending halo data to the coprocessor.
  if ((ny_global == decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
     !coprocessing: case 1: pencil furthest from index origin, send just the pencil
     allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
     allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
     allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
     xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(xpass2c)/))
     ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(ypass2c)/))     
     zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(zpass2c)/))
     call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
          decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
          xpass2c, ypass2c, zpass2c)
     deallocate(xpass2c, ypass2c, zpass2c)
  else if ((ny_global > decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
     !coprocessing: case 2: pencil furthest in z, send just upper y halo
     allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
     allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
     allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
     xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(xpass2c)/))
     ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(ypass2c)/))
     zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(zpass2c)/))
     call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
          decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3), n, n*dt, &
          xpass2c, ypass2c, zpass2c)
     deallocate(xpass2c, ypass2c, zpass2c)
  else if ((ny_global == decomp%xen(2)) .and. (nz_global > decomp%xen(3))) then
     !coprocessing: case 3: pencil furthest in y, send just upper z halo
     allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
     allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
     allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
     xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(xpass2c)/))
     ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(ypass2c)/))
     zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(zpass2c)/))
     call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
          decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3)+1, n, n*dt, &
          xpass2c, ypass2c, zpass2c)
     deallocate(xpass2c, ypass2c, zpass2c)
  else
     !coprocessing: case 4: send both upper y and upper z halo
     allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
     allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
     allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
     xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(xpass2c)/))
     ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(ypass2c)/))
     zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(zpass2c)/))
     call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
          decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3)+1,n,n*dt, &
          xpass2c, ypass2c, zpass2c)
     deallocate(xpass2c,ypass2c,zpass2c)
  end if

  !start timer
  CALL system_clock(start,count_rate)
  ! coprocessing: Simulation loop starts here
  DO n=1,Nt
     !fixed point
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        uold(i,j,k)=u(i,j,k)
        uxold(i,j,k)=ux(i,j,k)
        uyold(i,j,k)=uy(i,j,k)
        uzold(i,j,k)=uz(i,j,k)
     END DO; END DO ; END DO
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        vold(i,j,k)=v(i,j,k)
        vxold(i,j,k)=vx(i,j,k)
        vyold(i,j,k)=vy(i,j,k)
        vzold(i,j,k)=vz(i,j,k)
     END DO; END DO ; END DO
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        wold(i,j,k)=w(i,j,k)
        wxold(i,j,k)=wx(i,j,k)
        wyold(i,j,k)=wy(i,j,k)
        wzold(i,j,k)=wz(i,j,k)
     END DO; END DO ; END DO
     DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
        rhsuhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*uhat(i,j,k) 
     END DO; END DO ; END DO
     DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
        rhsvhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*vhat(i,j,k) 
     END DO; END DO ; END DO
     DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
        rhswhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*what(i,j,k) 
     END DO; END DO ; END DO

     chg=1
     DO WHILE (chg .gt. tol)
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(ux(i,j,k)+uxold(i,j,k))&
                +(v(i,j,k)+vold(i,j,k))*(uy(i,j,k)+uyold(i,j,k))&
                +(w(i,j,k)+wold(i,j,k))*(uz(i,j,k)+uzold(i,j,k)))
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_r,nonlinuhat,DECOMP_2D_FFT_FORWARD)
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(vx(i,j,k)+vxold(i,j,k))&
                +(v(i,j,k)+vold(i,j,k))*(vy(i,j,k)+vyold(i,j,k))&
                +(w(i,j,k)+wold(i,j,k))*(vz(i,j,k)+vzold(i,j,k)))
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_r,nonlinvhat,DECOMP_2D_FFT_FORWARD)
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(wx(i,j,k)+wxold(i,j,k))&
                +(v(i,j,k)+vold(i,j,k))*(wy(i,j,k)+wyold(i,j,k))&
                +(w(i,j,k)+wold(i,j,k))*(wz(i,j,k)+wzold(i,j,k)))
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_r,nonlinwhat,DECOMP_2D_FFT_FORWARD)
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           phat(i,j,k)=-1.0d0*( kx(i)*nonlinuhat(i,j,k)&
                +ky(j)*nonlinvhat(i,j,k)&
                +kz(k)*nonlinwhat(i,j,k))&
                /(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)+0.1d0**13)
        END DO; END DO ; END DO

        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           uhat(i,j,k)=(rhsuhatfix(i,j,k)-nonlinuhat(i,j,k)-kx(i)*phat(i,j,k))/&
                (dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           vhat(i,j,k)=(rhsvhatfix(i,j,k)-nonlinvhat(i,j,k)-ky(j)*phat(i,j,k))/&
                (dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           what(i,j,k)=(rhswhatfix(i,j,k)-nonlinwhat(i,j,k)-kz(k)*phat(i,j,k))/&
                (dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
        END DO; END DO ; END DO

        ! derivative of u with respect to x, y, and z 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD) 

        ! derivative of v with respect to x, y, and z 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD) 

        ! derivative of w with respect to x, y, and z 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD) 

        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           utemp(i,j,k)=u(i,j,k)
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           vtemp(i,j,k)=v(i,j,k)
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           wtemp(i,j,k)=w(i,j,k)
        END DO; END DO ; END DO

        CALL decomp_2d_fft_3d(uhat,u,DECOMP_2D_FFT_BACKWARD)    
        CALL decomp_2d_fft_3d(vhat,v,DECOMP_2D_FFT_BACKWARD)    
        CALL decomp_2d_fft_3d(what,w,DECOMP_2D_FFT_BACKWARD)    

        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           u(i,j,k)=u(i,j,k)*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           v(i,j,k)=v(i,j,k)*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           w(i,j,k)=w(i,j,k)*scalemodes
        END DO; END DO ; END DO

        mychg(1) =maxval(abs(utemp-u))
        mychg(2) =maxval(abs(vtemp-v))
        mychg(3) =maxval(abs(wtemp-w))
        CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
        chg=allchg(1)+allchg(2)+allchg(3)
        IF (myid.eq.0) THEN
           PRINT *,'chg:',chg
        END IF
     END DO
     time(n+1)=n*dt

     !goto 5100
     IF (myid.eq.0) THEN     
        PRINT *,'time',n*dt
     END IF

     !coprocessing: populating the arrays sent to the coprocessor
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
     END DO; END DO ; END DO
     call update_halo(realtempx, tempxhalo, level=1)

     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
     END DO; END DO ; END DO
     call update_halo(realtempy, tempyhalo, level=1)

     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
     END DO; END DO ; END DO
     call update_halo(realtempz, tempzhalo, level=1)

     !coprocessing: same cases as for initial conditions. 
     if ((ny_global == decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
        ! case 1: send just the pencil
        allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
        allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
        allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
        xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(xpass2c)/))
        ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(ypass2c)/))     
        zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(zpass2c)/))
        call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
             decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
             xpass2c, ypass2c, zpass2c)
        deallocate(xpass2c, ypass2c, zpass2c)
     else if ((ny_global > decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
        ! case 2: send just upper y halo
        allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
        allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
        allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
        xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(xpass2c)/))
        ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(ypass2c)/))
        zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(zpass2c)/))
        call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
             decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3), n, n*dt, &
             xpass2c, ypass2c, zpass2c)
        deallocate(xpass2c, ypass2c, zpass2c)
     else if ((ny_global == decomp%xen(2)) .and. (nz_global > decomp%xen(3))) then
        ! case 3: send just upper z halo
        allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
        allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
        allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
        xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(xpass2c)/))
        ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(ypass2c)/))
        zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(zpass2c)/))
        call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
             decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3)+1, n, n*dt, &
             xpass2c, ypass2c, zpass2c)
        deallocate(xpass2c, ypass2c, zpass2c)
     else
        ! send both upper y and upper z halo
        allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
        allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
        allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
        xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(xpass2c)/))
        ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(ypass2c)/))
        zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(zpass2c)/))
        call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
             decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3)+1,n,n*dt, &
             xpass2c, ypass2c, zpass2c)
        deallocate(xpass2c,ypass2c,zpass2c)
     end if

  END DO
  !coprocessing:
  call coprocessorfinalize()

  CALL system_clock(finish,count_rate)

  IF (myid.eq.0) then
     PRINT *, 'Program took', REAL(finish-start)/REAL(count_rate), 'for main timestepping loop'
  END IF

  IF (myid.eq.0) THEN
     name_config = './data/tdata.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO n=1,1+Nt
        WRITE(11,*) time(n)
     END DO 
     CLOSE(11)

     name_config = './data/xcoord.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO i=1,Nx
        WRITE(11,*) x(i)
     END DO
     CLOSE(11)       

     name_config = './data/ycoord.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO j=1,Ny
        WRITE(11,*) y(j)
     END DO
     CLOSE(11)

     name_config = './data/zcoord.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO k=1,Nz
        WRITE(11,*) z(k)
     END DO
     CLOSE(11)
     PRINT *,'Saved data'
  END IF

  ! Calculate error in final numerical solution
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     utemp(i,j,k)=u(i,j,k) -&
          (-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
          +sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
  END DO; END DO; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     vtemp(i,j,k)=v(i,j,k) -&
          (0.5*(  factor*sin(x(i))*cos(y(j))*sin(z(k))&
          -cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
  END DO; END DO ; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     wtemp(i,j,k)=w(i,j,k)-&
          (cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(Nt+1)/Re))
  END DO; END DO ; END DO
  mychg(1) = maxval(abs(utemp))
  mychg(2) = maxval(abs(vtemp))
  mychg(3) = maxval(abs(wtemp))
  CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
  chg=allchg(1)+allchg(2)+allchg(3)
  IF (myid.eq.0) THEN
     PRINT*,'The error at the final timestep is',chg
  END IF

  ! clean up 
  CALL decomp_2d_fft_finalize
  CALL decomp_2d_finalize

  DEALLOCATE(x,y,z,time,mychg,allchg,u,v,w,ux,uy,uz,vx,vy,vz,wx,wy,wz,uold,uxold,uyold,uzold,&
       vold,vxold,vyold,vzold,wold,wxold,wyold,wzold,utemp,vtemp,wtemp,&
       temp_r,kx,ky,kz,uhat,vhat,what,rhsuhatfix,rhsvhatfix,&
       rhswhatfix,phat,nonlinuhat,nonlinvhat,nonlinwhat,temp_c,&
       realtemp,stat=AllocateStatus)         
  !coprocessing: deallocate Coprocessing specific arrays
  deallocate(realtempx, realtempy, realtempz, tempxhalo, tempyhalo, tempzhalo)
  IF (AllocateStatus .ne. 0) STOP
  IF (myid.eq.0) THEN
     PRINT *,'Program execution complete'
  END IF
  CALL MPI_FINALIZE(ierr)         

END PROGRAM main

f90 适配器粘合代码中的代码更改

[编辑 | 编辑源代码]

 

 

 

 

( E)
  • 更改适配器以接受当前铅笔的子范围。

代码下载

! fortran module for interacting with the ParaView CoProcessor
! loosely based on: 
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaNSadaptor.f
! -- Changed for SESE

! subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.      
module NSadaptor_module 
  use iso_c_binding
  implicit none
  public
  interface NSadaptor
    ! Originally also had a version that accepted 3D arrays, but in all other
    ! respects was identical. Decided this could lead to too much confusion.
    module procedure NSadaptor1D 
  end interface NSadaptor
contains
  subroutine NSadaptor1D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
                                     step, time, omegax, omegay, omegaz)
    ! nx, ny, nz     -- grid dimensions or entire mesh
    !                   used for setting whole extent
    ! xst, xen, etc. -- extents of current subdomain pencil
    ! step           -- current simulation time step
    ! time           -- current simulation time
    ! omega*          -- scalar array for the current time step
    ! flag           -- receives status from API calls
    integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
    real(kind=8), intent(in) :: time
    real(kind=8), dimension(:), intent (in) :: omegax, omegay, omegaz
    integer :: flag

    ! check if processing this time step
    ! defined in FortranAdaptorAPI.h
    call requestdatadescription(step, time, flag)
        
    if (flag /= 0) then
       ! processing requested
       ! check if need to create grid
       ! defined in FortranAdaptorAPI.h
       call needtocreategrid(flag)
       
       if (flag /= 0) then
          ! grid needed
          ! defined in VTKPointBasedDataSet.cxx
          ! takes the size of the entire grid, and the extents of the
          ! sub grid.
          call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
       end if
          
       ! defined in VTKPointBasedDataSet.cxx
       ! remember to null-terminate strings for C/C++
       call addfield(omegax, "realtempx"//char(0));
       call addfield(omegay, "realtempy"//char(0));
       call addfield(omegaz, "realtempz"//char(0));       

       ! defined in FortranAdaptorAPI.h
       call coprocess()
    end if
    
    return
  end subroutine NSadaptor1D 
end module NSadaptor_module

C++ 包装代码中的代码更改

[编辑 | 编辑源代码]

 

 

 

 

( F)
  • 指定了相对于整个数据的已分区数据范围。
  • 指定了单元格的相对间距。

代码下载

// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx


// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h" 

// CoProcessor specific headers
// Routines that take the place of VTK dataset object creation.
// Called from Fortran code which also calls the Fortran Adaptor API
// supplied with ParaView source.
// Note: names mangled with trailing underscores for linker visibility.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"

#include <string>

// These will be called from the Fortran "glue" code"
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.

// Creates the data container for the CoProcessor.
// Takes the extents for both the global dataset and the particular subsection
// visible to the current MPI rank.
// Note: expects to receive Fortran base-1 indices.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,
	int* yst, int* yen, int* zst, int* zen)
{
  if (!ParaViewCoProcessing::GetCoProcessorData()) {
    vtkGenericWarningMacro("Unable to access CoProcessorData.");
    return;
  }

  // The simulation grid is a 3-dimensional topologically and geometrically 
  // regular grid. In VTK/ParaView, this is considered an image data set.
  vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
  img->SetExtent(*xst - 1, *xen - 1, *yst - 1, *yen - 1, *zst - 1, *zen - 1); 
  img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz); 
  ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
  ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx - 1, 0, *ny - 1, 0, *nz - 1);
}

// Add field(s) to the data container.
// Separate from above because this will be dynamic, grid is static.
// Might be an issue, VTK assumes row major C storage.
// Underscore is by-hand name mangling for fortran linking.
// Note: Expects the address of the data, has no way of determining
// if the array is densely packed or not.
// Note 2: Expects "name" to point to null-terminated character array, 
// be sure to null-terminate in caller.
extern "C" void addfield_(double* a, char* name)
{

  vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
  vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());

  if (!img) {
    vtkGenericWarningMacro("No adaptor grid to attach field data to.");
    return;
  }


  if (idd->IsFieldNeeded(name)) {
    vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
    field->SetName(name);
    field->SetArray(a, img->GetNumberOfPoints(), 1); 
    img->GetPointData()->AddArray(field);
  }
}
  • 某些 VTK/ParaView 过滤器可以使用幽灵(光晕)单元格,代码可以扩展以处理这些单元格。
  • 在模拟代码本身中添加了很多代码。
  • 光晕和打包数组需要额外的内存。
  • 为 C++ 重塑数组的额外开销。
  • 不同的分解库可能需要更改光晕逻辑。
  • 不能与 3D 分解库一起使用。

将模拟节点重新解释为可视化单元格

[编辑 | 编辑源代码]

查看数据的另一种方式是,模拟的每个节点都是可视化域中的一个单元格。然后,当铅笔发送到 ParaView 时,就不会有缺失单元格的间隙。

模拟代码中的代码更改

[编辑 | 编辑源代码]

 

 

 

 

( G)
  • 除了常规的 CoProcessing 调用之外,没有其他更改。

代码下载

PROGRAM main    
  !-----------------------------------------------------------------------------------
  !
  !
  ! PURPOSE
  !
  ! This program numerically solves the 3D incompressible Navier-Stokes
  ! on a Cubic Domain [0,2pi]x[0,2pi]x[0,2pi] using pseudo-spectral methods and
  ! Implicit Midpoint rule timestepping. The numerical solution is compared to
  ! an exact solution reported by Shapiro 
  !
  ! Analytical Solution:
  !       u(x,y,z,t)=-0.25*(cos(x)sin(y)sin(z)+sin(x)cos(y)cos(z))exp(-t/Re)
  !       v(x,y,z,t)= 0.25*(sin(x)cos(y)sin(z)-cos(x)sin(y)cos(z))exp(-t/Re)
  !       w(x,y,z,t)= 0.5*cos(x)cos(y)sin(z)exp(-t/Re)
  !
  ! .. Parameters ..
  !  Nx                           = number of modes in x - power of 2 for FFT
  !  Ny                           = number of modes in y - power of 2 for FFT
  !  Nz                           = number of modes in z - power of 2 for FFT
  !  Nt                           = number of timesteps to take
  !  Tmax                         = maximum simulation time
  !  FFTW_IN_PLACE        = value for FFTW input 
  !  FFTW_MEASURE         = value for FFTW input
  !  FFTW_EXHAUSTIVE      = value for FFTW input
  !  FFTW_PATIENT         = value for FFTW input    
  !  FFTW_ESTIMATE        = value for FFTW input
  !  FFTW_FORWARD     = value for FFTW input
  !  FFTW_BACKWARD        = value for FFTW input  
  !  pi = 3.14159265358979323846264338327950288419716939937510d0
  !  Re                           = Reynolds number
  ! .. Scalars ..
  !  i                            = loop counter in x direction
  !  j                            = loop counter in y direction
  !  k                            = loop counter in z direction
  !  n                            = loop counter for timesteps direction  
  !  allocatestatus       = error indicator during allocation
  !  count                        = keep track of information written to disk
  !  iol                          = size of array to write to disk
  !  start                        = variable to record start time of program
  !  finish                       = variable to record end time of program
  !  count_rate           = variable for clock count rate
  !  planfxyz                     = Forward 3d fft plan 
  !  planbxyz                     = Backward 3d fft plan
  !  dt                           = timestep
  ! .. Arrays ..
  !  u                            = velocity in x direction
  !  v                            = velocity in y direction
  !  w                            = velocity in z direction
  !  uold                         = velocity in x direction at previous timestep
  !  vold                         = velocity in y direction at previous timestep
  !  wold                         = velocity in z direction at previous timestep
  !  ux                           = x derivative of velocity in x direction
  !  uy                           = y derivative of velocity in x direction
  !  uz                           = z derivative of velocity in x direction
  !  vx                           = x derivative of velocity in y direction
  !  vy                           = y derivative of velocity in y direction
  !  vz                           = z derivative of velocity in y direction
  !  wx                           = x derivative of velocity in z direction
  !  wy                           = y derivative of velocity in z direction
  !  wz                           = z derivative of velocity in z direction
  !  uxold                        = x derivative of velocity in x direction
  !  uyold                        = y derivative of velocity in x direction
  !  uzold                        = z derivative of velocity in x direction
  !  vxold                        = x derivative of velocity in y direction
  !  vyold                        = y derivative of velocity in y direction
  !  vzold                        = z derivative of velocity in y direction
  !  wxold                        = x derivative of velocity in z direction
  !  wyold                        = y derivative of velocity in z direction
  !  wzold                        = z derivative of velocity in z direction
  !  utemp                        = temporary storage of u to check convergence
  !  vtemp                        = temporary storage of u to check convergence
  !  wtemp                        = temporary storage of u to check convergence
  !  temp_r                       = temporary storage for untransformed variables
  !  uhat                         = Fourier transform of u
  !  vhat                         = Fourier transform of v
  !  what                         = Fourier transform of w
  !  rhsuhatfix           = Fourier transform of righthand side for u for timestepping
  !  rhsvhatfix           = Fourier transform of righthand side for v for timestepping
  !  rhswhatfix           = Fourier transform of righthand side for w for timestepping
  !  nonlinuhat           = Fourier transform of nonlinear term for u
  !  nonlinvhat           = Fourier transform of nonlinear term for u
  !  nonlinwhat           = Fourier transform of nonlinear term for u
  !  phat                         = Fourier transform of nonlinear term for pressure, p
  !  temp_c                       = temporary storage for Fourier transforms
  !  realtemp                     = Real storage
  !
  ! .. Vectors ..
  !  kx                           = fourier frequencies in x direction
  !  ky                           = fourier frequencies in y direction
  !  kz                           = fourier frequencies in z direction
  !  x                            = x locations
  !  y                            = y locations
  !  z                            = y locations
  !  time                         = times at which save data
  !  name_config          = array to store filename for data to be saved
  !               
  ! REFERENCES
  !
  ! A. Shapiro " The use of an exact solution of the Navier-Stokes equations 
  ! in a validation test of a three-dimensional nonhydrostatic numerical model"
  ! Monthly Weather Review vol. 121, 2420-2425, (1993).
  !
  ! ACKNOWLEDGEMENTS
  !
  ! ACCURACY
  !               
  ! ERROR INDICATORS AND WARNINGS
  !
  ! FURTHER COMMENTS
  !
  ! This program has not been optimized to use the least amount of memory
  ! but is intended as an example only for which all states can be saved
  !
  !--------------------------------------------------------------------------------
  ! External routines required
  ! 
  ! External libraries required
  ! 2DECOMP&FFT -- Fast Fourier Transform in the West Library
  !                       (http://2decomp.org/)


  !---------------------------------------------------------------------------------

  USE decomp_2d
  USE decomp_2d_fft
  USE decomp_2d_io
  USE MPI
  !coprocessing:
  use NSadaptor_module 
  IMPLICIT NONE   
  ! declare variables
  INTEGER(kind=4), PARAMETER              :: Nx=256
  INTEGER(kind=4), PARAMETER              :: Ny=256
  INTEGER(kind=4), PARAMETER              :: Nz=256
  INTEGER(kind=4), PARAMETER              :: Lx=1
  INTEGER(kind=4), PARAMETER              :: Ly=1
  INTEGER(kind=4), PARAMETER              :: Lz=1
  INTEGER(kind=4), PARAMETER              :: Nt=20
  REAL(kind=8), PARAMETER                 :: dt=0.05d0/Nt
  REAL(kind=8), PARAMETER                 :: Re=1.0d0     
  REAL(kind=8), PARAMETER                 :: tol=0.1d0**10
  REAL(kind=8), PARAMETER                 :: theta=0.0d0

  REAL(kind=8), PARAMETER &
       ::  pi=3.14159265358979323846264338327950288419716939937510d0
  REAL(kind=8), PARAMETER         ::      ReInv=1.0d0/REAL(Re,kind(0d0))
  REAL(kind=8), PARAMETER         ::  dtInv=1.0d0/REAL(dt,kind(0d0)) 
  REAL(kind=8)                            :: scalemodes,chg,factor
  REAL(kind=8), DIMENSION(:), ALLOCATABLE                 :: x, y, z, time,mychg,allchg
  COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE  :: u, v, w,&
       ux, uy, uz,&
       vx, vy, vz,&
       wx, wy, wz,&
       uold, uxold, uyold, uzold,&
       vold, vxold, vyold, vzold,&
       wold, wxold, wyold, wzold,&
       utemp, vtemp, wtemp, temp_r

  COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE      ::      kx, ky, kz                                              
  COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE  ::      uhat, vhat, what,&
       rhsuhatfix, rhsvhatfix,&
       rhswhatfix, nonlinuhat,&
       nonlinvhat, nonlinwhat,&
       phat,temp_c
  !coprocessing: added x,y,z component arrays 
  REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE     ::  realtemp, realtempx, &
       realtempy, realtempz
  ! MPI and 2DECOMP variables
  TYPE(DECOMP_INFO)                               ::  decomp
  INTEGER(kind=MPI_OFFSET_KIND)                   ::  filesize, disp
  INTEGER(kind=4)                                 ::  p_row=0, p_col=0, numprocs, myid, ierr      

  ! variables used for saving data and timing
  INTEGER(kind=4)                                 :: count, iol 
  INTEGER(kind=4)                                 :: i,j,k,n,t,allocatestatus
  INTEGER(kind=4)                                 :: ind, numberfile
  CHARACTER*100                                   :: name_config
  INTEGER(kind=4)                                 :: start, finish, count_rate 

  ! initialisation of 2DECOMP&FFT
  CALL MPI_INIT(ierr)
  CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
  CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr) 
  ! do automatic domain decomposition
  CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
  ! get information about domain decomposition choosen
  CALL decomp_info_init(Nx,Ny,Nz,decomp)
  ! initialise FFT library
  CALL decomp_2d_fft_init
  IF (myid.eq.0) THEN
     PRINT *,'Grid:',Nx,'X',Ny,'Y',Nz,'Z'
     PRINT *,'dt:',dt
  END IF

  ALLOCATE(x(1:Nx),y(1:Ny),z(1:Nz),time(1:Nt+1),mychg(1:3),allchg(1:3),&
       u(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),& 
       v(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       w(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       ux(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uy(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uz(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vx(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vy(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vz(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wx(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wy(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wz(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uxold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uyold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       uzold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vxold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vyold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vzold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wxold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wyold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wzold(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       utemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       vtemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       wtemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       temp_r(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),&
       kx(1:Nx),ky(1:Ny),kz(1:Nz),&
       uhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       vhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       what(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       rhsuhatfix(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       rhsvhatfix(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       rhswhatfix(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       nonlinuhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       nonlinvhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       nonlinwhat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       phat(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       temp_c(decomp%zst(1):decomp%zen(1),&
       decomp%zst(2):decomp%zen(2),&
       decomp%zst(3):decomp%zen(3)),&
       realtemp(decomp%xst(1):decomp%xen(1),&
       decomp%xst(2):decomp%xen(2),&
       decomp%xst(3):decomp%xen(3)),& 
       realtempx(decomp%xst(1):decomp%xen(1), &
       decomp%xst(2):decomp%xen(2), &
       decomp%xst(3):decomp%xen(3)),&
       realtempy(decomp%xst(1):decomp%xen(1), &
       decomp%xst(2):decomp%xen(2), &
       decomp%xst(3):decomp%xen(3)),&
       realtempz(decomp%xst(1):decomp%xen(1), &
       decomp%xst(2):decomp%xen(2), &
       decomp%xst(3):decomp%xen(3)),&
       stat=AllocateStatus)      
  IF (AllocateStatus .ne. 0) STOP
  IF (myid.eq.0) THEN
     PRINT *,'allocated space'
  END IF

  ! setup fourier frequencies in x-direction
  DO i=1,Nx/2+1
     kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx                        
  END DO
  kx(1+Nx/2)=0.0d0
  DO i = 1,Nx/2 -1
     kx(i+1+Nx/2)=-kx(1-i+Nx/2)
  END DO
  ind=1
  DO i=-Nx/2,Nx/2-1
     x(ind)=2.0d0*pi*REAL(i,kind(0d0))*Lx/REAL(Nx,kind(0d0))
     ind=ind+1
  END DO
  ! setup fourier frequencies in y-direction
  DO j=1,Ny/2+1
     ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly                        
  END DO
  ky(1+Ny/2)=0.0d0
  DO j = 1,Ny/2 -1
     ky(j+1+Ny/2)=-ky(1-j+Ny/2)
  END DO
  ind=1
  DO j=-Ny/2,Ny/2-1
     y(ind)=2.0d0*pi*REAL(j,kind(0d0))*Ly/REAL(Ny,kind(0d0))
     ind=ind+1
  END DO
  ! setup fourier frequencies in z-direction
  DO k=1,Nz/2+1
     kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz                        
  END DO
  kz(1+Nz/2)=0.0d0
  DO k = 1,Nz/2 -1
     kz(k+1+Nz/2)=-kz(1-k+Nz/2)
  END DO
  ind=1
  DO k=-Nz/2,Nz/2-1
     z(ind)=2.0d0*pi*REAL(k,kind(0d0))*Lz/REAL(Nz,kind(0d0))
     ind=ind+1
  END DO
  scalemodes=1.0d0/REAL(Nx*Ny*Nz,kind(0d0))
  IF (myid.eq.0) THEN
     PRINT *,'Setup grid and fourier frequencies'
  END IF

  !initial conditions for Taylor-Green vortex
  !       factor=2.0d0/sqrt(3.0d0)
  !       DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
  !               u(i,j,k)=factor*sin(theta+2.0d0*pi/3.0d0)*sin(x(i))*cos(y(j))*cos(z(k))
  !       END DO; END DO; END DO
  !       DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
  !               v(i,j,k)=factor*sin(theta-2.0d0*pi/3.0d0)*cos(x(i))*sin(y(j))*cos(z(k))
  !       END DO ; END DO ; END DO
  !       DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
  !               w(i,j,k)=factor*sin(theta)*cos(x(i))*cos(y(j))*sin(z(k))
  !       END DO ; END DO ; END DO

  time(1)=0.0d0
  factor=sqrt(3.0d0)
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     u(i,j,k)=-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
          +sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
  END DO; END DO; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     v(i,j,k)=0.5*(  factor*sin(x(i))*cos(y(j))*sin(z(k))&
          -cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
  END DO; END DO ; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     w(i,j,k)=cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(1)/Re)
  END DO; END DO ; END DO

  CALL decomp_2d_fft_3d(u,uhat,DECOMP_2D_FFT_FORWARD)
  CALL decomp_2d_fft_3d(v,vhat,DECOMP_2D_FFT_FORWARD)
  CALL decomp_2d_fft_3d(w,what,DECOMP_2D_FFT_FORWARD)


  ! derivative of u with respect to x, y, and z 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD) 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD) 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD) 

  ! derivative of v with respect to x, y, and z 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)         
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD) 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)         

  ! derivative of w with respect to x, y, and z 
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)         
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)         
  DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
     temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
  END DO; END DO ; END DO
  CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)         

  ! save initial data
  n = 0
  !coprocessing: removed savedata calls from coprocessing version
  !coprocessing: could have this after the coprocessor initialize and
  !coprocessing: followed by a call to the coprocessor to write out the initial data
  !coprocessing: image. Otherwise, these are just wasted calls at this spot.
  call coprocessorinitialize("pipeline.py", 11)

  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
  END DO; END DO ; END DO

  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
  END DO; END DO ; END DO

  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
  END DO; END DO ; END DO

  call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
          decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
          realtempx, realtempy, realtempz)

  !start timer
  CALL system_clock(start,count_rate)
  ! coprocessing: Simulation loop starts here
  DO n=1,Nt
     !fixed point
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        uold(i,j,k)=u(i,j,k)
        uxold(i,j,k)=ux(i,j,k)
        uyold(i,j,k)=uy(i,j,k)
        uzold(i,j,k)=uz(i,j,k)
     END DO; END DO ; END DO
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        vold(i,j,k)=v(i,j,k)
        vxold(i,j,k)=vx(i,j,k)
        vyold(i,j,k)=vy(i,j,k)
        vzold(i,j,k)=vz(i,j,k)
     END DO; END DO ; END DO
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        wold(i,j,k)=w(i,j,k)
        wxold(i,j,k)=wx(i,j,k)
        wyold(i,j,k)=wy(i,j,k)
        wzold(i,j,k)=wz(i,j,k)
     END DO; END DO ; END DO
     DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
        rhsuhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*uhat(i,j,k) 
     END DO; END DO ; END DO
     DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
        rhsvhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*vhat(i,j,k) 
     END DO; END DO ; END DO
     DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
        rhswhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*what(i,j,k) 
     END DO; END DO ; END DO

     chg=1
     DO WHILE (chg .gt. tol)
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(ux(i,j,k)+uxold(i,j,k))&
                +(v(i,j,k)+vold(i,j,k))*(uy(i,j,k)+uyold(i,j,k))&
                +(w(i,j,k)+wold(i,j,k))*(uz(i,j,k)+uzold(i,j,k)))
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_r,nonlinuhat,DECOMP_2D_FFT_FORWARD)
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(vx(i,j,k)+vxold(i,j,k))&
                +(v(i,j,k)+vold(i,j,k))*(vy(i,j,k)+vyold(i,j,k))&
                +(w(i,j,k)+wold(i,j,k))*(vz(i,j,k)+vzold(i,j,k)))
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_r,nonlinvhat,DECOMP_2D_FFT_FORWARD)
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(wx(i,j,k)+wxold(i,j,k))&
                +(v(i,j,k)+vold(i,j,k))*(wy(i,j,k)+wyold(i,j,k))&
                +(w(i,j,k)+wold(i,j,k))*(wz(i,j,k)+wzold(i,j,k)))
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_r,nonlinwhat,DECOMP_2D_FFT_FORWARD)
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           phat(i,j,k)=-1.0d0*( kx(i)*nonlinuhat(i,j,k)&
                +ky(j)*nonlinvhat(i,j,k)&
                +kz(k)*nonlinwhat(i,j,k))&
                /(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)+0.1d0**13)
        END DO; END DO ; END DO

        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           uhat(i,j,k)=(rhsuhatfix(i,j,k)-nonlinuhat(i,j,k)-kx(i)*phat(i,j,k))/&
                (dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           vhat(i,j,k)=(rhsvhatfix(i,j,k)-nonlinvhat(i,j,k)-ky(j)*phat(i,j,k))/&
                (dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           what(i,j,k)=(rhswhatfix(i,j,k)-nonlinwhat(i,j,k)-kz(k)*phat(i,j,k))/&
                (dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
        END DO; END DO ; END DO

        ! derivative of u with respect to x, y, and z 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD) 

        ! derivative of v with respect to x, y, and z 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD) 

        ! derivative of w with respect to x, y, and z 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD) 
        DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
           temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
        END DO; END DO ; END DO
        CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD) 

        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           utemp(i,j,k)=u(i,j,k)
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           vtemp(i,j,k)=v(i,j,k)
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           wtemp(i,j,k)=w(i,j,k)
        END DO; END DO ; END DO

        CALL decomp_2d_fft_3d(uhat,u,DECOMP_2D_FFT_BACKWARD)    
        CALL decomp_2d_fft_3d(vhat,v,DECOMP_2D_FFT_BACKWARD)    
        CALL decomp_2d_fft_3d(what,w,DECOMP_2D_FFT_BACKWARD)    

        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           u(i,j,k)=u(i,j,k)*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           v(i,j,k)=v(i,j,k)*scalemodes
        END DO; END DO ; END DO
        DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
           w(i,j,k)=w(i,j,k)*scalemodes
        END DO; END DO ; END DO

        mychg(1) =maxval(abs(utemp-u))
        mychg(2) =maxval(abs(vtemp-v))
        mychg(3) =maxval(abs(wtemp-w))
        CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
        chg=allchg(1)+allchg(2)+allchg(3)
        IF (myid.eq.0) THEN
           PRINT *,'chg:',chg
        END IF
     END DO
     time(n+1)=n*dt

     !goto 5100
     IF (myid.eq.0) THEN     
        PRINT *,'time',n*dt
     END IF

     !coprocessing: populating the arrays sent to the coprocessor
     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
     END DO; END DO ; END DO

     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
     END DO; END DO ; END DO

     DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
        realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
     END DO; END DO ; END DO

     call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
             decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
             realtempx, realtempy, realtempz) 
  END DO
  !coprocessing:
  call coprocessorfinalize()

  CALL system_clock(finish,count_rate)

  IF (myid.eq.0) then
     PRINT *, 'Program took', REAL(finish-start)/REAL(count_rate), 'for main timestepping loop'
  END IF

  IF (myid.eq.0) THEN
     name_config = './data/tdata.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO n=1,1+Nt
        WRITE(11,*) time(n)
     END DO 
     CLOSE(11)

     name_config = './data/xcoord.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO i=1,Nx
        WRITE(11,*) x(i)
     END DO
     CLOSE(11)       

     name_config = './data/ycoord.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO j=1,Ny
        WRITE(11,*) y(j)
     END DO
     CLOSE(11)

     name_config = './data/zcoord.dat' 
     OPEN(unit=11,FILE=name_config,status="UNKNOWN")         
     REWIND(11)
     DO k=1,Nz
        WRITE(11,*) z(k)
     END DO
     CLOSE(11)
     PRINT *,'Saved data'
  END IF

  ! Calculate error in final numerical solution
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     utemp(i,j,k)=u(i,j,k) -&
          (-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
          +sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
  END DO; END DO; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     vtemp(i,j,k)=v(i,j,k) -&
          (0.5*(  factor*sin(x(i))*cos(y(j))*sin(z(k))&
          -cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
  END DO; END DO ; END DO
  DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
     wtemp(i,j,k)=w(i,j,k)-&
          (cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(Nt+1)/Re))
  END DO; END DO ; END DO
  mychg(1) = maxval(abs(utemp))
  mychg(2) = maxval(abs(vtemp))
  mychg(3) = maxval(abs(wtemp))
  CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
  chg=allchg(1)+allchg(2)+allchg(3)
  IF (myid.eq.0) THEN
     PRINT*,'The error at the final timestep is',chg
  END IF

  ! clean up 
  CALL decomp_2d_fft_finalize
  CALL decomp_2d_finalize

  DEALLOCATE(x,y,z,time,mychg,allchg,u,v,w,ux,uy,uz,vx,vy,vz,wx,wy,wz,uold,uxold,uyold,uzold,&
       vold,vxold,vyold,vzold,wold,wxold,wyold,wzold,utemp,vtemp,wtemp,&
       temp_r,kx,ky,kz,uhat,vhat,what,rhsuhatfix,rhsvhatfix,&
       rhswhatfix,phat,nonlinuhat,nonlinvhat,nonlinwhat,temp_c,&
       realtemp,stat=AllocateStatus)         
  !coprocessing: deallocate Coprocessing specific arrays
  deallocate(realtempx, realtempy, realtempz)
  IF (AllocateStatus .ne. 0) STOP
  IF (myid.eq.0) THEN
     PRINT *,'Program execution complete'
  END IF
  CALL MPI_FINALIZE(ierr)         

END PROGRAM main

f90 适配器粘合代码中的代码更改

[编辑 | 编辑源代码]

 

 

 

 

( H)
  • 接受当前铅笔的子范围。

代码下载

! Fortran module for interacting with the ParaView CoProcessor
! loosely based on: 
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE

! Subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! Some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.      
module NSadaptor_module 
  use iso_c_binding
  implicit none
  public
  interface NSadaptor 
    module procedure NSadaptor3D 
  end interface NSadaptor 
contains
  subroutine NSadaptor3D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
                                     step, time, omegax, omegay, omegaz)
    ! nx, ny, nz     -- grid dimensions of entire mesh
    !                   used for setting whole extent
    ! xst, xen, etc. -- extents of current subdomain pencil
    ! step           -- current simulation time step
    ! time           -- current simulation time
    ! omega*         -- scalar arrays for the current time step
    integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
    real(kind=8), intent(in) :: time
    real(kind=8), dimension(:,:,:), intent (in) :: omegax, omegay, omegaz
    integer :: flag

    ! check if processing this time step
    ! defined in FortranAdaptorAPI.h
    call requestdatadescription(step, time, flag)
        
    if (flag /= 0) then
       ! processing requested
       ! check if need to create grid
       ! defined in FortranAdaptorAPI.h
       call needtocreategrid(flag)
       
       if (flag /= 0) then
          ! grid needed
          ! defined in VTKCellBasedDataSet.cxx
          ! takes the size of the entire grid, and the extents of the
          ! sub grid.
          call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
       end if
          
       ! defined in VTKCellBasedDataSet.cxx 
       ! call for each field of interest. Be sure to null-terminate the
       ! string for C++!
       call addfield(omegax, "realtempx"//char(0))
       call addfield(omegay, "realtempy"//char(0))
       call addfield(omegaz, "realtempz"//char(0))       


       ! defined in FortranAdaptorAPI.h
       call coprocess()
    end if
    
    return
  end subroutine NSadaptor3D 
end module NSadaptor_module

C++ 包装代码中的代码更改

[编辑 | 编辑源代码]

 

 

 

 

( I)
  • 更改了范围的计算方式。
  • 将点数据调用更改为单元格数据调用。

代码下载

// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx


// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h" 

// CoProcessor specific headers
// Routines that take the place of VTK dataset object creation.
// Called from Fortran code which also calls the Fortran Adaptor API
// supplied with ParaView source.
// Note: names mangled with trailing underscores for linker visibility.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include "vtkCellData.h"
#include <string>

// These will be called from the Fortran adaptor "glue" code.
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.

// Creates the data container for the CoProcessor.
// Takes the extents for both the global dataset and the particular subsection
// visible to the current MPI rank.
// Note: expects to receive Fortran base-1 indices.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,
	int* yst, int* yen, int* zst, int* zen)
{
  if (!ParaViewCoProcessing::GetCoProcessorData()) {
    vtkGenericWarningMacro("Unable to access CoProcessorData.");
    return;
  }

  // The simulation grid is a 3-dimensional topologically and geometrically 
  // regular grid. In VTK/ParaView, this is considered an image data set.
  vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
  // Indexing based on change from F90 to C++, and also from nodal to cellular.
  img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen); 
  img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz);
 
  ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
  ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz);
}

// Add field to the data container.
// Separate from above because this could be dynamic, grid is static.
// Might be an issue, VTK assumes row major C storage.
// Underscore is by-hand name mangling for fortran linking.
// Note: Expects the address of the data, has no way of determining
// if the array is densely packed or not.
// Note 2: Assumes "name" points to null-terminated array of chars.
// Easiest way to do that is to concatenate a terminal NULL in caller. 
extern "C" void addfield_(double* a, char* name)
{

  vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
  vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());

  if (!img) {
    vtkGenericWarningMacro("No adaptor grid to attach field data to.");
    return;
  }

  if (idd->IsFieldNeeded(name)) {
    vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
    field->SetName(name);
    field->SetArray(a, img->GetNumberOfCells(), 1); 
    img->GetCellData()->AddArray(field);
  }
}
  • 最小的代码更改,模拟代码中没有更改(除了协处理调用)。
  • 不需要光晕或打包数组。
  • 应该可以在无需更改的情况下使用不同的分解库。
  • 需要在 ParaView 客户端中进行额外处理,以处理单元格渲染和点渲染之间的差异。
  • 不能与需要幽灵单元格的 VTK/ParaView 过滤器一起使用。

在 ParaView 客户端中创建管道的其他注意事项

[编辑 | 编辑源代码]

使用 Fortran 复数类型

[编辑 | 编辑源代码]

C++ 没有本机复数数据类型。这需要在 ParaView 客户端中进行一些额外的处理

  1. 以减少的网格大小运行原始的非协处理模拟。
  2. 使用 RAW 二进制阅读器在 ParaView 客户端中打开此数据。
  3. 将标量分量数设置为 2
  4. 创建管道,实部和虚部的默认名称分别是 ImageFile_X 和 ImageFile_Y。

 

 

 

 

( J)

有关将复数传递给协处理器的详细信息,请参见 NLSadaptor.f90 的代码。 代码下载

! Fortran module for interacting with the ParaView CoProcessor
! loosely based on: 
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE

! Subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! Some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.      
module NLSadaptor_module 
  use iso_c_binding
  implicit none
  public
  ! using an interface incase need arises to overload 1D, 2D versions
  interface NLSadaptor 
    module procedure NLSadaptor3D 
  end interface NLSadaptor 
contains
  subroutine NLSadaptor3D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
                                     step, time, a)
    ! nx, ny, nz     -- grid dimensions of entire mesh
    !                   used for setting whole extent
    ! xst, xen, etc. -- extents of current subdomain pencil
    ! step           -- current simulation time step
    ! time           -- current simulation time
    ! a              -- scalar array for the current time step
    ! flag           -- receives status from API calls
    integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
    real(kind=8), intent(in) :: time
    complex(kind=8), dimension(:,:,:), intent (in) :: a 
    integer :: flag

    flag = 0
    ! check if processing this time step
    ! defined in FortranAdaptorAPI.h
    call requestdatadescription(step, time, flag)
        
    if (flag /= 0) then
       ! processing requested
       ! check if need to create grid
       ! defined in FortranAdaptorAPI.h
       call needtocreategrid(flag)
       
       if (flag /= 0) then
          ! grid needed
          ! defined in adaptor.cxx
          ! takes the size of the entire grid, and the extents of the
          ! sub grid.
          call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
       end if
          
       ! defined in adaptor.cxx
       ! Be sure to null-terminate the
       ! string for C++!
       call addfield(a, "u_complex"//char(0))

       ! defined in FortranAdaptorAPI.h
       call coprocess()
    end if
    
    return
  end subroutine NLSadaptor3D 
end module NLSadaptor_module

 

 

 

 

 

( K )

这也需要在附带的 C++ 文件中进行一些小的代码更改。 代码下载

// 适配器,用于将 fortran 模拟代码输入 ParaView 协处理。 // 基于 ParaView 发行版中的 PhastaAdaptor 示例。 // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx


// Fortran 特定标头 // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/

  1. include "FortranAdaptorAPI.h"

// 协处理特定标头 // 替代 VTK 数据集对象创建的例程。 // 从 Fortran 代码调用,该代码还调用与 ParaView 源代码一起提供的 Fortran Adaptor API // 注意:名称与尾部下划线混合,用于链接器可见性。

  1. include "vtkCPDataDescription.h"
  2. include "vtkCPInputDataDescription.h"
  3. include "vtkCPProcessor.h"
  4. include "vtkDoubleArray.h"
  5. include "vtkPointData.h"
  6. include "vtkSmartPointer.h"
  7. include "vtkImageData.h"
  8. include "vtkCellData.h"
  9. include <string>

// 这些将从 Fortran “粘合” 代码调用 // 完全依赖于数据布局,结构化与非结构化,等等 // 因为 VTK/ParaView 对每个布局使用不同的内部布局。

// 为协处理器创建数据容器。 // 获取全局数据集和当前 MPI 秩可见的特定子部分的范围 // 注意:预期接收 Fortran 基于 1 的索引。 extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen, int* yst, int* yen, int* zst, int* zen) {

 if (!ParaViewCoProcessing::GetCoProcessorData()) {
   vtkGenericWarningMacro("Unable to access CoProcessorData.");
   return;
 }
 // The simulation grid is a 3-dimensional topologically and geometrically 
 // regular grid. In VTK/ParaView, this is considered an image data set.
 vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
 // For Fortram complex, need to set that there are 2 scalar components
 img->SetNumberOfScalarComponents(2);
 // Indexing based on change from F90 to C++, and also from nodal to cellular.
 // Extents are given in terms of nodes, not cells.
 img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen); 
 
 // Setting spacing is important so that the camera position in the pipeline makes
 // sense if using different sized meshes between setting up the pipeline and running
 // the simulation. Origin can often be ignored.
 img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz); // considering passing in as args. 
 ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
 ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz);

}

// 向数据容器添加字段。 // 与上述分开,因为这可能是动态的,网格是静态的。 // 可能会出现问题,VTK 假设行主序 C 存储。 // 下划线是用于 fortran 链接的手动名称混合。 // 注意:预期接收数据的地址,无法确定 // 数组是否密集打包。 // 注意 2:假设 “name” 指向以 null 结尾的字符数组。 // 最简单的实现方式是在调用者中进行连接。 extern "C" void addfield_(double* a, char* name) {

 vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
 vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
 if (!img) {
   vtkGenericWarningMacro("No adaptor grid to attach field data to.");
   return;
 }
 if (idd->IsFieldNeeded(name)) {
   vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
   field->SetNumberOfComponents(2);
   field->SetName(name);
   field->SetArray(a, 2* img->GetNumberOfCells(), 1); 
   img->GetCellData()->AddArray(field);
 }

}

使用单元格数据

[编辑 | 编辑源代码]
  • 某些 ParaView 过滤器和表示形式仅适用于点数据,例如体积渲染。使用单元格数据到点数据过滤器进行转换。请务必选中选项,以将单元格数据通过过滤器传递。如果模拟运行没有错误并产生没有输出,则可能的罪魁祸首是在创建管道时未能选中该选项。

vtkKDTreeGenerator 警告

[编辑 | 编辑源代码]
  • 使用体积渲染的管道将生成有关区域 ID 为 0 的警告。通常可以忽略这些警告。

ParaView 协处理资源

[编辑 | 编辑源代码]
华夏公益教科书