并行谱数值方法/使用ParaView协同处理进行可视化
ParaView协同处理插件允许应用程序代码被检测以连接到ParaView服务器,以执行可视化管道。管道可以生成各种格式的图像或VTK XML并行文件格式数据集。
用户使用一个示例(可能分辨率较低)数据集在ParaView客户端中创建一个管道。然后,该管道使用协同处理插件导出为Python脚本,该脚本将由应用程序加载。
应用程序开发人员还需要编写一些Fortran/C++代码,将应用程序的数据结构转换为ParaView可以理解的格式。对于完全规则的网格,这是很简单的。
第一个要求是拥有一个可以导出Python脚本的ParaView客户端。截至ParaView 3.14.1,这需要从源代码构建。 标准说明适用于一些关于构建 脚本生成插件的额外说明。构建完成后,第一次启动ParaView时,转到工具,管理插件,并将协同处理插件设置为在启动时加载。加载插件后,应该有两个新的菜单项:写入器和协同处理
- 可能需要安装CMake
- 可能需要构建Qt
- 可能需要关闭构建Manta插件
以下说明适用于2012年夏季在NCSA的Forge上测试的Navier-Stokes CUDA Fortran。
|
(
) |
navierstokes.cuf添加了4行并删除了.datbin写入。 代码下载
!--------------------------------------------------------------------
!
! PURPOSE
!
! This program numerically solves the 2D incompressible Navier-Stokes
! on a Square Domain [0,1]x[0,1] using pseudo-spectral methods and
! Crank-Nicolson timestepping. The numerical solution is compared to
! the exact Taylor-Green Vortex Solution.
!
! Periodic free-slip boundary conditions and Initial conditions:
! u(x,y,0)=sin(2*pi*x)cos(2*pi*y)
! v(x,y,0)=-cos(2*pi*x)sin(2*pi*y)
! Analytical Solution (subscript denote derivatives):
! u(x,y,t)=sin(2*pi*x)cos(2*pi*y)exp(-8*pi^2*t/Re)
! v(x,y,t)=-cos(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
! u_y(x,y,t)=-2*pi*sin(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
! v_x(x,y,t)=2*pi*sin(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
! omega=v_x-u_y
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! nplots = number of plots produced
! plotgap = number of timesteps inbetween plots
! Re = dimensionless Renold's number
! ReInv = 1/Re for optimization
! dt = timestep size
! dtInv = 1/dt for optimization
! tol = determines when convergences is reached
! scalemodes = 1/(Nx*Ny), scaling after preforming FFTs
! numthreads = number of CPUs used in calculation
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! time = times at which data is saved
! chg = error at each iteration
! temp = used for ordering saved omega
! .. Arrays (gpu) ..
! u_d = velocity in x direction
! uold_d = velocity in x direction at previous timestep
! v_d = velocity in y direction
! vold_d = velocity in y direction at previous timestep
! omeg_d = vorticity in real space
! omeghat_d = 2D Fourier transform of vorticity
! at next iterate
! omegoldhat_d = 2D Fourier transform of vorticity at previous
! iterate
! nloldhat_d = nonlinear term in Fourier space
! at previous iterate
! omegexact_d = taylor-green vorticity at
! at final step
! psihat_d = 2D Fourier transform of streamfunction
! at next iteration
! omegcheck_d = store of vorticity at previous iterate
! temp1_d/temp2_d = reusable complex/real space used for
! calculations. This reduces number of
! arrays stored.
! .. Vectors (gpu) ..
! kx_d = fourier frequencies in x direction
! ky_d = fourier frequencies in y direction
! x_d = x locations
! y_d = y locations
! name_config = array to store filename for data to be saved
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! This program has not been fully optimized.
!--------------------------------------------------------------------
module precision
! Precision control
integer, parameter, public :: Single = kind(0.0) ! Single precision
integer, parameter, public :: Double = kind(0.0d0) ! Double precision
integer, parameter, public :: fp_kind = Double
!integer, parameter, public :: fp_kind = Single
end module precision
module cufft
integer, public :: CUFFT_FORWARD = -1
integer, public :: CUFFT_INVERSE = 1
integer, public :: CUFFT_R2C = Z'2a' ! Real to Complex (interleaved)
integer, public :: CUFFT_C2R = Z'2c' ! Complex (interleaved) to Real
integer, public :: CUFFT_C2C = Z'29' ! Complex to Complex, interleaved
integer, public :: CUFFT_D2Z = Z'6a' ! Double to Double-Complex
integer, public :: CUFFT_Z2D = Z'6c' ! Double-Complex to Double
integer, public :: CUFFT_Z2Z = Z'69' ! Double-Complex to Double-Complex
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
! cufftPlan2d(cufftHandle *plan, int nx,int ny, cufftType type)
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
interface cufftPlan2d
subroutine cufftPlan2d(plan, nx, ny, type) bind(C,name='cufftPlan2d')
use iso_c_binding
integer(c_int):: plan
integer(c_int),value:: nx, ny, type
end subroutine cufftPlan2d
end interface
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
! cufftDestroy(cufftHandle plan)
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
interface cufftDestroy
subroutine cufftDestroy(plan) bind(C,name='cufftDestroy')
use iso_c_binding
integer(c_int),value:: plan
end subroutine cufftDestroy
end interface
interface cufftExecD2Z
subroutine cufftExecD2Z(plan, idata, odata) &
& bind(C,name='cufftExecD2Z')
use iso_c_binding
use precision
integer(c_int), value :: plan
real(fp_kind), device :: idata(1:nx,1:ny)
complex(fp_kind),device :: odata(1:nx,1:ny)
end subroutine cufftExecD2Z
end interface
interface cufftExecZ2D
subroutine cufftExecZ2D(plan, idata, odata) &
& bind(C,name='cufftExecZ2D')
use iso_c_binding
use precision
integer(c_int),value:: plan
complex(fp_kind),device:: idata(1:nx,1:ny)
real(fp_kind),device :: odata(1:nx,1:ny)
end subroutine cufftExecZ2D
end interface
end module cufft
PROGRAM main
use precision
use cufft
! coprocessing:
use ns2dcnadaptor
! declare variables
IMPLICIT NONE
INTEGER(kind=4), PARAMETER :: Nx=1024
INTEGER(kind=4), PARAMETER :: Ny=1024
!INTEGER(kind=8) ::
REAL(fp_kind), PARAMETER :: dt=0.00025d0
REAL(fp_kind), PARAMETER :: dtInv=1.0d0/REAL(dt,kind(0d0))
REAL(fp_kind), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(fp_kind), PARAMETER :: Re=5000.0d0
REAL(fp_kind), PARAMETER :: ReInv=1.0d0/REAL(Re,kind(0d0))
REAL(fp_kind), PARAMETER :: tol=0.1d0**10
!coprocessing: temp is used as an int, runtime crashes on
! floating temp used in write format string on line 376
!REAL(fp_kind) :: scalemodes,chg,temp=10000000
real(fp_kind) :: scalemodes, chg
integer(kind=4) :: temp=100000000
INTEGER(kind=4), PARAMETER :: nplots=400, plotgap=200
REAL(fp_kind),DIMENSION(:), ALLOCATABLE :: x,y
REAL(fp_kind),DIMENSION(:,:), ALLOCATABLE :: omeg,omegexact
INTEGER(kind=4) :: i,j,n,t, AllocateStatus
INTEGER(kind=4) :: planz2d,pland2z, kersize
!variables used for saving data and timing
INTEGER(kind=4) :: start, finish, count_rate,count, iol
CHARACTER*100 :: name_config
! Declare variables for GPU
REAL(fp_kind), DEVICE, DIMENSION(:,:), ALLOCATABLE :: u_d,v_d,&
omegcheck_d,omeg_d,&
nl_d, temp2_d, omegexact_d
COMPLEX(fp_kind), DEVICE, DIMENSION(:,:), ALLOCATABLE :: omegoldhat_d, nloldhat_d,&
omeghat_d, nlhat_d, psihat_d,&
temp1_d
COMPLEX(fp_kind), DEVICE, DIMENSION(:), ALLOCATABLE :: kx_d,ky_d
REAL(kind=8),DEVICE, DIMENSION(:), ALLOCATABLE :: x_d,y_d
kersize=min(Nx,256)
PRINT *,'Program starting'
PRINT *,'Grid:',Nx,'X',Ny
PRINT *,'dt:',dt
ALLOCATE(x(1:Nx),y(1:Ny),omeg(1:Nx,1:Ny),omegexact(1:Nx,1:Ny),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Allocated CPU arrays'
ALLOCATE(kx_d(1:Nx/2+1),ky_d(1:Ny),x_d(1:Nx),y_d(1:Ny),u_d(1:Nx,1:Ny),&
v_d(1:Nx,1:Ny),omeg_d(1:Nx,1:Ny),&
omegexact_d(1:Nx,1:Ny),temp2_d(1:Nx,1:Ny),&
omegoldhat_d(1:Nx/2+1,1:Ny),nloldhat_d(1:Nx/2+1,1:Ny),&
omegcheck_d(1:Nx,1:Ny),omeghat_d(1:Nx/2+1,1:Ny),nl_d(1:Nx,1:Ny),&
nlhat_d(1:Nx/2+1,1:Ny), psihat_d(1:Nx/2+1,1:Ny),temp1_d(1:Nx/2+1,1:Ny),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Allocated GPU arrays'
CALL cufftPlan2D(pland2z,nx,ny,CUFFT_D2Z)
CALL cufftPlan2D(planz2d,nx,ny,CUFFT_Z2D)
PRINT *,'Setup FFTs'
! setup fourier frequencies
!$cuf kernel do <<< *,* >>>
DO i=1,Nx/2+1
kx_d(i)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(i-1,kind=fp_kind)
END DO
kx_d(1+Nx/2)=0.0d0
!$cuf kernel do <<< *,* >>>
DO i=1,Nx
x_d(i)=REAL(i-1,kind(0d0))/REAL(Nx,kind=fp_kind)
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny/2+1
ky_d(j)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(j-1,kind=fp_kind)
END DO
ky_d(1+Ny/2)=0.0d0
!$cuf kernel do <<< *,* >>>
DO j = 1,Ny/2 -1
ky_d(j+1+Ny/2)=-ky_d(1-j+Ny/2)
END DO
!$cuf kernel do <<< *, * >>>
DO j=1,Ny
y_d(j)=REAL(j-1,kind(0d0))/REAL(Ny,kind=fp_kind)
END DO
scalemodes=1.0d0/REAL(Nx*Ny,kind=fp_kind)
PRINT *,'Setup grid and fourier frequencies'
!initial data
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
u_d(i,j)=sin(2.0d0*pi*x_d(i))*cos(2.0d0*pi*y_d(j))
END DO
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
v_d(i,j)=-cos(2.0d0*pi*x_d(i))*sin(2.0d0*pi*y_d(j))
END DO
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
omeg_d(i,j)=4.0d0*pi*sin(2.0d0*pi*x_d(i))*sin(2.0d0*pi*y_d(j))+0.01d0*cos(2.0d0*pi*y_d(j))
END DO
END DO
CALL cufftExecD2Z(pland2z,omeg_d,omeghat_d)
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*kx_d(i)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!first part nonlinear term in real space
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=u_d(i,j)*temp2_d(i,j)
END DO
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*ky_d(j)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=(nl_d(i,j)+v_d(i,j)*temp2_d(i,j))*scalemodes
END DO
END DO
CALL cufftExecD2Z(pland2z,nl_d,nlhat_d)
omegcheck_d=omeg_d
PRINT *,'Got initial data, starting timestepping'
CALL system_clock(start,count_rate)
! coprocessing: simulation loop
! coprocessing: before entering, initialze the ParaView coprocessor
! coprocessing: this is from FortranAdaptorAPI.cxx
! coprocessing: it takes the name of the pipeline script you created
! coprocessing: in the ParaView client and the name's length in chars
call coprocessorinitialize("pipeline.py", 11 )
DO t=1,nplots
DO n=1,plotgap
chg=1.0d0
nloldhat_d=nlhat_d
omegoldhat_d=omeghat_d
DO WHILE (chg>tol)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
omeghat_d(i,j)=((dtInv+0.5d0*ReInv*(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)))&
*omegoldhat_d(i,j) - 0.5d0*(nloldhat_d(i,j)+nlhat_d(i,j)))&
/(dtInv-0.5d0*ReInv*(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)))
END DO
END DO
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
psihat_d(i,j)=-omeghat_d(i,j)/(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)+0.10d0**14)
END DO
END DO
CALL cufftExecZ2D(planz2d,omeghat_d,omeg_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=-psihat_d(i,j)*kx_d(i)*scalemodes
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,v_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=psihat_d(i,j)*ky_d(j)*scalemodes
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,u_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*kx_d(i)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=u_d(i,j)*temp2_d(i,j)
END DO
END DO
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*ky_d(j)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=(nl_d(i,j)+v_d(i,j)*temp2_d(i,j))*scalemodes
END DO
END DO
CALL cufftExecD2Z(pland2z,nl_d,nlhat_d)
chg=0.0d0
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx
chg=chg+(omeg_d(i,j)-omegcheck_d(i,j))*(omeg_d(i,j)-omegcheck_d(i,j))&
*scalemodes*scalemodes
END DO
END DO
omegcheck_d=omeg_d
END DO
END DO
PRINT *, t*plotgap*dt
omeg=omeg_d
!temp=temp+1
!write(name_config,'(a,i0,a)') 'omega',temp,'.datbin'
!INQUIRE(iolength=iol) omeg(1,1)
!OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol)
!count = 1 !coprocessing - there's an intrinsic array function this shadows
!DO j=1,Ny
! DO i=1,Nx
! WRITE(11,rec=count) omeg(i,j)*scalemodes
! count=count+1
! END DO
!END DO
!CLOSE(11)
! coprocessing: do the coprocessing at the end of the sim loop
! grid dims, time step, time, scalar array
call navierStokesCoprocessor(Nx, Ny, 1, t, t*dt ,omeg)
END DO
! coprocessing: clean up
call coprocessorfinalize()
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),&
'for Time stepping'
! Copy results back to host
x=x_d
y=y_d
!!!!!!!!!!!!!!!!!!!!!!!!
!copy over data to disk!
!!!!!!!!!!!!!!!!!!!!!!!!
name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
!!!!!!!!!!!!!!!!!!!!!!!!
CALL cufftDestroy(planz2d)
CALL cufftDestroy(pland2z)
PRINT *,'Destroyed fft plan'
DEALLOCATE(x,y,omeg,omegexact,stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Deallocated CPU memory'
DEALLOCATE(kx_d,ky_d,x_d,y_d,u_d,&
v_d, omegcheck_d, omeg_d,omegoldhat_d,&
omegexact_d, nloldhat_d,omeghat_d,nl_d, nlhat_d,&
temp1_d,temp2_d, psihat_d,stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Deallocated GPU memory'
PRINT *,'Program execution complete'
END PROGRAM main
|
(
) |
! fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE
! subroutine determines if coprocessing needed during the
! current simulation step or not.
module ns2dcnadaptor
implicit none
public
contains
subroutine navierstokescoprocessor(nx, ny, nz, step, time, omeg)
! nx, ny, nz -- grid dimensions, nz will be 1 since this is a 2D problem.
! step -- current simulation time step
! time -- current simulation time
! omega -- scalar array for the current time step
integer, intent(in) :: nx, ny, nz, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:,:), intent (in) :: omeg
integer :: flag
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag .ne. 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag .ne. 0) then
! grid needed
! defined in ns2dcnVTKDataSet.cxx
call createcpimagedata(nx, ny, nz)
end if
! defined in ns2dcnVTKDataSet.cxx
call addfield(omeg)
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine navierstokescoprocessor
end module ns2dcnadaptor
|
(
) |
定义将模拟数据放入ParaView/VTK库可以理解的格式的函数。对于完全规则的网格来说非常简单。扩展到3D很简单。非结构化网格将需要更多努力。 代码下载
// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h"
// CoProcessor specific headers
// ParaView-3.14.1-Source/CoProcessing/CoProcessor/
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
// ParaView-3.14.1-Source/VTK/Filtering/vtkImageData.h
#include "vtkImageData.h"
// These will be called from the Fortran "glue" code"
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.
// Creates the data container for the CoProcessor.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz)
{
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 2-dimensional topologically and geometrically
// regular grid. In VTK/ParaView, this is considered an image data set.
vtkImageData* Grid = vtkImageData::New();
// assuming dimZ == 1 for now
Grid->SetDimensions(*nx, *ny, *nz);
// Setting the Origin and Spacing are also options.
// Name should be consistent between here, Fortran and Python client script.
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(Grid);
}
// Add field(s) to the data container.
// Separate from above because this will be dynamic, grid is static.
// Might be an issue, VTK probably assumes row major, but
// omeg probably passed column major...
// by hand name mangling for fortran
extern "C" void addfield_(double* scalars)
{
vtkCPInputDataDescription *idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkImageData* Image = vtkImageData::SafeDownCast(idd->GetGrid());
if (!Image) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
// field name must match that in the fortran code.
if (idd->IsFieldNeeded("omeg")) {
vtkDoubleArray* omega = vtkDoubleArray::New();
omega->SetName("omeg");
omega->SetArray(scalars, Image->GetNumberOfPoints(), 1);
Image->GetPointData()->AddArray(omega);
omega->Delete();
}
}
脚本管道是通过使用协同处理脚本生成器插件在ParaView客户端中生成的。
- 文件 > 打开 > 浏览到数据集,该文件现在应该在管道浏览器窗口中可见。
- 在属性窗口中,单击应用,数据现在应该出现在布局窗口中。
- 要更改颜色映射,请单击工具栏上带有绿色圆圈的彩虹按钮。
- 要在布局中显示颜色映射,请单击工具栏上垂直彩虹的按钮。
- 在工具 > 锁定视图大小自定义中设置视窗的大小。这将帮助你为最终图像排列项目。记下尺寸,稍后你需要它们来解决一个错误。
- 如果要写出VTK数据集,请转到写入器 > 并行图像数据写入器,这应该会在管道中添加一个写入器。
- 要导出python管道,请转到协同处理 > 导出状态,这应该会启动一个向导。
- 单击下一步。
- 选择你想要连接到你的模拟的数据集,添加它,单击下一步。
- 默认的模拟名称为“输入”。如果你更改它,请务必更新ns2dcnVTKDataSet.cxx,单击下一步。
- 检查是否要输出图像。单击完成。
- 保存文件。文件名将在navierstokes.cuf中引用,因此请保存为类似pipeline-test-??.py的名称,然后将其复制到pipeline.py,以避免为新的管道重新编译。
如果管道设置为输出图像,那么脚本中可能存在两个问题,可以视为错误。
- 无论在客户端中设置了什么视口大小,脚本都不会使用它。
- 颜色映射标签将以一个包含它们的框绘制。
第一个问题的解决方法是在客户端中,向ViewSize方法调用添加与视口窗口尺寸相同的参数。第二个问题只需要简单的编辑。这两个问题都在示例pipeline.py中用#mvm注释标记了。
Forge 要求加载 ParaView OSMesa 模块。$ module load paraview/3.14.1-OSMesa。在 Nautilus 上,该模块为 paraview/3.14.1。
按照运行未经检测版本的程序的方式运行。
有许多方法可以做到这一点,如果 ffmpeg 可用,则可以使用 ffmpeg -sameq -i path/to/images/image%04d.png myanimation.mpg。
或者,如果将 VTK 数据集写入文件,则可以直接在 ParaView 中使用它们。
流畅视频约为 30 fps 或 dt = 0.033… 如果将较大 dt 写入 VTK 数据集,则可以在 ParaView 中进行线性时间插值。如果将较大 dt 写入图像并将其拼接成 mpg,则可用选项会大大降低视频质量。
- Forge 没有 X 支持,因此需要构建离屏 Mesa (OSMesa)。
- Nautilus 支持 X,因此可以使用提供的 OpenGL 库进行渲染。
自 ParaView 3.14.1 以来,协同处理 API 有一些细微的更改。以下内容适用于 ParaView 3.98+ 和 Catalyst 1.0 alpha+。这些更改已于 2013 年 8 月 20 日在 NICS 的 Beacon 上通过 offload 模式进行了测试。
代码更改如下:
- 在模拟代码中,coprocessorinitialize 现在为 coprocessorinitializewithpython,参数相同。
- 在 C++ 数据集辅助代码中,包含 vtkCPPythonAdaptorAPI.h,而不是 FortranAdaptorAPI.h。
- 在 C++ 数据集辅助代码中,从 ParaViewCoprocessing 命名空间更改为 vtkCPPythonAdaptorAPI 命名空间。
要链接的库也已更改,对于 ParaView 4.0.1 版本,它们为:
- -lvtkPVCatalystPython26D-pv4.0
- -lvtkPVCatalyst-pv4.0
- -lvtkCommonCore-pv4.0
- -lvtkCommonDataModel-pv4.0
- -lvtkPVPythonCatalyst-pv4.0
|
(
) |
!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 2D incompressible Navier-Stokes
! on a Square Domain [0,1]x[0,1] using pseudo-spectral methods and
! Crank-Nicolson timestepping. The numerical solution is compared to
! the exact Taylor-Green Vortex Solution.
!
! Periodic free-slip boundary conditions and Initial conditions:
! u(x,y,0)=sin(2*pi*x)cos(2*pi*y)
! v(x,y,0)=-cos(2*pi*x)sin(2*pi*y)
! Analytical Solution:
! u(x,y,t)=sin(2*pi*x)cos(2*pi*y)exp(-8*pi^2*nu*t)
! v(x,y,t)=-cos(2*pi*x)sin(2*pi*y)exp(-8*pi^2*nu*t)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! mu = viscosity
! rho = density
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! count = keep track of information written to disk
! iol = size of array to write to disk
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfx = Forward 1d fft plan in x
! planbx = Backward 1d fft plan in x
! planfy = Forward 1d fft plan in y
! planby = Backward 1d fft plan in y
! dt = timestep
! .. Arrays ..
! u = velocity in x direction
! uold = velocity in x direction at previous timestep
! v = velocity in y direction
! vold = velocity in y direction at previous timestep
! u_y = y derivative of velocity in x direction
! v_x = x derivative of velocity in y direction
! omeg = vorticity in real space
! omegold = vorticity in real space at previous
! iterate
! omegcheck = store of vorticity at previous iterate
! omegoldhat = 2D Fourier transform of vorticity at previous
! iterate
! omegoldhat_x = x-derivative of vorticity in Fourier space
! at previous iterate
! omegold_x = x-derivative of vorticity in real space
! at previous iterate
! omegoldhat_y = y-derivative of vorticity in Fourier space
! at previous iterate
! omegold_y = y-derivative of vorticity in real space
! at previous iterate
! nlold = nonlinear term in real space
! at previous iterate
! nloldhat = nonlinear term in Fourier space
! at previous iterate
! omeghat = 2D Fourier transform of vorticity
! at next iterate
! omeghat_x = x-derivative of vorticity in Fourier space
! at next timestep
! omeghat_y = y-derivative of vorticity in Fourier space
! at next timestep
! omeg_x = x-derivative of vorticity in real space
! at next timestep
! omeg_y = y-derivative of vorticity in real space
! at next timestep
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kxx = square of fourier frequencies in x direction
! kyy = square of fourier frequencies in y direction
! x = x locations
! y = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
! fftfx = array to setup x Fourier transform
! fftbx = array to setup y Fourier transform
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)
! declare variables
PROGRAM main
use nsadaptor_glue
IMPLICIT NONE
INTEGER(kind=4), PARAMETER :: Nx=256
INTEGER(kind=4), PARAMETER :: Ny=256
REAL(kind=8), PARAMETER :: dt=0.00125
REAL(kind=8), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510
REAL(kind=8), PARAMETER :: rho=1.0d0
REAL(kind=8), PARAMETER :: mu=1.0d0
REAL(kind=8), PARAMETER :: tol=0.1d0**10
REAL(kind=8) :: chg
INTEGER(kind=4), PARAMETER :: nplots=50
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: time
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: kx,kxx
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: ky,kyy
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: x
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: y
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE :: &
u,uold,v,vold,u_y,v_x,omegold, omegcheck, omeg,&
omegoldhat, omegoldhat_x, omegold_x,&
omegoldhat_y, omegold_y, nlold, nloldhat,&
omeghat, omeghat_x, omeghat_y, omeg_x, omeg_y,&
nl, nlhat, psihat, psihat_x, psi_x, psihat_y, psi_y
! added for coprocessing
real(kind=8), dimension(:,:), allocatable :: dump
REAL(kind=8),DIMENSION(:,:), ALLOCATABLE :: uexact_y,vexact_x,omegexact
INTEGER(kind=4) :: i,j,k,n, allocatestatus, count, iol
INTEGER(kind=4) :: start, finish, count_rate
INTEGER(kind=4), PARAMETER :: FFTW_IN_PLACE = 8, FFTW_MEASURE = 0, &
FFTW_EXHAUSTIVE = 8, FFTW_PATIENT = 32, &
FFTW_ESTIMATE = 64
INTEGER(kind=4),PARAMETER :: FFTW_FORWARD = -1, FFTW_BACKWARD=1
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE :: fftfx,fftbx
INTEGER(kind=8) :: planfxy,planbxy
CHARACTER*100 :: name_config
CALL system_clock(start,count_rate)
ALLOCATE(time(1:nplots),kx(1:Nx),kxx(1:Nx),ky(1:Ny),kyy(1:Ny),x(1:Nx),y(1:Ny),&
u(1:Nx,1:Ny),uold(1:Nx,1:Ny),v(1:Nx,1:Ny),vold(1:Nx,1:Ny),u_y(1:Nx,1:Ny),&
v_x(1:Nx,1:Ny),omegold(1:Nx,1:Ny),omegcheck(1:Nx,1:Ny), omeg(1:Nx,1:Ny),&
omegoldhat(1:Nx,1:Ny),omegoldhat_x(1:Nx,1:Ny), omegold_x(1:Nx,1:Ny), &
omegoldhat_y(1:Nx,1:Ny),omegold_y(1:Nx,1:Ny), nlold(1:Nx,1:Ny), nloldhat(1:Nx,1:Ny),&
omeghat(1:Nx,1:Ny), omeghat_x(1:Nx,1:Ny), omeghat_y(1:Nx,1:Ny), omeg_x(1:Nx,1:Ny),&
omeg_y(1:Nx,1:Ny), nl(1:Nx,1:Ny), nlhat(1:Nx,1:Ny), psihat(1:Nx,1:Ny), &
psihat_x(1:Nx,1:Ny), psi_x(1:Nx,1:Ny), psihat_y(1:Nx,1:Ny), psi_y(1:Nx,1:Ny),&
uexact_y(1:Nx,1:Ny), vexact_x(1:Nx,1:Ny), omegexact(1:Nx,1:Ny),fftfx(1:Nx,1:Ny),&
fftbx(1:Nx,1:Ny),dump(1:Nx,1:Ny),stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'allocated space'
! set up ffts
CALL dfftw_plan_dft_2d_(planfxy,Nx,Ny,fftfx(1:Nx,1:Ny),fftbx(1:Nx,1:Ny),&
FFTW_FORWARD,FFTW_EXHAUSTIVE)
CALL dfftw_plan_dft_2d_(planbxy,Nx,Ny,fftbx(1:Nx,1:Ny),fftfx(1:Nx,1:Ny),&
FFTW_BACKWARD,FFTW_EXHAUSTIVE)
! setup fourier frequencies in x-direction
DO i=1,1+Nx/2
kx(i)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
DO i=1,Nx
kxx(i)=kx(i)*kx(i)
END DO
DO i=1,Nx
x(i)=REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0))
END DO
! setup fourier frequencies in y-direction
DO j=1,1+Ny/2
ky(j)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
DO j=1,Ny
kyy(j)=ky(j)*ky(j)
END DO
DO j=1,Ny
y(j)=REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0))
END DO
PRINT *,'Setup grid and fourier frequencies'
DO j=1,Ny
DO i=1,Nx
u(i,j)=sin(2.0d0*pi*x(i))*cos(2.0d0*pi*y(j))
v(i,j)=-cos(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
u_y(i,j)=-2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
v_x(i,j)=2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
omeg(i,j)=v_x(i,j)-u_y(i,j)
END DO
END DO
! Vorticity to Fourier Space
CALL dfftw_execute_dft_(planfxy,omeg(1:Nx,1:Ny),omeghat(1:Nx,1:Ny))
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!Initial nonlinear term !!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! obtain \hat{\omega}_x^{n,k}
DO j=1,Ny
omeghat_x(1:Nx,j)=omeghat(1:Nx,j)*kx(1:Nx)
END DO
! obtain \hat{\omega}_y^{n,k}
DO i=1,Nx
omeghat_y(i,1:Ny)=omeghat(i,1:Ny)*ky(1:Ny)
END DO
! convert to real space
CALL dfftw_execute_dft_(planbxy,omeghat_x(1:Nx,1:Ny),omeg_x(1:Nx,1:Ny))
CALL dfftw_execute_dft_(planbxy,omeghat_y(1:Nx,1:Ny),omeg_y(1:Nx,1:Ny))
! compute nonlinear term in real space
DO j=1,Ny
nl(1:Nx,j)=u(1:Nx,j)*omeg_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))+&
v(1:Nx,j)*omeg_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
CALL dfftw_execute_dft_(planfxy,nl(1:Nx,1:Ny),nlhat(1:Nx,1:Ny))
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
time(1)=0.0d0
PRINT *,'Got initial data, starting timestepping'
! New Coprocessor API:
call coprocessorinitializewithpython("pipeline.py", 11)
DO n=1,nplots
chg=1
! save old values
uold(1:Nx,1:Ny)=u(1:Nx,1:Ny)
vold(1:Nx,1:Ny)=v(1:Nx,1:Ny)
omegold(1:Nx,1:Ny)=omeg(1:Nx,1:Ny)
omegcheck(1:Nx,1:Ny)=omeg(1:Nx,1:Ny)
omegoldhat(1:Nx,1:Ny)=omeghat(1:Nx,1:Ny)
nloldhat(1:Nx,1:Ny)=nlhat(1:Nx,1:Ny)
DO WHILE (chg>tol)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!nonlinear fixed (n,k+1)!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! obtain \hat{\omega}_x^{n+1,k}
DO j=1,Ny
omeghat_x(1:Nx,j)=omeghat(1:Nx,j)*kx(1:Nx)
END DO
! obtain \hat{\omega}_y^{n+1,k}
DO i=1,Nx
omeghat_y(i,1:Ny)=omeghat(i,1:Ny)*ky(1:Ny)
END DO
! convert back to real space
CALL dfftw_execute_dft_(planbxy,omeghat_x(1:Nx,1:Ny),omeg_x(1:Nx,1:Ny))
CALL dfftw_execute_dft_(planbxy,omeghat_y(1:Nx,1:Ny),omeg_y(1:Nx,1:Ny))
! calculate nonlinear term in real space
DO j=1,Ny
nl(1:Nx,j)=u(1:Nx,j)*omeg_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))+&
v(1:Nx,j)*omeg_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
! convert back to fourier
CALL dfftw_execute_dft_(planfxy,nl(1:Nx,1:Ny),nlhat(1:Nx,1:Ny))
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! obtain \hat{\omega}^{n+1,k+1} with Crank Nicolson timestepping
DO j=1,Ny
omeghat(1:Nx,j)=( (1.0d0/dt+0.5d0*(mu/rho)*(kxx(1:Nx)+kyy(j)))&
*omegoldhat(1:Nx,j) - 0.5d0*(nloldhat(1:Nx,j)+nlhat(1:Nx,j)))/&
(1.0d0/dt-0.5d0*(mu/rho)*(kxx(1:Nx)+kyy(j)))
END DO
! calculate \hat{\psi}^{n+1,k+1}
DO j=1,Ny
psihat(1:Nx,j)=-omeghat(1:Nx,j)/(kxx(1:Nx)+kyy(j))
END DO
psihat(1,1)=0.0d0
psihat(Nx/2+1,Ny/2+1)=0.0d0
psihat(Nx/2+1,1)=0.0d0
psihat(1,Ny/2+1)=0.0d0
! obtain \psi_x^{n+1,k+1} and \psi_y^{n+1,k+1}
DO j=1,Ny
psihat_x(1:Nx,j)=psihat(1:Nx,j)*kx(1:Nx)
END DO
CALL dfftw_execute_dft_(planbxy,psihat_x(1:Nx,1:Ny),psi_x(1:Nx,1:Ny))
DO i=1,Nx
psihat_y(i,1:Ny)=psihat(i,1:Ny)*ky(1:Ny)
END DO
CALL dfftw_execute_dft_(planbxy,psihat_y(1:Ny,1:Ny),psi_y(1:Ny,1:Ny))
DO j=1,Ny
psi_x(1:Nx,j)=psi_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
psi_y(1:Nx,j)=psi_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
! obtain \omega^{n+1,k+1}
CALL dfftw_execute_dft_(planbxy,omeghat(1:Nx,1:Ny),omeg(1:Nx,1:Ny))
DO j=1,Ny
omeg(1:Nx,j)=omeg(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
! obtain u^{n+1,k+1} and v^{n+1,k+1} using stream function (\psi) in real space
DO j=1,Ny
u(1:Nx,j)=psi_y(1:Nx,j)
v(1:Nx,j)=-psi_x(1:Nx,j)
END DO
! check for convergence
chg=maxval(abs(omeg-omegcheck))
! saves {n+1,k+1} to {n,k} for next iteration
omegcheck=omeg
END DO
time(n+1)=time(n)+dt
PRINT *,'TIME ',time(n+1)
! call adaptor here
! not sure if there's a way to cleanly pass complex to C++ code
do j = 1, Ny
do i = 1, Nx
dump(i,j) = real(omeg(i,j), kind(0d0))
end do
end do
call nsadaptor(Nx, Ny, 1, n, time(n), dump)
END DO
call coprocessorfinalize()
DO j=1,Ny
DO i=1,Nx
uexact_y(i,j)=-2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))*&
exp(-8.0d0*mu*(pi**2)*nplots*dt)
vexact_x(i,j)=2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))*&
exp(-8.0d0*mu*(pi**2)*nplots*dt)
omegexact(i,j)=vexact_x(i,j)-uexact_y(i,j)
END DO
END DO
name_config = 'omegafinal.datbin'
INQUIRE(iolength=iol) omegexact(1,1)
OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol)
count = 1
DO j=1,Ny
DO i=1,Nx
WRITE(11,rec=count) REAL(omeg(i,j),KIND(0d0))
count=count+1
END DO
END DO
CLOSE(11)
name_config = 'omegaexactfinal.datbin'
OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol)
count = 1
DO j=1,Ny
DO i=1,Nx
WRITE(11,rec=count) omegexact(i,j)
count=count+1
END DO
END DO
CLOSE(11)
name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
CALL dfftw_destroy_plan_(planfxy)
CALL dfftw_destroy_plan_(planbxy)
CALL dfftw_cleanup_()
DEALLOCATE(time,kx,kxx,ky,kyy,x,y,&
u,uold,v,vold,u_y,v_x,omegold, omegcheck, omeg, &
omegoldhat, omegoldhat_x, omegold_x,&
omegoldhat_y, omegold_y, nlold, nloldhat,&
omeghat, omeghat_x, omeghat_y, omeg_x, omeg_y,&
nl, nlhat, psihat, psihat_x, psi_x, psihat_y, psi_y,&
uexact_y,vexact_x,omegexact, &
fftfx,fftbx,stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Program execution complete'
END PROGRAM main
|
(
) |
Fortran 适配器:NSadaptor.f90 -- 没有 API 更改,出于完整性考虑,将其包含在内。 代码下载
! Fortran module for interacting with the ParaView/Catalyst coprocessor
module nsadaptor_glue
implicit none
public
contains
subroutine nsadaptor(nx, ny, nz, step, time, scalar)
! nx, ny, nz -- grid dimensions
! step -- current time step
! time -- current time
! scalar -- the data
integer, intent(in) :: nx, ny, nz, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:,:), intent(in) :: scalar
integer :: flag
! check if coprocessing this step
! defined in adaptor header in ParaView src
call requestdatadescription(step, time, flag)
if (flag .ne. 0) then
! coprocessing requested
! check if grid exists
! also from ParaView adaptor header
call needtocreategrid(flag)
if (flag .ne. 0) then
! grid needed
! defined by application developer
call createcpimagedata(nx, ny, nz)
end if
! also defined by application dev
call addfield(scalar, "test")
! from ParaView adaptor header
call coprocess()
end if
return
end subroutine nsadaptor
end module nsadaptor_glue
|
(
) |
C++ VTK 数据辅助程序:PointBasedDataSet.cxx -- 更改包括新的标头和新的命名空间。还要注意 vtkSmartPointer 的使用,而不是原始指针处理。 代码下载
// VTK Data set creator for ParaView/Catalyst Coprocessing.
// Interfaces with a Fortran adaptor to get data from a Fortran
// simulation.
// Fortran specific header for API called from Fortran glue module
#include "vtkCPPythonAdaptorAPI.h"
// Adaptor specific headers, will need to change if simulation data structure
// changes. This should available from the ParaView install include directory
// if you chose to install development files during configuration.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include "vtkCellData.h"
#include <string>
#include <iostream>
// Names are mangled with trailing underscore for linker visibility, along with
// using extern "C"
// Creates the vtkDataSet container for the CoProcessor. The simulation data
// must be compatible.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz)
{
if (!vtkCPPythonAdaptorAPI::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access coprocessor data.");
return;
}
vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
// Note: expects Fortran indexing
img->SetExtent(0, *nx - 1, 0, *ny - 1, 0, *nz - 1);
vtkCPPythonAdaptorAPI::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
}
// Add data to the container.
extern "C" void addfield_(double* simdata, char* name)
{
vtkSmartPointer<vtkCPInputDataDescription> idd =
vtkCPPythonAdaptorAPI::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
std::string fieldPythonName = name;
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetName(name);
field->SetArray(simdata, img->GetNumberOfPoints(), 1);
img->GetPointData()->AddArray(field);
}
}
以下说明是在 NICS 的 Nautilus 上使用 NS3D-MPI、NavierStokes3DfftIMR.f90、KgSemiImp3D.f90 和 NSLsplitting.f90 时得出的。
在使用单个 CPU 节点(非分区数据)时,可以将模拟数组直接传递给 ParaView,并使用 vtkDataSet::GetPointData()->AddArray() 方法将其添加到管道中。然后,vtkImageData 的点将与模拟的网格节点完全对应。该 vtkImageData 将具有隐式细胞结构,每个细胞角点都是模拟节点之一。
如果没有针对 MPI 的特殊更改,此代码将运行,但是 ParaView 将无法创建角点位于不同秩上的细胞。加载 .pvti 文件时会显示错误,指出范围不正确。可以加载各个 .vtis 文件,查看实际情况确实是这样的,以及细胞之间是否存在间隙。
例如,假设数组是 dimension(20,10,10),并且已分解为 2 个铅笔 (1:10,:,:) 和 (11:20,:,:)。ParaView 认为 (10:11,:,:) 的数据丢失,因此会出现间隙。
光晕提供了一种填补间隙的方法。2decomp&fft 光晕 api 创建包含铅笔数据和周围光晕元素的数组。例如,假设铅笔为 (10:20,10:20,:), 则光晕数组为 (9:21,9:21,:). 使用光晕会带来一些问题:
- 光晕元素可能会超出计算域,这些元素将包含垃圾数据。
- 并非每个铅笔都需要将相同的光晕元素发送给协同处理器。这需要在程序中添加额外的决策逻辑。
- 从 C++ 的角度来看,光晕数组部分不会紧密打包,因此会包含垃圾数据。
|
(
) |
- 添加了光晕数组。
- 添加了用于将打包数组传递给 C++ 的一维数组。
- 根据铅笔范围添加了用于判断要传递哪些光晕元素的分支。
- 如果铅笔距离索引原点最远,则仅发送铅笔,不发送光晕元素。
- 否则,如果铅笔位于最远范围的任一侧,但不位于两侧,则发送其他范围中最远的光晕元素。
- 否则,发送两侧最远的光晕元素。
- 添加了对光晕区域的重塑。数组通过引用从 F90 传递到 C++。C++ 不理解 F90 数组区域,并且只会从引用中递增读取(这是任何数组区域问题,不仅仅是光晕)。解决此问题的一种方法是将区域重塑为一维数组,这将使所需数据在内存中连续。
PROGRAM main
!-----------------------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 3D incompressible Navier-Stokes
! on a Cubic Domain [0,2pi]x[0,2pi]x[0,2pi] using pseudo-spectral methods and
! Implicit Midpoint rule timestepping. The numerical solution is compared to
! an exact solution reported by Shapiro
!
! Analytical Solution:
! u(x,y,z,t)=-0.25*(cos(x)sin(y)sin(z)+sin(x)cos(y)cos(z))exp(-t/Re)
! v(x,y,z,t)= 0.25*(sin(x)cos(y)sin(z)-cos(x)sin(y)cos(z))exp(-t/Re)
! w(x,y,z,t)= 0.5*cos(x)cos(y)sin(z)exp(-t/Re)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nz = number of modes in z - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Re = Reynolds number
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! k = loop counter in z direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! count = keep track of information written to disk
! iol = size of array to write to disk
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfxyz = Forward 3d fft plan
! planbxyz = Backward 3d fft plan
! dt = timestep
! .. Arrays ..
! u = velocity in x direction
! v = velocity in y direction
! w = velocity in z direction
! uold = velocity in x direction at previous timestep
! vold = velocity in y direction at previous timestep
! wold = velocity in z direction at previous timestep
! ux = x derivative of velocity in x direction
! uy = y derivative of velocity in x direction
! uz = z derivative of velocity in x direction
! vx = x derivative of velocity in y direction
! vy = y derivative of velocity in y direction
! vz = z derivative of velocity in y direction
! wx = x derivative of velocity in z direction
! wy = y derivative of velocity in z direction
! wz = z derivative of velocity in z direction
! uxold = x derivative of velocity in x direction
! uyold = y derivative of velocity in x direction
! uzold = z derivative of velocity in x direction
! vxold = x derivative of velocity in y direction
! vyold = y derivative of velocity in y direction
! vzold = z derivative of velocity in y direction
! wxold = x derivative of velocity in z direction
! wyold = y derivative of velocity in z direction
! wzold = z derivative of velocity in z direction
! utemp = temporary storage of u to check convergence
! vtemp = temporary storage of u to check convergence
! wtemp = temporary storage of u to check convergence
! temp_r = temporary storage for untransformed variables
! uhat = Fourier transform of u
! vhat = Fourier transform of v
! what = Fourier transform of w
! rhsuhatfix = Fourier transform of righthand side for u for timestepping
! rhsvhatfix = Fourier transform of righthand side for v for timestepping
! rhswhatfix = Fourier transform of righthand side for w for timestepping
! nonlinuhat = Fourier transform of nonlinear term for u
! nonlinvhat = Fourier transform of nonlinear term for u
! nonlinwhat = Fourier transform of nonlinear term for u
! phat = Fourier transform of nonlinear term for pressure, p
! temp_c = temporary storage for Fourier transforms
! realtemp = Real storage
!
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kz = fourier frequencies in z direction
! x = x locations
! y = y locations
! z = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
!
! REFERENCES
!
! A. Shapiro " The use of an exact solution of the Navier-Stokes equations
! in a validation test of a three-dimensional nonhydrostatic numerical model"
! Monthly Weather Review vol. 121, 2420-2425, (1993).
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
!
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!
!--------------------------------------------------------------------------------
! External routines required
!
! External libraries required
! 2DECOMP&FFT -- Fast Fourier Transform in the West Library
! (http://2decomp.org/)
!---------------------------------------------------------------------------------
USE decomp_2d
USE decomp_2d_fft
USE decomp_2d_io
USE MPI
!coprocessing:
use NSadaptor_module
IMPLICIT NONE
! declare variables
INTEGER(kind=4), PARAMETER :: Nx=256
INTEGER(kind=4), PARAMETER :: Ny=256
INTEGER(kind=4), PARAMETER :: Nz=256
INTEGER(kind=4), PARAMETER :: Lx=1
INTEGER(kind=4), PARAMETER :: Ly=1
INTEGER(kind=4), PARAMETER :: Lz=1
INTEGER(kind=4), PARAMETER :: Nt=20
REAL(kind=8), PARAMETER :: dt=0.05d0/Nt
REAL(kind=8), PARAMETER :: Re=1.0d0
REAL(kind=8), PARAMETER :: tol=0.1d0**10
REAL(kind=8), PARAMETER :: theta=0.0d0
REAL(kind=8), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER :: ReInv=1.0d0/REAL(Re,kind(0d0))
REAL(kind=8), PARAMETER :: dtInv=1.0d0/REAL(dt,kind(0d0))
REAL(kind=8) :: scalemodes,chg,factor
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: x, y, z, time,mychg,allchg
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: u, v, w,&
ux, uy, uz,&
vx, vy, vz,&
wx, wy, wz,&
uold, uxold, uyold, uzold,&
vold, vxold, vyold, vzold,&
wold, wxold, wyold, wzold,&
utemp, vtemp, wtemp, temp_r
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: kx, ky, kz
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: uhat, vhat, what,&
rhsuhatfix, rhsvhatfix,&
rhswhatfix, nonlinuhat,&
nonlinvhat, nonlinwhat,&
phat,temp_c
!coprocessing: added x,y,z component arrays also halos
!coprocessing: halos get allocated inside call to update_halo
REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: realtemp, realtempx, &
realtempy, realtempz, tempxhalo, tempyhalo, tempzhalo
!coprocessing: added 1D array for passing to C++
real(kind=8), dimension(:), allocatable :: xpass2c, ypass2c, zpass2c
! MPI and 2DECOMP variables
TYPE(DECOMP_INFO) :: decomp
INTEGER(kind=MPI_OFFSET_KIND) :: filesize, disp
INTEGER(kind=4) :: p_row=0, p_col=0, numprocs, myid, ierr
! variables used for saving data and timing
INTEGER(kind=4) :: count, iol
INTEGER(kind=4) :: i,j,k,n,t,allocatestatus
INTEGER(kind=4) :: ind, numberfile
CHARACTER*100 :: name_config
INTEGER(kind=4) :: start, finish, count_rate
! initialisation of 2DECOMP&FFT
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
! do automatic domain decomposition
CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
! get information about domain decomposition choosen
CALL decomp_info_init(Nx,Ny,Nz,decomp)
! initialise FFT library
CALL decomp_2d_fft_init
IF (myid.eq.0) THEN
PRINT *,'Grid:',Nx,'X',Ny,'Y',Nz,'Z'
PRINT *,'dt:',dt
END IF
ALLOCATE(x(1:Nx),y(1:Ny),z(1:Nz),time(1:Nt+1),mychg(1:3),allchg(1:3),&
u(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
v(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
w(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
ux(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
utemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
temp_r(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
kx(1:Nx),ky(1:Ny),kz(1:Nz),&
uhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
vhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
what(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsuhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsvhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhswhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinuhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinvhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinwhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
phat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
temp_c(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
realtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
realtempx(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempy(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempz(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'allocated space'
END IF
! setup fourier frequencies in x-direction
DO i=1,Nx/2+1
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
ind=1
DO i=-Nx/2,Nx/2-1
x(ind)=2.0d0*pi*REAL(i,kind(0d0))*Lx/REAL(Nx,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in y-direction
DO j=1,Ny/2+1
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
ind=1
DO j=-Ny/2,Ny/2-1
y(ind)=2.0d0*pi*REAL(j,kind(0d0))*Ly/REAL(Ny,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in z-direction
DO k=1,Nz/2+1
kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz
END DO
kz(1+Nz/2)=0.0d0
DO k = 1,Nz/2 -1
kz(k+1+Nz/2)=-kz(1-k+Nz/2)
END DO
ind=1
DO k=-Nz/2,Nz/2-1
z(ind)=2.0d0*pi*REAL(k,kind(0d0))*Lz/REAL(Nz,kind(0d0))
ind=ind+1
END DO
scalemodes=1.0d0/REAL(Nx*Ny*Nz,kind(0d0))
IF (myid.eq.0) THEN
PRINT *,'Setup grid and fourier frequencies'
END IF
!initial conditions for Taylor-Green vortex
! factor=2.0d0/sqrt(3.0d0)
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! u(i,j,k)=factor*sin(theta+2.0d0*pi/3.0d0)*sin(x(i))*cos(y(j))*cos(z(k))
! END DO; END DO; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! v(i,j,k)=factor*sin(theta-2.0d0*pi/3.0d0)*cos(x(i))*sin(y(j))*cos(z(k))
! END DO ; END DO ; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! w(i,j,k)=factor*sin(theta)*cos(x(i))*cos(y(j))*sin(z(k))
! END DO ; END DO ; END DO
time(1)=0.0d0
factor=sqrt(3.0d0)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(u,uhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(v,vhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(w,what,DECOMP_2D_FFT_FORWARD)
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
! save initial data
n = 0
!coprocessing: removed savedata calls from coprocessing version
!coprocessing: could have this after the coprocessor initialize and
!coprocessing: followed by a call to the coprocessor to write out the initial data
!coprocessing: image. Otherwise, these are just wasted calls at this spot.
call coprocessorinitialize("pipeline.py", 11)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempx, tempxhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempy, tempyhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempz, tempzhalo, level=1)
!coprocessing: Four cases of sending halo data to the coprocessor.
if ((ny_global == decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
!coprocessing: case 1: pencil furthest from index origin, send just the pencil
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global > decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
!coprocessing: case 2: pencil furthest in z, send just upper y halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global == decomp%xen(2)) .and. (nz_global > decomp%xen(3))) then
!coprocessing: case 3: pencil furthest in y, send just upper z halo
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3)+1, n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else
!coprocessing: case 4: send both upper y and upper z halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3)+1,n,n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c,ypass2c,zpass2c)
end if
!start timer
CALL system_clock(start,count_rate)
! coprocessing: Simulation loop starts here
DO n=1,Nt
!fixed point
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
uold(i,j,k)=u(i,j,k)
uxold(i,j,k)=ux(i,j,k)
uyold(i,j,k)=uy(i,j,k)
uzold(i,j,k)=uz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vold(i,j,k)=v(i,j,k)
vxold(i,j,k)=vx(i,j,k)
vyold(i,j,k)=vy(i,j,k)
vzold(i,j,k)=vz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wold(i,j,k)=w(i,j,k)
wxold(i,j,k)=wx(i,j,k)
wyold(i,j,k)=wy(i,j,k)
wzold(i,j,k)=wz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsuhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*uhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsvhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*vhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhswhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*what(i,j,k)
END DO; END DO ; END DO
chg=1
DO WHILE (chg .gt. tol)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(ux(i,j,k)+uxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(uy(i,j,k)+uyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(uz(i,j,k)+uzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinuhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(vx(i,j,k)+vxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(vy(i,j,k)+vyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(vz(i,j,k)+vzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinvhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(wx(i,j,k)+wxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(wy(i,j,k)+wyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(wz(i,j,k)+wzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinwhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
phat(i,j,k)=-1.0d0*( kx(i)*nonlinuhat(i,j,k)&
+ky(j)*nonlinvhat(i,j,k)&
+kz(k)*nonlinwhat(i,j,k))&
/(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)+0.1d0**13)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
uhat(i,j,k)=(rhsuhatfix(i,j,k)-nonlinuhat(i,j,k)-kx(i)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
vhat(i,j,k)=(rhsvhatfix(i,j,k)-nonlinvhat(i,j,k)-ky(j)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
what(i,j,k)=(rhswhatfix(i,j,k)-nonlinwhat(i,j,k)-kz(k)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(uhat,u,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(vhat,v,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(what,w,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=u(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=v(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=w(i,j,k)*scalemodes
END DO; END DO ; END DO
mychg(1) =maxval(abs(utemp-u))
mychg(2) =maxval(abs(vtemp-v))
mychg(3) =maxval(abs(wtemp-w))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT *,'chg:',chg
END IF
END DO
time(n+1)=n*dt
!goto 5100
IF (myid.eq.0) THEN
PRINT *,'time',n*dt
END IF
!coprocessing: populating the arrays sent to the coprocessor
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempx, tempxhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempy, tempyhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempz, tempzhalo, level=1)
!coprocessing: same cases as for initial conditions.
if ((ny_global == decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
! case 1: send just the pencil
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global > decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
! case 2: send just upper y halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global == decomp%xen(2)) .and. (nz_global > decomp%xen(3))) then
! case 3: send just upper z halo
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3)+1, n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else
! send both upper y and upper z halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3)+1,n,n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c,ypass2c,zpass2c)
end if
END DO
!coprocessing:
call coprocessorfinalize()
CALL system_clock(finish,count_rate)
IF (myid.eq.0) then
PRINT *, 'Program took', REAL(finish-start)/REAL(count_rate), 'for main timestepping loop'
END IF
IF (myid.eq.0) THEN
name_config = './data/tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO n=1,1+Nt
WRITE(11,*) time(n)
END DO
CLOSE(11)
name_config = './data/xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = './data/ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
name_config = './data/zcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO k=1,Nz
WRITE(11,*) z(k)
END DO
CLOSE(11)
PRINT *,'Saved data'
END IF
! Calculate error in final numerical solution
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k) -&
(-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k) -&
(0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)-&
(cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
mychg(1) = maxval(abs(utemp))
mychg(2) = maxval(abs(vtemp))
mychg(3) = maxval(abs(wtemp))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT*,'The error at the final timestep is',chg
END IF
! clean up
CALL decomp_2d_fft_finalize
CALL decomp_2d_finalize
DEALLOCATE(x,y,z,time,mychg,allchg,u,v,w,ux,uy,uz,vx,vy,vz,wx,wy,wz,uold,uxold,uyold,uzold,&
vold,vxold,vyold,vzold,wold,wxold,wyold,wzold,utemp,vtemp,wtemp,&
temp_r,kx,ky,kz,uhat,vhat,what,rhsuhatfix,rhsvhatfix,&
rhswhatfix,phat,nonlinuhat,nonlinvhat,nonlinwhat,temp_c,&
realtemp,stat=AllocateStatus)
!coprocessing: deallocate Coprocessing specific arrays
deallocate(realtempx, realtempy, realtempz, tempxhalo, tempyhalo, tempzhalo)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'Program execution complete'
END IF
CALL MPI_FINALIZE(ierr)
END PROGRAM main
|
(
) |
- 更改适配器以接受当前铅笔的子范围。
! fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaNSadaptor.f
! -- Changed for SESE
! subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.
module NSadaptor_module
use iso_c_binding
implicit none
public
interface NSadaptor
! Originally also had a version that accepted 3D arrays, but in all other
! respects was identical. Decided this could lead to too much confusion.
module procedure NSadaptor1D
end interface NSadaptor
contains
subroutine NSadaptor1D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
step, time, omegax, omegay, omegaz)
! nx, ny, nz -- grid dimensions or entire mesh
! used for setting whole extent
! xst, xen, etc. -- extents of current subdomain pencil
! step -- current simulation time step
! time -- current simulation time
! omega* -- scalar array for the current time step
! flag -- receives status from API calls
integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:), intent (in) :: omegax, omegay, omegaz
integer :: flag
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag /= 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag /= 0) then
! grid needed
! defined in VTKPointBasedDataSet.cxx
! takes the size of the entire grid, and the extents of the
! sub grid.
call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
end if
! defined in VTKPointBasedDataSet.cxx
! remember to null-terminate strings for C/C++
call addfield(omegax, "realtempx"//char(0));
call addfield(omegay, "realtempy"//char(0));
call addfield(omegaz, "realtempz"//char(0));
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine NSadaptor1D
end module NSadaptor_module
|
(
) |
- 指定了相对于整个数据的已分区数据范围。
- 指定了单元格的相对间距。
// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h"
// CoProcessor specific headers
// Routines that take the place of VTK dataset object creation.
// Called from Fortran code which also calls the Fortran Adaptor API
// supplied with ParaView source.
// Note: names mangled with trailing underscores for linker visibility.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include <string>
// These will be called from the Fortran "glue" code"
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.
// Creates the data container for the CoProcessor.
// Takes the extents for both the global dataset and the particular subsection
// visible to the current MPI rank.
// Note: expects to receive Fortran base-1 indices.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,
int* yst, int* yen, int* zst, int* zen)
{
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 3-dimensional topologically and geometrically
// regular grid. In VTK/ParaView, this is considered an image data set.
vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
img->SetExtent(*xst - 1, *xen - 1, *yst - 1, *yen - 1, *zst - 1, *zen - 1);
img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx - 1, 0, *ny - 1, 0, *nz - 1);
}
// Add field(s) to the data container.
// Separate from above because this will be dynamic, grid is static.
// Might be an issue, VTK assumes row major C storage.
// Underscore is by-hand name mangling for fortran linking.
// Note: Expects the address of the data, has no way of determining
// if the array is densely packed or not.
// Note 2: Expects "name" to point to null-terminated character array,
// be sure to null-terminate in caller.
extern "C" void addfield_(double* a, char* name)
{
vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetName(name);
field->SetArray(a, img->GetNumberOfPoints(), 1);
img->GetPointData()->AddArray(field);
}
}
- 某些 VTK/ParaView 过滤器可以使用幽灵(光晕)单元格,代码可以扩展以处理这些单元格。
- 在模拟代码本身中添加了很多代码。
- 光晕和打包数组需要额外的内存。
- 为 C++ 重塑数组的额外开销。
- 不同的分解库可能需要更改光晕逻辑。
- 不能与 3D 分解库一起使用。
查看数据的另一种方式是,模拟的每个节点都是可视化域中的一个单元格。然后,当铅笔发送到 ParaView 时,就不会有缺失单元格的间隙。
|
(
) |
- 除了常规的 CoProcessing 调用之外,没有其他更改。
PROGRAM main
!-----------------------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 3D incompressible Navier-Stokes
! on a Cubic Domain [0,2pi]x[0,2pi]x[0,2pi] using pseudo-spectral methods and
! Implicit Midpoint rule timestepping. The numerical solution is compared to
! an exact solution reported by Shapiro
!
! Analytical Solution:
! u(x,y,z,t)=-0.25*(cos(x)sin(y)sin(z)+sin(x)cos(y)cos(z))exp(-t/Re)
! v(x,y,z,t)= 0.25*(sin(x)cos(y)sin(z)-cos(x)sin(y)cos(z))exp(-t/Re)
! w(x,y,z,t)= 0.5*cos(x)cos(y)sin(z)exp(-t/Re)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nz = number of modes in z - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Re = Reynolds number
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! k = loop counter in z direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! count = keep track of information written to disk
! iol = size of array to write to disk
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfxyz = Forward 3d fft plan
! planbxyz = Backward 3d fft plan
! dt = timestep
! .. Arrays ..
! u = velocity in x direction
! v = velocity in y direction
! w = velocity in z direction
! uold = velocity in x direction at previous timestep
! vold = velocity in y direction at previous timestep
! wold = velocity in z direction at previous timestep
! ux = x derivative of velocity in x direction
! uy = y derivative of velocity in x direction
! uz = z derivative of velocity in x direction
! vx = x derivative of velocity in y direction
! vy = y derivative of velocity in y direction
! vz = z derivative of velocity in y direction
! wx = x derivative of velocity in z direction
! wy = y derivative of velocity in z direction
! wz = z derivative of velocity in z direction
! uxold = x derivative of velocity in x direction
! uyold = y derivative of velocity in x direction
! uzold = z derivative of velocity in x direction
! vxold = x derivative of velocity in y direction
! vyold = y derivative of velocity in y direction
! vzold = z derivative of velocity in y direction
! wxold = x derivative of velocity in z direction
! wyold = y derivative of velocity in z direction
! wzold = z derivative of velocity in z direction
! utemp = temporary storage of u to check convergence
! vtemp = temporary storage of u to check convergence
! wtemp = temporary storage of u to check convergence
! temp_r = temporary storage for untransformed variables
! uhat = Fourier transform of u
! vhat = Fourier transform of v
! what = Fourier transform of w
! rhsuhatfix = Fourier transform of righthand side for u for timestepping
! rhsvhatfix = Fourier transform of righthand side for v for timestepping
! rhswhatfix = Fourier transform of righthand side for w for timestepping
! nonlinuhat = Fourier transform of nonlinear term for u
! nonlinvhat = Fourier transform of nonlinear term for u
! nonlinwhat = Fourier transform of nonlinear term for u
! phat = Fourier transform of nonlinear term for pressure, p
! temp_c = temporary storage for Fourier transforms
! realtemp = Real storage
!
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kz = fourier frequencies in z direction
! x = x locations
! y = y locations
! z = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
!
! REFERENCES
!
! A. Shapiro " The use of an exact solution of the Navier-Stokes equations
! in a validation test of a three-dimensional nonhydrostatic numerical model"
! Monthly Weather Review vol. 121, 2420-2425, (1993).
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
!
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!
!--------------------------------------------------------------------------------
! External routines required
!
! External libraries required
! 2DECOMP&FFT -- Fast Fourier Transform in the West Library
! (http://2decomp.org/)
!---------------------------------------------------------------------------------
USE decomp_2d
USE decomp_2d_fft
USE decomp_2d_io
USE MPI
!coprocessing:
use NSadaptor_module
IMPLICIT NONE
! declare variables
INTEGER(kind=4), PARAMETER :: Nx=256
INTEGER(kind=4), PARAMETER :: Ny=256
INTEGER(kind=4), PARAMETER :: Nz=256
INTEGER(kind=4), PARAMETER :: Lx=1
INTEGER(kind=4), PARAMETER :: Ly=1
INTEGER(kind=4), PARAMETER :: Lz=1
INTEGER(kind=4), PARAMETER :: Nt=20
REAL(kind=8), PARAMETER :: dt=0.05d0/Nt
REAL(kind=8), PARAMETER :: Re=1.0d0
REAL(kind=8), PARAMETER :: tol=0.1d0**10
REAL(kind=8), PARAMETER :: theta=0.0d0
REAL(kind=8), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER :: ReInv=1.0d0/REAL(Re,kind(0d0))
REAL(kind=8), PARAMETER :: dtInv=1.0d0/REAL(dt,kind(0d0))
REAL(kind=8) :: scalemodes,chg,factor
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: x, y, z, time,mychg,allchg
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: u, v, w,&
ux, uy, uz,&
vx, vy, vz,&
wx, wy, wz,&
uold, uxold, uyold, uzold,&
vold, vxold, vyold, vzold,&
wold, wxold, wyold, wzold,&
utemp, vtemp, wtemp, temp_r
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: kx, ky, kz
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: uhat, vhat, what,&
rhsuhatfix, rhsvhatfix,&
rhswhatfix, nonlinuhat,&
nonlinvhat, nonlinwhat,&
phat,temp_c
!coprocessing: added x,y,z component arrays
REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: realtemp, realtempx, &
realtempy, realtempz
! MPI and 2DECOMP variables
TYPE(DECOMP_INFO) :: decomp
INTEGER(kind=MPI_OFFSET_KIND) :: filesize, disp
INTEGER(kind=4) :: p_row=0, p_col=0, numprocs, myid, ierr
! variables used for saving data and timing
INTEGER(kind=4) :: count, iol
INTEGER(kind=4) :: i,j,k,n,t,allocatestatus
INTEGER(kind=4) :: ind, numberfile
CHARACTER*100 :: name_config
INTEGER(kind=4) :: start, finish, count_rate
! initialisation of 2DECOMP&FFT
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
! do automatic domain decomposition
CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
! get information about domain decomposition choosen
CALL decomp_info_init(Nx,Ny,Nz,decomp)
! initialise FFT library
CALL decomp_2d_fft_init
IF (myid.eq.0) THEN
PRINT *,'Grid:',Nx,'X',Ny,'Y',Nz,'Z'
PRINT *,'dt:',dt
END IF
ALLOCATE(x(1:Nx),y(1:Ny),z(1:Nz),time(1:Nt+1),mychg(1:3),allchg(1:3),&
u(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
v(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
w(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
ux(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
utemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
temp_r(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
kx(1:Nx),ky(1:Ny),kz(1:Nz),&
uhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
vhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
what(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsuhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsvhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhswhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinuhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinvhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinwhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
phat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
temp_c(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
realtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
realtempx(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempy(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempz(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'allocated space'
END IF
! setup fourier frequencies in x-direction
DO i=1,Nx/2+1
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
ind=1
DO i=-Nx/2,Nx/2-1
x(ind)=2.0d0*pi*REAL(i,kind(0d0))*Lx/REAL(Nx,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in y-direction
DO j=1,Ny/2+1
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
ind=1
DO j=-Ny/2,Ny/2-1
y(ind)=2.0d0*pi*REAL(j,kind(0d0))*Ly/REAL(Ny,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in z-direction
DO k=1,Nz/2+1
kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz
END DO
kz(1+Nz/2)=0.0d0
DO k = 1,Nz/2 -1
kz(k+1+Nz/2)=-kz(1-k+Nz/2)
END DO
ind=1
DO k=-Nz/2,Nz/2-1
z(ind)=2.0d0*pi*REAL(k,kind(0d0))*Lz/REAL(Nz,kind(0d0))
ind=ind+1
END DO
scalemodes=1.0d0/REAL(Nx*Ny*Nz,kind(0d0))
IF (myid.eq.0) THEN
PRINT *,'Setup grid and fourier frequencies'
END IF
!initial conditions for Taylor-Green vortex
! factor=2.0d0/sqrt(3.0d0)
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! u(i,j,k)=factor*sin(theta+2.0d0*pi/3.0d0)*sin(x(i))*cos(y(j))*cos(z(k))
! END DO; END DO; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! v(i,j,k)=factor*sin(theta-2.0d0*pi/3.0d0)*cos(x(i))*sin(y(j))*cos(z(k))
! END DO ; END DO ; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! w(i,j,k)=factor*sin(theta)*cos(x(i))*cos(y(j))*sin(z(k))
! END DO ; END DO ; END DO
time(1)=0.0d0
factor=sqrt(3.0d0)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(u,uhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(v,vhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(w,what,DECOMP_2D_FFT_FORWARD)
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
! save initial data
n = 0
!coprocessing: removed savedata calls from coprocessing version
!coprocessing: could have this after the coprocessor initialize and
!coprocessing: followed by a call to the coprocessor to write out the initial data
!coprocessing: image. Otherwise, these are just wasted calls at this spot.
call coprocessorinitialize("pipeline.py", 11)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
realtempx, realtempy, realtempz)
!start timer
CALL system_clock(start,count_rate)
! coprocessing: Simulation loop starts here
DO n=1,Nt
!fixed point
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
uold(i,j,k)=u(i,j,k)
uxold(i,j,k)=ux(i,j,k)
uyold(i,j,k)=uy(i,j,k)
uzold(i,j,k)=uz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vold(i,j,k)=v(i,j,k)
vxold(i,j,k)=vx(i,j,k)
vyold(i,j,k)=vy(i,j,k)
vzold(i,j,k)=vz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wold(i,j,k)=w(i,j,k)
wxold(i,j,k)=wx(i,j,k)
wyold(i,j,k)=wy(i,j,k)
wzold(i,j,k)=wz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsuhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*uhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsvhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*vhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhswhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*what(i,j,k)
END DO; END DO ; END DO
chg=1
DO WHILE (chg .gt. tol)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(ux(i,j,k)+uxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(uy(i,j,k)+uyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(uz(i,j,k)+uzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinuhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(vx(i,j,k)+vxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(vy(i,j,k)+vyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(vz(i,j,k)+vzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinvhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(wx(i,j,k)+wxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(wy(i,j,k)+wyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(wz(i,j,k)+wzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinwhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
phat(i,j,k)=-1.0d0*( kx(i)*nonlinuhat(i,j,k)&
+ky(j)*nonlinvhat(i,j,k)&
+kz(k)*nonlinwhat(i,j,k))&
/(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)+0.1d0**13)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
uhat(i,j,k)=(rhsuhatfix(i,j,k)-nonlinuhat(i,j,k)-kx(i)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
vhat(i,j,k)=(rhsvhatfix(i,j,k)-nonlinvhat(i,j,k)-ky(j)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
what(i,j,k)=(rhswhatfix(i,j,k)-nonlinwhat(i,j,k)-kz(k)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(uhat,u,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(vhat,v,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(what,w,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=u(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=v(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=w(i,j,k)*scalemodes
END DO; END DO ; END DO
mychg(1) =maxval(abs(utemp-u))
mychg(2) =maxval(abs(vtemp-v))
mychg(3) =maxval(abs(wtemp-w))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT *,'chg:',chg
END IF
END DO
time(n+1)=n*dt
!goto 5100
IF (myid.eq.0) THEN
PRINT *,'time',n*dt
END IF
!coprocessing: populating the arrays sent to the coprocessor
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
realtempx, realtempy, realtempz)
END DO
!coprocessing:
call coprocessorfinalize()
CALL system_clock(finish,count_rate)
IF (myid.eq.0) then
PRINT *, 'Program took', REAL(finish-start)/REAL(count_rate), 'for main timestepping loop'
END IF
IF (myid.eq.0) THEN
name_config = './data/tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO n=1,1+Nt
WRITE(11,*) time(n)
END DO
CLOSE(11)
name_config = './data/xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = './data/ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
name_config = './data/zcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO k=1,Nz
WRITE(11,*) z(k)
END DO
CLOSE(11)
PRINT *,'Saved data'
END IF
! Calculate error in final numerical solution
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k) -&
(-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k) -&
(0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)-&
(cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
mychg(1) = maxval(abs(utemp))
mychg(2) = maxval(abs(vtemp))
mychg(3) = maxval(abs(wtemp))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT*,'The error at the final timestep is',chg
END IF
! clean up
CALL decomp_2d_fft_finalize
CALL decomp_2d_finalize
DEALLOCATE(x,y,z,time,mychg,allchg,u,v,w,ux,uy,uz,vx,vy,vz,wx,wy,wz,uold,uxold,uyold,uzold,&
vold,vxold,vyold,vzold,wold,wxold,wyold,wzold,utemp,vtemp,wtemp,&
temp_r,kx,ky,kz,uhat,vhat,what,rhsuhatfix,rhsvhatfix,&
rhswhatfix,phat,nonlinuhat,nonlinvhat,nonlinwhat,temp_c,&
realtemp,stat=AllocateStatus)
!coprocessing: deallocate Coprocessing specific arrays
deallocate(realtempx, realtempy, realtempz)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'Program execution complete'
END IF
CALL MPI_FINALIZE(ierr)
END PROGRAM main
|
(
) |
- 接受当前铅笔的子范围。
! Fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE
! Subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! Some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.
module NSadaptor_module
use iso_c_binding
implicit none
public
interface NSadaptor
module procedure NSadaptor3D
end interface NSadaptor
contains
subroutine NSadaptor3D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
step, time, omegax, omegay, omegaz)
! nx, ny, nz -- grid dimensions of entire mesh
! used for setting whole extent
! xst, xen, etc. -- extents of current subdomain pencil
! step -- current simulation time step
! time -- current simulation time
! omega* -- scalar arrays for the current time step
integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:,:,:), intent (in) :: omegax, omegay, omegaz
integer :: flag
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag /= 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag /= 0) then
! grid needed
! defined in VTKCellBasedDataSet.cxx
! takes the size of the entire grid, and the extents of the
! sub grid.
call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
end if
! defined in VTKCellBasedDataSet.cxx
! call for each field of interest. Be sure to null-terminate the
! string for C++!
call addfield(omegax, "realtempx"//char(0))
call addfield(omegay, "realtempy"//char(0))
call addfield(omegaz, "realtempz"//char(0))
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine NSadaptor3D
end module NSadaptor_module
|
(
) |
- 更改了范围的计算方式。
- 将点数据调用更改为单元格数据调用。
// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h"
// CoProcessor specific headers
// Routines that take the place of VTK dataset object creation.
// Called from Fortran code which also calls the Fortran Adaptor API
// supplied with ParaView source.
// Note: names mangled with trailing underscores for linker visibility.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include "vtkCellData.h"
#include <string>
// These will be called from the Fortran adaptor "glue" code.
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.
// Creates the data container for the CoProcessor.
// Takes the extents for both the global dataset and the particular subsection
// visible to the current MPI rank.
// Note: expects to receive Fortran base-1 indices.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,
int* yst, int* yen, int* zst, int* zen)
{
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 3-dimensional topologically and geometrically
// regular grid. In VTK/ParaView, this is considered an image data set.
vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
// Indexing based on change from F90 to C++, and also from nodal to cellular.
img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen);
img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz);
}
// Add field to the data container.
// Separate from above because this could be dynamic, grid is static.
// Might be an issue, VTK assumes row major C storage.
// Underscore is by-hand name mangling for fortran linking.
// Note: Expects the address of the data, has no way of determining
// if the array is densely packed or not.
// Note 2: Assumes "name" points to null-terminated array of chars.
// Easiest way to do that is to concatenate a terminal NULL in caller.
extern "C" void addfield_(double* a, char* name)
{
vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetName(name);
field->SetArray(a, img->GetNumberOfCells(), 1);
img->GetCellData()->AddArray(field);
}
}
- 最小的代码更改,模拟代码中没有更改(除了协处理调用)。
- 不需要光晕或打包数组。
- 应该可以在无需更改的情况下使用不同的分解库。
- 需要在 ParaView 客户端中进行额外处理,以处理单元格渲染和点渲染之间的差异。
- 不能与需要幽灵单元格的 VTK/ParaView 过滤器一起使用。
C++ 没有本机复数数据类型。这需要在 ParaView 客户端中进行一些额外的处理
- 以减少的网格大小运行原始的非协处理模拟。
- 使用 RAW 二进制阅读器在 ParaView 客户端中打开此数据。
- 将标量分量数设置为 2
- 创建管道,实部和虚部的默认名称分别是 ImageFile_X 和 ImageFile_Y。
|
(
) |
有关将复数传递给协处理器的详细信息,请参见 NLSadaptor.f90 的代码。 代码下载
! Fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE
! Subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! Some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.
module NLSadaptor_module
use iso_c_binding
implicit none
public
! using an interface incase need arises to overload 1D, 2D versions
interface NLSadaptor
module procedure NLSadaptor3D
end interface NLSadaptor
contains
subroutine NLSadaptor3D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
step, time, a)
! nx, ny, nz -- grid dimensions of entire mesh
! used for setting whole extent
! xst, xen, etc. -- extents of current subdomain pencil
! step -- current simulation time step
! time -- current simulation time
! a -- scalar array for the current time step
! flag -- receives status from API calls
integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
real(kind=8), intent(in) :: time
complex(kind=8), dimension(:,:,:), intent (in) :: a
integer :: flag
flag = 0
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag /= 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag /= 0) then
! grid needed
! defined in adaptor.cxx
! takes the size of the entire grid, and the extents of the
! sub grid.
call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
end if
! defined in adaptor.cxx
! Be sure to null-terminate the
! string for C++!
call addfield(a, "u_complex"//char(0))
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine NLSadaptor3D
end module NLSadaptor_module
|
(
) |
这也需要在附带的 C++ 文件中进行一些小的代码更改。 代码下载
// 适配器,用于将 fortran 模拟代码输入 ParaView 协处理。 // 基于 ParaView 发行版中的 PhastaAdaptor 示例。 // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran 特定标头 // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
- include "FortranAdaptorAPI.h"
// 协处理特定标头 // 替代 VTK 数据集对象创建的例程。 // 从 Fortran 代码调用,该代码还调用与 ParaView 源代码一起提供的 Fortran Adaptor API // 注意:名称与尾部下划线混合,用于链接器可见性。
- include "vtkCPDataDescription.h"
- include "vtkCPInputDataDescription.h"
- include "vtkCPProcessor.h"
- include "vtkDoubleArray.h"
- include "vtkPointData.h"
- include "vtkSmartPointer.h"
- include "vtkImageData.h"
- include "vtkCellData.h"
- include <string>
// 这些将从 Fortran “粘合” 代码调用 // 完全依赖于数据布局,结构化与非结构化,等等 // 因为 VTK/ParaView 对每个布局使用不同的内部布局。
// 为协处理器创建数据容器。 // 获取全局数据集和当前 MPI 秩可见的特定子部分的范围 // 注意:预期接收 Fortran 基于 1 的索引。 extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen, int* yst, int* yen, int* zst, int* zen) {
if (!ParaViewCoProcessing::GetCoProcessorData()) { vtkGenericWarningMacro("Unable to access CoProcessorData."); return; }
// The simulation grid is a 3-dimensional topologically and geometrically // regular grid. In VTK/ParaView, this is considered an image data set. vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
// For Fortram complex, need to set that there are 2 scalar components img->SetNumberOfScalarComponents(2);
// Indexing based on change from F90 to C++, and also from nodal to cellular. // Extents are given in terms of nodes, not cells. img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen); // Setting spacing is important so that the camera position in the pipeline makes // sense if using different sized meshes between setting up the pipeline and running // the simulation. Origin can often be ignored. img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz); // considering passing in as args. ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img); ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz);
}
// 向数据容器添加字段。 // 与上述分开,因为这可能是动态的,网格是静态的。 // 可能会出现问题,VTK 假设行主序 C 存储。 // 下划线是用于 fortran 链接的手动名称混合。 // 注意:预期接收数据的地址,无法确定 // 数组是否密集打包。 // 注意 2:假设 “name” 指向以 null 结尾的字符数组。 // 最简单的实现方式是在调用者中进行连接。 extern "C" void addfield_(double* a, char* name) {
vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input"); vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) { vtkGenericWarningMacro("No adaptor grid to attach field data to."); return; }
if (idd->IsFieldNeeded(name)) { vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New(); field->SetNumberOfComponents(2); field->SetName(name); field->SetArray(a, 2* img->GetNumberOfCells(), 1); img->GetCellData()->AddArray(field); }
}
- 某些 ParaView 过滤器和表示形式仅适用于点数据,例如体积渲染。使用单元格数据到点数据过滤器进行转换。请务必选中选项,以将单元格数据通过过滤器传递。如果模拟运行没有错误并产生没有输出,则可能的罪魁祸首是在创建管道时未能选中该选项。
- 使用体积渲染的管道将生成有关区域 ID 为 0 的警告。通常可以忽略这些警告。
- ParaView 维基条目
- SC10 协处理教程
- ParaView 源代码,特别是 PHASTA Fortran 示例。
- paraview-users 邮件列表