Fortran 程序和在 Windows 上的入门

示例程序

要使用 OpenMP 或 MPI（消息传递接口）进行并行编程，我们通常需要使用比 Matlab 更低级的语言，例如 Fortran。另一种可能的语言选择是 C，但 Fortran 具有比 C 更强大的数组处理能力，并且其语法与 Matlab 类似，因此通常更易于用于对正则数组大量使用的科学计算。因此，在开始研究如何创建并行程序之前，介绍一些简单的 Fortran 程序是有用的。关于 Fortran 的一个很好的最新参考资料是 Metcalf、Reid 和 Cohen^[1]。我们认识到大多数人会不熟悉 Fortran，并且可能更熟悉 Matlab^[2]、C 或 C++，但我们希望示例代码将使任何具有入门编程背景的人更容易理解。最近的一份指南描述了如何编写高效的并行 Fortran 代码，即 Levesque 和 Wagenbreth^[3]。我们的程序被编写为在密歇根大学的 Flux 集群上运行。有关此集群的更多信息，请访问 http://cac.engin.umich.edu/resources/systems/flux/ 和 http://cac.engin.umich.edu/started/index.html。下面是运行此程序所需的四个文件。

一个 makefile，用于在列表 A 中的 Flux 上编译 Fortran 代码。这应该保存为 makefile。在使用 makefile 编译代码之前，您需要在登录 Flux 后在命令行提示符处键入
module load fftw/3.2.1-intel
然后将 makefile 和 heat.f90 放在同一个目录中，下面的示例文件假设该目录是
{${HOME}/ParallelMethods/Heat}
并键入
make
以编译文件。文件编译完成后，键入
qsub fluxsubscript
以让集群运行您的程序，然后输出结果。以下程序使用 FFTW 库来执行快速傅里叶变换。有关此库的更多信息，请访问 http://www.fftw.org/。
( A)

一个用于编译傅里叶谱 Fortran 热方程程序的示例 makefile 代码下载
```
#define the complier
COMPILER = mpif90
# compilation settings, optimization, precision, parallelization
FLAGS = -O0

# libraries
LIBS = -L${FFTW_LINK} -lfftw3 -lm
# source list for main program
SOURCES = heat.f90

test: $(SOURCES)
${COMPILER} -o heat $(FLAGS) $(SOURCES) $(LIBS)

clean:
rm *.o
```

列表 B 中的 Fortran 程序 - 这应该保存为 heat.f90

( B)

一个 Fortran 傅里叶谱程序，使用反向欧拉时间步长来求解热方程代码下载

!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program solves heat equation in 1 dimension
! u_t=\alpha*u_{xx}
! using a the backward Euler method for x\in[0,2\pi]
!
! The boundary conditions are u(0)=u(2\pi)
! The initial condition is u=sin(x)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! plotgap = number of timesteps between plots
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! L = width of box
! alpha = heat conductivity
! .. Scalars ..
! i = loop counter in x direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfx = Forward 1d fft plan in x
! planbx = Backward 1d fft plan in x
! dt = timestep
! .. Arrays ..
! u = approximate REAL solution
! v = Fourier transform of approximate solution
! vna = temporary field
! .. Vectors ..
! kx = fourier frequencies in x direction
! x = x locations
! time = times at which save data
! name_config = array to store filename for data to be saved
!
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! Check that the initial iterate is consistent with the
! boundary conditions for the domain specified
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)

PROGRAM main

! Declare variables
IMPLICIT NONE
INTEGER(kind=4),	PARAMETER :: Nx=64
INTEGER(kind=4),	PARAMETER	:: Nt=20
REAL(kind=8),	PARAMETER	&
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8),	PARAMETER	:: L=5.0d0	
REAL(kind=8),	PARAMETER	:: alpha=0.50d0	
REAL(kind=8)	:: dt=0.2d0/REAL(Nt,kind(0d0))	
COMPLEX(KIND=8), DIMENSION(:),ALLOCATABLE	:: kx	
REAL(kind=8), DIMENSION(:),ALLOCATABLE	:: x	
COMPLEX(KIND=8), DIMENSION(:,:),ALLOCATABLE	:: u,v	
REAL(kind=8), DIMENSION(:),ALLOCATABLE	:: time
COMPLEX(KIND=8), DIMENSION(:),ALLOCATABLE	:: vna	
INTEGER(kind=4)	:: i,j,k,n
INTEGER(kind=4)	:: start, finish, count_rate, AllocateStatus
INTEGER(kind=4), PARAMETER	:: FFTW_IN_PLACE = 8, FFTW_MEASURE = 0, &
FFTW_EXHAUSTIVE = 8, FFTW_PATIENT = 32, FFTW_ESTIMATE = 64
INTEGER(kind=4), PARAMETER :: FFTW_FORWARD = -1, FFTW_BACKWARD=1	
COMPLEX(KIND=8), DIMENSION(:),ALLOCATABLE :: fftfx,fftbx
INTEGER(kind=8)	:: planfx,planbx
CHARACTER*100	:: name_config

CALL system_clock(start,count_rate)	
ALLOCATE(kx(1:Nx),x(1:Nx),u(1:Nx,1:1+Nt),v(1:Nx,1:1+Nt),&
    time(1:1+Nt),vna(1:Nx),fftfx(1:Nx),fftbx(1:Nx),&
    stat=AllocateStatus)
    IF (AllocateStatus .ne. 0) STOP

! set up ffts
CALL dfftw_plan_dft_1d(planfx,Nx,fftfx(1:Nx),fftbx(1:Nx),&
FFTW_FORWARD,FFTW_ESTIMATE)
CALL dfftw_plan_dft_1d(planbx,Nx,fftbx(1:Nx),fftfx(1:Nx),&
FFTW_BACKWARD,FFTW_ESTIMATE)

PRINT *,'Setup FFTs'

! setup fourier frequencies
DO i=1,1+Nx/2
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/L
END DO
kx(1+Nx/2)=0.00d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
DO i=1,Nx
x(i)=(-1.00d0 + 2.00d0*REAL(i-1,kind(0d0))/REAL(Nx,KIND(0d0)))*pi*L
END DO
PRINT *,'Setup grid and fourier frequencies and splitting coefficients'

u(1:Nx,1)=sin(x(1:Nx))
! transform initial data
CALL dfftw_execute_dft_(planfx,u(1:Nx,1),v(1:Nx,1))
PRINT *,'Got initial data, starting timestepping'
time(1)=0.0d0

vna(1:Nx)=v(1:Nx,1)
PRINT *,'Starting timestepping'
DO n=1,Nt
DO i=1,Nx
vna(i)=vna(i)/(1-dt*kx(i)*kx(i))
END DO
PRINT *,'storing plot data ',n
time(n+1)=time(n)+dt
v(1:Nx,n+1)=vna(1:Nx)
CALL dfftw_execute_dft_(planbx,v(1:Nx,n+1),u(1:Nx,n+1))
u(1:Nx,n+1)=u(1:Nx,n+1)/REAL(Nx,KIND(0d0))	! normalize
END DO
PRINT *,'Finished time stepping'
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),'for execution'

! Write data out to disk

name_config = 'u.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,1+Nt
DO i=1,Nx
WRITE(11,*) REAL(u(i,j))
END DO
END DO
CLOSE(11)

name_config = 'tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,1+Nt
WRITE(11,*) time(j)
END DO
CLOSE(11)

name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)

PRINT *,'Saved data'
DEALLOCATE(kx,x,u,v,&
    time,vna,fftfx,fftbx,&
    stat=AllocateStatus)
    IF (AllocateStatus .ne. 0) STOP
CALL dfftw_destroy_plan(planbx)
CALL dfftw_destroy_plan(planfx)
CALL dfftw_cleanup()
PRINT *,'Program execution complete'
END PROGRAM main

一个在集群上使用的示例提交脚本，如列表 C 中所示 - 这应该保存为 fluxsubscript。更多示例可在 http://cac.engin.umich.edu/resources/software/pbs.html 中找到。要使用它，请将电子邮件地址从 your_uniqname@umich.edu 更改为您可以接收有关作业何时开始和结束的通知的电子邮件地址。

( C)

一个在 Flux 上使用的示例提交脚本代码下载

#!/bin/bash
#PBS -N heatequation
#PBS -l nodes=1,walltime=00:10:00
#PBS -l qos=math471f11_flux
#PBS -A math471f11_flux
#PBS -q flux
#PBS -M your_uniqname@umich.edu
#PBS -m abe
#PBS -V
# Create a local directory to run and copy your files to local.
# Let PBS handle your output
mkdir /tmp/${PBS_JOBID}
cp ${HOME}/ParallelMethods/Heat/heatequation /tmp/${PBS_JOBID}/heatequation
cd /tmp/${PBS_JOBID}
./heatequation
#Clean up your files
cd
cd ParallelMethods/Heat
# Retrieve your output
cp /tmp/${PBS_JOBID}/u.dat ${HOME}/ParallelMethods/Heat/u.dat
cp /tmp/${PBS_JOBID}/xcoord.dat ${HOME}/ParallelMethods/Heat/xcoord.dat
cp /tmp/${PBS_JOBID}/tdata.dat ${HOME}/ParallelMethods/Heat/tdata.dat

/bin/rm -rf /tmp/${PBS_JOBID}

一个 Matlab 绘图脚本^[4]，用于生成图 i ，如列表 D 中所示。

( D)

一个用于绘制计算结果的 Matlab 程序代码下载

% A Matlab program to plot the computed results

clear all; format compact, format short,
set(0,'defaultaxesfontsize',18,'defaultaxeslinewidth',.9,...
    'defaultlinelinewidth',3.5,'defaultpatchlinewidth',5.5);

% Load data
load('./u.dat');
load('./tdata.dat');
load('./xcoord.dat');
Tsteps = length(tdata);

Nx = length(xcoord); Nt = length(tdata);
 
u = reshape(u,Nx,Nt);

% Plot data
figure(3); clf; mesh(tdata,xcoord,u); xlabel t; ylabel x; zlabel('u');

( i)

由 B 中的 Fortran 程序计算并由 D 中的 Matlab 程序后处理的热方程解。

在 Windows 上的入门

要在 Windows 上运行这些程序，您需要安装 Cygwin。Cygwin 在 Windows 上模拟 bash shell。您还需要以下 Cygwin 包：make、mpi、fftw3（列表不完整）。以下是修改后的 makefile，其他文件保持不变。

一个用于在列表 E 中编译 Fortran 代码的 makefile。这应该保存为 makefile 而不是 ( A )。然后将 makefile 和 heat.f90 ( B ) 放在同一个目录中。键入

make

to compile the file. Once the file is compiled type

./heat.exe

to get your computer to run the program.

使用上面的 Matlab 绘图脚本获取图片。

( E)

一个用于编译傅里叶谱 Fortran 热方程程序的示例 makefile 代码下载

#define the complier
COMPILER = mpif90
# compilation settings, optimization, precision, parallelization
FLAGS = -O0

# libraries
LIBS = -lfftw3 -lm
# source list for main program
SOURCES = heat.f90

test: $(SOURCES)
                ${COMPILER} -o heat $(FLAGS) $(SOURCES) $(LIBS)

clean:
       rm *.o 
       rm heat

练习

请阅读网页 http://cac.engin.umich.edu/started/index.html 上的资源，了解如何使用 Flux 集群。
修改一维热方程的 Fortran 程序，以使用您选择的时间步长方案求解 Allen-Cahn 方程。创建您运行输出的绘图。在您的解决方案中包含源代码和绘图。
修改一维热方程的 Fortran 程序，以使用您选择的时间步长方案求解二维热方程。您的程序应该在每个时间步保存字段，而不是将所有字段放在一个大的数组中。创建您运行的初始状态和最终状态的绘图。在您的解决方案中包含源代码和绘图。

注释

↑ Metcalf、Reid 和 Cohen (2009)
↑ 虽然 Matlab 是用 C 编写的，但它最初是用 Fortran 编写的，因此其风格与 Fortran 类似。
↑ Levesque 和 Wagenbreth (2009)
↑
对于许多计算问题，我们可以使用比生成结果所需的计算能力少 10-100 倍的计算能力来可视化结果，因此对于不太大的问题，使用 Matlab 等高级语言来后处理数据要容易得多。

参考文献

Levesque, J.; Wagenbreth, G. (2011). 高性能计算：编程与应用. CRC 出版社.

Metcalf, M.; Reid, J.; Cohen, M. (2011). 现代 Fortran 解释. 牛津大学出版社.

[1] Metcalf、Reid 和 Cohen (2009)

[2] 虽然 Matlab 是用 C 编写的，但它最初是用 Fortran 编写的，因此其风格与 Fortran 类似。

[3] Levesque 和 Wagenbreth (2009)

[4] 
对于许多计算问题，我们可以使用比生成结果所需的计算能力少 10-100 倍的计算能力来可视化结果，因此对于不太大的问题，使用 Matlab 等高级语言来后处理数据要容易得多。

[1]

[2]

[3]

[4]