GLPK/Unix 批量执行
外观
< GLPK
这些脚本演示了在 Unix 命令行上运行批处理作业。为了普遍性,演示被分为两部分。第一部分创建一组问题实例,然后在第二部分中的一组批处理队列中运行它们。
第一部分包括四个脚本,tst1、tst2a、tst2b 和 tst3。tst1 是一个 awk 脚本,它在一行上创建一组点,并在 X 和 Y 值中都存在误差。tst3 调用 tst1 和 tst2a 或 tst2b 多次,从而创建一组 MathProg 格式的数据文件。
这些示例来自 Solaris 系统,其中 Aho、Weinberger 和 Kernighan 的 The AWK Programming Language 中的 awk(1) 被称为 nawk。在其他系统上,它只是 awk,而 awk 的先前实现被称为 oawk(旧的 awk)。许多 Linux 系统使用 Gnu 实现,gawk。因此,可能需要对 nawk 的引用进行少量编辑。
所有这些在 Solaris、Linux、FreeBSD、OpenBSD、MacOS X 上以及在 Windows 上添加一个 Unix 环境(如 Cygwin)后,经过适当的微调后都将有效。
#!/bin/nawk -f
#-----------------------------------------------------------------------
# tst1
#
# generate the specified number of points on a line with random errors
# in the X & Y values
#-----------------------------------------------------------------------
BEGIN{
srand();
n = ARGV[1];
ARGV[1] = "";
for( i=0; i<n; i++ ){
printf( "%d " ,i+1 );
printf( "%f " ,i/3.0+rand() );
printf( "%f " ,i/7.0+rand() );
printf( "\n" );
}
}
#!/bin/sh
#--------------------------------------------------------------
# tst2a
#
# This script uses Bourne shell here files to add the MathProg
# statements.
#--------------------------------------------------------------
# create the first part of the MathProg data file statements
cat <<EOF
data;
param : I : x y :=
EOF
# pass stdin through to stdout
cat
# add the last part of the MathProg data file
cat <<EOF
;
end;
EOF
<syntaxhighlight lang="bash">
#!/bin/sh
#--------------------------------------------------------------
# tst2b
#
# This script uses Bourne shell here files to add the MathProg
# statements. It then uses the Unix utilities, awk and sort to
# randomly reorder the points.
#--------------------------------------------------------------
# write out the first part of the MathProg data file
cat <<EOF
data;
param : I : x y :=
EOF
# randomly reorder the input
nawk 'BEGIN{srand();}{ print $0 ,rand()}' ${1} \
| sort -k 4n | nawk '{print NR ,$2, $3}'
# write out the last part of the MathProg data file
cat <<EOF
;
end;
EOF
#!/bin/sh
#-----------------------------------------------------------------------
# tst3
#
# create a set of data files in MathProg format using tst1 to generate
# the data and either tst2a or tst2b to add the required MathProg
# statements.
#
# ./tst1 generates points on a line with random errors in the X & Y values
# ./tst2a just sets up the MathProg statements
# ./tst2b also randomly reorders the data using Unix command line
# utilities.
#
# tst3 takes two arguments specifying the number of points on the line
# and the number of data files to create. The sleep following tst2b
# is to ensure that a new seed is used for each instance.
#-----------------------------------------------------------------------
if [ ${#} -ne 2 ]
then
echo "usage:"
echo "./tst3 <npoints> <ninstances>"
exit
fi
J=1
while [ ${J} -le ${2} ]
do
#./tst1 ${1} | ./tst2a >${J}.dat
./tst1 ${1} | ./tst2b >${J}.dat; sleep 1;
J=`expr ${J} + 1`
done
tst4 根据文件扩展名 .dat 获取作业列表,将列表分成 4 组,并在后台队列中运行它们。在 Unix 上实现批处理队列有很多选择。这是一个适合单个研究人员使用单个多核工作站的需求的最小示例。如果您需要在集群上运行大量长时间运行的作业,则应考虑更全面的队列系统。
来自 glpsol 的所有控制台输出都将定向到每个队列的日志文件,命名为 Q0.log、Q1.log 等。
#!/bin/sh
#-----------------------------------------------------------------------
# tst4
#
# This script runs a collection of jobs in a set of 4 parallel queues.
#
# This can be extended to as many cores in a multicore processor as
# you wish. If you plan to run a very large number of jobs that will
# require significant time to complete it is suggested that you use
# N-1 queues where N is the number of cores in your system. This
# will ensure that you have a core free for interactive use.
#
# The jobs are identified by the extension .dat, however, any naming
# will work.
#
# White space matters. In particular, the "\" must be followed by
# a newline (aka linefeed).
#-----------------------------------------------------------------------
# get a list of all jobs in a temporary file
/bin/ls *.dat >/tmp/tmp.$$
# break the list into 4 sublists
Q0=`nawk 'NR%4==0' /tmp/tmp.$$`
Q1=`nawk 'NR%4==1' /tmp/tmp.$$`
Q2=`nawk 'NR%4==2' /tmp/tmp.$$`
Q3=`nawk 'NR%4==3' /tmp/tmp.$$`
# remove the temporary file
rm /tmp/tmp.$$
# fire off the queues by putting shell loops into the background
(for I in ${Q0}; \
do \
glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q0.log &
(for I in ${Q1}; \
do \
glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q1.log &
(for I in ${Q2}; \
do \
glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q2.log &
(for I in ${Q3}; \
do \
glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q3.log &
MathProg 模型是分布式示例目录中 cf12a.mod 的一个小修改。
# set of points
set I;
# independent variable
param x {i in I};
# dependent variable
param y {i in I};
# define equation variables
var a;
var b;
var u {i in I}, >= 0;
var v {i in I}, >= 0;
# define objective function
minimize error: sum {i in I} u[i] + sum {i in I} v[i];
# define equation constraint
s.t. equation {i in I} : b * x[i] + a + u[i] - v[i] = y[i];
solve;
printf "y = %.4fx + %.4f\n", b, a;
end;