GLPK/Unix 批量执行

这些脚本演示了在 Unix 命令行上运行批处理作业。为了普遍性，演示被分为两部分。第一部分创建一组问题实例，然后在第二部分中的一组批处理队列中运行它们。

第一部分包括四个脚本，tst1、tst2a、tst2b 和 tst3。tst1 是一个 awk 脚本，它在一行上创建一组点，并在 X 和 Y 值中都存在误差。tst3 调用 tst1 和 tst2a 或 tst2b 多次，从而创建一组 MathProg 格式的数据文件。

这些示例来自 Solaris 系统，其中 Aho、Weinberger 和 Kernighan 的 The AWK Programming Language 中的 awk(1) 被称为 nawk。在其他系统上，它只是 awk，而 awk 的先前实现被称为 oawk（旧的 awk）。许多 Linux 系统使用 Gnu 实现，gawk。因此，可能需要对 nawk 的引用进行少量编辑。

所有这些在 Solaris、Linux、FreeBSD、OpenBSD、MacOS X 上以及在 Windows 上添加一个 Unix 环境（如 Cygwin）后，经过适当的微调后都将有效。

#!/bin/nawk -f

#-----------------------------------------------------------------------
#  tst1
#
# generate the specified number of points on a line with random errors
# in the X & Y values
#-----------------------------------------------------------------------

BEGIN{

   srand();

   n = ARGV[1];
   ARGV[1] = "";

   for( i=0; i<n; i++ ){

      printf( "%d " ,i+1  );
      
      printf( "%f " ,i/3.0+rand() );
      printf( "%f " ,i/7.0+rand() );

      printf( "\n" );

   }
}

#!/bin/sh
#--------------------------------------------------------------
# tst2a
#
# This script uses Bourne shell here files to add the MathProg
# statements.
#--------------------------------------------------------------

# create the first part of the MathProg data file statements

cat <<EOF
data;

param : I :   x    y :=

EOF

# pass stdin through to stdout

cat 

# add the last part of the MathProg data file

cat <<EOF
;
end;
EOF

<syntaxhighlight lang="bash">
#!/bin/sh
#--------------------------------------------------------------
# tst2b
#
# This script uses Bourne shell here files to add the MathProg
# statements.  It then uses the Unix utilities, awk and sort to
# randomly reorder the points.
#--------------------------------------------------------------

# write out the first part of the MathProg data file

cat <<EOF
data;

param : I :   x    y :=

EOF

# randomly reorder the input

nawk 'BEGIN{srand();}{ print $0 ,rand()}' ${1}  \
| sort -k 4n | nawk '{print NR ,$2, $3}'

# write out the last part of the MathProg data file

cat <<EOF
;
end;
EOF

#!/bin/sh

#-----------------------------------------------------------------------
# tst3
#
# create a set of data files in MathProg format using tst1 to generate 
# the data and either tst2a or tst2b to add the required MathProg 
# statements.
#
# ./tst1 generates points on a line with random errors in the X & Y values
# ./tst2a just sets up the MathProg statements
# ./tst2b also randomly reorders the data using Unix command line 
# utilities.  
#
# tst3 takes two arguments specifying the number of points on the line 
# and the number of data files to create.  The sleep following tst2b 
# is to ensure that a new seed is used for each instance.
#-----------------------------------------------------------------------
 
if [ ${#} -ne 2 ]
   then

   echo "usage:"
   echo "./tst3 <npoints> <ninstances>"
   exit
fi

J=1

while [ ${J} -le ${2} ]
   do

   #./tst1 ${1}  | ./tst2a >${J}.dat
   ./tst1 ${1}  | ./tst2b >${J}.dat; sleep 1;

   J=`expr ${J} + 1`

done

tst4 根据文件扩展名 .dat 获取作业列表，将列表分成 4 组，并在后台队列中运行它们。在 Unix 上实现批处理队列有很多选择。这是一个适合单个研究人员使用单个多核工作站的需求的最小示例。如果您需要在集群上运行大量长时间运行的作业，则应考虑更全面的队列系统。

来自 glpsol 的所有控制台输出都将定向到每个队列的日志文件，命名为 Q0.log、Q1.log 等。

#!/bin/sh 

#-----------------------------------------------------------------------
# tst4
#
# This script runs a collection of jobs in a set of 4 parallel queues.
#
# This can be extended to as many cores in a multicore processor as
# you wish.  If you plan to run a very large number of jobs that will
# require significant time to complete it is suggested that you use 
# N-1 queues where N is the number of cores in your system.  This 
# will ensure that you have a core free for interactive use.
#
# The jobs are identified by the extension .dat, however, any naming
# will work.
#
# White space matters.  In particular, the "\" must be followed by
# a newline (aka linefeed).
#-----------------------------------------------------------------------

# get a list of all jobs in a temporary file

/bin/ls *.dat >/tmp/tmp.$$

# break the list into 4 sublists

Q0=`nawk 'NR%4==0' /tmp/tmp.$$`
Q1=`nawk 'NR%4==1' /tmp/tmp.$$`
Q2=`nawk 'NR%4==2' /tmp/tmp.$$`
Q3=`nawk 'NR%4==3' /tmp/tmp.$$`

# remove the temporary file

rm /tmp/tmp.$$

# fire off the queues by putting shell loops into the background

(for I in ${Q0};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q0.log &


(for I in ${Q1};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q1.log &


(for I in ${Q2};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q2.log &


(for I in ${Q3};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q3.log &

MathProg 模型是分布式示例目录中 cf12a.mod 的一个小修改。

# set of points

set I;

# independent variable

param x {i in I};

# dependent variable

param y {i in I};

# define equation variables

var a;

var b;

var u {i in I}, >= 0;

var v {i in I}, >= 0;

# define objective function

minimize error: sum {i in I} u[i] + sum {i in I} v[i];

# define equation constraint

s.t. equation {i in I} : b * x[i] + a + u[i] - v[i] = y[i];

solve;

printf "y = %.4fx + %.4f\n", b, a;

end;