Should I use indexed assignments or loop statements in GAMS?
Indexed assignments within GAMS are done using simultaneous or parallel assignments. In most cases you don’t need explicit loops, an assignment is an implicit loop. The loop statement is necessary when parallel assignments are not sufficient - an example is provided below. However, a loop statement is not executed in parallel, but sequentially. This may slow down the execution of your code tremendously. Below are some examples:
A simple example
set i / i1*i100000 /;
parameter u(i);
* bad!
loop(i, u(i) = uniform(0,2));
* good
u(i) = uniform(0,2);
If you run this fragment with the profile option set to one, you’ll get something like:
---- 4 Loop 7.235 7.235 SECS 14 MB
---- 6 Assignment u 0.000 7.235 SECS 14 MB 100000
The execution of the loop statement was significantly slower than the parallel assignment! Thus parallel assignments should be used wherever possible.
Note: That “parallel” assignment that use the same symbol on the left and on the right creates a copy of the symbol for the use on the right. Hence the updated value on the left is not used when referencing on the right as shown in the next example. If this is required a loop statement must be used. Here is an example, which illustrates this:
set j /j1*j4/
parameters a(j);
a('j1')=1;
* wrong!
a(j)$(ord(j)>1)=a(j-1)+1;
display 'Wrong:',a;
* correct!
loop(j$(ord(j)>1), a(j)=a(j-1)+1);
display 'Correct:',a;
---- 6 Wrong:
---- 6 PARAMETER a
j1 1.000, j2 2.000, j3 1.000, j4 1.000
---- 9 Correct:
---- 9 PARAMETER a
j1 1.000, j2 2.000, j3 3.000, j4 4.000
Another example
Let’s define three sets, stores, products, and weeks. How does the following iteration work exactly?
found =0;
loop((stores, products, weeks) $ (condition1(weeks)
and condition2(stores)
and condition3(products)), found =1);
Assuming that we have 6000 stores, 20 products, and 100 weeks. And only one week, week1, satisfies condition1; only one store, store1, satisfies condition2; only one product, product1, satisfies condition3. Will the iteration go through ALL the stores, products, and weeks even though only one element satisfies all the conditions. Given the conditional control, will the dimensions of the sets have a big impact on run time?
GAMS tries to be efficient when it comes to restricting the domains it needs to search. There is some optimization in place, but with a general construct as you have it there, GAMS will go through all the tuples of the loop. You have a better chance when you do:
set w(weeks), s(stores), p(products);
w(weeks) = condition1(weeks);
s(stores) = condition2(stores);
p(products) = condition3(products);
found = card(w) and card(s) and card(p) // or card(w)*card(s)*card(p)
Even if GAMS can’t be smart about the individual conditions, it only walks them once card(weeks)+card(stores)+card(products) in the original loop you probably get * instead of +). If the conditions are simple and can be optimized by the GAMS compiler you get constant time. It is a little like code optimization in other language compilers (C, FORTRAN, etc).
A third example
Here we have an example, where we have to match elements from two different sets. We show different formulation, which are much faster than using a loop statement.
set k /k1*k90000/, t /t1*t10000/;
parameter a(k), b(t), bref(t);
a(k)=normal(0,1);
* expansive loop
loop(k, b(t)$(ord(t) eq ord(k)-card(t))=a(k));
bref(t) = b(t);
option clear=b;
* better, but not yet optimal
b(t) = sum(k$(ord(k)=ord(t)), a(k+card(t)));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
option clear=b;
* much faster using element matching introduced in distribution 227
set sk(k) 'shifted k', kt(k,t);
sk(k+card(t)) = yes;
option kt(sk:t);
b(t)=sum(k$kt(k,t),a(k));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
option clear=b, clear=kt;
* this formulation is more compact
option kt(k:t);
b(t)=sum(kt(k,t), a(k+card(t)));
abort$(smax(t,abs(b(t)-bref(t)))>1e-6) b,bref;
Running the code using the GAMS profiler (profile 1) yields:
---- 7 Loop 29.218 29.234 SECS 14 MB
---- 12 Assignment b 20.188 49.422 SECS 16 MB 10000
---- 18 Assignment sk 0.015 49.437 SECS 17 MB 80000
---- 20 Assignment b 0.000 49.437 SECS 21 MB 10000
---- 26 Assignment b 0.125 49.562 SECS 21 MB 10000
A fourth example
In this example, a large matrix with a small band of non-zeros off the diagonal is copied from the name space j,jj to a new parallel matching name space x,xx where ‘j’ and ‘x’ match perfectly. The band matrix (PDmatrix_jj) is created and the copied to the same matrix with different labels (PDmatrix_xx). This is done once with the combination of two j:x mappings (jx and jxx) in a loop statement and once with a sequence of parallel assignment utilizing a temporary 4-dimensional map symbol ‘j_jj_x_xx’:
set x / x1*x50000 /;
singleton set xone(x) / x1 /;
$eval xCard card(x)
set j /1*%xCard% /;
set jx(j,x) / #j:#x /;
alias (x,xx,xxx), (j,jj,jjj), (jx,jxx);
parameter
LTmatrix_jj(j,jj)
PDmatrix_jj(j,jj)
scalar LTwidth / 3 /;
$eval LTWIDTH LTwidth
set jsub(j) / 1*%LTWIDTH% /;
LTmatrix_jj(j,jj+(ord(j)-LTwidth))$[jsub(jj)] = normal(0,1);
PDmatrix_jj(j,jj+(ord(j)-LTwidth))$[jsub(jj)] = sum(jjj, LTmatrix_jj(j,jjj)*LTmatrix_jj(jj+(ord(j)-LTwidth),jjj));
Parameter
PDmatrix_xx1(x,xx)
PDmatrix_xx2(x,xx);
* Slow loop
loop((jx(j,x),jxx(jj,xx))$PDmatrix_jj(j,jj), PDmatrix_xx1(x,xx) = PDmatrix_jj(j,jj));
* Fast sequence of
set j_jj_x_xx(j,jj,x,xx), x_xx(x,xx);
j_jj_x_xx(j,jj,x+(ord(j)-1),x+(ord(jj)-1))$xone(x) = PDmatrix_jj(j,jj);
option x_xx<j_jj_x_xx;
PDmatrix_xx2(x_xx(x,xx)) = SUM(j_jj_x_xx(j,jj,x,xx), PDmatrix_jj(j,jj));
abort$(smax(x_xx, abs(PDmatrix_xx1(x_xx)-PDmatrix_xx2(x_xx)))>1e-6) 'bad calculation';
The relevant part of the GAMS profiler report yields:
---- 24 Loop 14.625 18.657 SECS 32 MB
---- 28 Assignment j_jj_x_xx 3.250 21.907 SECS 45 MB 149997
---- 29 Other 0.000 21.907 SECS 49 MB
---- 30 Assignment PDmatrix_xx2 0.093 22.000 SECS 69 MB 149997
So the loop is more than 4 times slower than the sequence of parallel assignment statements.