Individual LP v.s. Integrated LP in GAMS

Hi all,

I have found an interesting characteristics in GAMS that it solves multiple simple LPs much slower than solving the integrated version of them. Here I show one illustrating example codes.

I defined 27 simpe LP problems. Firstly I solved them individually with respect to the indices. Then I solved the integrated version combining all of them.

Set
    x /x1*x3/
    y /y1*y3/
    z /z1*z3/;

Parameter        d1(x,y,z)  /
         x1.(y1*y3).(z1*z3) =      2.9
         x2.(y1*y3).(z1*z3) =      3
         x3.(y1*y3).(z1*z3) =      3.1
/;

Parameter        d2(x,y,z)  /
         (x1*x3).y1.(z1*z3) =      -4.8
         (x1*x3).y2.(z1*z3) =      -5
         (x1*x3).y3.(z1*z3) =      -5.2
/;

Parameter        d3(x,y,z)  /
         (x1*x3).(y1*y3).z1 =      0.8
         (x1*x3).(y1*y3).z2 =      1.0
         (x1*x3).(y1*y3).z3 =      1.2
/;

********* Individual Problems *************

Positive Variable
         x1;

Negative Variable
         x2;

Free Variable
         obj1
         x3;

Scalars
         dd1
         dd2
         dd3;

Equations
         objj1
         eq11
         eq12
         eq13
         eq14;

objj1..   obj1 =e= 5 * x1 + 4 * x2 + 6 * x3;
eq11..    x1 + 2 * x2 =g= 2;

eq12..    x1 + x3 =l= dd1;
eq13..    -3 * x1 + 2 * x2 + x3 =l= dd2;
eq14..    x1 - x2 + x3 =e= dd3;

Model sample1 / objj1, eq11, eq12, eq13, eq14 /;

Parameter
         store(x,y,z);

Scalar
         storesum;

Loop(x,
         Loop(y,
                 Loop(z,
                         dd1 = d1(x,y,z);
                         dd2 = d2(x,y,z);
                         dd3 = d3(x,y,z);
                         Solve sample1 maximizing obj1 using lp;
                         store(x,y,z) = obj1.l
                     );
         );
);

storesum = sum(x, sum(y, sum(z, store(x,y,z))));
display storesum;

*********  Integrated Problem *************

Positive Variable
         x12(x,y,z);

Negative Variable
         x22(x,y,z);

Free Variable
         obj2
         x32(x,y,z);

Equations
         objj2
         eq21
         eq22
         eq23
         eq24;

objj2..   obj2 =e= sum(x, sum(y, sum(z, 5 * x12(x,y,z) + 4 * x22(x,y,z) + 6 * x32(x,y,z))));
eq21(x,y,z)..    x12(x,y,z) + 2 * x22(x,y,z) =g= 2;

eq22(x,y,z)..    x12(x,y,z) + x32(x,y,z) =l= d1(x,y,z);
eq23(x,y,z)..    -3 * x12(x,y,z) + 2 * x22(x,y,z) + x32(x,y,z) =l= d2(x,y,z);
eq24(x,y,z)..    x12(x,y,z) - x22(x,y,z) + x32(x,y,z) =e= d3(x,y,z);

Model sample2 / objj2, eq21, eq22, eq23, eq24 /;

Solve sample2 maximizing obj2 using lp;
display obj2.l;

The elapse time of the first one was 7.982s, and the second one was only 0.042s. At the beginning I was considering that it might be due to the window updates and summary show-up took the majority of time, but it didn’t make sense because elapse time would not be affected by them.

If the circumstance changes to millions of LPs, solving such a large-scale LP with millions of variables and millions of constraints is exceedingly time consuming, which triggers the development of decomposition techniques like Benders and Dantzig-Wolfe. Those decomposition techniques basically are to decompose the large-scale LP to multiple individual LPs (they called subproblems) to reduce computational time. But the GAMS elapse time result here does not support this fact.

Does anyone have any idea about this? I guess it should be a way to do something like parallel computing, but I am a newbie to GAMS. Thanks in advance!

Gabriel

Hi,

why did you discard your first hypothesis ? If you look at the .lst file, you can see that the time is in fact spent in GAMS “execution” and “generation”, rather than in the actual “solve” time (which is detailed as “resource use” of each solve. Check: https://support.gams.com/gams:what_is_the_meaning_of_compilation_time_generation_time_and_execution_time).

This not only involves window updates and file writing, also, by looking at the .lst one gets the idea the GAMS is mostly “rebuilding” the model object and passing it again to the solver, as opposite to making low level calls to the solver API modifying coefficients (which is what you would do in an efficiency oriented optimization set up) (Although, I’m not sure of how much “rebuilding” is done, official GAMS experts can better confirm/discard/explain this). I believe GAMS has GUSS as a framework to efficiently deal with this kind of situations.

Hope to help your understanding and, if someone “official” chimes in, mine as well.

Best!
Claudio

Make sense! Thank you! Then the execution time is that solving the problem actually takes. Besides, is it any way to show the sum of CPU time of solving each simple LP? In .lst file, the execution time of each LP always displays 0.000…

The issue may not be so simple, look at this post by the great Michael Bussieck: https://newforum.gams.com/t/cplex-time-vs-resusd/1939/1

But, as far as I understand, no, “Execution” in this context means GAMS execution, not the solver. From the page I linked:

This is the time required for the second (execution) pass. Time consuming factors can be heavy numerical calculations on existing data and the use of “long” loops. Parallel assignments should be used wherever possible. The results of calculations and also execution errors (if any) are written to the lst file or othere external files

Also:

To get the time required to solve the model, you want to look at resource usage by the solver. This appears in the solve summary, just search for S O L V E. > The model attribute model_name.resusd > makes this information available within your model. Note that this time is not included in the GAMS execution time.

So, in your example, one can have:

Set
    x /x1*x3/
    y /y1*y3/
    z /z1*z3/;

Parameter        d1(x,y,z)  /
         x1.(y1*y3).(z1*z3) =      2.9
         x2.(y1*y3).(z1*z3) =      3
         x3.(y1*y3).(z1*z3) =      3.1
/;

Parameter        d2(x,y,z)  /
         (x1*x3).y1.(z1*z3) =      -4.8
         (x1*x3).y2.(z1*z3) =      -5
         (x1*x3).y3.(z1*z3) =      -5.2
/;

Parameter        d3(x,y,z)  /
         (x1*x3).(y1*y3).z1 =      0.8
         (x1*x3).(y1*y3).z2 =      1.0
         (x1*x3).(y1*y3).z3 =      1.2
/;

********* Individual Problems *************

Positive Variable
         x1;

Negative Variable
         x2;

Free Variable
         obj1
         x3;

Scalars
         dd1
         dd2
         dd3;

Equations
         objj1
         eq11
         eq12
         eq13
         eq14;

objj1..   obj1 =e= 5 * x1 + 4 * x2 + 6 * x3;
eq11..    x1 + 2 * x2 =g= 2;

eq12..    x1 + x3 =l= dd1;
eq13..    -3 * x1 + 2 * x2 + x3 =l= dd2;
eq14..    x1 - x2 + x3 =e= dd3;

Model sample1 / objj1, eq11, eq12, eq13, eq14 /;

Parameter
         store(x,y,z);

Scalar
         storesum;

set allLPs /lp1*lp28/;
parameter timeTook(allLPs);
scalar k /1/;

Loop(x,
         Loop(y,
                 Loop(z,
                         dd1 = d1(x,y,z);
                         dd2 = d2(x,y,z);
                         dd3 = d3(x,y,z);
                         Solve sample1 maximizing obj1 using lp;
                         store(x,y,z) = obj1.l;
                         timeTook(allLPs)$(ord(allLPs)=k)=sample1.resusd;
                         k = k+1;
                     );
         );
);

storesum = sum(x, sum(y, sum(z, store(x,y,z))));
display storesum;

*********  Integrated Problem *************

Positive Variable
         x12(x,y,z);

Negative Variable
         x22(x,y,z);

Free Variable
         obj2
         x32(x,y,z);

Equations
         objj2
         eq21
         eq22
         eq23
         eq24;

objj2..   obj2 =e= sum(x, sum(y, sum(z, 5 * x12(x,y,z) + 4 * x22(x,y,z) + 6 * x32(x,y,z))));
eq21(x,y,z)..    x12(x,y,z) + 2 * x22(x,y,z) =g= 2;

eq22(x,y,z)..    x12(x,y,z) + x32(x,y,z) =l= d1(x,y,z);
eq23(x,y,z)..    -3 * x12(x,y,z) + 2 * x22(x,y,z) + x32(x,y,z) =l= d2(x,y,z);
eq24(x,y,z)..    x12(x,y,z) - x22(x,y,z) + x32(x,y,z) =e= d3(x,y,z);

Model sample2 / objj2, eq21, eq22, eq23, eq24 /;

Solve sample2 maximizing obj2 using lp;

timeTook('lp28')=sample2.resusd;

display obj2.l,timeTook;

Looking at its output, it appears as there is somehow a non detection limit of time below 0.015/6, and time being reported in multiples of it, because in some problems (at least in my computer) the “resource usage” of the solver is effectively zero CPU seconds. I don’t know if there is any way to go “below” this detection limit.

Best

I got your point. Nevertheless here is still some way to let the user know the time consumption. This is just a tiny example and it should have been considered meaningless by GAMS system to show its time. I’ll try a bigger one with millions of rows/columns to check the time. Thank you so much!

Best,
Gabriel

Gabriel,

Please find a few comments below.

  1. Even if you would have found one meaningful example (sorry that I doubt you did) where the solution of multiple simple LPs takes longer than the integrated large LP, deriving a general rule from a single observation seems like a hasty conclusion.

  2. All LPs (including the large one) in this experiment are trivial and solved during presolve.

  3. It seems that the small problems are not connected at all in the integrated LP. Typically, in decomposition approaches the underlying problems have a certain structure. For example, there might be a block angular matrix where the blocks are linked via so called linking constraints and/or linking variables. The fact that the blocks are linked, makes the large problem difficult. If you combine 27 trivial problems without any additional restrictions they may very well result in a bigger but still trivial problem.

Decomposition methods can be very powerful and from my perspective this experiment does not prove the opposite! However, just to avoid misunderstandings, of course not every decomposition approach leads automatically to a speedup in solution time even though it might be correct in the mathematical sense. Implementing efficient decomposition methods is a non-trivial task.

Modelname.resusd is the time the solver reports that it has spent (in wall time). If you want GAMS to monitor the (wall) time the solver takes, you can access modelname.etsolver. If you want the (wall) time the entire solve statement including model generation and reading back the solution takes, you can look at modelname.etsolve.

I hope this helps!

Best,
Fred

Thank you Fred! I know what you mean. The experiment I made here is totally meaningless. I should test a 2-stage problem which makes the identical first-stage variable the complicating variable which connects the 27 second-stage linear programs together. Then I can test the CPU time that Bender’s decomposition takes with the time that solving the large problem takes. That would make more sense. Thank you for your kind suggestion! I really appreciate that. Once I have time to make the experiment I will post that here.

But one thing still haunts me that if the trivial large-scale program has millions/billions of variables and constraints, the computational time would still be horrible, based on the best of my knowledge. I will test it in GAMS too.

Best,
Gabriel