Different solutions with MCP and NLP formulation

Hi
I am running the famous dice problem from entropy econometrics. I formulated an NLP and an MCP model (Goland, Judge, Miller, Maximum Entropy Econometrics, Chapter 2). If I assume that the average of dice throws is 2, I get the same distribution for the probabilities. If I choose 1.5 as the average, the NLP and MCP problem differ depending on the starting point. If I choose arbitrary starting points for the MCP problem, I get different solutions and redefineds as well as a lower entropy measure compared to the NLP solution. If I use the solution from the NLP problem as a starting point, I get the same solution as the MCP model.
I don’t understand why I get an inferior solution with the MCP model. I am probably overlooking something.
Thanks
Renger

Here is my code:

set d Dice numbers/1*6/;

positive variables
    P(d) Probability;

variables    
    ENT     Entropy
    LAMBDA  Lagrange multiplicator
    GAMMA;

parameters
    x(d) Dice values,
    y    Mean  /1.5/,
    delta small value /0.00001/;

x(d) = d.val;

equations
    obj      Objective
    c_mean   Mean constraint
    c_unit   Unity constraint,
    c_foc    First order constraints;

obj..
    ENT =E= -sum(d, (P(d)) * log(P(d)));

c_mean..
    y =E= sum(d, P(d) * x(d));

c_unit..
    1 =E= sum(d, P(d));

c_foc(d)..
    -log(P(d)) - 1 -  LAMBDA * x(d) - GAMMA =E= 0;


model  entropy /obj, c_mean, c_unit/;

model entropyfoc /c_foc.P, c_unit.GAMMA,  c_mean.LAMBDA/;
P.L(d)  = 0.1;
P.LO(d) = 0.00000001;
P.Up(d) = 1;

parameter results;

set ls /1/;   

loop(ls,
    y = 1 + ls.val * 0.5;
*   P.L(d) = 0.1;
    solve entropy using nlp maximizing ENT;
    results("NLP",ls,"Entropy") = -sum(d, P.L(d) * log(P.L(d)));
    results("NLP",ls, d) = P.L(d);
    results("NLP",ls, "Sum") = sum(d, P.L(d));
    results("NLP",ls, "Mean") = sum(d, x(d) * P.L(d));        
);

loop(ls,
    y = 1 + ls.val * 0.5;
* If you comment out the following line, you get the same solution
    P.L(d) = 0.2;    
    solve entropyfoc using mcp;
    results("MCP",ls,"Entropy") = -sum(d, P.L(d) * log(P.L(d)));
    results("MCP",ls, d) = P.L(d);
    results("MCP",ls, "Sum") = sum(d, P.L(d));
    results("MCP",ls, "Mean") = sum(d, x(d) * P.L(d));        
   );

display results;

Renger,

Essentially, things work as you were expecting if you use the negative of the objective derivative in the FOC. This is to be expected, since you have a maximization. With min you just use the derivative. So, change:

c_foc(d)..
    -log(P(d)) - 1 -  LAMBDA * x(d) - GAMMA =E= 0;

to

c_foc(d)..
    log(P(d)) + 1 -  LAMBDA * x(d) - GAMMA =E= 0;

and you are all set.

This is a great illustration of how things can get confusing when there are only equality constraints. In such a case, we still get a solution to the MCP if we take the primal solution values from the NLP (i.e. the x values) and the opposite of the duals from the NLP. Since the duals are unconstrained (i.e. we have only equality constraints) this works: our primal x values mean we satisfy the constraints c_unit and c_mean in the FOC, but to satisfy c_foc we use the negative of the NLP duals to correct for the use of the objective gradient instead of its negative. So it seems like we have the correct FOC system, but it’s not correct - it admits solutions that don’t solve the NLP.

One good way to check for this is to load up the NLP solution (including the duals!) as the initial point for the MCP, and then run the MCP with iterlim=0. The solver returns a merit function value for the initial point, i.e. a measure of how close the initial point is to being a solution. This merit function is quite small with the adjusted c_foc. The code is attached.
renger2.gms (1.72 KB)
-Steve