How to Augment existing pre-defined sets

Hello,

I have been working on a project in GAMs and encountered an issue and was hoping that dynamic sets would help me deal with this:

1.) So I have a .gdx file of data for four regions, indexed A, B, C and D. Within each region there are 3 groups (like household types, 3 income categories etc.,), A1,A2,A3, B1,B2,B3, C1,…
2.) The .gdx data, in a simplified form, looks this:
income(“”,“”,“A1”)=10;
income(“”,“”,“A2”)=20;

Where the last element indexes household type.

So I have 1 row in the .gdx for each variable (income, expenditure etc.,), which is pre-populated with a designated household set, designated set “h”. (so “income(”“,”",h), where h contains A1-D4)

3.) When I try to run the model, it uses the entire h set, which is /A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3/; so it runs the model with all 12 household groups (which is not what I want)

4.) instead, I want to run the model via a selection mechanism, say: id=1 for A region, id=2 for B region, so-on-and-so-forth. So when I tell GAMs id=1, it only runs the model for A1, A2 and A3.

My question is as follows: is there a way to tell GAMs to drop elements of the “h” set (12 elements), depending on some external parameter? I tried using dynamic sets, but it doesn’t allow me to selectively read-in a subset of “h”, because it is a pre-defined set (inside the .gdx data file). In a nutshell, I’m reading in data from 12 different agents (set “h”), however, I want to only use 3 of them at a time in the model, is there a good way to modify pre-existing sets (read in from a .gdx file)? I’ve tried dynamic sets, but it won’t allow me to modify a pre-existing set, I think. Thank you!

Some details are not clear. A small working example, often is better than a long text. Anyhow, if I understand correctly you could use the GDX filtered load facility. For this you would need to define the households you want to use in your source:

set h / A1*A3 /;
Parameter income(*,*,h);
$gdxin input
$load income

With the GAMS will filter out all records of income that don’t belong to h.

Now you can combine this with some GAMS compile time magic and double dash command line parameters:

$if not set ID $set ID A
set h / %ID%1*%ID%3 /;
Parameter income(*,*,h);
$gdxin input
$load income

Now you call gams with “gams mymodel --ID=B” and it will include the household groups B1*B3.

-Michael

Hello Michael,

Thanks so much for the assistance! Yes that makes sense to me, I didn’t realize I could pre-define the set as a filter; this is an elegant solution, as opposed to other methods I have been trying. However, I’d still like to compare and contrast a few different solutions (ideally not involving writing double-dash variables) to see what options I have.

Would there be a possible solution in which I read in the entire .gdx for all my groups:

Sets
h "household sets" 
income(h) "income";

Parameter data(*,*,h);

$gdxin data_file.gdx
$load H INCOME DATA;

Which should load in the entire data file (so set “h” is /A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3/). Then, I want to augment the set “h” and have it drop B1-D3 (so drop all the .gdx data currently stored based on what is the the “h” argument)? In other words, once I’ve read in all my groups, is there a way to drop elements in a set? (I was thinking of using embedded python, but that’s a language which I’m hugely unfamiliar with) Thanks in advance if you can provide any tips!

Best,
HH

You can’t kick anything out of a set that is used as a domain. You would have to work with a dynamic set:

Sets
h "household sets", hh(h) 
incomeAll(h) "income", income(h);

Parameter data(*,*,h);

$gdxin data_file.gdx
$load H incomeAll=income dataAll=data;

* How do you want to tell the code from outside what households to keep???
hh('A1') = yes;
hh('A2') = yes;
hh('A3') = yes;

income(hh) = incomeAll(hh);
alias (*,u1,u2);
data(u1,u2,hh) = incomeAll(u1,u2,hh);
option clear=incomeAll, clear=incomeAll;

The double dash are a way to parameterize the code. How do you want to tell the code to either use A1A3 or B1B3?

-Michael

I have the code that generates the .gdx, so I was hoping to pass one more input to it (e.g, region==A/B/C/D), then once the data is loaded using that region variable to drop things from the “h” set that don’t belong. However, since I can’t kick items out of a set that is a domain, that might be an issue. Would I be able to redefine the set using python embed? Thanks again! ( I should note that the reason I want to avoid dynamic set in “h” is that later I need to define parameters with an “h” argument, which doesn’t seem possible when “h” is declared a dynamic set.)

If you have control over the code that creates the GDX file, why not create the set h in the GDX file with the households you want? Even if you write data for all the household? If you like to keep h as is, make another set (as you propose to create another “region variable”) with the households you want and load this with the $load h=h_small.

It seems the actual trouble you have is that you can’t easily create this set because you know the region, e.g. “A” but you don’t know how many actual households i.e. A1, A2, A3, … there are without going through the data (which you try to avoid in the creation of the GDX file). The right way of doing this would have been to not slap the region and the household number into one label but keep them apart:

Sets
r "region" 
h "household sets" 
income(r,h) "income";

Parameter data(*,*,r,h);

$gdxin data_file.gdx
$load R H INCOME DATA;

If things were laid out like this, you would just have to write a single region “A” to set r and the filtered loading would have done what you want. Now you need to take the label apart. GAMS does not deal well with label modifications. That’s why we have something like embedded Python. This can be usually avoided by doing a proper relational data model without putting semantic into the content of the data (r=A,h=1 versus h=A1).

Here is some code that creates the GDX file used by the next model:

set h / A1*A6, B1*B3, C1*C12 /;
set i /i1*i10/, j /j1*j5/;
parameter data(i,j,h);
data(i,j,h) = uniform(0,1);
set r / A /;
execute_unload 'data_file.gdx';

Here is the code that creates the domain set h based on data semantics (the labels of hAll (the original h) that start with the name of the (first) element in r):

set hAll, h;
parameter data(*,*,h);
set r;
$gdxin data_file
$load hAll=h r
$onEmbeddedCode Python:
hAll = list(gams.get('hAll'))
r = list(gams.get('r'))[0]
h = [ i for i in hAll if i.startswith(r)]
gams.set('h',h)
$offEmbeddedCode h
$load data
display h, data

The display shows:

----     13 SET h  
A1,    A2,    A3,    A4,    A5,    A6

----     13 PARAMETER data  
                A1          A2          A3          A4          A5          A6
i1 .j1       0.172       0.843       0.550       0.301       0.292       0.224
i1 .j2       0.351       0.131       0.150       0.589       0.831       0.231
i1 .j3       0.314       0.047       0.339       0.182       0.646       0.561
i1 .j4       0.750       0.178       0.034       0.585       0.621       0.389
i1 .j5       0.151       0.174       0.331       0.317       0.322       0.964
i2 .j1       0.226       0.396       0.276       0.152       0.936       0.423

-Michael

This is helpful, thanks for sharing.

Hello Michael,

Thanks again for the pointers! The reason the .gdx is as it is is that my collaborator generated the file that way (though I have inputs and will likely change to your suggestions, or simply just generate 4 .gdx’s).

The python code example looks very useful! If I understand correctly, “r” set is the control set for the regions and python (which is in the .gdx), so I can generate an “r” list and use python to redefine the “h” set. Which should be a potential solution! Thanks again for all your pointers, this was very helpful.

Best,
HH