Hi,
I encountered a few issues when trying to parallelize the execution of GAMSPy models which I wanted to share. Maybe someone knows a sensible workaround.
What I want to do: I have a large model containing a set representing hourly time steps, e.g. for the duration of a year. I have decomposed this model into 8760 smaller ones representing the sub-problems created by removing all time dependencies. Now I want to solve these models in parallel, ideally by using the standard libraries given in Python (concurrent.futures in my case). I cannot share my whole code, but a sketch of the parallelization part looks like this:
mi = ip.ModelInput(source_file) # object containing all necessary data for the model
def f(t):
mi_t = mi.view(t) # function creating view of mi, essentially sampling the time-dependent data
ct, m = build.build(mi_t) # function which builds the GAMSPy model
m.solve()
return ct
def main():
ts = list(range(8760))
executor = concurrent.futures.ProcessPoolExecutor()
results = list(
executor.map(
f,
ts,
chunksize=len(ts) // multiprocessing.cpu_count()
)
)
The problems I encountered:
- gamspy.Container is not picklable: This is apparently necessary to gather the results after execution. This was easily fixed by letting f() return the individual records DataFrames that are of interest.
- The writing of temporary files by GAMSPy alerts the Windows “Antimalware Service Executable”, which dramatically slows down the execution: At least that’s my assumption, and it’s very likely to be true (30-50% CPU usage by this service when executing, while the processes are at ~0.1%). This may also be due to my organization’s restrictive security policy, but nonetheless is a problem. I have tried creating a local working_directory for each process, but the rules apparently also apply there.
I have the option to parallelize this differently on a server, but would prefer to do it locally. Is there maybe another option to do so?
Cheers,
Gereon