I would like to ask you, whether there is a person that has tried to implement ARIMA model (or other time series methods) in GAMS.
I have not found any topics related to this model.

My points are to:

how to find how many lags we need to take into account to the model,

how to find differentiation in time series (I have data from 2000 to 2018), last data historical year is 2018,

the variable I want to predict has at least 4-5 dimensions,

These are main points that come to my mind rigth now. If you have any suggestions or are familiar with the application of the time series models, I would be very happy.
I need to implement ARIMA and ETS methods, to check and compare their forecasting results to our benchmark method (that we have used for many years).

It may be possible, and I work with GAMS weekly, but it’s easier to do in R. Try the forecast package by Hyndman!

The R software is free, the package is free, and the techniques are well documented in his text, which is available free online at https://otexts.com/fpp2/

It’s worth investigating the autoarima command, which is well documented online at the CRAN website.

Yes. You are right abour R. There is function to arima models.
The problem, or maybe not a problem, is that all of us here at work are working with GAMS, and we use this program as a “bible”.
All forecasting methods so far have been implemented in this program.

The only think, I have experiance with ni R and GAMS, is that I am creating gdx-files in R (from type of dat-files) that include raw data from data base. This is a transformation from a type of file that cannot be easily read by GAMS.

Do you think that it would possible to input variables from GAMS, that you want to forecast with the new methods, to R, and then come back with the results to GAMS program again?

Thank you for the clarification! (I also use GAMS as a “first choice” tool, whenever possible.)

I haven’t coded any forecast model identification problem, but it’s surely possible. It would require hard coding the process (via loops or a similar structure) of altering model parameters, recalculating observed values for the model in question, and fitting the new model. And there should be some parameters or scalars used to store the parameters of either the best solution found (so far in the process) or, subject to your needs, all model parameters and performance measures.

Yes, this is how I see it.
Storage of coefficients is one think. The other will be to “adjust” all fitted models into national value, since it will not only fit ARIMA to a single univariate value, but after fitting all models (say 10.000), I have to make an assumption to a national (“roof”) value.
What I am saying here is like this, a single record, with five dimensions value, is:
AKM101 (x98) BKM101 (x98) HUD01 (x80) KOEN01 (x2) ALDER0029 (x3) 3,
It means that: there are 3 employed in work municipality 101, they live in municipality 101, with education HUD01, gender “men” and age interval 00-29.

You can see how many combinations it will be (in parentheses). And important think is, after all, to fit to employment national value. Let’s say, it has to be 2.900.000 employed. So the models cannot just “blow up”.

Thank you for the links. I am, fortunately, familiar with the second. The first one, to my surprise, is something I did not know and is against what my firend has said.

I have created a simple algorithm with loops for calculating ARIMA model for each possible combination to that I have described above.
My problem is that it would take probably a month or so to finish it …

So, …, I am looking for a method, possible in R program, to cluster or group a million observations in multi dimensional space into some clusters. It could be kind of clustering of high dimensional data like bottom-up method.
Maybe I am not patient or my algorithm is not so efficient, but it would be helpful to have something more efficient or time saving.