How to handle Multicolinearity in STATA

While going for regression analysis with time series of panel data, it may be possible that the independent variables considered in the model are highly correlated to each other, and therefore, the entire model in turn prove out to be a spurious one, with an absurd high value of regression coefficient (R2). This issue is called as the multicolinearity among independent variables. As long as this issue is present within the set of independent variables, the regression model can never be estimated in a correct manner.

However, making use of STATA, this issue can be resolved very easily. The solution to this problem is two-fold in nature, and it depends upon the nature of the variables. The solutions are as per the following:

  • Time series data: The problem of multicolinearity is mostly prevalent in this kind of data, as the historic pattern of the data can influence the futuristic pattern to a great extent. To get rid of this problem, a new data series can be introduced by using the data editor in STATA, and that variable will be a lagged version of the variable, in which the multicolinearity is present. Another way is to use a differentiated version of the variables, and this method is more appropriate than the previous one. Including this type of variables in place of the original data can solve the problem totally.
  • Panel data: In order to handle multicolinearity in the panel data, it is needed to transform the variables into orthogonal nature, i.e. make the set of independent variables uncorrelated with each other. There is a direct command in STATA named “orthog”, through which the variables can be transformed directly, and the correlation coefficients among the new group of variables turn out to be absolutely zero.

To know more about the different aspects of handling multicolinearity in STATA, kindly browse through the pages of

Leave a Reply