No, not necessarily. If two variables are correlated, they can possibly have any relationship and not just a linear one.
But the important point to note here is that there are two correlation coefficients that are widely used in regression. One is Pearson's R correlation coefficient, which is the correlation coefficient that you learnt about in the linear regression model. This correlation coefficient is designed for linear relationships, and it might not be a good measure for a non-linear relationship between the variables. The other correlation coefficient is Spearman's R, which is used to determine the correlation if the relationship between the variables is not linear. So, even though Pearson's R may give a correlation coefficient for non-linear relationships, it might not be reliable. For example, the correlation coefficients, as given by both the techniques for the relationship y=X3 for 100 equally separated values between 1 and 100, were found out to be:
Pearson′s R ≈ 0.91
Spearman′s R ≈ 1
And as we keep on increasing the power, the Pearson's R value consistently drops, while the Spearman's R remains robust at 1. For example, for the relationship y=X10 for the same datapoints, the coefficients were:
Pearson′s R ≈ 0.66
Spearman′s R ≈ 1
So, the takeaway here is that if you have some sense of the relationship being non-linear, you should look at Spearman's R instead or Pearson's R. It might happen that even for a non-linear relationship, the Pearson's R value might be high, but it is simply not reliable.
Comments