Warning

 

Close

Confirm Action

Are you sure you wish to do this?

Confirm Cancel
BCM
User Panel

Site Notices
Posted: 10/29/2022 11:18:16 AM EDT
Non linear correlation?

I remember two nonlinear names, but can’t recall the handful of distance based analysis-
Or which would be best.

I was not a math or stats major.  3 sem of calc, one undergrad stats for science majors, and one stats after undergrad.


Basically, some people were arguing the correlation of something that had a Pearson  of .3.

And claiming it wasn’t related.

I said quit arguing, it’s related big time, just not linearly, , I’ll shoot you a better way to compare next week.

But can’t recall how I used to do it.  Way to go Ramairthree.

It’s the kind of comparison for when something is very correlated, - but in different ways at different points.  But you still want an overall picture.

I explained it like, let’s say you compare motor vehicle speed to fatal vehicle accidents.
From zero to 35 mph, no linear relation as speed increases, essentially zero fatalities.
From 36 to 87, some linear correlation with increasing fatalities as speed increases,
from 88 to 109, poorer linear correlation because you are doing single variable correlation but multiple variables like quality of vehicle, size and durability of vehicle, latest safety features, object struck, etc. have greater weight on outcome at 109 mph  than 88 mph,
And at 110 and up lose linear correlation because survivability at 120 vs 130 or 187 vs 200 are equally bad even with those other variables in a regular civilian car.
I haven’t done this sort of thing in over a decade, and only a few times, off the top of my head I can’t recall, but I’ll shoot you what you could do.

And now can’t remember.





Link Posted: 10/30/2022 10:22:52 AM EDT
[#1]
Damn it math and science forum.

Come through for me.

I need the anonymity for this.

I have a circle of colleagues that mock each other mercilessly for entertainment when we help each other.

If I have to ping a killer math/stats guy I used to work with, he is going to flay me relentlessly in return for the brutal left handed compliments I was laying on him the last time he pinged me for some help.
Link Posted: 11/28/2022 6:14:45 PM EDT
[#2]
I'm not an expert on nonlinear correlation, so I would just start with the basics: Spearman's rank correlation https://en.wikipedia.org/wiki/Spearman's_rank_correlation_coefficient. This is essentially just the correlation between the ranks of the two variables, rather than the raw values. If wikipedia is to be trusted, this allows you to pick up general monotonic relationships between the two variables, but you may still be fooled by more complicated nonlinearities.

That said, a correlation of 0.3 doesn't mean that the variables are unrelated even if the nature of the relationship is perfectly linear! There's two ways to think about this, depending on what you mean by "unrelated."

1) Is there statistical evidence for a relationship? This is a question of our uncertainty about the estimate. To answer this you need to perform the hypothesis test, construct a confidence interval, perform a Bayesian analysis, etc, to see if the estimated correlation is statistically distinguishable from zero. Essentially, what is the standard error of that correlation estimate? 0.3 +/- 0.01 is pretty strong evidence for a relationship, but 0.3 +/- 0.2 is very weak evidence for a relationship.

2) Does the magnitude of the relationship matter? This is a question about the practical significance of the relationship - suppose we are completely certain in that 0.3, does it matter? Do we care? Well, that depends on the variables. But in abstract terms, this means that a 1 standard deviation increase in X predicts a 0.3 standard deviation increase in Y. That may be a lot depending on the context - the social sciences, for example, that's a perfectly respectable value.

We can also compute the standard error of this difference as follows:

Var(y2 - y1) = Var(a + (x + Sx)b + e2 - a - xb - e1) = Var(e2 - e1)  = 2Se^2 where Se^2 is the error variance of the regression Y = a + bX + e.

then the variance relationship gives us:

Sy^2 = b^2 * Sx^2 + Se^2

Then rearrange and plug in b = r * Sy / Sx to obtain:

Se^2 = Sy^2(1 - r^2)

So the standard error of the difference is sqrt(2 *(1 - 0.3^2)) * Sy = 1.35 Sy.

Then we can be about 95% sure that y2-y1 = 0.3 * Sy +/- 2.7 Sy

Or in standard deviation units: (y2 - y1) / Sy = 0.3 +/- 2.7. So yeah, the error is large relative to the expected difference, but in the context of the problem it still may matter to you a lot. Or it might not matter at all. Really hard to go any further without understanding what your variables are. And note that everything under 2) assumes that a) we know the correlation with certainty (e.g. sample size is infinite), and b) the relationship is in fact linear.
Link Posted: 12/8/2022 7:41:35 PM EDT
[#3]
Discussion ForumsJump to Quoted PostQuote History
Quoted:
Damn it math and science forum.

Come through for me.

I need the anonymity for this.

I have a circle of colleagues that mock each other mercilessly for entertainment when we help each other.

If I have to ping a killer math/stats guy I used to work with, he is going to flay me relentlessly in return for the brutal left handed compliments I was laying on him the last time he pinged me for some help.
View Quote

So, not knowing what the context is but...

Non-linear is a rabbit hole. I learned (the hard way) that the first thing to do is try transformations. Take the logarithm of all your data and do the correlation on that. That's still linear correlation, it's just linear in the logworld. Take the square root of everything and do the correlation on that.

Actually, I guess that's not the first thing you do. The first thing is to look at a residuals plot and see if anything jumps out there.
Link Posted: 2/21/2023 10:22:27 PM EDT
[#4]
I do work with non-linear models, but in a regression context.  These are empirical models of growth data or something similar and I'm usually trying to estimate something from one or more variables that may have non-linear relationships to what I'm trying to estimate.  As has been said above, if you can linearize that will greatly simplify things.  One thing you'll have to consider is that using parameters fit on a linear transform and then plugged into the non-linear form of your model might introduce bias.  I'd have to dig through notes that I haven't looked at in years, but there is a rule of thumb to correct for this.  It may not matter for what you're doing though.  Comparing non-linear models to each other or to linear models can get cumbersome...
Link Posted: 6/15/2023 7:29:13 AM EDT
[#5]
What about ANOVA tables (Analysis Of Variance) tables.  We used these pretty extensively when I took stats in college - especially for sets with 2 variables.  They were a pain in the ass because for every (X,Y) data set you had to calculate Sum X, Sum Y, Sum X*Y, Sum X2, Sum Y2, Sum of (X*Y)2.  We all had unique data sets and just calculating those numbers for 75+ data sets took hours.  I also ran a HP-15C calculator that could invert a 4x4 matrix in 2 minutes.  The only calculator at the time that could do so and it cost 190 bucks in the early 80's.  Man, I HATED stats.
Link Posted: 7/4/2023 6:16:50 PM EDT
[#6]
For follow up, I hadn’t heard back in time and just broke the data into four sections and did a weighted tail on each.
Close Join Our Mail List to Stay Up To Date! Win a FREE Membership!

Sign up for the ARFCOM weekly newsletter and be entered to win a free ARFCOM membership. One new winner* is announced every week!

You will receive an email every Friday morning featuring the latest chatter from the hottest topics, breaking news surrounding legislation, as well as exclusive deals only available to ARFCOM email subscribers.


By signing up you agree to our User Agreement. *Must have a registered ARFCOM account to win.
Top Top