Mathematics desk
< July 24	<< Jun \| July \| Aug >>	Current desk >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

July 25[edit]

in MATLAB, the absolute value of the Pearson correlation coefficient is much higher than sqrt(R^2) of least squares fit[edit]

According to our article Pearson product-moment correlation coefficient, R^2 in linear regression should be the square of Pearson's r. However in MATLAB, I have data sets where my R^2 value (for my least squares fit) is something like 0.28, but r is something like -0.74 or -0.85. Is polyfit() not finding the best fit? The original data I believe follows a power law, so I have taken the log of both x and y to make a log-log data set, and used linear least squares regression on it. 76.104.28.221 (talk) 05:40, 25 July 2012 (UTC)[reply]

In general, engineers, statistician, and mathematicians may use slightly different definitions of correlation, covariance, autocorrelation, etc. They are all related via scaling or centering, but this can be a pain to work out. See for instance Autocorrelation#Signal_processing for an alternate definition. As a wild guess, try passing the 'coeff' flag into the corr function, and that might force the definition that you are expecting. Here's MathWorks' info on xcorr [1]. SemanticMantis (talk) 15:16, 25 July 2012 (UTC)[reply]

It's simply impossible for r to be -0.74 for the best linear fit. The worst that r could ever be is zero. Something is not right here. Looie496 (talk) 16:15, 25 July 2012 (UTC)[reply]

r implies a negative correlation, i.e. it is quite a good fit.

I've been using corr2() -- I guess this is not the Pearson correlation coefficient, but a "2D correlation coefficient". (However I am comparing linear vectors). Does anybody know why the 2D correlation coefficient is so much stronger for correlating one-dimensional vectors versus Pearson correlation? What method is used in 2D correlation? Does it allow cross-correlation? My two variables are time-series variables -- velocity (predictor) and curvature (predicted) of fruit flies. I do suspect sometimes that velocity predicts curvature a little later (e.g. 1/15 seconds later) rather than the curvature of that very moment. (Predictor and predicted could be reversed of course, I am just using one hypothesis.)

I should mention that the trajectories of fruit flies have several interesting properties-- for example literature mentions that paths are often self-similar (fractal-like) regardless of scale (at least up to the temporal resolution limit of conventional digital imaging, i.e. 1/30 seconds). However, I am a biochemistry student, not an engineering / signals and systems student, so I am just figuring this out. (My engineering friend explained autocorrelation to me a few months back.) 137.54.30.45 (talk) 16:47, 25 July 2012 (UTC)[reply]

The closest equivalent of corr2() I can find on Wikipedia is Digital image correlation, which allows for stretching and various other transformations of the vectors. Is this what corr2() does? If I apply digital image correlation between two one-dimensional images (so to speak -- allowing for stretching and translation?) how do I find an equation that allows one to determine the other? What I am trying to determine is a power law, after all.... 137.54.30.45 (talk) 16:57, 25 July 2012 (UTC)[reply]

I would strongly advise that you find somebody local to help you. You're using words without really understanding what they mean, and so the questions you are asking don't make sense. A local person could look at your data and help you to find the right words to describe it. Looie496 (talk) 17:18, 25 July 2012 (UTC)[reply]

Okay, but I am absolutely certain that a Pearson r of approximately -0.74 should yield an R^2 of approximately ~0.55. I mean, that's what our article seems to imply. I'm using definitions described in documentation and am simply trying to find the Wikipedia articles describing similar concepts. The MATLAB documentation for corr2 says that it finds the 2D correlation coefficient -- that is indeed digital image correlation is it not? Digital image correlation is mentioned as an application of corr2(). It seems that corr2() returning a negative-r should be possible...137.54.30.45 (talk) 17:24, 25 July 2012 (UTC)[reply]

Are you sure you included in your data a constant column? For example if I want to regress x on y the command is b=regress(y,[ones(size(x)) x]). Yaniv256 (talk) 23:40, 28 July 2012 (UTC)[reply]

polyfit() not finding the best fit?[edit]

So explicitly called the Pearson's correlation coefficient function in MATLAB using corr() rather than corr2(), but the results are completely the same. However, in many cases the square root of the R^2 of the least squares fit found via polyfit() is significantly below Pearson's r, which is always around -0.7 to -0.85. (In one case, r is -0.77, but the polyfit() R^2 is 0.28?) Is polyfit() not finding the best fit? 137.54.30.45 (talk) 18:07, 25 July 2012 (UTC)[reply]

Term for visual appearance of slope[edit]

Taken from Misleading graph

Misleading line graph with no scale
Volatility	Steady, fast growth	Slow growth

The lack of a scale allows the graph to be easily manipulated.

What is the term for the visual appearance of a slope? I'm writing the article misleading graph and want to express that visually, it appears that the slope is increasing/decreasing. I don't want to use slope or gradient as the underlying data is not changing. I'm currently using the terms volatility and growth, but these are imprecise economic terms. Anyone know how to better express Misleading_graph#Axis_changes and Misleading_graph#No scale? Smallman12q (talk) 12:07, 25 July 2012 (UTC)[reply]

Articles should be based on sources. What do the sources say? If they don't then we shouldn't either. Dmcq (talk) 12:10, 25 July 2012 (UTC)[reply]

The sources say that graphs without scales are misleading...and then give several examples. The economics texts say that one graph shows growth while the other one doesn't. I'm hoping there's some better way to express this... Smallman12q (talk) 12:22, 25 July 2012 (UTC)[reply]

I think it is valuable to make progress, even if citations are currently lacking (they can always be added later). I don't think there's a widely-used formal term for this. I would just describe the phenomenon as accurately as possible. Something like "Even though the actual slope of the (x,y) data is the same in both graphs, the way that the data is ploted can change the visual appearance of the angle made by the line on the graph. This is because each plot has different scale on its vertical axis. Because the scale is not shown, these graphs can be misleading." Now that I think of it, slope is independent of any plotting scheme, but angle can (and does in this example) depend on plotting scheme. So that might be the best distinction to make. SemanticMantis (talk) 15:03, 25 July 2012 (UTC)[reply]

Steepness. 207.224.43.139 (talk) 02:18, 30 July 2012 (UTC)[reply]

How does curvature correlate to velocity in a random walk with randomly-distributed speeds (i.e. Brownian motion)?[edit]

While investigating fly path curvature in a 2D arena as a statistical parameter, I have found that the curvature-speed relationship follows a power law where curvature in (1/mm) is predicted by the relation C = 10^a * x^b, where a is ~0.8 and b is ~ -1.6 and x is absolute speed in mm/s. Firstly, if behavior is random, doesn't the physical equation for curvature already "totally account" for velocity, i.e. curvature should be statistically independent of velocity in a random walk of random speeds? Curvature is |y"x' - x"y'| / speed^3, but both the numerator and denominator are dependent on velocity and should cancel each other out, right? Thus any correlation between curvature and velocity should be due to non-random behavior-- is this correct? 137.54.30.45 (talk) 17:24, 25 July 2012 (UTC)[reply]

The word "random" just means unpredictable in some way -- it doesn't mean anything specific. When you have a physical object, if it is moving very fast, momentum makes it impossible for it to curve very sharply, so there is always going to be a correlation between velocity (in the sense of speed) and curvature. But more basically, can I ask again that you look for somebody local to advise you? We can try to answer specific and well-formulated questions, but we really can't perform the functions of a mentor, because we can't look at your data. Looie496 (talk) 17:34, 25 July 2012 (UTC)[reply]

I'm talking about a random walk, where time progresses in fixed intervals so speed is distance/t, but since distance travelled is random, so speed should have the same random distribution. Consider a random walk model that factors in momentum and dampening parameters. A fruit fly I think has low momentum and high dampening (there is probably some dimensionless number describing this), so a fruit fly behaving randomly would be well-approximated by Brownian motion, no? In that case, is the curvature of a Brownian-motion particle statistically independent of its velocity?

Also, my PI is very busy, I have to do all the work and literature research myself; however I am finding it difficult to search the literature on this question. I think my questions are very well-defined -- if they are too general, I can narrow them down further. I think my questions are reasonable? 137.54.30.45 (talk) 18:29, 25 July 2012 (UTC)[reply]

The relation between speed and curvature that you would expect just depends on the model you use to explain the fly's movement. If your model says that the two should be independent, but your data shows some relation, then it's not a good model. This doesn't have anything to do with whether the fly's behavior is random or not. First, whether the fly moves in a way that's truly random is question of metaphysics that you probably are not trying to answer. But more importantly, any model is going to have randomness built into it since a deterministic model is out of reach, but there are many different random models. If one fails you just need to find a better one. Rckrone (talk) 20:52, 25 July 2012 (UTC)[reply]

Oh no, the random model is the null hypothesis. I am creating a model which implies non-random behavior but I need to compare it against a "null" model, in this case the random walk / Brownian hypothesis. I do not think a random model would predict a power law with an exponential parameter of -1.6? 137.54.30.45 (talk) 00:32, 26 July 2012 (UTC)[reply]

So, you have (or at least "approximately have") something like the "ideal" Brownian motion, when the particle (in this case - fly) moves straight all the time with exception of the points when it instantly changes direction? In such case the curvature of the trajectory is zero in the straight segments and infinite for turning points. Thus, if it is so, it makes little sense to look for any "interesting" relationship between curvature and anything else. In other words, if you want to get curvature that makes sense, first of all make sure you have enough points to see how the trajectory is curved.

A second point. "Curvature is |y"x' - x"y'| / speed^3, but both the numerator and denominator are dependent on velocity and should cancel each other out, right?" - it is going to be easier if you will forget the velocities for the moment. The curvature describes a point in a curve. In this case the curve is trajectory. Trajectory does not tell you anything about the speed. Neither will curvature (unless there is something more interesting in the underlying process).

A third point. Look how you find the velocities and curvatures. If you use numerical differentiation, remember that it is not resistant to noise. --Martynas Patasius (talk) 23:16, 25 July 2012 (UTC)[reply]

The resolution I'm looking at is 1/15 s-- I think this is high enough to examine curvature. The fly actually "wobbles" and we can measure things like yaw or absolute angular difference from some orientation but this results in a more noisy measure of curvature. One measure of curvature I used initially was angular velocity over linear velocity, but this was noisy compared to using the |y"x' - x"y'| / (x'^2 + y'^2)^3/2 definition. 137.54.30.45 (talk) 00:30, 26 July 2012 (UTC)[reply]

In Brownian motion the speed and curvature are everywhere infinite. -- Meni Rosenfeld (talk) 16:38, 29 July 2012 (UTC)[reply]