Why does corrcoef return a matrix?

PythonMathNumpy

Python Problem Overview


It seems strange to me that np.corrcoef returns a matrix.

 correlation1 = corrcoef(Strategy1Returns,Strategy2Returns)

[[ 1.         -0.99598935]
 [-0.99598935  1.        ]]

Does anyone know why this is the case and whether it is possible to return just one value in the classical sense?

Python Solutions


Solution 1 - Python

It allows you to compute correlation coefficients of >2 data sets, e.g.

>>> from numpy import *
>>> a = array([1,2,3,4,6,7,8,9])
>>> b = array([2,4,6,8,10,12,13,15])
>>> c = array([-1,-2,-2,-3,-4,-6,-7,-8])
>>> corrcoef([a,b,c])
array([[ 1.        ,  0.99535001, -0.9805214 ],
       [ 0.99535001,  1.        , -0.97172394],
       [-0.9805214 , -0.97172394,  1.        ]])

Here we can get the correlation coefficient of a,b (0.995), a,c (-0.981) and b,c (-0.972) at once. The two-data-set case is just a special case of N-data-set class. And probably it's better to keep the same return type. Since the "one value" can be obtained simply with

>>> corrcoef(a,b)[1,0]
0.99535001355530017

there's no big reason to create the special case.

Solution 2 - Python

corrcoef returns the normalised covariance matrix.

The covariance matrix is the matrix

Cov( X, X )    Cov( X, Y )

Cov( Y, X )    Cov( Y, Y )

Normalised, this will yield the matrix:

Corr( X, X )    Corr( X, Y )

Corr( Y, X )    Corr( Y, Y )

correlation1[0, 0 ] is the correlation between Strategy1Returns and itself, which must be 1. You just want correlation1[ 0, 1 ].

Solution 3 - Python

The correlation matrix is the standard way to express correlations between an arbitrary finite number of variables. The correlation matrix of N data vectors is a symmetric N × N matrix with unity diagonal. Only in the case N = 2 does this matrix have one free parameter.

Solution 4 - Python

You can use the following function to return only the correlation coefficient:

def pearson_r(x, y):
"""Compute Pearson correlation coefficient between two arrays."""
   
   # Compute correlation matrix
   corr_mat = np.corrcoef(x, y)

   # Return entry [0,1]
   return corr_mat[0,1]

Solution 5 - Python

Consider using matplotlib.cbook pieces

for example:

import matplotlib.cbook as cbook
segments = cbook.pieces(np.arange(20), 3)
for s in segments:
     print s

Solution 6 - Python

The function Correlate of numpy works with 2 1D arrays that you want to correlate and returns one correlation value.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDanView Question on Stackoverflow
Solution 1 - PythonkennytmView Answer on Stackoverflow
Solution 2 - PythonKatrielView Answer on Stackoverflow
Solution 3 - PythonPhilippView Answer on Stackoverflow
Solution 4 - PythonArman AynaszyanView Answer on Stackoverflow
Solution 5 - PythonschwaterView Answer on Stackoverflow
Solution 6 - PythonSergioView Answer on Stackoverflow