I am a data scientist working on time series forecasting (using R and Python 3) at the London Ambulance Service NHS Trust. I earned my PhD in cognitive neuroscience at the University of Glasgow working with fmri data and neural networks. I favour linux machines, and working in the terminal with Vim as my editor of choice.
View the Project on GitHub Dr-Matthew-Bennett/Matt-A-Bennett.github.io
It is often useful to normalise data. One type of normalisation is to 'zero center' the data by subtracting it's mean. This will yield data with a new mean of zero. Often we want to do this separately for each column or row of a matrix. Also it's often the case that we want to perform this operation on a square matrix. In that case, we can multiply by a 'centering matrix':
Here the matrix $C$ had the effect of subtracting $1/3$ from the first column, subtracting $2/3$ from the second column, and adding $1$ to the last column - ensuring that all columns sum to zero.
We create a function to build the centering matrix:def gen_centering(size):
if type(size) is int:
size = [size, size]
return la.eye(size).subtract(1/size[0])We then define a function that can zero center each column, each row, or that matrix as a whole (if axis=2). If the matrix is square, then we use the centering matrix approach above. Otherwise, after taking the mean across the relevant axis (columns, rows, or all elements), we make a matrix of the same size but filled with the mean of each axis and subtract it off the original matrix:
def zero_center(A, axis=0):
if axis == 2:
global_mean = mean(mean(A)).data[0][0]
return A.subtract(global_mean)
elif axis == 1:
A = A.tr()
if A.is_square():
A = gen_centering(la.size(A)).multiply(A)
else:
A_mean = mean(A)
ones = la.gen_mat([la.size(A)[0], 1], values=[1])
A_mean_mat = ones.multiply(A_mean)
A = A.subtract(A_mean_mat)
if axis == 1:
A = A.tr()
return AWe create a matrix, call the zero center method twice with a different axis as an argument and print the results (we only print 2 decimal places in one case to make it look pretty):
import linalg as la
A = la.Mat([[1, 2, 3],
[-2, 1, 4],
[0, 1, 2],
[3, 6, 1]])
B = la.Mat([[1, 1, 1],
[0, 2, 0],
[0, 3, -4]])
result = la.stats.zero_center(A)
la.print_mat(result)
result = la.stats.zero_center(A, axis=1)
la.print_mat(result, 2)
result = la.stats.zero_center(B)
la.print_mat(result, 2)Outputs:
>>> la.print_mat(result)
[0.5, -0.5, 0.5]
[-2.5, -1.5, 1.5]
[-0.5, -1.5, -0.5]
[2.5, 3.5, -1.5]
>>> la.print_mat(result, 2)
[-1.0, 0.0, 1.0]
[-3.0, 0.0, 3.0]
[-1.0, 0.0, 1.0]
[-0.33, 2.67, -2.33]
>>> la.print_mat(result, 2)
[0.67, -1.0, 2.0]
[-0.33, 0.0, 1.0]
[-0.33, 1.0, -3.0]back to project main page
back to home