Normalize a feature in this table

Machine Learning Problem Overview

This has become quite a frustrating question, but I've asked in the Coursera discussions and they won't help. Below is the question:

enter image description here

I've gotten it wrong 6 times now. How do I normalize the feature? Hints are all I'm asking for.

I'm assuming x_2^(2) is the value 5184, unless I am adding the x_0 column of 1's, which they don't mention but he certainly mentions in the lectures when talking about creating the design matrix X. In which case x_2^(2) would be the value 72. Assuming one or the other is right (I'm playing a guessing game), what should I use to normalize it? He talks about 3 different ways to normalize in the lectures: one using the maximum value, another with the range/difference between max and mins, and another the standard deviation -- they want an answer correct to the hundredths. Which one am I to use? This is so confusing.

Machine Learning Solutions

Solution 1 - Machine Learning

> ...use both feature scaling (dividing by the > "max-min", or range, of a feature) and mean normalization.

So for any individual feature f:

f_norm = (f - f_mean) / (f_max - f_min)

e.g. for x2,(midterm exam)^2 = {7921, 5184, 8836, 4761}

> x2 <- c(7921, 5184, 8836, 4761)
> mean(x2)
 6676
> max(x2) - min(x2)
 4075
> (x2 - mean(x2)) / (max(x2) - min(x2))
 0.306  -0.366  0.530 -0.470

Hence norm(5184) = 0.366

(using R language, which is great at vectorizing expressions like this)

I agree it's confusing they used the notation x2 (2) to mean x2 (norm) or x2'

EDIT: in practice everyone calls the builtin scale(...) function, which does the same thing.

Solution 2 - Machine Learning

It's asking to normalize the second feature under second column using both feature scaling and mean normalization. Therefore,

(5184 - 6675.5) / 4075 = -0.366

Solution 3 - Machine Learning

Usually we normalize all of them to have zero mean and go between [-1, 1].

You can do that easily by dividing by the maximum of the absolute value and then remove the mean of the samples.

Solution 4 - Machine Learning

"I'm assuming x_2^(2) is the value 5184" is this because it's the second item in the list and using the subscript _2? x_2 is just a variable identity in maths, it applies to all rows in the list. Note that the highest raw mid-term exam result (i.e. that which is not squared) goes down on the final test and the lowest raw mid-term result increases the most for the final exam result. Theta is a fixed value, a coefficient, so somewhere your normalisation of x_1 and x_2 values must become (EDIT: not negative, less than 1) in order to allow for this behaviour. That should hopefully give you a starting basis, by identifying where the pivot point is.

Solution 5 - Machine Learning

I had the same problem, in my case the thing was that I was using as average the maximum x2 value (8836) minus minimum x2 value (4761) divided by two, instead of the sum of each x2 value divided by the number of examples.

Solution 6 - Machine Learning

For the same training set, I got the question as Q. What is the normalized feature x^(3)_1?

Thus, 3rd training ex and 1st feature makes out to 94 in above table. Now, normalized form is

x = (x - mean(x's)) / range(x)

Values are :

x = 94
mean(89+72+94+69) / 4 = 81
range = 94 - 69 = 25

Normalized x = (94 - 81) / 25 = 0.52

Solution 7 - Machine Learning

I'm taking this course at the moment and a really trivial mistake I made first time I answered this question was using comma instead of dot in the answer, since I did by hand and in my country we use comma to denote decimals. Ex:(0,52 instead of 0.52)

So in the second time I tried I used dot and works fine.

Content Type	Original Author	Original Content on Stackoverflow
Question	bjd2385	View Question on Stackoverflow
Solution 1 - Machine Learning	smci	View Answer on Stackoverflow
Solution 2 - Machine Learning	user6552158	View Answer on Stackoverflow
Solution 3 - Machine Learning	Royi	View Answer on Stackoverflow
Solution 4 - Machine Learning	roganjosh	View Answer on Stackoverflow
Solution 5 - Machine Learning	jordileft	View Answer on Stackoverflow
Solution 6 - Machine Learning	Siddhesh Suhas Sathe	View Answer on Stackoverflow
Solution 7 - Machine Learning	Everton Carneiro	View Answer on Stackoverflow