project demonstrates how to measure similarities between the data objects. These topics described are mostly from chapter 2.4 : dataType4.pdf
1) Based on the table 2.8 in Dissimilarities between Data Objects 2.4.3, show example of calculating P2dist.pdf
A) Euclidean distance
B) L1 distance
2) Prove or disprove that Euclidean and L1 distance satisfy
A) Positivity
(i) d(x,y) >= 0 for all x and y,
(ii) d(x,y) == 0 only if x == y.
B) Symmetry
d(x,y) == d(y,x) for all x and y.
C) Triangle Inequality
d(x,z) <= d(x,y) + d(y,z) for all points x, y, and z
3) Explain with the example :
A) What is Hamming distance?
B) Is it possible to rearrange data so Euclidean distance gives the same meaning as Hamming distance?
C) For distance measure d=1-cos(x,y), are the Positivity, Symmetry, and Triangle Inequality satisfied?
4) Draw the conclusions about if it matters which distance measure is picked to evaluate the dissimilarities between data objects. Consider Euclidean, L1, and 1-cos(x,y) measures.