• 欢迎光临~

# 数据分析：5个数据相关性指标

## 2. 指标

### 2.1. 欧几里得距离

``````from scipy.spatial import distance

# Calculate Euclidean distance between two points
point1 = [1, 2, 3]
point2 = [4, 5, 6]

# Use the euclidean function from scipy's distance module to calculate the Euclidean distance
euclidean_distance = distance.euclidean(point1, point2)
``````

### 2.2. 曼哈顿距离

``````from scipy.spatial import distance

# Calculate Manhattan distance between two points
point1 = [1, 2, 3]
point2 = [4, 5, 6]

# Use the cityblock function from scipy's distance module to calculate the Manhattan distance
manhattan_distance = distance.cityblock(point1, point2)

# Print the result
print("Manhattan Distance between the given two points: " +
str(manhattan_distance))
``````

### 2.3. 余弦相似度

``````from sklearn.metrics.pairwise import cosine_similarity

# Calculate cosine similarity between two vectors
vector1 = [1, 2, 3]
vector2 = [4, 5, 6]

# Use the cosine_similarity function from scikit-learn to calculate the similarity
cosine_sim = cosine_similarity([vector1], [vector2])[0][0]

# Print the result
print("Cosine Similarity between the given two vectors: " +
str(cosine_sim))Jaccard Similarity
``````

### 2.4. Jaccard相似度

``````def jaccard_similarity(list1, list2):
"""
Calculates the Jaccard similarity between two lists.

Parameters:
list1 (list): The first list to compare.
list2 (list): The second list to compare.

Returns:
float: The Jaccard similarity between the two lists.
"""
# Convert the lists to sets for easier comparison
s1 = set(list1)
s2 = set(list2)

# Calculate the Jaccard similarity by taking the length of the intersection of the sets
# and dividing it by the length of the union of the sets
return float(len(s1.intersection(s2)) / len(s1.union(s2)))

# Calculate Jaccard similarity between two sets
set1 = [1, 2, 3]
set2 = [2, 3, 4]
jaccard_sim = jaccard_similarity(set1, set2)

# Print the result
print("Jaccard Similarity between the given two sets: " +
str(jaccard_sim))
``````

### 2.5. 皮尔逊相关系数

``````import numpy as np

# Calculate Pearson correlation coefficient between two variables
x = [1, 2, 3, 4]
y = [2, 3, 4, 5]

# Numpy corrcoef function to calculate the Pearson correlation coefficient and p-value
pearson_corr = np.corrcoef(x, y)[0][1]

# Print the result
print("Pearson Correlation between the given two variables: " +
str(pearson_corr))
``````