常用用戶相似度距離計(jì)算函數(shù)


曼哈頓距離

manhattan
def manhattan(rating1, rating2):
    """計(jì)算曼哈頓距離,rating1 和 rating2均為字典對(duì)"""
    distance = 0
    for key in rating1:
        if key in rating2:
            distance = distance + abs(rating1[key] - rating2[key])
    return distance

歐幾里德距離

歐幾里德
def euclid(rating1, rating2):
    """計(jì)算歐幾里德距離"""
    distance = 0
    for key in rating1:
        if key in rating2:
            distance += pow(rating1[key] - rating2[key], 2)
    return sqrt(distance)

閔可夫斯基距離

minkowski
def minkowski(rating1, rating2, r):
    distance = 0
    for key in rating1:
        if key in rating2:
            distance += pow(abs(rating1[key] - rating2[key]), r)
    return pow(distance, 1.0 / r)

皮爾森相關(guān)系數(shù)

Pearson
def pearson(rating1, rating2):
    sum_x = 0
    sum_y = 0
    sum_xy = 0
    sum_x2 = 0
    sum_y2 = 0
    n = 0
    for key in rating1:
        if key in rating2:
            n += 1
            sum_x += rating1[key]
            sum_y += rating2[key]
            sum_xy += rating1[key] * rating2[key]  #-->sum_xy += sum_x * sum_y
            sum_x2 += rating1[key] ** 2
            sum_y2 += rating2[key] ** 2

    fenmu = sqrt(sum_x2 - (sum_x ** 2) / n) * sqrt(sum_y2 - (sum_y ** 2) / n)
    if fenmu == 0:
        return 0
    else:
        return (sum_xy - (sum_x * sum_y) / n) / fenmu

余弦距離

余弦相似度

分子x,y為向量的數(shù)量積

def cosine_similarity(rating1, rating2):
    sum_xy = 0
    sum_x2 = 0
    sum_y2 = 0
    for key in rating1:
        if key in rating2:
            sum_xy += rating1[key] * rating2[key]
            sum_x2 += rating1[key] ** 2
            sum_y2 += rating2[key] ** 2
    return sum_xy / (sqrt(sum_x2) * sqrt(sum_y2))
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容