【機(jī)器學(xué)習(xí)實(shí)戰(zhàn) Task1】（KNN）k近鄰算法的應(yīng)用

toddmark 發(fā)布于2021-09-22 10:02 / 3719人閱讀

摘要：背景近鄰算法的概述近鄰算法的簡(jiǎn)介近鄰算法是屬于一個(gè)非常有效且易于掌握的機(jī)器學(xué)習(xí)算法，簡(jiǎn)單的說(shuō)就是采用測(cè)量不同特征值之間距離的方法對(duì)數(shù)據(jù)進(jìn)行分類(lèi)的一個(gè)算法。完美的分類(lèi)器的錯(cuò)誤率為，而最差的分類(lèi)器的錯(cuò)誤率則為。

1 背景

1.1 k近鄰算法的概述

（1）k近鄰算法的簡(jiǎn)介

k-近鄰算法是屬于一個(gè)非常有效且易于掌握的機(jī)器學(xué)習(xí)算法，簡(jiǎn)單的說(shuō)就是采用測(cè)量不同特征值之間距離的方法對(duì)數(shù)據(jù)進(jìn)行分類(lèi)的一個(gè)算法。

（2）k近鄰算法的工作原理

給定一個(gè)樣本的集合，這里稱為訓(xùn)練集，并且樣本中每個(gè)數(shù)據(jù)都包含標(biāo)簽。對(duì)于新輸入的一個(gè)不包含標(biāo)簽的數(shù)據(jù)，通過(guò)計(jì)算這個(gè)新的數(shù)據(jù)與每一個(gè)樣本之間的距離，選取前k個(gè)，通常k小于20，以k個(gè)劇里最近的數(shù)據(jù)的標(biāo)簽中出現(xiàn)次數(shù)最多的標(biāo)簽作為該新加入的數(shù)據(jù)標(biāo)簽。

（3）k近鄰算法的案例

當(dāng)前統(tǒng)計(jì)了6部電影的接吻和打斗的鏡頭數(shù)，假設(shè)有一部未看過(guò)的電影，如何確定它是愛(ài)情片還是動(dòng)作片呢？

電影名稱	打斗鏡頭	接吻鏡頭	電影類(lèi)型
California Man	3	104	愛(ài)情片
He‘s Not Really into Dudes	2	100	愛(ài)情片
Beautiful Woman	1	81	愛(ài)情片
Kevin Longblade	101	10	動(dòng)作片
Robo Slayer 3000	99	5	動(dòng)作片
Amped II	98	2	動(dòng)作片
？	18	90	未知

根據(jù)knn算法的原理，我們可以求出，未知電影與每部電影之間的距離(這里采用歐式距離）

以California Man為例

>>>((3-18)**2+(104-90)**2)**(1/2)20.518284528683193

電影名稱	與未知i電影之間的距離
California Man	20.5
He‘s Not Really into Dudes	18.7
Beautiful Woman	19.2
Kevin Longblade	115.3
Robo Slayer 3000	117.4
Amped II	118.9

因此我們可以找到樣本中前k個(gè)距離最近的電影，假設(shè)k=3，前三部電影均為愛(ài)情片，因此我們判定未知電影屬于愛(ài)情片。

1.2?用python代碼實(shí)現(xiàn)k近鄰算法

（1）計(jì)算已知類(lèi)別數(shù)據(jù)集中的每個(gè)點(diǎn)與當(dāng)前點(diǎn)之間的距離

（2）按照距離遞增次序排序

（3）選取與當(dāng)前點(diǎn)距離最小的k個(gè)點(diǎn)

（4）確定前k個(gè)點(diǎn)所在類(lèi)別出現(xiàn)的頻率

（5）返回前k個(gè)點(diǎn)出現(xiàn)頻率最高的類(lèi)別作為當(dāng)前點(diǎn)的預(yù)測(cè)分類(lèi)

import numpy as npimport operatordef classify0(inX, dataSet, labels, k):    dataSetSize = dataSet.shape[0]    diffMat = np.tile(inX, (dataSetSize,1)) - dataSet    sqDiffMat = diffMat**2    sqDistances = sqDiffMat.sum(axis=1)    distances = sqDistances**0.5    sortedDistIndicies = distances.argsort()         classCount={}              for i in range(k):        voteIlabel = labels[sortedDistIndicies[i]]        classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1    sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)    return sortedClassCount[0][0]

（6）案例

>>>group = np.array([[1, 1.1],...                 [1, 1],...                 [0, 0],...                 [0, 0.1]])>>>labels = ["A", "A", "B", "B"]>>>classify0([0,0], group, labels, 3)"B"

1.3?如何測(cè)試分類(lèi)器

正常來(lái)說(shuō)為了測(cè)試分類(lèi)器給出來(lái)的分類(lèi)效果，我們通常采用計(jì)算分類(lèi)器的錯(cuò)誤率對(duì)分類(lèi)器的效果進(jìn)行評(píng)判。也就是采用分類(lèi)出錯(cuò)的次數(shù)除以分類(lèi)的總次數(shù)。完美的分類(lèi)器的錯(cuò)誤率為0，而最差的分類(lèi)器的錯(cuò)誤率則為1。

2 使用kNN算法改進(jìn)約會(huì)網(wǎng)站的匹配效果

2.1 案例介紹

朋友海倫在使用約會(huì)軟件尋找約會(huì)對(duì)象的時(shí)候，盡管網(wǎng)站會(huì)推薦不同的人選，但并不是每一個(gè)人她都喜歡，具體可以分為以下三類(lèi)：不喜歡的人，魅力一般的人，極具魅力的人。盡管發(fā)現(xiàn)了以上的規(guī)律，但是海倫依舊無(wú)法將網(wǎng)站推薦的人歸到恰當(dāng)?shù)念?lèi)別，因此海倫希望我們的分類(lèi)軟件能更好的幫助她將匹配到的對(duì)象分配到確切的分類(lèi)中。

2.2 數(shù)據(jù)的準(zhǔn)備

以下提供兩種下載數(shù)據(jù)集的渠道：

《機(jī)器學(xué)習(xí)實(shí)戰(zhàn)官方下載python2版本代碼》

《202xxx的github下載python3版本代碼》

數(shù)據(jù)存放在datingTestSet2.txt中，每個(gè)樣本占一行，共1000行數(shù)據(jù)，主要包括了以下三個(gè)特征：

每年獲得的飛行?？屠锍虜?shù)，玩視頻游戲所耗時(shí)間百分比，每周消費(fèi)冰淇淋公升數(shù)

在數(shù)據(jù)輸入到分類(lèi)器之前，需要把數(shù)據(jù)轉(zhuǎn)換成分類(lèi)器可以識(shí)別的樣式

def file2matrix(filename):    fr = open(filename)    numberOfLines = len(fr.readlines())         #get the number of lines in the file    returnMat = np.zeros((numberOfLines,3))        #prepare matrix to return    classLabelVector = []                       #prepare labels return       fr = open(filename)    index = 0    for line in fr.readlines():        line = line.strip()        listFromLine = line.split("/t")        returnMat[index,:] = listFromLine[0:3]        classLabelVector.append(int(listFromLine[-1]))        index += 1    return returnMat,classLabelVector

使用file2matix讀取到的特征數(shù)據(jù)(datingDataMat)如下

array([[4.0920000e+04, 8.3269760e+00, 9.5395200e-01],        [1.4488000e+04, 7.1534690e+00, 1.6739040e+00],        [2.6052000e+04, 1.4418710e+00, 8.0512400e-01],        ...,        [2.6575000e+04, 1.0650102e+01, 8.6662700e-01],        [4.8111000e+04, 9.1345280e+00, 7.2804500e-01],        [4.3757000e+04, 7.8826010e+00, 1.3324460e+00]]

標(biāo)簽數(shù)據(jù)(datingLabels)如下

[3,2,1,1,1,1,3,3,...,3,3,3]

2.3?數(shù)據(jù)分析：使用Matplotlib創(chuàng)建散點(diǎn)圖

（1）玩視頻游戲所耗時(shí)間百分比與每周消費(fèi)冰淇淋公升數(shù)之間的相關(guān)關(guān)系圖

import matplotlibimport matplotlib.pyplot as pltfig = plt.figure()ax = fig.add_subplot(111)ax.scatter(datingDataMat[:,0], datingDataMat[:,1], 15.0*np.array(datingDLabels), 15.0*np.array(datingDLabels))plt.show()

其中，y軸為每周消費(fèi)冰淇淋公升數(shù)，x軸為玩視頻游戲所耗時(shí)間百分比

紫色為不喜歡，綠色為魅力一般，黃色為極具魅力

（2）飛行?？屠锍虜?shù)與玩視頻游戲所耗時(shí)間百分比之間的相關(guān)關(guān)系圖

import matplotlibimport matplotlib.pyplot as pltfig = plt.figure()ax = fig.add_subplot(111)ax.scatter(datingDataMat[:,0], datingDataMat[:,1], 15.0*np.array(datingDLabels), 15.0*np.array(datingDLabels))plt.show()

其中，y軸為玩視頻游戲所耗時(shí)間百分比，x軸為飛行?？屠锍虜?shù)

紫色為不喜歡，綠色為魅力一般，黃色為極具魅力

（3）飛行?？屠锍虜?shù)與每周消費(fèi)冰淇淋公升數(shù)之間的相關(guān)關(guān)系圖

import matplotlibimport matplotlib.pyplot as pltfig = plt.figure()ax = fig.add_subplot(111)ax.scatter(datingDataMat[:,0], datingDataMat[:,2], 15.0*np.array(datingDLabels), 15.0*np.array(datingDLabels))plt.show()

其中，y軸為每周消費(fèi)冰淇淋公升數(shù)，x軸為飛行?？屠锍虜?shù)

紫色為不喜歡，綠色為魅力一般，黃色為極具魅力

?2.4?數(shù)據(jù)準(zhǔn)備：歸一化數(shù)值

?由于通過(guò)歐式距離計(jì)算樣本之間的距離時(shí)，對(duì)于飛行?？屠锍虜?shù)來(lái)說(shuō)，數(shù)量值巨大，會(huì)對(duì)結(jié)果影響的權(quán)重也會(huì)較大，而且遠(yuǎn)遠(yuǎn)大于其他兩個(gè)特征，但是作為三個(gè)等權(quán)重之一，飛行?？屠锍虜?shù)并不應(yīng)該如此嚴(yán)重影響結(jié)果，例子如下

((0-67)**2+(20000-32000)**2+(1.1-0.1)**2)**1/2

	玩視頻游戲所耗時(shí)間百分比	飛行?？屠锍虜?shù)	每周消費(fèi)冰淇淋公升數(shù)	樣本分類(lèi)
1	0.8	400	0.5	1
2	12	134000	0.9	3
3	0	20000	1.1	2
4	67	32000	0.1	2

通常我們?cè)谔幚聿煌≈捣秶奶卣鲿r(shí)，常常采用歸一化進(jìn)行處理，將特征值映射到0-1或者-1到1之間，通過(guò)對(duì)（列中所有值-列中最小值）/（列中最大值-列中最小值）進(jìn)行歸一化特征

def autoNorm(dataSet):    minVals = dataSet.min(0)    maxVals = dataSet.max(0)    ranges = maxVals - minVals    normDataSet = np.zeros(np.shape(dataSet))    m = dataSet.shape[0]    normDataSet = dataSet - np.tile(minVals, (m,1))    normDataSet = normDataSet/np.tile(ranges, (m,1))   #element wise divide    return normDataSet, ranges, minVals

?2.5?測(cè)試算法：作為完整程序驗(yàn)證分類(lèi)器

評(píng)估正確率是機(jī)器學(xué)習(xí)算法中非常重要的一個(gè)步驟，通常我們會(huì)只使用訓(xùn)練樣本的90%用來(lái)訓(xùn)練分類(lèi)器，剩下的10%用于測(cè)試分類(lèi)器的正確率。為了不影響數(shù)據(jù)的隨機(jī)性，我們需要隨機(jī)選擇10%數(shù)據(jù)。

（1）使用file2matrix函數(shù)導(dǎo)入數(shù)據(jù)樣本

（2）使用autoNorm對(duì)數(shù)據(jù)進(jìn)行歸一化處理

（3）使用classify0對(duì)90%的數(shù)據(jù)進(jìn)行訓(xùn)練，對(duì)10%的數(shù)據(jù)進(jìn)行測(cè)試

（4）輸出測(cè)試集中的錯(cuò)誤率

def datingClassTest():    hoRatio = 0.50      #hold out 10%    datingDataMat,datingLabels = file2matrix("datingTestSet2.txt")       #load data setfrom file    normMat, ranges, minVals = autoNorm(datingDataMat)    m = normMat.shape[0]    numTestVecs = int(m*hoRatio)    errorCount = 0.0    for i in range(numTestVecs):        classifierResult = classify0(normMat[i,:],normMat[numTestVecs:m,:],datingLabels[numTestVecs:m],3)        print ("the classifier came back with: %d, the real answer is: %d" % (classifierResult, datingLabels[i]))        if (classifierResult != datingLabels[i]): errorCount += 1.0    print ("the total error rate is: %f" % (errorCount/float(numTestVecs))）    print (errorCount)

最后得到分類(lèi)器處理的約會(huì)數(shù)據(jù)集的錯(cuò)誤率為2.4%，這是一個(gè)相當(dāng)不錯(cuò)的結(jié)果，同樣我們可以改變hoRatio的值，和k的值，檢測(cè)錯(cuò)誤率是否隨著變量的變化而增加

?2.5?使用算法：構(gòu)建完整可用的系統(tǒng)

通過(guò)上面的學(xué)習(xí)，我們嘗試給海倫開(kāi)發(fā)一套程序，通過(guò)在約會(huì)網(wǎng)站找到某個(gè)人的信息，輸入到程序中，程序會(huì)給出海倫對(duì)對(duì)方的喜歡程度的預(yù)測(cè)值：不喜歡，魅力一般，極具魅力

import numpy as npimport operatordef file2matrix(filename):    fr = open(filename)    numberOfLines = len(fr.readlines())         #get the number of lines in the file    returnMat = np.zeros((numberOfLines,3))        #prepare matrix to return    classLabelVector = []                       #prepare labels return       fr = open(filename)    index = 0    for line in fr.readlines():        line = line.strip()        listFromLine = line.split("/t")        returnMat[index,:] = listFromLine[0:3]        classLabelVector.append(int(listFromLine[-1]))        index += 1    return returnMat,classLabelVectordef autoNorm(dataSet):    minVals = dataSet.min(0)    maxVals = dataSet.max(0)    ranges = maxVals - minVals    normDataSet = np.zeros(np.shape(dataSet))    m = dataSet.shape[0]    normDataSet = dataSet - np.tile(minVals, (m,1))    normDataSet = normDataSet/np.tile(ranges, (m,1))   #element wise divide    return normDataSet, ranges, minValsdef classify0(inX, dataSet, labels, k):    dataSetSize = dataSet.shape[0]    diffMat = np.tile(inX, (dataSetSize,1)) - dataSet    sqDiffMat = diffMat**2    sqDistances = sqDiffMat.sum(axis=1)    distances = sqDistances**0.5    sortedDistIndicies = distances.argsort()         classCount={}              for i in range(k):        voteIlabel = labels[sortedDistIndicies[i]]        classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1    sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)    return sortedClassCount[0][0]def classifyPerson():    resultList = ["not at all", "in small doses", "in large doses"]    percentTats = float(input("percentage of time spent playing video games?"))    ffMiles = float(input("ferquent fiter miles earned per year?"))    iceCream = float(input("liters of ice ice crean consumed per year?"))    datingDataMat,datingLabels = file2matrix("knn/datingTestSet2.txt")       #load data setfrom file    normMat, ranges, minVals = autoNorm(datingDataMat)    inArr = np.array([percentTats, ffMiles, iceCream])    classifierResult = classify0((inArr-minVals)/ranges, normMat, datingLabels,3)    print ("You will probably like this person:", resultList[classifierResult-1])if __name__ == "__main__":    classifyPerson()#10    10000    0.5

輸入測(cè)試數(shù)據(jù)：

percentage of time spent playing video games?10ferquent fiter miles earned per year?10000liters of ice ice crean consumed per year?0.5You will probably like this person: not at all

3 使用kNN算法制作手寫(xiě)識(shí)別系統(tǒng)

3.1 案例介紹

以下案例以數(shù)字0-9的分類(lèi)為例，簡(jiǎn)述如何采用k近鄰算法對(duì)手寫(xiě)數(shù)字進(jìn)行識(shí)別。

?通常手寫(xiě)輸入的數(shù)字都是圖片格式，我們需要將圖片轉(zhuǎn)換成knn算法可以識(shí)別的結(jié)構(gòu)化數(shù)據(jù)，簡(jiǎn)單來(lái)說(shuō)就是讀取圖片中的像素點(diǎn)，像素點(diǎn)值通常在0-255之間，0為黑色，255為白色，因此可以將值大于250的像素點(diǎn)標(biāo)記為1，其余標(biāo)記為0，手寫(xiě)數(shù)字1可以用以下數(shù)據(jù)集表示：

1	1	1	1	1	1	1	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	0	0	0	0	1	1	1
1	1	1	1	1	1	1	1	1	1

3.2?數(shù)據(jù)準(zhǔn)備：將圖像轉(zhuǎn)換為測(cè)試向量

?以下提供兩種下載數(shù)據(jù)集的渠道：

《機(jī)器學(xué)習(xí)實(shí)戰(zhàn)官方下載python2版本代碼》

《202xxx的github下載python3版本代碼》

數(shù)據(jù)集存放在digits.zip中，其中用1代表手寫(xiě)的區(qū)域，用0代表空白區(qū)域

（大佬們，中秋快樂(lè)?。。。?

?通過(guò)img2vector函數(shù)對(duì)數(shù)據(jù)進(jìn)行讀取，并且返回?cái)?shù)組

def img2vector(filename):    returnVect = np.zeros((1,1024))    fr = open(filename)    for i in range(32):        lineStr = fr.readline()        for j in range(32):            returnVect[0,32*i+j] = int(lineStr[j])    return returnVect

3.3 測(cè)試算法，使用kNN識(shí)別手寫(xiě)數(shù)字

（1）使用listdir讀取trainingDigits目錄下所有文件作為訓(xùn)練數(shù)據(jù)

（2）使用listdir讀取testDigits目錄下所有文件作為測(cè)試數(shù)據(jù)

（3）將訓(xùn)練數(shù)據(jù)與測(cè)試數(shù)據(jù)喂入knn算法中

def handwritingClassTest():    hwLabels = []    trainingFileList = listdir("trainingDigits")           #load the training set    m = len(trainingFileList)    trainingMat = np.zeros((m,1024))    for i in range(m):        fileNameStr = trainingFileList[i]        fileStr = fileNameStr.split(".")[0]     #take off .txt        classNumStr = int(fileStr.split("_")[0])        hwLabels.append(classNumStr)        trainingMat[i,:] = img2vector("trainingDigits/%s" % fileNameStr)    testFileList = listdir("testDigits")        #iterate through the test set    errorCount = 0.0    mTest = len(testFileList)    for i in range(mTest):        fileNameStr = testFileList[i]        fileStr = fileNameStr.split(".")[0]     #take off .txt        classNumStr = int(fileStr.split("_")[0])        vectorUnderTest = img2vector("testDigits/%s" % fileNameStr)        classifierResult = classify0(vectorUnderTest, trainingMat, hwLabels, 3)        print ("the classifier came back with: %d, the real answer is: %d"% (classifierResult, classNumStr))        if (classifierResult != classNumStr): errorCount += 1.0    print ("/nthe total number of errors is: %d" % errorCount)    print ("/nthe total error rate is: %f" % (errorCount/float(mTest)))

輸出訓(xùn)練結(jié)果，錯(cuò)誤率為1.1628%，通過(guò)改變k值與訓(xùn)練樣本都會(huì)使得錯(cuò)誤率發(fā)生變化。

the classifier came back with: 7, the real answer is: 7the classifier came back with: 7, the real answer is: 7the classifier came back with: 9, the real answer is: 9the classifier came back with: 0, the real answer is: 0the classifier came back with: 0, the real answer is: 0the classifier came back with: 4, the real answer is: 4the classifier came back with: 9, the real answer is: 9the classifier came back with: 7, the real answer is: 7the classifier came back with: 7, the real answer is: 7the classifier came back with: 1, the real answer is: 1the classifier came back with: 5, the real answer is: 5the classifier came back with: 4, the real answer is: 4the classifier came back with: 3, the real answer is: 3the classifier came back with: 3, the real answer is: 3the total number of errors is: 11the total error rate is: 0.011628

4 總結(jié)

4.1 k-近鄰算法的優(yōu)缺點(diǎn)

（1）優(yōu)點(diǎn)：精度高，對(duì)異常值不敏感，無(wú)數(shù)據(jù)輸入假定

（2）缺點(diǎn)：計(jì)算復(fù)雜度高，空間復(fù)雜度高

適用數(shù)據(jù)范圍：數(shù)值型和標(biāo)稱型

4.2?k-近鄰算法的一般流程

（1）收集數(shù)據(jù)：可以使用任何方法

（2）準(zhǔn)備數(shù)據(jù)：距離計(jì)算所需的數(shù)值，最好是結(jié)構(gòu)化的數(shù)據(jù)格式

（3）分析數(shù)據(jù)L：可以使用任何方法

（4）訓(xùn)練算法：此步驟不適合與k近鄰算法

（5）測(cè)試算法：計(jì)算錯(cuò)誤率

（6）使用算法：首先需要輸入樣本數(shù)據(jù)和結(jié)構(gòu)化的輸出結(jié)果，然后運(yùn)行k-近鄰算法判定輸入數(shù)據(jù)分別屬于哪個(gè)分類(lèi)，最后應(yīng)用對(duì)計(jì)算出的分類(lèi)執(zhí)行后續(xù)的處理。

4.3?k-近鄰算法使用需要注意的問(wèn)題

（1）數(shù)據(jù)特征之間量綱不統(tǒng)一時(shí)，需要對(duì)數(shù)據(jù)進(jìn)行歸一化處理，否則會(huì)出現(xiàn)大數(shù)吃小數(shù)的問(wèn)題

（2）數(shù)據(jù)之間的距離計(jì)算通常采用歐式距離

（3）kNN算法中K值的選取會(huì)對(duì)結(jié)果產(chǎn)生較大的影響，一般k值要小于訓(xùn)練樣本數(shù)據(jù)的平方根

（4）通常采用交叉驗(yàn)證法來(lái)選擇最優(yōu)的K值

5?Reference

《機(jī)器學(xué)習(xí)實(shí)戰(zhàn)》

GPU云服務(wù)器云服務(wù)器機(jī)器學(xué)習(xí)的算法常用的機(jī)器學(xué)習(xí)算法機(jī)器學(xué)習(xí)應(yīng)用的技術(shù) 機(jī)器學(xué)習(xí)的應(yīng)用領(lǐng)域

文章版權(quán)歸作者所有，未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址：http://m.hztianpu.com/yun/120005.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

toddmark

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

OpenCV通道的分離和合并

閱讀 3270·2021-11-23 09:51
基于域名的虛擬主機(jī)是什么-什么是虛擬主機(jī)？

閱讀 3732·2021-09-22 15:35
【機(jī)器學(xué)習(xí)實(shí)戰(zhàn) Task1】（KNN）k近鄰算法的應(yīng)用

閱讀 3720·2021-09-22 10:02
搬瓦工，DC2機(jī)房CN2 VPS匯總，電信雙程cn2 gt，聯(lián)通/移動(dòng)雙程直連

閱讀 3030·2021-08-30 09:49
FxTransit：$16/月/2核/1GB內(nèi)存/10GB SSD硬盤(pán)/2TB流量/500Mbps端

閱讀 587·2021-08-05 10:01
微信小程序隱藏客服按鈕，用圖片替代&增加提示卡片可隨時(shí)關(guān)閉。

閱讀 3471·2019-08-30 15:54
CSS3漸變效果工具

閱讀 1728·2019-08-30 15:53
前端每日實(shí)戰(zhàn)：111# 視頻演示如何用純 CSS 創(chuàng)作一只藝術(shù)的鴨子

閱讀 3615·2019-08-29 16:27

1	1	1	1	1	1	1	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	0	0	0	0	1	1	1
1	1	1	1	1	1	1	1	1	1

1	1	1	1	1	1	1	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	0	0	0	0	1	1	1
1	1	1	1	1	1	1	1	1	1

資訊專(zhuān)欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！

【機(jī)器學(xué)習(xí)實(shí)戰(zhàn) Task1】 （KNN）k近鄰算法的應(yīng)用

1 背景

1.1 k近鄰算法的概述

1.2?用python代碼實(shí)現(xiàn)k近鄰算法

1.3?如何測(cè)試分類(lèi)器

2 使用kNN算法改進(jìn)約會(huì)網(wǎng)站的匹配效果

2.1 案例介紹

2.2 數(shù)據(jù)的準(zhǔn)備

2.3?數(shù)據(jù)分析：使用Matplotlib創(chuàng)建散點(diǎn)圖

?2.4?數(shù)據(jù)準(zhǔn)備：歸一化數(shù)值

?2.5?測(cè)試算法：作為完整程序驗(yàn)證分類(lèi)器

?2.5?使用算法：構(gòu)建完整可用的系統(tǒng)

3 使用kNN算法制作手寫(xiě)識(shí)別系統(tǒng)

3.1 案例介紹

3.2?數(shù)據(jù)準(zhǔn)備：將圖像轉(zhuǎn)換為測(cè)試向量

3.3 測(cè)試算法，使用kNN識(shí)別手寫(xiě)數(shù)字

4 總結(jié)

4.1 k-近鄰算法的優(yōu)缺點(diǎn)

4.2?k-近鄰算法的一般流程

4.3?k-近鄰算法使用需要注意的問(wèn)題

5?Reference

相關(guān)文章

發(fā)表評(píng)論

0條評(píng)論

男|高級(jí)講師

TA的文章

最新活動(dòng)

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！

【機(jī)器學(xué)習(xí)實(shí)戰(zhàn) Task1】（KNN）k近鄰算法的應(yīng)用

3.3 測(cè)試算法，使用kNN識(shí)別手寫(xiě)數(shù)字

1	1	1	1	1	1	1	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	0	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	1	0	0	1	1	1	1
1	1	1	0	0	0	0	1	1	1
1	1	1	1	1	1	1	1	1	1