摘要:系數(shù)反映每個(gè)特征的影響力。越大表示該特征在分類中起到的作用越大
import numpy as np import scipy as sp import pandas as pd import matplotlib.pyplot as pltSplit train and test
from sklearn.cross_validation import train_test_split x_train, x_test, y_train, y_test = train_test_split(customer.ix[:,0:customer.columns.size-1], customer.ix[:,customer.columns.size-1], test_size = 0.2) x_train, x_test, y_train, y_test = train_test_split(order.ix[:,0:order.columns.size-1], order.ix[:,order.columns.size-1], test_size = 0.2)Pearson Correlation for Order
from scipy.stats import pearsonr prr = [] for i in range(order.columns.size-1): frame = pearsonr(order.iloc[:,i], order.iloc[:,order.columns.size-1]) prr.append(frame) result = pd.concat([pd.DataFrame(order.columns.values.tolist()), pd.DataFrame(prr)], axis=1) result.columns = ["Features", "Pearson", "Pvalue"] result result.to_csv("result.csv", index = True, header = True)Pearson Correlation for Customer
from scipy.stats import pearsonr prr = [] for i in range(customer.columns.size-1): frame = pearsonr(customer.iloc[:,i], customer.iloc[:,customer.columns.size-1]) prr.append(frame) result = pd.concat([pd.DataFrame(customer.columns.values.tolist()), pd.DataFrame(prr)], axis=1) result.columns = ["Features", "Pearson", "Pvalue"] result result.to_csv("result.csv", index = True, header = True)Random forest
from sklearn.ensemble import RandomForestRegressor clf = RandomForestRegressor() clf.fit(x_train, y_train) from sklearn.ensemble import RandomForestClassifier clf = RandomForestClassifier(n_jobs=100) clf.fit(x_train, y_train)MIC
from minepy import MINE mic = [] for i in range(customer.columns.size-1): frame = m.compute_score(customer.iloc[:,i], customer.iloc[:,34]) prr.append(frame) result = pd.concat([pd.DataFrame(customer.columns.values.tolist()), pd.DataFrame(prr)], axis=1) result.columns = ["Features", "Pearson", "Pvalue"] result.to_csv("result.csv", index = True, header = True)Feature Correlation
corr = customer.corr() corr.to_csv("result.csv", index = True, header = True) tar_corr = lambda x: x.corr(x["tar"]) cus_call.apply(tar_corr) cus_call.corrwith(cus_call.tar)Feature Importance
系數(shù)反映每個(gè)特征的影響力。越大表示該特征在分類中起到的作用越大
importances = pd.DataFrame(sorted(zip(x_train.columns, map(lambda x: round(x, 4), clf.feature_importances_)), reverse=True)) importances.columns = ["Features", "Importance"] importances.to_csv("result.csv", index = True, header = True)
文章版權(quán)歸作者所有,未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請(qǐng)注明本文地址:http://m.hztianpu.com/yun/44567.html
摘要:翻譯自昨天收到推送了一篇介紹隨機(jī)森林算法的郵件,感覺(jué)作為介紹和入門不錯(cuò),就順手把它翻譯一下。隨機(jī)森林引入的隨機(jī)森林算法將自動(dòng)創(chuàng)建隨機(jī)決策樹群?;貧w隨機(jī)森林也可以用于回歸問(wèn)題。結(jié)語(yǔ)隨機(jī)森林相當(dāng)起來(lái)非常容易。 翻譯自:http://blog.yhat.com/posts/python-random-forest.html 昨天收到y(tǒng)hat推送了一篇介紹隨機(jī)森林算法的郵件,感覺(jué)作為介紹和入門...
摘要:機(jī)器學(xué)習(xí)算法類型從廣義上講,有種類型的機(jī)器學(xué)習(xí)算法。強(qiáng)化學(xué)習(xí)的例子馬爾可夫決策過(guò)程常用機(jī)器學(xué)習(xí)算法列表以下是常用機(jī)器學(xué)習(xí)算法的列表。我提供了對(duì)各種機(jī)器學(xué)習(xí)算法的高級(jí)理解以及運(yùn)行它們的代碼。決策樹是一種監(jiān)督學(xué)習(xí)算法,主要用于分類問(wèn)題。 showImg(https://segmentfault.com/img/remote/1460000019086462); 介紹 谷歌的自動(dòng)駕駛汽車和機(jī)...
閱讀 2600·2021-07-26 23:38
閱讀 3497·2019-08-30 13:10
閱讀 2392·2019-08-29 18:33
閱讀 2382·2019-08-29 16:12
閱讀 1075·2019-08-29 10:59
閱讀 1853·2019-08-26 17:40
閱讀 889·2019-08-26 11:59
閱讀 875·2019-08-26 11:41