大數(shù)據(jù)信息站 2018-08-13 18:00:52

今天，我們將更深入地學(xué)習(xí)和實(shí)現(xiàn)8個(gè)頂級Python機(jī)器學(xué)習(xí)算法。

讓我們開始Python編程中的機(jī)器學(xué)習(xí)算法之旅。

8 Python機(jī)器學(xué)習(xí)算法 - 你必須學(xué)習(xí)

以下是Python機(jī)器學(xué)習(xí)的算法：

1。線性回歸

線性回歸是受監(jiān)督的Python機(jī)器學(xué)習(xí)算法之一，它可以觀察連續(xù)特征并預(yù)測結(jié)果。根據(jù)它是在單個(gè)變量上還是在許多特征上運(yùn)行，我們可以將其稱為簡單線性回歸或多元線性回歸。

這是最受歡迎的Python ML算法之一，經(jīng)常被低估。它為變量分配最佳權(quán)重以創(chuàng)建線ax + b來預(yù)測輸出。我們經(jīng)常使用線性回歸來估計(jì)實(shí)際值，例如基于連續(xù)變量的房屋調(diào)用和房屋成本。回歸線是擬合Y = a * X + b的最佳線，表示獨(dú)立變量和因變量之間的關(guān)系。

您是否了解Python機(jī)器學(xué)習(xí)環(huán)境設(shè)置？

讓我們?yōu)樘悄虿?shù)據(jù)集繪制這個(gè)圖。

>>>將matplotlib.pyplot導(dǎo)入為plt
>>>將numpy導(dǎo)入為np
>>>來自sklearn導(dǎo)入數(shù)據(jù)集，linear_model
>>>來自sklearn.metrics import mean_squared_error，r2_score
>>>糖尿病=數(shù)據(jù)集。load_diabetes （）
>>> diabetes_X = diabetes.data [ ：，np.newaxis，2 ]
>>> diabetes_X_train = diabetes_X [ ： - 30 ] #splitting數(shù)據(jù)到訓(xùn)練和測試集
>>> diabetes_X_test = diabetes_X [ - 30 ：]
>>> diabetes_y_train = diabetes.target [ ： - 30 ] #splitting目標(biāo)分為訓(xùn)練和測試集
>>> diabetes_y_test = diabetes.target [ - 30 ：]
>>> regr = linear_model。LinearRegression （）＃線性回歸對象
>>> regr。fit （diabetes_X_train，diabetes_y_train ）#Use training set訓(xùn)練模型

LinearRegression（copy_X = True，fit_intercept = True，n_jobs = 1，normalize = False）

>>> diabetes_y_pred = regr。預(yù)測（diabetes_X_test ）#Make預(yù)測
>>> regr.coef_

陣列（[941.43097333]）

>>> mean_squared_error （diabetes_y_test，diabetes_y_pred ）

3035.0601152912695

>>> r2_score （diabetes_y_test，diabetes_y_pred ）#Variance得分

0.410920728135835

>>> plt。散射（diabetes_X_test，diabetes_y_test，color = 'lavender' ）

<matplotlib.collections.PathCollection對象位于0x0584FF70>

>>> plt。情節(jié)（diabetes_X_test，diabetes_y_pred，color = 'pink' ，linewidth = 3 ）

[<matplotlib.lines.Line2D對象位于0x0584FF30>]

>>> plt。xticks （（））

（[]，<a 0 of text xticklabel objects>）

>>> plt。yticks （（））

（[]，<a 0 of text yticklabel objects>）

>>> plt。show （）

Python機(jī)器學(xué)習(xí)算法 - 線性回歸

2 Logistic回歸

Logistic回歸是一種受監(jiān)督的分類Python機(jī)器學(xué)習(xí)算法，可用于估計(jì)離散值，如0/1，是/否和真/假。這是基于一組給定的自變量。我們使用邏輯函數(shù)來預(yù)測事件的概率，這給出了0到1之間的輸出。

雖然它說'回歸'，但這實(shí)際上是一種分類算法。Logistic回歸將數(shù)據(jù)擬合到logit函數(shù)中，也稱為logit回歸。讓我們描繪一下。

>>>將numpy導(dǎo)入為np
>>>將matplotlib.pyplot導(dǎo)入為plt
>>>來自sklearn import linear_model
>>> XMIN，XMAX = - 7 ，7 #TEST集; 高斯噪聲的直線
>>> n_samples = 77
>>> np.random。種子（0 ）
>>> x = np.random。正常（size = n_samples ）
>>> y = （x> 0 ）。astype （np.float ）
>>> x [ x> 0 ] * = 3
>>> x + =。4 * np.random。正常（size = n_samples ）
>>> x = x [ ：，np.newaxis ]
>>> clf = linear_model。LogisticRegression （C = 1e4 ）#Classifier
>>> clf。適合（x，y ）
>>> plt。圖（1 ，figsize = （3 ，4 ））
<圖大小與300x400 0 軸>
>>> plt。clf （）
>>> plt。散射（X。拆紗（）中，Y，顏色= '薰衣草' ，ZORDER = 17 ）

<matplotlib.collections.PathCollection對象位于0x057B0E10>

>>> x_test = np。linspace （- 7 ，7 ，277 ）
>>> def model （x ）：
返回1 / （1個(gè)+ NP。EXP （-x ））
>>> loss = model （x_test * clf.coef_ + clf.intercept_ ）。拉威爾（）
>>> plt。plot （x_test，loss，color = 'pink' ，linewidth = 2.5 ）

[<matplotlib.lines.Line2D對象位于0x057BA090>]

>>> ols = linear_model。LinearRegression （）
>>> ols。適合（x，y ）

LinearRegression（copy_X = True，fit_intercept = True，n_jobs = 1，normalize = False）

>>> plt。plot （x_test，ols.coef_ * x_test + ols.intercept_，linewidth = 1 ）

[<matplotlib.lines.Line2D對象位于0x057BA0B0>]

>>> plt。axhline （。4 ，顏色= ” 0.4' ）

<matplotlib.lines.Line2D對象位于0x05860E70>

>>> plt。ylabel （'y' ）

文本（0,0.5， 'Y'）

>>> plt。xlabel （'x' ）

文本（0.5,0， 'X'）

>>> plt。xticks （范圍（- 7 ，7 ））
>>> plt。yticks （[ 0 ，0.4 ，1 ] ）
>>> plt。ylim （- 。25 ，1.25 ）

（-0.25,1.25）

>>> plt。XLIM （- 4 ，10 ）

（-4,10）

>>> plt。圖例（（'Logistic回歸' ，'線性回歸' ），loc = '右下' ，fontsize = 'small' ）

<matplotlib.legend.Legend對象位于0x057C89F0>

>>> plt。show （）

機(jī)器學(xué)習(xí)算法 - Logistic Regreesion

3。決策樹

決策樹屬于受監(jiān)督的Python機(jī)器學(xué)習(xí)學(xué)習(xí)，并且用于分類和回歸 - 盡管主要用于分類。此模型接受一個(gè)實(shí)例，遍歷樹，并將重要特征與確定的條件語句進(jìn)行比較。是下降到左子分支還是右分支取決于結(jié)果。通常，更重要的功能更接近根。

這種Python機(jī)器學(xué)習(xí)算法可以對分類和連續(xù)因變量起作用。在這里，我們將人口分成兩個(gè)或更多個(gè)同類集。讓我們看看這個(gè)算法 -

>>>來自sklearn.cross_validation import train_test_split
>>>來自sklearn.tree導(dǎo)入DecisionTreeClassifier
>>>來自sklearn.metrics import accuracy_score
>>>來自sklearn.metrics import classification_report
>>> def importdata （）：#Importing data
balance_data = PD。read_csv （ 'https://archive.ics.uci.edu/ml/machine-learning-' +
'databases / balance-scale / balance-scale.data' ，
sep = '，' ，header = None ）
print （len （balance_data ））
print （balance_data.shape ）
打印（balance_data。頭（））
return balance_data
>>> def splitdataset （balance_data ）：# Splitting 數(shù)據(jù)
x = balance_data.values [ ：，1 ：5 ]
y = balance_data.values [ ：，0 ]
x_train，x_test，y_train，y_test = train_test_split （
x，y，test_size = 0.3 ，random_state = 100 ）
返回x，y，x_train，x_test，y_train，y_test
>>> def train_using_gini （x_train，x_test，y_train ）：#gining with giniIndex
clf_gini = DecisionTreeClassifier （criterion = “ gini ” ，
random_state = 100 ，max_depth = 3 ，min_samples_leaf = 5 ）
clf_gini。適合（x_train，y_train ）
返回clf_gini
>>> def train_using_entropy （x_train，x_test，y_train ）：#Training with entropy
clf_entropy = DecisionTreeClassifier （
criterion = “entropy” ，random_state = 100 ，
max_depth = 3 ，min_samples_leaf = 5 ）
clf_entropy。適合（x_train，y_train ）
返回clf_entropy
>>> def 預(yù)測（x_test，clf_object ）：＃制作預(yù)測
y_pred = clf_object。預(yù)測（x_test ）
print （f “預(yù)測值：{y_pred}” ）
返回y_pred
>>> def cal_accuracy （y_test，y_pred ）：＃計(jì)算準(zhǔn)確性
print （confusion_matrix （y_test，y_pred ））
打印（accuracy_score （y_test，y_pred ）* 100 ）
print （classification_report （y_test，y_pred ））
>>> data = importdata （）

625

（625,5）

0 1 2 3 4

0 B 1 1 1 1

1 R 1 1 1 2

2 R 1 1 1 3

3 R 1 1 1 4

4 R 1 1 1 5

>>> x，y，x_train，x_test，y_train，y_test = splitdataset （data ）
>>> clf_gini = train_using_gini （x_train，x_test，y_train ）
>>> clf_entropy = train_using_entropy （x_train，x_test，y_train ）
>>> y_pred_gini = 預(yù)測（x_test，clf_gini ）

Python機(jī)器學(xué)習(xí)算法 - 決策樹

>>> cal_accuracy （y_test，y_pred_gini ）

[[0 6 7]

[0 67 18]

[0 19 71]]

73.40425531914893

Python機(jī)器學(xué)習(xí)算法 - 決策樹

>>> y_pred_entropy = 預(yù)測（x_test，clf_entropy ）

Python機(jī)器學(xué)習(xí)算法 - 決策樹

>>> cal_accuracy （y_test，y_pred_entropy ）

[[0 6 7]

[0 63 22]

[0 20 70]]

70.74468085106383

Python機(jī)器學(xué)習(xí)算法 - 決策樹

4。支持向量機(jī)（SVM）

SVM是一種受監(jiān)督的分類Python機(jī)器學(xué)習(xí)算法，它繪制了一條劃分不同類別數(shù)據(jù)的線。在這個(gè)ML算法中，我們計(jì)算向量以優(yōu)化線。這是為了確保每組中最近的點(diǎn)彼此相距最遠(yuǎn)。雖然你幾乎總會發(fā)現(xiàn)這是一個(gè)線性向量，但它可能不是那樣的。

在這個(gè)Python機(jī)器學(xué)習(xí)教程中，我們將每個(gè)數(shù)據(jù)項(xiàng)繪制為n維空間中的一個(gè)點(diǎn)。我們有n個(gè)特征，每個(gè)特征都具有某個(gè)坐標(biāo)的值。

首先，讓我們繪制一個(gè)數(shù)據(jù)集。

>>>來自sklearn.datasets.samples_generator import make_blobs
>>> x，y = make_blobs （n_samples = 500 ，centers = 2 ，
random_state = 0 ，cluster_std = 0 .40 ）
>>>將matplotlib.pyplot導(dǎo)入為plt
>>> plt。scatter （x [ ：，0 ] ，x [ ：，1 ] ，c = y，s = 50 ，cmap = 'plasma' ）

位于0x04E1BBF0的<matplotlib.collections.PathCollection對象>

>>> plt。show （）

Python機(jī)器學(xué)習(xí)算法 - SVM

>>>將numpy導(dǎo)入為np
>>> xfit = np。linspace （- 1 ，3 0.5 ）
>>> plt。scatter （X [ ：，0 ] ，X [ ：，1 ] ，c = Y，s = 50 ，cmap = 'plasma' ）

<matplotlib.collections.PathCollection對象位于0x07318C90>

>>>為M，B，d在[ （1 ，0.65 ，0.33 ），（0.5 ，1.6 ，0.55 ），（- 0 0.2 ，2 0.9 ，0.2 ）] ：
yfit = m * xfit + b
PLT。情節(jié)（xfit，yfit，' - k' ）
PLT。fill_between （xfit ，yfit - d，yfit + d，edgecolor = 'none' ，
color = '＃AFFEDC' ，alpha = 0.4 ）

[<matplotlib.lines.Line2D對象位于0x07318FF0>]

<matplotlib.collections.PolyCollection對象位于0x073242D0>

[<matplotlib.lines.Line2D對象位于0x07318B70>]

<matplotlib.collections.PolyCollection對象位于0x073246F0>

[<matplotlib.lines.Line2D對象位于0x07324370>]

<matplotlib.collections.PolyCollection對象位于0x07324B30>

>>> plt。XLIM （- 1 ，3.5 ）

（-1,3.5）

>>> plt。show （）

Python機(jī)器學(xué)習(xí)算法 - SVM

5，樸素貝葉斯

樸素貝葉斯是一種基于貝葉斯定理的分類方法。這假定預(yù)測變量之間的獨(dú)立性。樸素貝葉斯分類器將假定類中的特征與任何其他特征無關(guān)?？紤]一個(gè)水果。這是一個(gè)蘋果，如果它是圓形，紅色，直徑2.5英寸。樸素貝葉斯分類器將說這些特征獨(dú)立地促成果實(shí)成為蘋果的概率。即使功能相互依賴，這也是如此。

對于非常大的數(shù)據(jù)集，很容易構(gòu)建樸素貝葉斯模型。這種模型不僅非常簡單，而且比許多高度復(fù)雜的分類方法表現(xiàn)更好。讓我們建立這個(gè)。

>>>來自sklearn.naive_bayes導(dǎo)入GaussianNB
>>>來自sklearn.naive_bayes導(dǎo)入MultinomialNB
>>>來自sklearn導(dǎo)入數(shù)據(jù)集
>>>來自sklearn.metrics import confusion_matrix
>>>來自sklearn.model_selection import train_test_split
>>> iris =數(shù)據(jù)集。load_iris （）
>>> x = iris.data
>>> y = iris.target
>>> x_train，x_test，y_train，y_test = train_test_split （x，y，test_size = 0 .3 ，random_state = 0 ）
>>> gnb = GaussianNB （）
>>> MNB = MultinomialNB （）
>>> y_pred_gnb = gnb。適合（x_train，y_train ）。預(yù)測（x_test ）
>>> cnf_matrix_gnb = confusion_matrix （y_test，y_pred_gnb ）
>>> cnf_matrix_gnb

數(shù)組（[[16,0,0]，

[0,18,0]，

[0,0,11]]，dtype = int64）

>>> y_pred_mnb = mnb。適合（x_train，y_train ）。預(yù)測（x_test ）
>>> cnf_matrix_mnb = confusion_matrix （y_test，y_pred_mnb ）
>>> cnf_matrix_mnb

數(shù)組（[[16,0,0]，

[0,0,18]，

[0,0,11]]，dtype = int64）

6。kNN（k-Nearest Neighbors）

這是一種用于分類和回歸的Python機(jī)器學(xué)習(xí)算法 - 主要用于分類。這是一種監(jiān)督學(xué)習(xí)算法，它考慮不同的質(zhì)心并使用通常的歐幾里德函數(shù)來比較距離。然后，它分析結(jié)果并將每個(gè)點(diǎn)分類到組以優(yōu)化它以放置所有最接近的點(diǎn)。它使用其鄰居k的多數(shù)票對新案件進(jìn)行分類。它分配給一個(gè)類的情況是其K個(gè)最近鄰居中最常見的一個(gè)。為此，它使用距離函數(shù)。

I,對整個(gè)數(shù)據(jù)集進(jìn)行培訓(xùn)和測試

>>>來自sklearn.datasets import load_iris
>>> iris = load_iris （）
>>> x = iris.data
>>> y = iris.target
>>>來自sklearn.linear_model import LogisticRegression
>>> logreg = LogisticRegression （）
>>> logreg。適合（x，y ）

LogisticRegression（C = 1.0，class_weight = None，dual = False，fit_intercept = True，

intercept_scaling = 1，max_iter = 100，multi_class ='ovr'，n_jobs = 1，

penalty ='l2'，random_state = None，solver ='liblinear'，tol = 0.0001，

verbose = 0，warm_start = False）

>>> logreg。預(yù)測（x ）

array（[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0，

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0，

0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1

2,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,1,1，

1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2，

2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,2,2，

2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2]]

>>> y_pred = logreg。預(yù)測（x ）
>>> len （y_pred ）

150

>>>來自sklearn導(dǎo)入指標(biāo)
>>>指標(biāo)。accuracy_score （y，y_pred ）

0.96

>>>來自sklearn.neighbors導(dǎo)入KNeighborsClassifier
>>> knn = KNeighborsClassifier （n_neighbors = 5 ）
>>> knn。適合（x，y ）

KNeighborsClassifier（algorithm ='auto'，leaf_size = 30，metric ='minkowski'，

metric_params =無，n_jobs = 1，n_neighbors = 5，p = 2，

權(quán)重=“均勻”）

>>> y_pred = knn。預(yù)測（x ）
>>>指標(biāo)。accuracy_score （y，y_pred ）

0.9666666666666667

>>> knn = KNeighborsClassifier （n_neighbors = 1 ）
>>> knn。適合（x，y ）

KNeighborsClassifier（algorithm ='auto'，leaf_size = 30，metric ='minkowski'，

metric_params =無，n_jobs = 1，n_neighbors = 1，p = 2，

權(quán)重=“均勻”）

>>> y_pred = knn。預(yù)測（x ）
>>>指標(biāo)。accuracy_score （y，y_pred ）

1.0

II。分裂成火車/測試

>>> x.shape

（150,4）

>>> y.shape

（150）

>>>來自sklearn.cross_validation import train_test_split
>>> x.shape

（150,4）

>>> y.shape

（150）

>>>來自sklearn.cross_validation import train_test_split
>>> x_train，x_test，y_train，y_test = train_test_split （x，y，test_size = 0.4 ，random_state = 4 ）
>>> x_train.shape

（90,4）

>>> x_test.shape

（60,4）

>>> y_train.shape

（90）

>>> y_test.shape

（60）

>>> logreg = LogisticRegression （）
>>> logreg。適合（x_train，y_train ）
>>> y_pred = knn。預(yù)測（x_test ）
>>>指標(biāo)。accuracy_score （y_test，y_pred ）

0.9666666666666667

>>> knn = KNeighborsClassifier （n_neighbors = 5 ）
>>> knn。適合（x_train，y_train ）

KNeighborsClassifier（algorithm ='auto'，leaf_size = 30，metric ='minkowski'，

metric_params =無，n_jobs = 1，n_neighbors = 5，p = 2，

權(quán)重=“均勻”）

>>> y_pred = knn。預(yù)測（x_test ）
>>>指標(biāo)。accuracy_score （y_test，y_pred ）

0.9666666666666667

>>> k_range = 范圍（1 ，26 ）
>>>得分= [ ]
>>> for k in k_range：
knn = KNeighborsClassifier （n_neighbors = k ）
KNN。適合（x_train，y_train ）
y_pred = knn。預(yù)測（x_test ）
分?jǐn)?shù)。追加（指標(biāo)。accuracy_score （y_test，y_pred ））
>>>分?jǐn)?shù)

[0.95，0.95，0.9666666666666667，0.9666666666666667，0.9666666666666667，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9833333333333333，0.9666666666666667，0.9833333333333333，0.9666666666666667，0.9666666666666667，0.9666666666666667，0.9666666666666667 0.95，0.95 ]

>>>將matplotlib.pyplot導(dǎo)入為plt
>>> plt。情節(jié)（k_range，分?jǐn)?shù)）

[<matplotlib.lines.Line2D對象位于0x05FDECD0>]

>>> plt。xlabel （'k代表kNN' ）

文字（0.5,0，'k為kNN'）

>>> plt。ylabel （'測試準(zhǔn)確度' ）

文字（0,0.5，'測試準(zhǔn)確度'）

>>> plt。show （）

Python機(jī)器學(xué)習(xí)算法 - kNN（k-Nearest Neighbors）

閱讀Python統(tǒng)計(jì)數(shù)據(jù) - p值，相關(guān)性，T檢驗(yàn)，KS檢驗(yàn)

7。K-Means

k-Means是一種無監(jiān)督算法，可以解決聚類問題。它使用許多集群對數(shù)據(jù)進(jìn)行分類。類中的數(shù)據(jù)點(diǎn)與同類組是同構(gòu)的和異構(gòu)的。

>>>將numpy導(dǎo)入為np
>>>將matplotlib.pyplot導(dǎo)入為plt
>>>來自matplotlib導(dǎo)入樣式
>>>風(fēng)格。使用（'ggplot' ）
>>>來自sklearn.cluster導(dǎo)入KMeans
>>> X = [ 1 ，5 ，1 0.5 ，8 ，1 ，9 ]
>>> Y = [ 2 ，8 ，1.7 ，6 ，0 0.2 ，12 ]
>>> plt。散射（x，y ）

<matplotlib.collections.PathCollection對象位于0x0642AF30>

>>> x = np。陣列（[ [ 1 ，2 ] ，[ 5 ，8 ] ，[ 1.5 ，1 0.8 ] ，[ 8 ，8 ] ，[ 1 ，0 0.6 ] ，[ 9 ，11 ] ] ）
>>> kmeans = KMeans （n_clusters = 2 ）
>>> kmeans。適合（x ）

KMeans（algorithm ='auto'，copy_x = True，init ='k-means ++'，max_iter = 300，

n_clusters = 2，n_init = 10，n_jobs = 1，precompute_distances ='auto'，

random_state =無，tol = 0.0001，verbose = 0）

>>> centroids = kmeans.cluster_centers_
>>> labels = kmeans.labels_
>>>質(zhì)心

數(shù)組（[[1.16666667,1.46666667]，

[7.33333333,9。]]）

>>>標(biāo)簽

數(shù)組（[0,1,0,1,0,1]）

>>> colors = [ 'g。' ，'r。' ，'c。' ，'呃。' ]
>>> for i in range （len （x ））：
print （x [ i ] ，labels [ i ] ）
PLT。plot （x [ i ] [ 0 ] ，x [ i ] [ 1 ] ，colors [ labels [ i ] ] ，markersize = 10 ）

[1。2.] 0

[<matplotlib.lines.Line2D對象位于0x0642AE10>]

[5。8.] 1

[<matplotlib.lines.Line2D對象位于0x06438930>]

[1.5 1.8] 0

[<matplotlib.lines.Line2D對象位于0x06438BF0>]

[8。8.] 1

[<matplotlib.lines.Line2D對象位于0x06438EB0>]

[1。0.6] 0

[<matplotlib.lines.Line2D對象位于0x06438FB0>]

[9. 11.] 1

[<matplotlib.lines.Line2D對象位于0x043B1410>]

>>> plt。scatter （centroids [ ：，0 ] ，centroids [ ：，1 ] ，marker = 'x' ，s = 150 ，linewidths = 5 ，zorder = 10 ）

<matplotlib.collections.PathCollection對象位于0x043B14D0>

>>> plt。show （）

8。Random Forest

Random Forest是決策樹的集合。為了根據(jù)其屬性對每個(gè)新對象進(jìn)行分類，樹投票給類 - 每個(gè)樹提供一個(gè)分類。投票最多的分類在Random

中獲勝。

>>>將numpy導(dǎo)入為np
>>>將pylab導(dǎo)入為pl
>>> x = np.random。均勻的（1 ，100 ，1000 ）
>>> y = np。log （x ）+ np.random。正常（0 ，。3 ，1000 ）
>>> pl。scatter （x，y，s = 1 ，label = 'log（x）with noise' ）

<matplotlib.collections.PathCollection對象，位于0x0434EC50>

>>> pl。情節(jié)（NP。人氣指數(shù)（1 ，100 ），NP。日志（NP。人氣指數(shù)（1 ，100 ））中，c = 'B' ，標(biāo)記= '日志（x）的函數(shù)真' ）

[<matplotlib.lines.Line2D對象位于0x0434EB30>]

>>> pl。xlabel （'x' ）

文本（0.5,0， 'X'）

>>> pl。ylabel （'f（x）= log（x）' ）

文本（0,0.5， 'F（X）=日志（X）'）

>>> pl。傳奇（loc = 'best' ）

<matplotlib.legend.Legend對象，位于0x04386450>

>>> pl。標(biāo)題（'基本日志功能' ）

文字（0.5,1，'基本日志功能'）

>>> pl。show （）

Python機(jī)器學(xué)習(xí)算法 -

>>>來自sklearn.datasets import load_iris
>>>來自sklearn.ensemble導(dǎo)入RandomForestClassifier
>>>將pandas導(dǎo)入為pd
>>>將numpy導(dǎo)入為np
>>> iris = load_iris （）
>>> df = pd。DataFrame （iris.data，columns = iris.feature_names ）
>>> df [ 'is_train' ] = np.random。均勻的（0 ，1 ，LEN （DF ））<=。75
>>> df [ 'species' ] = pd.Categorical。from_codes （iris.target，iris.target_names ）
>>> df。頭（）

萼片長度（厘米）萼片寬度（厘米）... is_train物種

0 5.1 3.5 ...真正的setosa

1 4.9 3.0 ...真正的setosa

2 4.7 3.2 ...真正的setosa

3 4.6 3.1 ...真正的setosa

4 5.0 3.6 ...假setosa

[5行x 6列]

>>> train，test = df [ df [ 'is_train' ] == True ] ，df [ df [ 'is_train' ] == False ]
>>> features = df.columns [ ：4 ]
>>> clf = RandomForestClassifier （n_jobs = 2 ）
>>> y，_ = pd。factorize （train [ 'species' ] ）
>>> clf。適合（火車[ 功能] ，y ）

RandomForestClassifier（bootstrap = True，class_weight = None，criterion ='gini'，

max_depth =無，max_features ='auto'，max_leaf_nodes =無，

min_impurity_decrease = 0.0，min_impurity_split =無，

min_samples_leaf = 1，min_samples_split = 2，

min_weight_fraction_leaf = 0.0，n_estimators = 10，n_jobs = 2，

oob_score = False，random_state = None，verbose = 0，

warm_start = FALSE）

>>> preds = iris.target_names [ clf。預(yù)測（測試[ 特征] ）]
>>> pd。交叉表（test [ 'species' ] ，preds，rownames = [ 'actual' ] ，colnames = [ 'preds' ] ）

preds setosa versicolor virginica

實(shí)際

setosa 12 0 0

versicolor 0 17 2

virginica 0 1 15

所以，這就是Python機(jī)器學(xué)習(xí)算法教程。希望你喜歡。

因此，今天我們討論了八個(gè)重要的Python機(jī)器學(xué)習(xí)算法。

本站僅提供存儲服務(wù)，所有內(nèi)容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點(diǎn)擊舉報(bào)。

中文字幕理论片,69视频免费在线观看,亚洲成人app,国产1级毛片,刘涛最大尺度戏视频,欧美亚洲美女视频,2021韩国美女仙女屋vip视频

1。線性回歸

2 Logistic回歸

3。決策樹

4。支持向量機(jī)（SVM）

5， 樸素貝葉斯

6。kNN（k-Nearest Neighbors）

7。K-Means

8。Random Forest

1。線性回歸

3。決策樹

4。支持向量機(jī)（SVM）

5，樸素貝葉斯

6。kNN（k-Nearest Neighbors）

7。K-Means

8。Random Forest