文章詳情頁

利用python中的matplotlib打印混淆矩陣實例

瀏覽：107日期：2022-07-21 09:43:50

前面說過混淆矩陣是我們在處理分類問題時，很重要的指標，那么如何更好的把混淆矩陣給打印出來呢，直接做表或者是前端可視化，小編曾經就嘗試過用前端（D5）做出來，然后截圖，顯得不那么好看。。

代碼：

import itertoolsimport matplotlib.pyplot as pltimport numpy as np def plot_confusion_matrix(cm, classes, normalize=False, title=’Confusion matrix’, cmap=plt.cm.Blues): ''' This function prints and plots the confusion matrix. Normalization can be applied by setting `normalize=True`. ''' if normalize: cm = cm.astype(’float’) / cm.sum(axis=1)[:, np.newaxis] print('Normalized confusion matrix') else: print(’Confusion matrix, without normalization’) print(cm) plt.imshow(cm, interpolation=’nearest’, cmap=cmap) plt.title(title) plt.colorbar() tick_marks = np.arange(len(classes)) plt.xticks(tick_marks, classes, rotation=45) plt.yticks(tick_marks, classes) fmt = ’.2f’ if normalize else ’d’ thresh = cm.max() / 2. for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])): plt.text(j, i, format(cm[i, j], fmt), horizontalalignment='center', color='white' if cm[i, j] > thresh else 'black') plt.tight_layout() plt.ylabel(’True label’) plt.xlabel(’Predicted label’) plt.show() # plt.savefig(’confusion_matrix’,dpi=200) cnf_matrix = np.array([ [4101, 2, 5, 24, 0], [50, 3930, 6, 14, 5], [29, 3, 3973, 4, 0], [45, 7, 1, 3878, 119], [31, 1, 8, 28, 3936],]) class_names = [’Buildings’, ’Farmland’, ’Greenbelt’, ’Wasteland’, ’Water’] # plt.figure()# plot_confusion_matrix(cnf_matrix, classes=class_names,# title=’Confusion matrix, without normalization’) # Plot normalized confusion matrixplt.figure()plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True, title=’Normalized confusion matrix’)

在放矩陣位置，放一下你的混淆矩陣就可以，當然可視化混淆矩陣這一步也可以直接在模型運行中完成。

補充知識：混淆矩陣(Confusion matrix)的原理及使用(scikit-learn 和 tensorflow)

原理

在機器學習中, 混淆矩陣是一個誤差矩陣, 常用來可視化地評估監督學習算法的性能. 混淆矩陣大小為 (n_classes, n_classes) 的方陣, 其中 n_classes 表示類的數量. 這個矩陣的每一行表示真實類中的實例, 而每一列表示預測類中的實例 (Tensorflow 和 scikit-learn 采用的實現方式). 也可以是, 每一行表示預測類中的實例, 而每一列表示真實類中的實例 (Confusion matrix From Wikipedia 中的定義). 通過混淆矩陣, 可以很容易看出系統是否會弄混兩個類, 這也是混淆矩陣名字的由來.

混淆矩陣是一種特殊類型的列聯表(contingency table)或交叉制表(cross tabulation or crosstab). 其有兩維 (真實值 'actual' 和預測值 'predicted' ), 這兩維都具有相同的類('classes')的集合. 在列聯表中, 每個維度和類的組合是一個變量. 列聯表以表的形式, 可視化地表示多個變量的頻率分布.

使用混淆矩陣( scikit-learn 和 Tensorflow)

下面先介紹在 scikit-learn 和 tensorflow 中計算混淆矩陣的 API (Application Programming Interface) 接口函數, 然后在一個示例中, 使用這兩個 API 函數.

scikit-learn 混淆矩陣函數 sklearn.metrics.confusion_matrix API 接口

skearn.metrics.confusion_matrix( y_true, # array, Gound true (correct) target values y_pred, # array, Estimated targets as returned by a classifier labels=None, # array, List of labels to index the matrix. sample_weight=None # array-like of shape = [n_samples], Optional sample weights)

在 scikit-learn 中, 計算混淆矩陣用來評估分類的準確度.

按照定義, 混淆矩陣 C 中的元素 Ci,j 等于真實值為組 i , 而預測為組 j 的觀測數(the number of observations). 所以對于二分類任務, 預測結果中, 正確的負例數(true negatives, TN)為 C0,0; 錯誤的負例數(false negatives, FN)為 C1,0; 真實的正例數為 C1,1; 錯誤的正例數為 C0,1.

如果 labels 為 None, scikit-learn 會把在出現在 y_true 或 y_pred 中的所有值添加到標記列表 labels 中, 并排好序.

Tensorflow 混淆矩陣函數 tf.confusion_matrix API 接口

tf.confusion_matrix( labels, # 1-D Tensor of real labels for the classification task predictions, # 1-D Tensor of predictions for a givenclassification num_classes=None, # The possible number of labels the classification task can have dtype=tf.int32, # Data type of the confusion matrix name=None, # Scope name weights=None, # An optional Tensor whose shape matches predictions)

Tensorflow tf.confusion_matrix 中的 num_classes 參數的含義, 與 scikit-learn sklearn.metrics.confusion_matrix 中的 labels 參數相近, 是與標記有關的參數, 表示類的總個數, 但沒有列出具體的標記值. 在 Tensorflow 中一般是以整數作為標記, 如果標記為字符串等非整數類型, 則需先轉為整數表示. 如果 num_classes 參數為 None, 則把 labels 和 predictions 中的最大值 + 1, 作為num_classes 參數值.

tf.confusion_matrix 的 weights 參數和 sklearn.metrics.confusion_matrix 的 sample_weight 參數的含義相同, 都是對預測值進行加權, 在此基礎上, 計算混淆矩陣單元的值.

使用示例

#!/usr/bin/env python# -*- coding: utf8 -*-'''Author: klchangDescription: A simple example for tf.confusion_matrix and sklearn.metrics.confusion_matrix.Date: 2018.9.8'''from __future__ import print_functionimport tensorflow as tfimport sklearn.metrics y_true = [1, 2, 4]y_pred = [2, 2, 4] # Build graph with tf.confusion_matrix operationsess = tf.InteractiveSession()op = tf.confusion_matrix(y_true, y_pred)op2 = tf.confusion_matrix(y_true, y_pred, num_classes=6, dtype=tf.float32, weights=tf.constant([0.3, 0.4, 0.3]))# Execute the graphprint ('confusion matrix in tensorflow: ')print ('1. default: n', op.eval())print ('2. customed: n', sess.run(op2))sess.close() # Use sklearn.metrics.confusion_matrix functionprint ('nconfusion matrix in scikit-learn: ')print ('1. default: n', sklearn.metrics.confusion_matrix(y_true, y_pred))print ('2. customed: n', sklearn.metrics.confusion_matrix(y_true, y_pred, labels=range(6), sample_weight=[0.3, 0.4, 0.3]))

以上這篇利用python中的matplotlib打印混淆矩陣實例就是小編分享給大家的全部內容了，希望能給大家一個參考，也希望大家多多支持好吧啦網。

Python 編程

上一條：利用Python實現Excel的文件間的數據匹配功能下一條：Python SMTP配置參數并發送郵件

相關文章：

1. Python實現迪杰斯特拉算法過程解析2. JavaScript Reduce使用詳解3. 淺談JavaScript中等號、雙等號、三等號的區別4. Spring security 自定義過濾器實現Json參數傳遞并兼容表單參數(實例代碼)5. 詳解Python模塊化編程與裝飾器6. python使用ctypes庫調用DLL動態鏈接庫7. Python如何進行時間處理8. python裝飾器三種裝飾模式的簡單分析9. JavaScript中的AOP編程的基本實現10. 詳解java中static關鍵詞的作用

排行榜

					
					Spring security 自定義過濾器實現Json參數傳遞并兼容表單參數(實例代碼)
Python實現迪杰斯特拉算法過程解析
詳解java中static關鍵詞的作用
詳解Python模塊化編程與裝飾器
Django框架安裝及項目創建過程解析
python裝飾器三種裝飾模式的簡單分析
JXTA概念介紹-Matrix翻譯
Django實現任意文件上傳（最簡單的方法）
java結構性模式之變壓器模式介紹(二)
JavaScript Reduce使用詳解
Python如何進行時間處理