pycm

Multi-class confusion matrix library in Python

Github星跟踪图


Table of contents

Overview

Installation

⚠️ PyCM 2.4 is the last version to support Python 2.7 & Python 3.4

Source code

  • Download Version 2.5 or Latest Source
  • Run pip install -r requirements.txt or pip3 install -r requirements.txt (Need root access)
  • Run python3 setup.py install or python setup.py install (Need root access)

PyPI

Conda

Easy install

  • Run easy_install --upgrade pycm (Need root access)

Docker

  • Run docker pull sepandhaghighi/pycm (Need root access)
  • Configuration :
    • Ubuntu 16.04
    • Python 3.6

Usage

From vector

>>> from pycm import *
>>> y_actu = [2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2] # or y_actu = numpy.array([2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2])
>>> y_pred = [0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2] # or y_pred = numpy.array([0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2])
>>> cm = ConfusionMatrix(actual_vector=y_actu, predict_vector=y_pred) # Create CM From Data
>>> cm.classes
[0, 1, 2]
>>> cm.table
{0: {0: 3, 1: 0, 2: 0}, 1: {0: 0, 1: 1, 2: 2}, 2: {0: 2, 1: 1, 2: 3}}
>>> print(cm)
Predict 0       1       2       
Actual
0       3       0       0       

1       0       1       2       

2       2       1       3       





Overall Statistics : 

95% CI                                                            (0.30439,0.86228)
ACC Macro                                                         0.72222
AUNP                                                              0.66667
AUNU                                                              0.69444
Bennett S                                                         0.375
CBA                                                               0.47778
CSI                                                               0.17778
Chi-Squared                                                       6.6
Chi-Squared DF                                                    4
Conditional Entropy                                               0.95915
Cramer V                                                          0.5244
Cross Entropy                                                     1.59352
F1 Macro                                                          0.56515
F1 Micro                                                          0.58333
Gwet AC1                                                          0.38931
Hamming Loss                                                      0.41667
Joint Entropy                                                     2.45915
KL Divergence                                                     0.09352
Kappa                                                             0.35484
Kappa 95% CI                                                      (-0.07708,0.78675)
Kappa No Prevalence                                               0.16667
Kappa Standard Error                                              0.22036
Kappa Unbiased                                                    0.34426
Lambda A                                                          0.16667
Lambda B                                                          0.42857
Mutual Information                                                0.52421
NIR                                                               0.5
Overall ACC                                                       0.58333
Overall CEN                                                       0.46381
Overall J                                                         (1.225,0.40833)
Overall MCC                                                       0.36667
Overall MCEN                                                      0.51894
Overall RACC                                                      0.35417
Overall RACCU                                                     0.36458
P-Value                                                           0.38721
PPV Macro                                                         0.56667
PPV Micro                                                         0.58333
Pearson C                                                         0.59568
Phi-Squared                                                       0.55
RCI                                                               0.34947
RR                                                                4.0
Reference Entropy                                                 1.5
Response Entropy                                                  1.48336
SOA1(Landis & Koch)                                               Fair
SOA2(Fleiss)                                                      Poor
SOA3(Altman)                                                      Fair
SOA4(Cicchetti)                                                   Poor
SOA5(Cramer)                                                      Relatively Strong
SOA6(Matthews)                                                    Weak
Scott PI                                                          0.34426
Standard Error                                                    0.14232
TPR Macro                                                         0.61111
TPR Micro                                                         0.58333
Zero-one Loss                                                     5

Class Statistics :

Classes                                                           0             1             2             
ACC(Accuracy)                                                     0.83333       0.75          0.58333       
AGF(Adjusted F-score)                                             0.9136        0.53995       0.5516        
AGM(Adjusted geometric mean)                                      0.83729       0.692         0.60712       
AM(Difference between automatic and manual classification)        2             -1            -1            
AUC(Area under the ROC curve)                                     0.88889       0.61111       0.58333       
AUCI(AUC value interpretation)                                    Very Good     Fair          Poor          
AUPR(Area under the PR curve)                                     0.8           0.41667       0.55          
BCD(Bray-Curtis dissimilarity)                                    0.08333       0.04167       0.04167       
BM(Informedness or bookmaker informedness)                        0.77778       0.22222       0.16667       
CEN(Confusion entropy)                                            0.25          0.49658       0.60442       
DOR(Diagnostic odds ratio)                                        None          4.0           2.0           
DP(Discriminant power)                                            None          0.33193       0.16597       
DPI(Discriminant power interpretation)                            None          Poor          Poor          
ERR(Error rate)                                                   0.16667       0.25          0.41667       
F0.5(F0.5 score)                                                  0.65217       0.45455       0.57692       
F1(F1 score - harmonic mean of precision and sensitivity)         0.75          0.4           0.54545       
F2(F2 score)                                                      0.88235       0.35714       0.51724       
FDR(False discovery rate)                                         0.4           0.5           0.4           
FN(False negative/miss/type 2 error)                              0             2             3             
FNR(Miss rate or false negative rate)                             0.0           0.66667       0.5           
FOR(False omission rate)                                          0.0           0.2           0.42857       
FP(False positive/type 1 error/false alarm)                       2             1             2             
FPR(Fall-out or false positive rate)                              0.22222       0.11111       0.33333       
G(G-measure geometric mean of precision and sensitivity)          0.7746        0.40825       0.54772       
GI(Gini index)                                                    0.77778       0.22222       0.16667       
GM(G-mean geometric mean of specificity and sensitivity)          0.88192       0.54433       0.57735       
IBA(Index of balanced accuracy)                                   0.95062       0.13169       0.27778       
ICSI(Individual classification success index)                     0.6           -0.16667      0.1           
IS(Information score)                                             1.26303       1.0           0.26303       
J(Jaccard index)                                                  0.6           0.25          0.375         
LS(Lift score)                                                    2.4           2.0           1.2           
MCC(Matthews correlation coefficient)                             0.68313       0.2582        0.16903       
MCCI(Matthews correlation coefficient interpretation)             Moderate      Negligible    Negligible    
MCEN(Modified confusion entropy)                                  0.26439       0.5           0.6875        
MK(Markedness)                                                    0.6           0.3           0.17143       
N(Condition negative)                                             9             9             6             
NLR(Negative likelihood ratio)                                    0.0           0.75          0.75          
NLRI(Negative likelihood ratio interpretation)                    Good          Negligible    Negligible    
NPV(Negative predictive value)                                    1.0           0.8           0.57143       
OC(Overlap coefficient)                                           1.0           0.5           0.6           
OOC(Otsuka-Ochiai coefficient)                                    0.7746        0.40825       0.54772       
OP(Optimized precision)                                           0.70833       0.29545       0.44048       
P(Condition positive or support)                                  3             3             6             
PLR(Positive likelihood ratio)                                    4.5           3.0           1.5           
PLRI(Positive likelihood ratio interpretation)                    Poor          Poor          Poor          
POP(Population)                                                   12            12            12            
PPV(Precision or positive predictive value)                       0.6           0.5           0.6           
PRE(Prevalence)                                                   0.25          0.25          0.5           
Q(Yule Q - coefficient of colligation)                            None          0.6           0.33333       
RACC(Random accuracy)                                             0.10417       0.04167       0.20833       
RACCU(Random accuracy unbiased)                                   0.11111       0.0434        0.21007       
TN(True negative/correct rejection)                               7             8             4             
TNR(Specificity or true negative rate)                            0.77778       0.88889       0.66667       
TON(Test outcome negative)                                        7             10            7             
TOP(Test outcome positive)                                        5             2             5             
TP(True positive/hit)                                             3             1             3             
TPR(Sensitivity, recall, hit rate, or true positive rate)         1.0           0.33333       0.5           
Y(Youden index)                                                   0.77778       0.22222       0.16667       
dInd(Distance index)                                              0.22222       0.67586       0.60093       
sInd(Similarity index)                                            0.84287       0.52209       0.57508

>>> cm.print_matrix()
Predict          0    1    2    
Actual
0                3    0    0    

1                0    1    2    

2                2    1    3    

>>> cm.print_normalized_matrix()
Predict          0          1          2          
Actual
0                1.0        0.0        0.0        

1                0.0        0.33333    0.66667    

2                0.33333    0.16667    0.5        

>>> cm.print_matrix(one_vs_all=True,class_name=0)   # One-Vs-All, new in version 1.4
Predict          0    ~    
Actual
0                3    0    

~                2    7    


Direct CM

>>> from pycm import *
>>> cm2 = ConfusionMatrix(matrix={"Class1": {"Class1": 1, "Class2":2}, "Class2": {"Class1": 0, "Class2": 5}}) # Create CM Directly
>>> cm2
pycm.ConfusionMatrix(classes: ['Class1', 'Class2'])
>>> print(cm2)
Predict      Class1       Class2       
Actual
Class1       1            2            

Class2       0            5            





Overall Statistics : 

95% CI                                                            (0.44994,1.05006)
ACC Macro                                                         0.75
AUNP                                                              0.66667
AUNU                                                              0.66667
Bennett S                                                         0.5
CBA                                                               0.52381
CSI                                                               0.52381
Chi-Squared                                                       1.90476
Chi-Squared DF                                                    1
Conditional Entropy                                               0.34436
Cramer V                                                          0.48795
Cross Entropy                                                     1.2454
F1 Macro                                                          0.66667
F1 Micro                                                          0.75
Gwet AC1                                                          0.6
Hamming Loss                                                      0.25
Joint Entropy                                                     1.29879
KL Divergence                                                     0.29097
Kappa                                                             0.38462
Kappa 95% CI                                                      (-0.354,1.12323)
Kappa No Prevalence                                               0.5
Kappa Standard Error                                              0.37684
Kappa Unbiased                                                    0.33333
Lambda A                                                          0.33333
Lambda B                                                          0.0
Mutual Information                                                0.1992
NIR                                                               0.625
Overall ACC                                                       0.75
Overall CEN                                                       0.44812
Overall J                                                         (1.04762,0.52381)
Overall MCC                                                       0.48795
Overall MCEN                                                      0.29904
Overall RACC                                                      0.59375
Overall RACCU                                                     0.625
P-Value                                                           0.36974
PPV Macro                                                         0.85714
PPV Micro                                                         0.75
Pearson C                                                         0.43853
Phi-Squared                                                       0.2381
RCI                                                               0.20871
RR                                                                4.0
Reference Entropy                                                 0.95443
Response Entropy                                                  0.54356
SOA1(Landis & Koch)                                               Fair
SOA2(Fleiss)                                                      Poor
SOA3(Altman)                                                      Fair
SOA4(Cicchetti)                                                   Poor
SOA5(Cramer)                                                      Relatively Strong
SOA6(Matthews)                                                    Weak
Scott PI                                                          0.33333
Standard Error                                                    0.15309
TPR Macro                                                         0.66667
TPR Micro                                                         0.75
Zero-one Loss                                                     2

Class Statistics :

Classes                                                           Class1        Class2        
ACC(Accuracy)                                                     0.75          0.75          
AGF(Adjusted F-score)                                             0.53979       0.81325       
AGM(Adjusted geometric mean)                                      0.73991       0.5108        
AM(Difference between automatic and manual classification)        -2            2             
AUC(Area under the ROC curve)                                     0.66667       0.66667       
AUCI(AUC value interpretation)                                    Fair          Fair          
AUPR(Area under the PR curve)                                     0.66667       0.85714       
BCD(Bray-Curtis dissimilarity)                                    0.125         0.125         
BM(Informedness or bookmaker informedness)                        0.33333       0.33333       
CEN(Confusion entropy)                                            0.5           0.43083       
DOR(Diagnostic odds ratio)                                        None          None          
DP(Discriminant power)                                            None          None          
DPI(Discriminant power interpretation)                            None          None          
ERR(Error rate)                                                   0.25          0.25          
F0.5(F0.5 score)                                                  0.71429       0.75758       
F1(F1 score - harmonic mean of precision and sensitivity)         0.5           0.83333       
F2(F2 score)                                                      0.38462       0.92593       
FDR(False discovery rate)                                         0.0           0.28571       
FN(False negative/miss/type 2 error)                              2             0             
FNR(Miss rate or false negative rate)                             0.66667       0.0           
FOR(False omission rate)                                          0.28571       0.0           
FP(False positive/type 1 error/false alarm)                       0             2             
FPR(Fall-out or false positive rate)                              0.0           0.66667       
G(G-measure geometric mean of precision and sensitivity)          0.57735       0.84515       
GI(Gini index)                                                    0.33333       0.33333       
GM(G-mean geometric mean of specificity and sensitivity)          0.57735       0.57735       
IBA(Index of balanced accuracy)                                   0.11111       0.55556       
ICSI(Individual classification success index)                     0.33333       0.71429       
IS(Information score)                                             1.41504       0.19265       
J(Jaccard index)                                                  0.33333       0.71429       
LS(Lift score)                                                    2.66667       1.14286       
MCC(Matthews correlation coefficient)                             0.48795       0.48795       
MCCI(Matthews correlation coefficient interpretation)             Weak          Weak          
MCEN(Modified confusion entropy)                                  0.38998       0.51639       
MK(Markedness)                                                    0.71429       0.71429       
N(Condition negative)                                             5             3             
NLR(Negative likelihood ratio)                                    0.66667       0.0           
NLRI(Negative likelihood ratio interpretation)                    Negligible    Good          
NPV(Negative predictive value)                                    0.71429       1.0           
OC(Overlap coefficient)                                           1.0           1.0           
OOC(Otsuka-Ochiai coefficient)                                    0.57735       0.84515       
OP(Optimized precision)                                           0.25          0.25          
P(Condition positive or support)                                  3             5             
PLR(Positive likelihood ratio)                                    None          1.5           
PLRI(Positive likelihood ratio interpretation)                    None          Poor          
POP(Population)                                                   8             8             
PPV(Precision or positive predictive value)                       1.0           0.71429       
PRE(Prevalence)                                                   0.375         0.625         
Q(Yule Q - coefficient of colligation)                            None          None          
RACC(Random accuracy)                                             0.04688       0.54688       
RACCU(Random accuracy unbiased)                                   0.0625        0.5625        
TN(True negative/correct rejection)                               5             1             
TNR(Specificity or true negative rate)                            1.0           0.33333       
TON(Test outcome negative)                                        7             1             
TOP(Test outcome positive)                                        1             7             
TP(True positive/hit)                                             1             5             
TPR(Sensitivity, recall, hit rate, or true positive rate)         0.33333       1.0           
Y(Youden index)                                                   0.33333       0.33333       
dInd(Distance index)                                              0.66667       0.66667       
sInd(Similarity index)                                            0.5286        0.5286
   
>>> cm2.stat(summary=True)
Overall Statistics : 

ACC Macro                                                         0.75
F1 Macro                                                          0.66667
Kappa                                                             0.38462
Overall ACC                                                       0.75
PPV Macro                                                         0.85714
SOA1(Landis & Koch)                                               Fair
TPR Macro                                                         0.66667
Zero-one Loss                                                     2

Class Statistics :

Classes                                                           Class1        Class2        
ACC(Accuracy)                                                     0.75          0.75          
AUC(Area under the ROC curve)                                     0.66667       0.66667       
AUCI(AUC value interpretation)                                    Fair          Fair          
F1(F1 score - harmonic mean of precision and sensitivity)         0.5           0.83333       
FN(False negative/miss/type 2 error)                              2             0             
FP(False positive/type 1 error/false alarm)                       0             2             
N(Condition negative)                                             5             3             
P(Condition positive or support)                                  3             5             
POP(Population)                                                   8             8             
PPV(Precision or positive predictive value)                       1.0           0.71429       
TN(True negative/correct rejection)                               5             1             
TON(Test outcome negative)                                        7             1             
TOP(Test outcome positive)                                        1             7             
TP(True positive/hit)                                             1             5             
TPR(Sensitivity, recall, hit rate, or true positive rate)         0.33333       1.0 
                               
>>> cm3 = ConfusionMatrix(matrix={"Class1": {"Class1": 1, "Class2":0}, "Class2": {"Class1": 2, "Class2": 5}},transpose=True) # Transpose Matrix      
>>> cm3.print_matrix()
Predict          Class1    Class2    
Actual
Class1           1         2         

Class2           0         5         

  • matrix() and normalized_matrix() renamed to print_matrix() and print_normalized_matrix() in version 1.5

Activation threshold

threshold is added in version 0.9 for real value prediction.

For more information visit Example3

Load from file

file is added in version 0.9.5 in order to load saved confusion matrix with .obj format generated by save_obj method.

For more information visit Example4

Sample weights

sample_weight is added in version 1.2

For more information visit Example5

Transpose

transpose is added in version 1.2 in order to transpose input matrix (only in Direct CM mode)

Relabel

relabel method is added in version 1.5 in order to change ConfusionMatrix classnames.

>>> cm.relabel(mapping={0:"L1",1:"L2",2:"L3"})
>>> cm
pycm.ConfusionMatrix(classes: ['L1', 'L2', 'L3'])

Online help

online_help function is added in version 1.1 in order to open each statistics definition in web browser


>>> from pycm import online_help
>>> online_help("J")
>>> online_help("SOA1(Landis & Koch)")
>>> online_help(2)

  • List of items are available by calling online_help() (without argument)
  • If PyCM website is not available, set alt_link = True (new in version 2.4)

Parameter recommender

This option has been added in version 1.9 in order to recommend most related parameters considering the characteristics of the input dataset. The characteristics according to which the parameters are suggested are balance/imbalance and binary/multiclass. All suggestions can be categorized into three main groups: imbalanced dataset, binary classification for a balanced dataset, and multi-class classification for a balanced dataset. The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper.

>>> cm.imbalance
False
>>> cm.binary
False
>>> cm.recommended_list
['MCC', 'TPR Micro', 'ACC', 'PPV Macro', 'BCD', 'Overall MCC', 'Hamming Loss', 'TPR Macro', 'Zero-one Loss', 'ERR', 'PPV Micro', 'Overall ACC']

Compare

In version 2.0 a method for comparing several confusion matrices is introduced. This option is a combination of several overall and class-based benchmarks. Each of the benchmarks evaluates the performance of the classification algorithm from good to poor and give them a numeric score. The score of good performance is 1 and for the poor performance is 0.

After that, two scores are calculated for each confusion matrices, overall and class based. The overall score is the average of the score of six overall benchmarks which are Landis & Koch, Fleiss, Altman, Cicchetti, Cramer, and Matthews. And with a same manner, the class based score is the average of the score of five class-based benchmarks which are Positive Likelihood Ratio Interpretation, Negative Likelihood Ratio Interpretation, Discriminant Power Interpretation, AUC value Interpretation, and Matthews Correlation Coefficient Interpretation. It should be notice that if one of the benchmarks returns none for one of the classes, that benchmarks will be eliminate in total averaging. If user set weights for the classes, the averaging over the value of class-based benchmark scores will transform to a weighted average.

If the user set the value of by_class boolean input True, the best confusion matrix is the one with the maximum class-based score. Otherwise, if a confusion matrix obtain the maximum of the both overall and class-based score, that will be the reported as the best confusion matrix but in any other cases the compare object doesn’t select best confusion matrix.

>>> cm2 = ConfusionMatrix(matrix={0:{0:2,1:50,2:6},1:{0:5,1:50,2:3},2:{0:1,1:7,2:50}})
>>> cm3 = ConfusionMatrix(matrix={0:{0:50,1:2,2:6},1:{0:50,1:5,2:3},2:{0:1,1:55,2:2}})
>>> cp = Compare({"cm2":cm2,"cm3":cm3})
>>> print(cp)
Best : cm2

Rank  Name   Class-Score         Overall-Score
1     cm2    4.15                1.48333
2     cm3    2.75                0.95

>>> cp.best
pycm.ConfusionMatrix(classes: [0, 1, 2])
>>> cp.sorted
['cm2', 'cm3']
>>> cp.best_name
'cm2'

Acceptable data types

ConfusionMatrix

  1. actual_vector : python list or numpy array of any stringable objects
  2. predict_vector : python list or numpy array of any stringable objects
  3. matrix : dict
  4. digit: int
  5. threshold : FunctionType (function or lambda)
  6. file : File object
  7. sample_weight : python list or numpy array of numbers
  8. transpose : bool
  • Run help(ConfusionMatrix) for ConfusionMatrix object details

Compare

  1. cm_dict : python dict of ConfusionMatrix object (str : ConfusionMatrix)
  2. by_class : bool
  3. weight : python dict of class weights (class_name : float)
  4. digit: int
  • Run help(Compare) for Compare object details

For more information visit here

Try PyCM in your browser!

PyCM can be used online in interactive Jupyter Notebooks via the Binder service! Try it out now! :

Binder

  • Check Examples in Document folder

Issues & bug reports

Just fill an issue and describe it. We'll check it ASAP!
or send an email to info@pycm.ir.

  • Please complete the issue template

Outputs

  1. HTML
  2. CSV
  3. PyCM
  4. OBJ
  5. COMP

Dependencies

References

Cite

If you use PyCM in your research, we would appreciate citations to the following paper :

Download PyCM.bib

License

FOSSA Status

If you do like our project and we hope that you do, can you please support us? Our project is not and is never going to be working for profit. We need the money just so we can continue doing what we do ;-) .

主要指标

概览
名称与所有者sepandhaghighi/pycm
主编程语言Python
编程语言Python (语言数: 6)
平台
许可证MIT License
所有者活动
创建于2018-01-22 19:46:54
推送于2025-04-22 16:34:14
最后一次提交2025-04-04 15:56:14
发布数46
最新版本名称v4.3 (发布于 )
第一版名称v0.1 (发布于 )
用户参与
星数1.5k
关注者数35
派生数126
提交数3.1k
已启用问题?
问题数208
打开的问题数16
拉请求数355
打开的拉请求数1
关闭的拉请求数39
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?