bayesian

Naive Bayesian Classification for Golang.

  • Owner: jbrukh/bayesian
  • Platform:
  • License:: Other
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

Naive Bayesian Classification

Perform naive Bayesian classification into an arbitrary number of classes on sets of strings. bayesian also supports term frequency-inverse document frequency calculations (TF-IDF).

Copyright (c) 2011-2017. Jake Brukhman. (jbrukh@gmail.com).
All rights reserved. See the LICENSE file for BSD-style license.


Background

This is meant to be an low-entry barrier Go library for basic Bayesian classification. See code comments for a refresher on naive Bayesian classifiers, and please take some time to understand underflow edge cases as this otherwise may result in innacurate classifications.


Installation

Using the go command:

go get github.com/jbrukh/bayesian
go install !$

Documentation

See the GoPkgDoc documentation here.


Features

  • Conditional probability and "log-likelihood"-like scoring.
  • Underflow detection.
  • Simple persistence of classifiers.
  • Statistics.
  • TF-IDF support.

Example 1 (Simple Classification)

To use the classifier, first you must create some classes
and train it:

import . "bayesian"

const (
    Good Class = "Good"
    Bad Class = "Bad"
)

classifier := NewClassifier(Good, Bad)
goodStuff := []string{"tall", "rich", "handsome"}
badStuff  := []string{"poor", "smelly", "ugly"}
classifier.Learn(goodStuff, Good)
classifier.Learn(badStuff,  Bad)

Then you can ascertain the scores of each class and
the most likely class your data belongs to:

scores, likely, _ := classifier.LogScores(
                        []string{"tall", "girl"}
                     )

Magnitude of the score indicates likelihood. Alternatively (but
with some risk of float underflow), you can obtain actual probabilities:

probs, likely, _ := classifier.ProbScores(
                        []string{"tall", "girl"}
                     )

Example 2 (TF-IDF Support)

To use the TF-IDF classifier, first you must create some classes
and train it and you need to call ConvertTermsFreqToTfIdf() AFTER training
and before calling classification methods such as LogScores, SafeProbScores, and ProbScores)

import . "bayesian"

const (
    Good Class = "Good"
    Bad Class = "Bad"
)

// Create a classifier with TF-IDF support.
classifier := NewClassifierTfIdf(Good, Bad)

goodStuff := []string{"tall", "rich", "handsome"}
badStuff  := []string{"poor", "smelly", "ugly"}

classifier.Learn(goodStuff, Good)
classifier.Learn(badStuff,  Bad)

// Required
classifier.ConvertTermsFreqToTfIdf()

Then you can ascertain the scores of each class and
the most likely class your data belongs to:

scores, likely, _ := classifier.LogScores(
                        []string{"tall", "girl"}
                     )

Magnitude of the score indicates likelihood. Alternatively (but
with some risk of float underflow), you can obtain actual probabilities:

probs, likely, _ := classifier.ProbScores(
                        []string{"tall", "girl"}
                     )

Use wisely.

Main metrics

Overview
Name With Ownerjbrukh/bayesian
Primary LanguageGo
Program languageGo (Language Count: 1)
Platform
License:Other
所有者活动
Created At2011-11-23 04:17:00
Pushed At2023-11-17 14:32:45
Last Commit At2023-11-17 09:32:45
Release Count2
Last Release Name1.0 (Posted on )
First Release Name0.9 (Posted on )
用户参与
Stargazers Count808
Watchers Count33
Fork Count127
Commits Count96
Has Issues Enabled
Issues Count11
Issue Open Count7
Pull Requests Count16
Pull Requests Open Count1
Pull Requests Close Count4
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private