ir.classifiers
Class NaiveBayes

java.lang.Object
  extended by ir.classifiers.Classifier
      extended by ir.classifiers.NaiveBayes

public class NaiveBayes
extends Classifier

Implements the NaiveBayes Classifier with Laplace smoothing. Stores probabilities internally as logs to prevent underflow problems.


Field Summary
static java.lang.String name
          Name of classifier
 
Fields inherited from class ir.classifiers.Classifier
categories, random
 
Constructor Summary
NaiveBayes(java.lang.String[] categories, boolean debug)
          Create a naive Bayes classifier with these attributes
 
Method Summary
protected  double[] calculatePriors(java.util.List<Example> trainExamples)
          Calculates the class priors
protected  double[] calculateProbs(Example testExample)
          Calculates the prob of the testExample being generated by each category
protected  java.util.Hashtable<java.lang.String,double[]> conditionalProbs(java.util.List<Example> trainExamples)
          Calculates the conditional probs of each feature in the different categories
protected  void displayProbs(double[] classPriors, java.util.Hashtable<java.lang.String,double[]> featureHash)
          Displays the probs for each feature in the different categories
 double getEpsilon()
          Returns value of EPSILON
 boolean getIsLaplace()
          Returns value of isLaplace
 java.lang.String getName()
          Returns the name
 BayesResult getTrainResult()
          Returns training result
 void setDebug(boolean bool)
          Sets the debug flag
 void setEpsilon(double ep)
          Sets the value of EPSILON (default 1e-6)
 void setLaplace(boolean bool)
          Sets the Laplace smoothing flag
 boolean test(Example testExample)
          Categorizes the test example using the trained Naive Bayes classifier, returning true if the predicted category is same as the actual category
 void train(java.util.List<Example> trainExamples)
          Trains the Naive Bayes classifier - estimates the prior probs and calculates the counts for each feature in different categories
 
Methods inherited from class ir.classifiers.Classifier
argMax, getCategories
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

name

public static final java.lang.String name
Name of classifier

See Also:
Constant Field Values
Constructor Detail

NaiveBayes

public NaiveBayes(java.lang.String[] categories,
                  boolean debug)
Create a naive Bayes classifier with these attributes

Parameters:
categories - The array of Strings containing the category names
debug - Flag to turn on detailed output
Method Detail

setDebug

public void setDebug(boolean bool)
Sets the debug flag


setLaplace

public void setLaplace(boolean bool)
Sets the Laplace smoothing flag


setEpsilon

public void setEpsilon(double ep)
Sets the value of EPSILON (default 1e-6)


getName

public java.lang.String getName()
Returns the name

Specified by:
getName in class Classifier
Returns:
the name of a particular classifier

getEpsilon

public double getEpsilon()
Returns value of EPSILON


getTrainResult

public BayesResult getTrainResult()
Returns training result


getIsLaplace

public boolean getIsLaplace()
Returns value of isLaplace


train

public void train(java.util.List<Example> trainExamples)
Trains the Naive Bayes classifier - estimates the prior probs and calculates the counts for each feature in different categories

Specified by:
train in class Classifier
Parameters:
trainExamples - The vector of training examples

test

public boolean test(Example testExample)
Categorizes the test example using the trained Naive Bayes classifier, returning true if the predicted category is same as the actual category

Specified by:
test in class Classifier
Parameters:
testExample - The test example to be categorized

calculatePriors

protected double[] calculatePriors(java.util.List<Example> trainExamples)
Calculates the class priors

Parameters:
trainExamples - The training examples from which class priors will be estimated

conditionalProbs

protected java.util.Hashtable<java.lang.String,double[]> conditionalProbs(java.util.List<Example> trainExamples)
Calculates the conditional probs of each feature in the different categories

Parameters:
trainExamples - The training examples from which counts will be estimated

calculateProbs

protected double[] calculateProbs(Example testExample)
Calculates the prob of the testExample being generated by each category

Parameters:
testExample - The test example to be categorized

displayProbs

protected void displayProbs(double[] classPriors,
                            java.util.Hashtable<java.lang.String,double[]> featureHash)
Displays the probs for each feature in the different categories

Parameters:
classPriors - Prior probs
featureHash - Feature hashtable after training