ir.vsr
Class TextStringDocument

java.lang.Object
  extended by ir.vsr.Document
      extended by ir.vsr.TextStringDocument

public class TextStringDocument
extends Document

A simple document represented by a String


Field Summary
protected  java.util.StringTokenizer tokenizer
          The tokenizer for this document.
static java.lang.String tokenizerDelim
          StringTokenizer delim for tokenizing only alphabetic strings.
 
Fields inherited from class ir.vsr.Document
nextToken, numStopWords, numTokens, stem, stemmer, stopWords, stopWordsFile
 
Constructor Summary
TextStringDocument(java.lang.String string, boolean stem)
          Create a simple Document for this string
 
Method Summary
protected  java.lang.String getNextCandidateToken()
          Get the next token from this string
static void main(java.lang.String[] args)
          For testing, print the bag-of-words vector for the given string
 
Methods inherited from class ir.vsr.Document
allLetters, hashMapVector, hasMoreTokens, loadStopWords, nextToken, numberOfTokens, prepareNextToken, printVector
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

tokenizerDelim

public static final java.lang.String tokenizerDelim
StringTokenizer delim for tokenizing only alphabetic strings.

See Also:
Constant Field Values

tokenizer

protected java.util.StringTokenizer tokenizer
The tokenizer for this document.

Constructor Detail

TextStringDocument

public TextStringDocument(java.lang.String string,
                          boolean stem)
Create a simple Document for this string

Method Detail

getNextCandidateToken

protected java.lang.String getNextCandidateToken()
Get the next token from this string

Specified by:
getNextCandidateToken in class Document

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
For testing, print the bag-of-words vector for the given string

Throws:
java.io.IOException