A B C D E F G H I L M N O P Q R S T U V W

A

add(HashMapVector) - Method in class ir.vsr.HashMapVector
Destructively add the given vector to the current vector
add(String) - Method in class ir.webutils.RobotExclusionSet
 
addBad(DocumentReference) - Method in class ir.vsr.Feedback
Add a document to the list of those deemed irrelevant
addEdge(String, String) - Method in class ir.webutils.Graph
Adds an edge from xName to yName.
addEdge(Node) - Method in class ir.webutils.Node
Adds an outgoing edge
addEndSlash(URL) - Static method in class ir.webutils.HTMLPage
If URL looks like a directory rather than a file, then add a "/" at the end so that it acts as a proper base URL for completing URLs in this page
addGood(DocumentReference) - Method in class ir.vsr.Feedback
Add a document to the list of those deemed relevant
addLink(MutableAttributeSet, HTML.Attribute) - Method in class ir.webutils.AnchoredLinkExtractor
Retrieves a link from an attribute set and completes it against the base URL.
addLink(MutableAttributeSet, HTML.Attribute) - Method in class ir.webutils.LinkExtractor
Retrieves a link from an attribute set and completes it against the base URL.
addLink(MutableAttributeSet, HTML.Attribute) - Method in class ir.webutils.ScoredAnchoredLinkExtractor
Retrieves a link from an attribute set and completes it against the base URL.
addNode(String) - Method in class ir.webutils.Graph
Adds a node if it is not already present.
addResult(int, double) - Method in class ir.classifiers.PointResults
Set the nth result
addScaled(HashMapVector, double) - Method in class ir.vsr.HashMapVector
Destructively add a scaled version of the given vector to the current vector
addVectors(double[], double[]) - Static method in class ir.utilities.MoreMath
Add two vectors and return the vector sum
allLetters(String) - Method in class ir.vsr.Document
Check if this token consists of all Unicode letters to eliminate other bizarre tokens
ALPHA - Static variable in class ir.vsr.Feedback
A Rochio/Ide algorithm parameter
AnchoredLink - Class in ir.webutils
Link with included anchor text
AnchoredLink(URL, String) - Constructor for class ir.webutils.AnchoredLink
Constructs a link with specified URL and anchor text
AnchoredLink(URL) - Constructor for class ir.webutils.AnchoredLink
Constructs a link with specified URL
AnchoredLink(String) - Constructor for class ir.webutils.AnchoredLink
Construct a link with specified URL string
AnchoredLinkExtractor - Class in ir.webutils
Extractor for AnchoredLink's.
AnchoredLinkExtractor(HTMLPage) - Constructor for class ir.webutils.AnchoredLinkExtractor
Create an anchored link extractor for the given page
anchorText - Variable in class ir.webutils.AnchoredLinkExtractor
Buffer to store anchor text encountered between an "a" start tag and end tag.
appendTag(StringBuffer, HTML.Tag, MutableAttributeSet) - Static method in class ir.webutils.AnchoredLinkExtractor
Write this tag with attributes out to the buffer
argMax(double[]) - Method in class ir.classifiers.Classifier
Returns the array index with the maximum value
averageVectors(ArrayList<double[]>) - Static method in class ir.utilities.MoreMath
Average all of the vectors in a list of vectors and return the average vector.

B

badDocRefs - Variable in class ir.vsr.Feedback
The list of DocumentReference's that were rated irrelevant
BayesResult - Class in ir.classifiers
An object to hold the result of training a NaiveBayes classifier.
BayesResult() - Constructor for class ir.classifiers.BayesResult
 
BeamSearchSiteSpider - Class in ir.webutils
A BeamSearchSpider that limits itself to a given site (web host).
BeamSearchSiteSpider() - Constructor for class ir.webutils.BeamSearchSiteSpider
 
BeamSearchSpider - Class in ir.webutils
A spider that uses heuristic beam search to find a web page that contains a set of "want strings" using a set of "help strings" to guide the search.
BeamSearchSpider() - Constructor for class ir.webutils.BeamSearchSpider
 
beamSize - Variable in class ir.webutils.BeamSearchSpider
The beam width to use.
BETA - Static variable in class ir.vsr.Feedback
A Rochio/Ide algorithm parameter
binExamples() - Method in class ir.classifiers.CVLearningCurve
Set the fold Bins from the total Examples -- this effectively stores the training-test split
Browser - Class in ir.utilities
Utilities for displaying a URL or local file in the current Browser window using "browser -remote".
Browser() - Constructor for class ir.utilities.Browser
 
BROWSER_NAME - Static variable in class ir.utilities.Browser
 

C

calculatePriors(List<Example>) - Method in class ir.classifiers.NaiveBayes
Calculates the class priors
calculateProbs(Example) - Method in class ir.classifiers.NaiveBayes
Calculates the prob of the testExample being generated by each category
categories - Variable in class ir.classifiers.Classifier
Array of categories (classes) in the data
categories - Variable in class ir.classifiers.DirectoryExamplesConstructor
Array of categories (classes) in the data
categories - Variable in class ir.classifiers.PerceptronUnit
Array of categories (classes) in the data
category - Variable in class ir.classifiers.Example
Category index of the example
Classifier - Class in ir.classifiers
Abstract class specifying the functionality of a classifier.
Classifier() - Constructor for class ir.classifiers.Classifier
 
classifier - Variable in class ir.classifiers.CVLearningCurve
The classifier for which K-fold CV learning curve has to be generated
classify(Example) - Method in class ir.classifiers.PerceptronUnit
Classify an example using the perceptron and return the amount by which the net input exceeds the threshold as a measure of confidence of the prediction of being positive for this perceptron
classPriors - Variable in class ir.classifiers.BayesResult
Stores the prior probabilities of each class
cleanURL(URL) - Static method in class ir.webutils.Link
Standardize URL by removing trailing slashes, URL decoding it, replacing the UTCS-specific "/users/user" to "/~user" link, and removing a set of common index pages.
clear() - Method in class ir.classifiers.PerceptronUnit
Clear the weights and threshold all back to zero
clear() - Method in class ir.vsr.HashMapVector
Clears the vector back to all zeros
clear() - Method in class ir.vsr.InvertedIndex
Clear all documents from the inverted index
compareTo(Object) - Method in class ir.vsr.Retrieval
Compares this Retrieval to another for sorting from best to worst.
compareTo(Link) - Method in class ir.webutils.Link
Compares this Link to another for sorting from best to worst.
computeIDFandDocumentLengths() - Method in class ir.vsr.InvertedIndex
Compute the IDF factor for every token in the index and the length of the document vector for every document referenced in the index.
conditionalProbs(List<Example>) - Method in class ir.classifiers.NaiveBayes
Calculates the conditional probs of each feature in the different categories
constructLinkHeuristic() - Method in class ir.webutils.BeamSearchSpider
Return default LinkHeuristic.
contains(String) - Method in class ir.webutils.RobotExclusionSet
Checks to see if a path is prohibited by this set.
copy() - Method in class ir.vsr.HashMapVector
Produce a copy of this HashMapVector with a new HashMap and new Weight's
corpusDir - Variable in class ir.eval.Experiment
The directory from which the indexed documents come.
cosineTo(HashMapVector) - Method in class ir.vsr.HashMapVector
Computes cosine of angle to otherVector.
cosineTo(HashMapVector, double) - Method in class ir.vsr.HashMapVector
Computes cosine of angle to otherVector when also given otherVector's Euclidian length (Allows saving computation if length already known.
count - Variable in class ir.utilities.Counter
The integer count
count - Variable in class ir.vsr.TokenOccurrence
The number of times it occurs in the Document
count - Variable in class ir.webutils.Spider
The number of pages indexed.
Counter - Class in ir.utilities
A simple wrapper data structure for storing an integer count as an Object that can be put into lists, maps, etc.
Counter() - Constructor for class ir.utilities.Counter
 
countPhrase(String, String, int) - Static method in class ir.utilities.MoreString
Counts the number of times that a given substring appears in a string using matching as defined in startsWithPhrase and starting in the string from fromIndex
countPhrase(String, String) - Static method in class ir.utilities.MoreString
Counts the number of times that a given substring appears in a string using matching as defined in startsWithPhrase
covariance(double[], double[]) - Static method in class ir.utilities.Stats
Return the covariance between the vectors x and y.
currentLink - Variable in class ir.webutils.AnchoredLinkExtractor
The current link being processed
CVLearningCurve - Class in ir.classifiers
Gives learning curves with K-fold cross validation for a classifier.
CVLearningCurve(int, Classifier, List<Example>, double[], long, boolean) - Constructor for class ir.classifiers.CVLearningCurve
Creates a CVLearning curve object
CVLearningCurve(Classifier, List<Example>) - Constructor for class ir.classifiers.CVLearningCurve
Creates a CVLearning curve object with 10 folds and default points

D

debug - Variable in class ir.classifiers.CVLearningCurve
Flag for debug display
debug - Variable in class ir.classifiers.Perceptron
Flag for debug print statements
debug - Variable in class ir.classifiers.PerceptronUnit
Flag for debug print statements
decrement() - Method in class ir.utilities.Counter
Decrement and return the new count
decrement(int) - Method in class ir.utilities.Counter
Decrement by n and return the new count
decrement() - Method in class ir.utilities.Weight
Decrement and return the new count
decrement(int) - Method in class ir.utilities.Weight
Decrement by n and return the new count
decrement(double) - Method in class ir.utilities.Weight
Decrement by n and return the new count
DEFAULT_POINTS - Static variable in class ir.classifiers.CVLearningCurve
Default points
DirectoryExamplesConstructor - Class in ir.classifiers
Creates a list of examples from a directory where file names contain the category name as a substring.
DirectoryExamplesConstructor(String, String[], short, boolean) - Constructor for class ir.classifiers.DirectoryExamplesConstructor
Construct an ExamplesConstructor for the given directory and category labels
DirectoryExamplesConstructor(String, String[]) - Constructor for class ir.classifiers.DirectoryExamplesConstructor
Construct an ExamplesConstructor for the given directory and category labels
DirectorySpider - Class in ir.webutils
Spider that limits itself to the directory it started in.
DirectorySpider() - Constructor for class ir.webutils.DirectorySpider
 
dirFile - Variable in class ir.vsr.InvertedIndex
The directory from which the indexed documents come.
dirName - Variable in class ir.classifiers.DirectoryExamplesConstructor
Name of the directory where the example files are stored.
display(String) - Static method in class ir.utilities.Browser
Make browser display a given URL
display(File) - Static method in class ir.utilities.Browser
Make browser display a given file
displayProbs(double[], Hashtable<String, double[]>) - Method in class ir.classifiers.NaiveBayes
Displays the probs for each feature in the different categories
displayURL(URL) - Method in class ir.webutils.WebPageViewer
 
doCrawl() - Method in class ir.webutils.BeamSearchSpider
Crawls the web using beam search with given heuristic to find a page that satisfies goal.
doCrawl() - Method in class ir.webutils.Spider
Performs the crawl.
docRef - Variable in class ir.vsr.Retrieval
A reference to the Document being retrieved
docRef - Variable in class ir.vsr.TokenOccurrence
A reference to the Document where it occurs
docRefs - Variable in class ir.vsr.InvertedIndex
A list of all indexed documents.
docType - Variable in class ir.classifiers.DirectoryExamplesConstructor
Type of document (text or HTML)
docType - Variable in class ir.vsr.DocumentIterator
The type of documents to be created
docType - Variable in class ir.vsr.InvertedIndex
The type of Documents (text, HTML).
document - Variable in class ir.classifiers.Example
fileDocument object for the example
Document - Class in ir.vsr
Docment is an abstract class that provides for tokenization of a document with stop-word removal and an iterator-like interface similar to StringTokenizer.
Document(boolean) - Constructor for class ir.vsr.Document
Creates a new Document making sure that the stopwords are loaded, indexed, and ready for use.
DocumentIterator - Class in ir.vsr
An object for iterating over a set of documents in a directory.
DocumentIterator(File, short, boolean, FilenameFilter) - Constructor for class ir.vsr.DocumentIterator
Create an iterator with these attributes
DocumentIterator(File, short, boolean) - Constructor for class ir.vsr.DocumentIterator
Create an iterator with these attributes
DocumentIterator(File) - Constructor for class ir.vsr.DocumentIterator
Create an iterator for TexFileDocuments
DocumentReference - Class in ir.vsr
A simple data structure for storing a reference to a document file that includes information on the length of its document vector.
DocumentReference(File, double) - Constructor for class ir.vsr.DocumentReference
 
DocumentReference(FileDocument) - Constructor for class ir.vsr.DocumentReference
Create a reference to this document, initializing its length to 0
DoubleValue - Class in ir.utilities
A simple wrapper data structure for storing a double real value as an Object whose value can be reset.
DoubleValue(double) - Constructor for class ir.utilities.DoubleValue
 

E

empty() - Method in class ir.webutils.HTMLPage
Returns true if the page is empty or a 404 error.
entrySet() - Method in class ir.vsr.HashMapVector
Returns the Set of MapEntries in the hashMap
equals(Object) - Method in class ir.webutils.Link
 
Example - Class in ir.classifiers
An object to hold training or test examples for categorization.
Example(HashMapVector, int, String, FileDocument) - Constructor for class ir.classifiers.Example
 
ExamplesConstructor - Class in ir.classifiers
Creates a list of Examples from data files Specializations handle various ways of storing examples.
ExamplesConstructor() - Constructor for class ir.classifiers.ExamplesConstructor
 
Experiment - Class in ir.eval
Contains methods for running evaluation experiments for information retrieval, specifically the generation of recall-precision curves for a given test corpus of query/relevant-documents pairs.
Experiment(File, File, File, short, boolean) - Constructor for class ir.eval.Experiment
Create an Experiment object for generating Recall/Precision curves
Experiment(InvertedIndex, File, File) - Constructor for class ir.eval.Experiment
Create an Experiment object for generating Recall/Precision curves using a provided InvertedIndex
extractLinks() - Method in class ir.webutils.LinkExtractor
Extracts links from the given page.

F

featureTable - Variable in class ir.classifiers.BayesResult
Stores the counts for each feature: an entry in the hashTable stores the array of class counts for a feature
Feedback - Class in ir.vsr
Gets and stores information about relevance feedback from the user and computes an updated query based on original query and retrieved documents that are rated relevant and irrelevant.
Feedback(HashMapVector, Retrieval[], InvertedIndex) - Constructor for class ir.vsr.Feedback
Create a feedback object for this query with initial retrievals to be rated
feedback - Variable in class ir.vsr.InvertedIndex
Whether relevance feedback using the Ide_regular algorithm is used
file - Variable in class ir.vsr.DocumentReference
The file where the referenced document is stored.
file - Variable in class ir.vsr.FileDocument
The name of the file
FileDocument - Class in ir.vsr
A Document stored as a file.
FileDocument(File, boolean) - Constructor for class ir.vsr.FileDocument
Creates a FileDocument and initializes its name and reader.
fileExtension(String) - Static method in class ir.utilities.MoreString
 
FilePrefixer - Class in ir.utilities
Prefix all files in a directory with a particular prefix.
FilePrefixer(File, FilenameFilter) - Constructor for class ir.utilities.FilePrefixer
 
FilePrefixer(File) - Constructor for class ir.utilities.FilePrefixer
 
files - Variable in class ir.vsr.DocumentIterator
An array of files in the directory
fileToString(String) - Static method in class ir.utilities.MoreString
Load the stopwords from file to the hashtable where they are indexed.
findClassID(String) - Method in class ir.classifiers.DirectoryExamplesConstructor
Finds the class ID from the name of the document file.
foldBins - Variable in class ir.classifiers.CVLearningCurve
foldBins[i][j] stores the examples for class i in fold j.

G

GAMMA - Static variable in class ir.vsr.Feedback
A Rochio/Ide algorithm parameter
getAnchorText() - Method in class ir.webutils.AnchoredLink
Return anchor text for link
getBackLink() - Method in class ir.webutils.ScoredAnchoredLink
Return backLink for link
getCategories() - Method in class ir.classifiers.Classifier
Returns the categories (classes) in the data
getCategory() - Method in class ir.classifiers.Example
Returns the category of the example
getClassifier() - Method in class ir.classifiers.CVLearningCurve
Return classifier
getClassPriors() - Method in class ir.classifiers.BayesResult
Returns the class priors
getDocument() - Method in class ir.classifiers.Example
Returns the document of the example
getDocument(short, boolean) - Method in class ir.vsr.DocumentReference
Get the full Document for this Document reference by recreating it with the given docType and stemming
getEdgesIn() - Method in class ir.webutils.Node
Gives the list of incoming edges
getEdgesOut() - Method in class ir.webutils.Node
Gives the list of outgoing edges
getEndPosition() - Method in class ir.webutils.ScoredAnchoredLink
Return endPosition for link
getEpsilon() - Method in class ir.classifiers.NaiveBayes
Returns value of EPSILON
getExamples() - Method in class ir.classifiers.DirectoryExamplesConstructor
Get the examples from the directory, process them into HashMapVector's and label them with the correct category label
getExamples() - Method in class ir.classifiers.ExamplesConstructor
Return the list of examples for this dataset
getExistingNode(String) - Method in class ir.webutils.Graph
Returns the node with that name
getFeatureTable() - Method in class ir.classifiers.BayesResult
Returns the feature hash
getFeedback(int) - Method in class ir.vsr.Feedback
Prompt the user for feedback on this numbered retrieval
getFoldBins() - Method in class ir.classifiers.CVLearningCurve
Return the fold Bins
getHashMapVector() - Method in class ir.classifiers.Example
Returns the hashVector of the example
getHTMLPage(Link) - Method in class ir.webutils.HTMLPageRetriever
Downloads a web page from a given URL.
getHTMLPage(Link) - Method in class ir.webutils.SafeHTMLPageRetriever
Tries to download the given web page.
getIsLaplace() - Method in class ir.classifiers.NaiveBayes
Returns value of isLaplace
getLink() - Method in class ir.webutils.HTMLPage
Returns the Link object that was used to access this page.
getName() - Method in class ir.classifiers.Classifier
The name of a classifier
getName() - Method in class ir.classifiers.Example
Returns the name of the example
getName() - Method in class ir.classifiers.NaiveBayes
Returns the name
getName() - Method in class ir.classifiers.Perceptron
Returns the name
getNewLinks(HTMLPage) - Method in class ir.webutils.BeamSearchSiteSpider
Gets links from the given page that are on the same host as the page.
getNewLinks(HTMLPage) - Method in class ir.webutils.BeamSearchSpider
Returns a list of scored links to follow from a given page.
getNewLinks(HTMLPage) - Method in class ir.webutils.DirectorySpider
Gets links from the page that are in or below the starting directory.
getNewLinks(HTMLPage) - Method in class ir.webutils.SiteSpider
Gets links from the given page that are on the same host as the page.
getNewLinks(HTMLPage) - Method in class ir.webutils.Spider
Returns a list of links to follow from a given page.
getNextCandidateToken() - Method in class ir.vsr.Document
Return the next possible token in the document.
getNextCandidateToken() - Method in class ir.vsr.HTMLFileDocument
Return the next purely alpha-character token in the document, or null if none left.
getNextCandidateToken() - Method in class ir.vsr.TextFileDocument
Return the next purely alpha-character token in the document, or null if none left.
getNextCandidateToken() - Method in class ir.vsr.TextStringDocument
Get the next token from this string
getNode(String) - Method in class ir.webutils.Graph
Returns the node with that name, creates one if not already present.
getOutLinks() - Method in class ir.webutils.HTMLPage
Get the list of out links from this page.
getParser() - Method in class ir.webutils.HTMLParserMaker
Returns a parser.
getPoint() - Method in class ir.classifiers.PointResults
 
getResults() - Method in class ir.classifiers.PointResults
 
getRetrieval(double, DocumentReference, double) - Method in class ir.vsr.InvertedIndex
Calculate the final score for a retrieval and return a Retrieval object representing the retrieval with its final score.
getScore() - Method in class ir.webutils.AnchoredLink
 
getScore() - Method in class ir.webutils.Link
 
getScore() - Method in class ir.webutils.ScoredAnchoredLink
 
getStartPosition() - Method in class ir.webutils.ScoredAnchoredLink
Return startPosition for link
getTestCV(int) - Method in class ir.classifiers.CVLearningCurve
Creates the testing set for one fold of a cross-validation on the dataset.
getText() - Method in class ir.webutils.HTMLPage
Returns the full text of this page.
getTotalExamples() - Method in class ir.classifiers.CVLearningCurve
Return all the examples
getTrainCV(int, double) - Method in class ir.classifiers.CVLearningCurve
Creates the training set for one fold of a cross-validation on the dataset.
getTrainResult() - Method in class ir.classifiers.NaiveBayes
Returns training result
getURL() - Method in class ir.webutils.Link
Returns the URL of this link.
getURL(String) - Static method in class ir.webutils.URLChecker
Returns a URL for the given string after correcting simple errors.
getValue() - Method in class ir.utilities.Counter
Get the current count
getValue() - Method in class ir.utilities.Weight
Get the current count
getWebPage(String) - Static method in class ir.webutils.WebPage
Downloads the web page specified by the URL represented by a given string.
getWebPage(URL) - Static method in class ir.webutils.WebPage
Downloads the web page specified by the given URL object.
getWeight(String) - Method in class ir.vsr.HashMapVector
Return the weight of the given token in the vector
go(String[]) - Method in class ir.webutils.BeamSearchSpider
Interprets command line arguments and performs the crawl.
go(String[]) - Method in class ir.webutils.Spider
Checks command line arguments and performs the crawl.
goal - Variable in class ir.webutils.BeamSearchSpider
Defines the goal predicate over HTMLPage's that is to be satisfied.
goalPage - Variable in class ir.webutils.BeamSearchSpider
The page found that satisfies the goal
goodDocRefs - Variable in class ir.vsr.Feedback
The list of DocumentReference's that were rated relevant
Graph - Class in ir.webutils
Graph data structure.
Graph() - Constructor for class ir.webutils.Graph
Basic constructor.

H

handleBCommandLineOption(String) - Method in class ir.webutils.BeamSearchSpider
Called when "-b" is passed in on the command line to sets beam width.
handleCCommandLineOption(String) - Method in class ir.webutils.Spider
Called when "-c" is passed in on the command line.
handleDCommandLineOption(String) - Method in class ir.webutils.Spider
Called when "-d" is passed in on the command line.
handleEndTag(HTML.Tag, int) - Method in class ir.webutils.AnchoredLinkExtractor
Executed when a closing HTML tag is found in the document.
handleEndTag(HTML.Tag, int) - Method in class ir.webutils.LinkExtractor
Executed when a closing HTML tag is found in the document.
handleEndTag(HTML.Tag, int) - Method in class ir.webutils.ScoredAnchoredLinkExtractor
Executed when a closing HTML tag is found in the document.
handleHCommandLineOption(String) - Method in class ir.webutils.BeamSearchSpider
Called when "-h" is passed in on the command line to set help strings.
handleSafeCommandLineOption() - Method in class ir.webutils.Spider
Called when "-safe" is passed in on the command line.
handleSimpleTag(HTML.Tag, MutableAttributeSet, int) - Method in class ir.webutils.AnchoredLinkExtractor
Executed when an HTML tag that has no closing tag is found in the document.
handleSimpleTag(HTML.Tag, MutableAttributeSet, int) - Method in class ir.webutils.LinkExtractor
Executed when an HTML tag that has no closing tag is found in the document.
handleSimpleTag(HTML.Tag, MutableAttributeSet, int) - Method in class ir.webutils.RobotsMetaTagParser
Checks for robots META tags.
handleSlowCommandLineOption() - Method in class ir.webutils.Spider
Called when "-slow" is passed in on the command line.
handleStartTag(HTML.Tag, MutableAttributeSet, int) - Method in class ir.webutils.AnchoredLinkExtractor
Executed when an opening HTML tag is found in the document.
handleStartTag(HTML.Tag, MutableAttributeSet, int) - Method in class ir.webutils.LinkExtractor
Executed when an opening HTML tag is found in the document.
handleStartTag(HTML.Tag, MutableAttributeSet, int) - Method in class ir.webutils.ScoredAnchoredLinkExtractor
Executed when an opening HTML tag is found in the document.
handleText(char[], int) - Method in class ir.webutils.AnchoredLinkExtractor
Executed when a block of text is encountered.
handleText(char[], int) - Method in class ir.webutils.LinkExtractor
Executed when a block of text is encountered.
handleUCommandLineOption(String) - Method in class ir.webutils.BeamSearchSpider
Called when "-u" is passed in on the command line.
handleUCommandLineOption(String) - Method in class ir.webutils.DirectorySpider
Sets the initial URL from the "-u" argument, then calls the corresponding superclass method.
handleUCommandLineOption(String) - Method in class ir.webutils.Spider
Called when "-u" is passed in on the command line.
handleWCommandLineOption(String) - Method in class ir.webutils.BeamSearchSpider
Called when "-w" is passed in on the command line to set "want strings".
hashCode() - Method in class ir.webutils.Link
 
hashMap - Variable in class ir.vsr.HashMapVector
The HashMap that stores the mapping of tokens to Weights
hashMapVector() - Method in class ir.vsr.Document
Returns a hashmap version of the term-vector (bag of words) for this document, where each token is a key whose value is the number of times it occurs in the document as stored in a Weight.
HashMapVector - Class in ir.vsr
A data structure for a term vector for a document stored as a HashMap that maps tokens to Weight's that store the weight of that token in the document.
HashMapVector() - Constructor for class ir.vsr.HashMapVector
 
hashVector - Variable in class ir.classifiers.Example
Representation of the example as a vector of (feature -> weight) mappings
hasMoreDocuments() - Method in class ir.vsr.DocumentIterator
Returns true iff there are more documents in this directory
hasMoreTokens() - Method in class ir.vsr.Document
Returns true iff the document contains more tokens
haveFeedback(int) - Method in class ir.vsr.Feedback
Has the user already provided feedback on this numbered retrieval?
helpStrings - Variable in class ir.webutils.LinkHeuristic
The array of help strings to help find the want strings
heuristic - Variable in class ir.webutils.BeamSearchSpider
Defines the heuristic that is used to sort ScoredAnchoredLink's in the queue
HTMLFileDocument - Class in ir.vsr
An HTML file document where HTML commands are removed from the token stream.
HTMLFileDocument(File, boolean) - Constructor for class ir.vsr.HTMLFileDocument
Create a new text document for the given file.
HTMLFileDocument(String, boolean) - Constructor for class ir.vsr.HTMLFileDocument
Create a new text document for the given file name.
HTMLPage - Class in ir.webutils
HTMLPage is a representation of information about a web page.
HTMLPage(Link, String) - Constructor for class ir.webutils.HTMLPage
Constructs an HTMLPage with the given link and text.
HTMLPageRetriever - Class in ir.webutils
HTMLPageRetriever allows clients to download web pages from URLs.
HTMLPageRetriever() - Constructor for class ir.webutils.HTMLPageRetriever
Constructs a HTMLPageRetriever object.
HTMLParserMaker - Class in ir.webutils
HTMLParserMaker allows clients to retrieve an HTMLEditorKit.Parser instance.
HTMLParserMaker() - Constructor for class ir.webutils.HTMLParserMaker
 

I

idf - Variable in class ir.vsr.TokenInfo
The IDF (inverse document frequency) factor for this token which indicates how much to weight an occurence.
incorporateToken(String, double, Map<DocumentReference, DoubleValue>) - Method in class ir.vsr.InvertedIndex
Retrieve the documents indexed by this token in the inverted index, add it to the retrievalHash if needed, and update its running total score.
increment() - Method in class ir.utilities.Counter
Increment and return the new count
increment(int) - Method in class ir.utilities.Counter
Increment by n and return the new count
increment() - Method in class ir.utilities.Weight
Increment and return the new count
increment(int) - Method in class ir.utilities.Weight
Increment by n and return the new count
increment(double) - Method in class ir.utilities.Weight
Increment by n and return the new count
increment(String, double) - Method in class ir.vsr.HashMapVector
Increment the weight for the given token in the vector by the given amount.
increment(String) - Method in class ir.vsr.HashMapVector
Increment the weight for the given token in the vector by 1.
increment(String, int) - Method in class ir.vsr.HashMapVector
Increment the weight for the given token in the vector by the given int
index() - Method in class ir.webutils.RobotsMetaTagParser
Indicates whether the page can be indexed.
indexAllowed() - Method in class ir.webutils.HTMLPage
Clients should always call this method before indexing an HTML page if they want to obey the "NOINDEX" directive in the Robots META tag.
indexAllowed() - Method in class ir.webutils.SafeHTMLPage
Indicates whether or not indexing has been disallowed by a Robots META tag.
indexDocument(FileDocument, HashMapVector) - Method in class ir.vsr.InvertedIndex
Index the given document using its corresponding vector
indexDocuments() - Method in class ir.vsr.InvertedIndex
Index the documents in dirFile.
indexDocuments(List<Example>) - Method in class ir.vsr.InvertedIndex
Index the documents in the List of Examples for text categorization.
indexOfIgnoreCase(String, String, int) - Static method in class ir.utilities.MoreString
 
indexOfIgnoreCase(String, String) - Static method in class ir.utilities.MoreString
 
indexOfPhrase(String, String, int) - Static method in class ir.utilities.MoreString
Version of String method indexOf that treats all whitespace characters as equivalent and matches lowercase characters in the substring to either lower or uppercase in the string, but uppercase characters in the substring must match uppercase in the string
indexOfPhrase(String, String) - Static method in class ir.utilities.MoreString
Version of String method indexOf that treats all whitespace characters in substring as matching any "word boundary" and matches lowercase characters in the substring to either lower or uppercase in the string, but uppercase characters in the substring must match uppercase in the string
indexPage(HTMLPage) - Method in class ir.webutils.Spider
"Indexes" a HTMLpage.
indexToken(String, int, DocumentReference) - Method in class ir.vsr.InvertedIndex
Add a token occurrence to the index.
invertedIndex - Variable in class ir.vsr.Feedback
The current InvertedIndex
InvertedIndex - Class in ir.vsr
An inverted index for vector-space information retrieval.
InvertedIndex(File, short, boolean, boolean) - Constructor for class ir.vsr.InvertedIndex
Create an inverted index of the documents in a directory.
InvertedIndex(List<Example>) - Constructor for class ir.vsr.InvertedIndex
Create an inverted index of the documents in a List of Example objects of documents for text categorization.
ir.classifiers - package ir.classifiers
Provides methods for classifying text documents using machine learning.
ir.eval - package ir.eval
Provides methods for running experiments for evaluating information retrieval.
ir.utilities - package ir.utilities
Provides utility methods for manipulating various types of data for the overall IR package
ir.vsr - package ir.vsr
Provides basic vector-space information retrieval system.
ir.webutils - package ir.webutils
Provides web utilities for downloading web pages and spidering the web.
isEmpty() - Method in class ir.vsr.Feedback
Has the user rated any documents yet?
isWordBoundary(char) - Static method in class ir.utilities.MoreString
Returns true iff character is in a specific set considered to mark a word boundary
iterator() - Method in class ir.webutils.RobotExclusionSet
 

L

learningRate - Variable in class ir.classifiers.PerceptronUnit
The learning rate for weight updating
length - Variable in class ir.vsr.DocumentReference
The length of the corresponding Document vector.
length() - Method in class ir.vsr.HashMapVector
Compute Euclidian length (sqrt of sum of squares) of vector
link - Variable in class ir.webutils.HTMLPage
The original link to this page
Link - Class in ir.webutils
Link is a class that contains a URL.
Link() - Constructor for class ir.webutils.Link
May be subclassed.
Link(URL) - Constructor for class ir.webutils.Link
Constructs a link with specified URL.
Link(String) - Constructor for class ir.webutils.Link
Construct a link with specified URL string
LinkExtractor - Class in ir.webutils
LinkExtractor defines a callback that extracts the links from an HTML document and provides functionality to parse a document.
LinkExtractor(HTMLPage) - Constructor for class ir.webutils.LinkExtractor
Create an link extractor for the given page
LinkHeuristic - Class in ir.webutils
Evaluates a web link (ScoredAnchoredLink) based on satisfying a set of "want strings" and "help strings".
LinkHeuristic() - Constructor for class ir.webutils.LinkHeuristic
Construct an empty heuristic
LinkHeuristic(String[], String[]) - Constructor for class ir.webutils.LinkHeuristic
Construct a heuristic with the given wantStrings and helpStrings
links - Variable in class ir.webutils.LinkExtractor
The current list of extracted links
linksToVisit - Variable in class ir.webutils.Spider
The queue of links maintained by the spider
linkToHTMLPage(Link) - Method in class ir.webutils.Spider
Check if this is a link to an HTML page.
loadStopWords() - Static method in class ir.vsr.Document
Load the stopwords from file to the hashtable where they are indexed.
log(double, double) - Static method in class ir.utilities.MoreMath
Return logarithm of a given base
log(int, int) - Static method in class ir.utilities.MoreMath
 
log(double, int) - Static method in class ir.utilities.MoreMath
 
log(int, double) - Static method in class ir.utilities.MoreMath
 

M

main(String[]) - Static method in class ir.classifiers.DirectoryExamplesConstructor
Test loading a sample directory of examples
main(String[]) - Static method in class ir.classifiers.TestNaiveBayes
A driver method for testing the NaiveBayes classifier using 10-fold cross validation.
main(String[]) - Static method in class ir.classifiers.TestPerceptron
 
main(String[]) - Static method in class ir.eval.Experiment
Evaluate retrieval performance on a given query test corpus and generate a recall/precision graph.
main(String[]) - Static method in class ir.utilities.Browser
Test interface
main(String[]) - Static method in class ir.utilities.FilePrefixer
 
main(String[]) - Static method in class ir.utilities.MoreString
 
main(String[]) - Static method in class ir.utilities.Porter
For testing, print the stemmed version of a word
main(String[]) - Static method in class ir.vsr.DocumentIterator
Test by printing the bag-of-words for each file in the given directory
main(String[]) - Static method in class ir.vsr.HTMLFileDocument
For testing, print the bag-of-words vector for a given HTML file
main(String[]) - Static method in class ir.vsr.InvertedIndex
Index a directory of files and then interactively accept retrieval queries.
main(String[]) - Static method in class ir.vsr.TextFileDocument
For testing, print the bag-of-words vector for a given file
main(String[]) - Static method in class ir.vsr.TextStringDocument
For testing, print the bag-of-words vector for the given string
main(String[]) - Static method in class ir.webutils.AnchoredLinkExtractor
 
main(String[]) - Static method in class ir.webutils.BeamSearchSiteSpider
Search the web using beam search according to the following command options, but stay within the initial host site.
main(String[]) - Static method in class ir.webutils.BeamSearchSpider
Search the web using beam search according to the following command options: -safe : Check for and obey robots.txt and robots META tag directives. -c <maxCount> : Download at most <maxCount> pages (default is 10,000). -u <url> : Start at <url>. -w <strings> : <strings> should be a list of "need strings" separated by ";"'s. -h <strings> : <strings> should be a list of "help strings" separated by ";"'s. -b <size> : Use a beam width of given <size> (default is 100) -slow : Pause briefly before getting a page.
main(String[]) - Static method in class ir.webutils.DirectorySpider
Spider the web according to the following command options, but only below the start URL directory.
main(String[]) - Static method in class ir.webutils.Graph
 
main(String[]) - Static method in class ir.webutils.Link
 
main(String[]) - Static method in class ir.webutils.RobotExclusionSet
For testing only.
main(String[]) - Static method in class ir.webutils.SiteSpider
Spider the web according to the following command options, but stay within the given site (same URL host).
main(String[]) - Static method in class ir.webutils.Spider
Spider the web according to the following command options: -safe : Check for and obey robots.txt and robots META tag directives. -d <directory> : Store indexed files in <directory>. -c <maxCount> : Store at most <maxCount> files (default is 10,000). -u <url> : Start at <url>. -slow : Pause briefly before getting a page.
main(String[]) - Static method in class ir.webutils.WebPage
Retrieve the page on the URL given and output its contents to STDOUT.
main(String[]) - Static method in class ir.webutils.WebPageViewer
 
makeRpCurve() - Method in class ir.eval.Experiment
Process and evaluate all queries and generate recall-precision curve
MAX_RETRIEVALS - Static variable in class ir.vsr.InvertedIndex
The maximum number of retrieved documents for a query to present to the user at a time
maxCount - Variable in class ir.webutils.Spider
The maximum number of pages to be indexed.
maxEpochs - Variable in class ir.classifiers.PerceptronUnit
Maximum number of training epochs allowed
maxWeight() - Method in class ir.vsr.HashMapVector
Returns the maximum weight of any token in the vector.
mean(double[]) - Static method in class ir.utilities.Stats
Return the arithmetic mean of the argument values.
MoreMath - Class in ir.utilities
A place to put some additional math functions
MoreMath() - Constructor for class ir.utilities.MoreMath
 
MoreString - Class in ir.utilities
A place to put some additional string functions
MoreString() - Constructor for class ir.utilities.MoreString
 
multiply(double) - Method in class ir.vsr.HashMapVector
Destructively multiply the vector by a constant

N

NaiveBayes - Class in ir.classifiers
Implements the NaiveBayes Classifier with Laplace smoothing.
NaiveBayes(String[], boolean) - Constructor for class ir.classifiers.NaiveBayes
Create a naive Bayes classifier with these attributes
name - Variable in class ir.classifiers.Example
Name of the example
name - Static variable in class ir.classifiers.NaiveBayes
Name of classifier
name - Static variable in class ir.classifiers.Perceptron
Name of classifier
newQuery() - Method in class ir.vsr.Feedback
Use the Ide_regular algorithm to compute a new revised query.
nextDocument() - Method in class ir.vsr.DocumentIterator
Get the next document
nextNode() - Method in class ir.webutils.Graph
Returns the next node in an iterator over the nodes of the graph
nextToken - Variable in class ir.vsr.Document
The next token in the document
nextToken() - Method in class ir.vsr.Document
Returns the next token in the document or null if there are none
Node - Class in ir.webutils
Node in the the Graph data structure.
Node(String) - Constructor for class ir.webutils.Node
Constructs a node with that name.
nodeArray() - Method in class ir.webutils.Graph
Returns all the nodes of the graph.
numberFound - Variable in class ir.webutils.StringSearchResult
Number of different strings found
numberOccurrences - Variable in class ir.webutils.StringSearchResult
Total number of occurrences of any of the strings
numberOfTokens() - Method in class ir.vsr.Document
Returns the total number of tokens in the document or -1 if there are still more tokens to be read and the total count is not yet available.
numCategories - Variable in class ir.classifiers.Perceptron
Number of categories
numClasses - Variable in class ir.classifiers.CVLearningCurve
Number of classes in the data
numFolds - Variable in class ir.classifiers.CVLearningCurve
Number of folds of cross validation to run
numStopWords - Static variable in class ir.vsr.Document
The number of stopwords in this file
numTokens - Variable in class ir.vsr.Document
The number of tokens currently read from document

O

occList - Variable in class ir.vsr.TokenInfo
A list of TokenOccurences giving documents where this token occurs
outFile - Variable in class ir.eval.Experiment
The output file where final recall/precision result data is printed.
outLinks - Variable in class ir.webutils.HTMLPage
The links on this page

P

padTo(String, int, char) - Static method in class ir.utilities.MoreString
Pad a string with a specific char on the right to make it the specified length
padTo(String, int) - Static method in class ir.utilities.MoreString
Pad a string with blanks on the right to make it the specified length
padToLeft(String, int, char) - Static method in class ir.utilities.MoreString
Pad a string with a specific char on the left to make it the specified length
padToLeft(String, int) - Static method in class ir.utilities.MoreString
Pad a string with blanks on the left to make it the specified length
padToLeft(double, int) - Static method in class ir.utilities.MoreString
Convert a double to a string and pad with blanks on the left to make it the specified length
padToLeft(int, int) - Static method in class ir.utilities.MoreString
Convert an int to a string and pad with blanks on the left to make it the specified length
padWithZeros(int, int) - Static method in class ir.utilities.MoreString
 
padWithZeros(double, int) - Static method in class ir.utilities.MoreString
 
page - Variable in class ir.webutils.LinkExtractor
The page from which to extract links
PageGoal - Class in ir.webutils
Object for defining the goal in a heuristic web search.
PageGoal(String[]) - Constructor for class ir.webutils.PageGoal
Construct a PageGoal with these wantStrings
pageScore - Variable in class ir.webutils.ScoredAnchoredLink
The heuristic score assigned to the page to which this link points
parseMetaTags() - Method in class ir.webutils.RobotsMetaTagParser
Parses the document and returns a list of links that can not be followed.
PathDisallowedException - Exception in ir.webutils
PathDisallowedException is thrown to indicate that a client program tried to access a path that was disallowed by either a robots.txt file or a robots META tag.
PathDisallowedException() - Constructor for exception ir.webutils.PathDisallowedException
 
PathDisallowedException(String) - Constructor for exception ir.webutils.PathDisallowedException
 
pearsonCorrelation(double[], double[]) - Static method in class ir.utilities.Stats
Return the Pearson Correlation between the vectors x and y.
Perceptron - Class in ir.classifiers
A perceptron classifier that trains a perceptron to recognize each category.
Perceptron(String[], boolean) - Constructor for class ir.classifiers.Perceptron
Create an Perceptron classifier with these attributes
PerceptronUnit - Class in ir.classifiers
An individual perceptron object used by the Perceptron classifier Includes methods for classifying an Example and training the unit to fire only for a given category of examples.
PerceptronUnit(String[], boolean) - Constructor for class ir.classifiers.PerceptronUnit
Create an PerceptronUnit with these attributes
point - Variable in class ir.classifiers.PointResults
Point on curve at which results are for
PointResults - Class in ir.classifiers
Utility class for generating average result curves.
PointResults(int) - Constructor for class ir.classifiers.PointResults
Create a vector of results for a point
points - Variable in class ir.classifiers.CVLearningCurve
Points on the X axis (percentage of train data) to plot
Porter - Class in ir.utilities
The Porter stemmer for reducing words to their base stem form.
Porter() - Constructor for class ir.utilities.Porter
 
position - Variable in class ir.vsr.DocumentIterator
The current position of the iterator in this array
precision - Variable in class ir.eval.RecallPrecisionPair
 
prefix(String) - Method in class ir.utilities.FilePrefixer
 
prepareNextToken() - Method in class ir.vsr.Document
The nextToken slot is always precomputed and stored by this method.
presentRetrievals(HashMapVector, Retrieval[]) - Method in class ir.vsr.InvertedIndex
Print out a ranked set of retrievals.
print() - Method in class ir.vsr.HashMapVector
Print out the vector showing the tokens and their weights
print() - Method in class ir.vsr.InvertedIndex
Print out an inverted index by listing each token and the documents it occurs in.
print() - Method in class ir.webutils.Graph
Prints the entire graph on stdout.
printRetrievals(Retrieval[], int) - Method in class ir.vsr.InvertedIndex
Print out at most MAX_RETRIEVALS ranked retrievals starting at given starting rank number.
printVector(double[]) - Static method in class ir.utilities.MoreMath
Print a vector in the form [x,y,...z] to standard out
printVector(double[], PrintStream) - Static method in class ir.utilities.MoreMath
Print a vector in the form [x,y,...z] to the print stream
printVector() - Method in class ir.vsr.Document
Compute and print out (one line per term) the term-vector (bag of words) for this document
processArgs(String[]) - Method in class ir.webutils.BeamSearchSpider
Processes command-line arguments.
processArgs(String[]) - Method in class ir.webutils.Spider
Processes command-line arguments.
processQueries() - Method in class ir.vsr.InvertedIndex
Enter an interactive user-query loop, accepting queries and showing the retrieved documents in ranked order.
prompt(String) - Static method in class ir.utilities.UserInput
Prompt the user with a string and then get a line of input

Q

queryFile - Variable in class ir.eval.Experiment
The file with the list of queries and results to be tested.
queryVector - Variable in class ir.vsr.Feedback
The original query vector for this query

R

random - Static variable in class ir.classifiers.Classifier
Used for breaking ties in argMax()
randomSeed - Variable in class ir.classifiers.CVLearningCurve
Seed for random number generator
reader - Variable in class ir.vsr.FileDocument
The I/O reader for accessing the file
readFromFile(String) - Method in class ir.webutils.Graph
Reads graph from file where each line consists of a node-name followed by a list of the names of nodes to which it points
readLine() - Static method in class ir.utilities.UserInput
Read a line of input from the user
recall - Variable in class ir.eval.RecallPrecisionPair
 
RECALL_LEVELS - Static variable in class ir.eval.Experiment
The standard recall levels for which we want to plot precision values
RecallPrecisionPair - Class in ir.eval
A lightweight object for storing a pair of recall precision measures
RecallPrecisionPair(double, double) - Constructor for class ir.eval.RecallPrecisionPair
 
removeEndSlash(URL) - Static method in class ir.webutils.Link
Removes slash at end of URL to normalize
removeRef(URL) - Static method in class ir.webutils.Link
Remove the internal "ref" pointer in a URL if there is one.
resetIterator() - Method in class ir.webutils.Graph
Resets the iterator.
results - Variable in class ir.classifiers.PointResults
Sampled values of result at this point
Retrieval - Class in ir.vsr
A lightweight object for storing information about a retrieved Document.
Retrieval(DocumentReference, double) - Constructor for class ir.vsr.Retrieval
Create a retrieval with these values
retrievals - Variable in class ir.vsr.Feedback
The current list of ranked retrievals
retrieve(String) - Method in class ir.vsr.InvertedIndex
Perform ranked retrieval on this input query.
retrieve(Document) - Method in class ir.vsr.InvertedIndex
Perform ranked retrieval on this input query Document.
retrieve(HashMapVector) - Method in class ir.vsr.InvertedIndex
Perform ranked retrieval on this input query Document vector.
retriever - Variable in class ir.webutils.Spider
The object to be used to retrieve pages
RobotExclusionSet - Class in ir.webutils
RobotExclusionSet provides support for the Robots Exclusion Protocol.
RobotExclusionSet() - Constructor for class ir.webutils.RobotExclusionSet
Constructs an empty set.
RobotExclusionSet(String) - Constructor for class ir.webutils.RobotExclusionSet
Constructs a set containing the paths in the robots.txt file for this site.
RobotsMetaTagParser - Class in ir.webutils
Parser callback that extracts robots META tag information.
RobotsMetaTagParser() - Constructor for class ir.webutils.RobotsMetaTagParser
 
RobotsMetaTagParser(URL) - Constructor for class ir.webutils.RobotsMetaTagParser
 
RobotsMetaTagParser(URL, String) - Constructor for class ir.webutils.RobotsMetaTagParser
 
roundTo(double, int) - Static method in class ir.utilities.MoreMath
Round a double to the given number of decimalPlaces
run() - Method in class ir.classifiers.CVLearningCurve
Run a CV learning curve test and print total training and test time and generate an averge learning curve plot output files suitable for gunuplot

S

SafeHTMLPage - Class in ir.webutils
SafeHTMLPage is an immutable representation of information about a web page that includes information about whether or not this page can be indexed.
SafeHTMLPage(Link, String, boolean) - Constructor for class ir.webutils.SafeHTMLPage
Constructs an SafeHTMLPage with the given link, text, and indication whether or not indexing is allowed.
SafeHTMLPageRetriever - Class in ir.webutils
Keeps track of Robot Exclusion information.
SafeHTMLPageRetriever() - Constructor for class ir.webutils.SafeHTMLPageRetriever
 
satisfiedBy(HTMLPage) - Method in class ir.webutils.PageGoal
Returns true if this page satisfies the goal by containing all of the wantStrings
saveDir - Variable in class ir.webutils.Spider
The directory to save the downloaded files to.
score - Variable in class ir.vsr.Retrieval
The score given to this document by a retrieval engine.
score - Variable in class ir.webutils.ScoredAnchoredLink
The heuristic score assigned to this link
ScoredAnchoredLink - Class in ir.webutils
An AnchoredLink that can be used in heuristic web search where links are scored for their promise.
ScoredAnchoredLink(URL, String, Link, int) - Constructor for class ir.webutils.ScoredAnchoredLink
Constructs a link with specified URL and anchor text and backLink
ScoredAnchoredLink(URL, Link, int) - Constructor for class ir.webutils.ScoredAnchoredLink
Constructs a link with specified URL and backLink
ScoredAnchoredLink(String) - Constructor for class ir.webutils.ScoredAnchoredLink
Construct a link with specified URL string
ScoredAnchoredLinkExtractor - Class in ir.webutils
An AnchoredLinkExtractor that extracts ScoredAnchoredLink's that can be scored and used in heuristic web search.
ScoredAnchoredLinkExtractor(HTMLPage) - Constructor for class ir.webutils.ScoredAnchoredLinkExtractor
Create an ScoredAnchoredLink extractor for the given page
scoreLink(ScoredAnchoredLink, HTMLPage) - Method in class ir.webutils.LinkHeuristic
Heuristically score the given link appearing on the given page
scoreLinks(List<Link>, HTMLPage) - Method in class ir.webutils.BeamSearchSpider
Use the heuristic to score each of the new links on a given page that was expanded.
segment(String, char) - Static method in class ir.utilities.MoreString
Segment a string into substrings by breaking at occurences of the given character and returning a list of segments
segmentToArray(String, char) - Static method in class ir.utilities.MoreString
Segment a string into substrings by breaking at occurrences of the given character and returning an array of all the segments, in order
setAnchorText(String) - Method in class ir.webutils.AnchoredLink
Return anchor text for link
setCategory(int) - Method in class ir.classifiers.Example
Sets the category of the example
setClassifier(Classifier) - Method in class ir.classifiers.CVLearningCurve
Set the classifier
setClassPriors(double[]) - Method in class ir.classifiers.BayesResult
Sets the class priors
setDebug(boolean) - Method in class ir.classifiers.NaiveBayes
Sets the debug flag
setDebug(boolean) - Method in class ir.classifiers.Perceptron
Sets the debug flag
setDocument(FileDocument) - Method in class ir.classifiers.Example
Sets the document of the example
setEndPosition(int) - Method in class ir.webutils.ScoredAnchoredLink
Set endPosition for link
setEpsilon(double) - Method in class ir.classifiers.NaiveBayes
Sets the value of EPSILON (default 1e-6)
setFeatureTable(Hashtable<String, double[]>) - Method in class ir.classifiers.BayesResult
Sets the feature hash
setFoldBins(Vector<Example>[][]) - Method in class ir.classifiers.CVLearningCurve
Set the fold Bins
setHashMapVector(HashMapVector) - Method in class ir.classifiers.Example
Sets the hashVector of the example
setInvertedIndex(InvertedIndex) - Method in class ir.classifiers.Perceptron
Since Perceptron does not use an inverted Index, this function does nothing
setLaplace(boolean) - Method in class ir.classifiers.NaiveBayes
Sets the Laplace smoothing flag
setName(String) - Method in class ir.classifiers.Example
Sets the name of the example
setOutLinks(List<Link>) - Method in class ir.webutils.HTMLPage
Set of the outLinks for this page to given list
setPage(String) - Method in class ir.webutils.RobotsMetaTagParser
 
setPoint(double) - Method in class ir.classifiers.PointResults
 
setTotalExamples(Vector<Example>[]) - Method in class ir.classifiers.CVLearningCurve
Set all the examples
setTotalExamples(List<Example>) - Method in class ir.classifiers.CVLearningCurve
Sets the totalExamples by partitioning examples into categories to get a stratified sample
setUrl(URL) - Method in class ir.webutils.RobotsMetaTagParser
 
setValue(int) - Method in class ir.utilities.Counter
Set the current count
setValue(int) - Method in class ir.utilities.Weight
Set the current count
setValue(double) - Method in class ir.utilities.Weight
Set the current count
showRetrievals(Retrieval[]) - Method in class ir.vsr.InvertedIndex
Show the top retrievals to the user if there are any.
SiteSpider - Class in ir.webutils
A spider that limits itself to a given site.
SiteSpider() - Constructor for class ir.webutils.SiteSpider
 
size() - Method in class ir.vsr.HashMapVector
Returns the number of tokens in the vector.
size() - Method in class ir.vsr.InvertedIndex
Return the number of tokens indexed.
size() - Method in class ir.webutils.RobotExclusionSet
 
sizeOfFold(int) - Method in class ir.classifiers.CVLearningCurve
Computes the total number of examples in given fold
slow - Variable in class ir.webutils.Spider
Flag to purposely slow the crawl for debugging purposes
Spider - Class in ir.webutils
Spider defines a framework for writing a web crawler.
Spider() - Constructor for class ir.webutils.Spider
 
standardDeviation(double[]) - Static method in class ir.utilities.Stats
Return the standard deviation of the argument values.
startsWithIgnoreCase(String, String, int) - Static method in class ir.utilities.MoreString
 
startsWithIgnoreCase(String, String) - Static method in class ir.utilities.MoreString
 
startsWithPhrase(String, String, int) - Static method in class ir.utilities.MoreString
Version of String method startsWith that treats all whitespace characters in substring as matching any "word boundary" and matches lowercase characters in the substring to either lower or uppercase in the string, but uppercase characters in the substring must match uppercase in the string
startsWithPhrase(String, String) - Static method in class ir.utilities.MoreString
Version of String method startsWith that treats all whitespace characters in substring as matching any "word boundary" and matches lowercase characters in the substring to either lower or uppercase in the string, but uppercase characters in the substring must match uppercase in the string
Stats - Class in ir.utilities
A place to put statistical routines
Stats() - Constructor for class ir.utilities.Stats
 
stem - Variable in class ir.classifiers.DirectoryExamplesConstructor
Flag set to stem words to their root forms
stem - Variable in class ir.vsr.Document
Whether to stem tokens with the Porter stemmer
stem - Variable in class ir.vsr.DocumentIterator
Whether tokens should be stemmed with Porter stemmer
stem - Variable in class ir.vsr.InvertedIndex
Whether tokens should be stemmed with Porter stemmer
stemmer - Static variable in class ir.vsr.Document
The Porter stemmer
stopWords - Static variable in class ir.vsr.Document
The hashtable where stopwords are indexed
stopWordsFile - Static variable in class ir.vsr.Document
The file where a list of stopwords, 1 per line, are stored
StringSearchResult - Class in ir.webutils
Lightweight object for storing both the number of DIFFERENT strings in a set of search strings that are found in a text as well as the total number of occurrences in the text of ANY of the strings in the set.
StringSearchResult(int, int) - Constructor for class ir.webutils.StringSearchResult
Construct result with a given numberFound and numberOccurrences
stripAffixes(String) - Method in class ir.utilities.Porter
Takes a String as input and returns its stem as a String.
subtract(HashMapVector) - Method in class ir.vsr.HashMapVector
Destructively subtract the given vector from the current vector

T

test(Example) - Method in class ir.classifiers.Classifier
Returns true if the predicted category of the test example matches the correct category, false otherwise
test(Example) - Method in class ir.classifiers.NaiveBayes
Categorizes the test example using the trained Naive Bayes classifier, returning true if the predicted category is same as the actual category
test(Example) - Method in class ir.classifiers.Perceptron
Categorizes the test example using the trained Perceptron classifier, returning true if the predicted category is same as the actual category
TestNaiveBayes - Class in ir.classifiers
Wrapper class to test NaiveBayes classifier using 10-fold CV.
TestNaiveBayes() - Constructor for class ir.classifiers.TestNaiveBayes
 
TestPerceptron - Class in ir.classifiers
Wrapper class to test Perceptron classifier using 10-fold CV.
TestPerceptron() - Constructor for class ir.classifiers.TestPerceptron
 
testResults - Variable in class ir.classifiers.CVLearningCurve
Accuracy results for test data, one PointResults for each point on the curve
testTime - Variable in class ir.classifiers.CVLearningCurve
Total Testing time
testTimeNum - Variable in class ir.classifiers.CVLearningCurve
Total number of examples tested in test time
text - Variable in class ir.webutils.HTMLPage
The text of the page
TextFileDocument - Class in ir.vsr
A normal ASCII text file Document
TextFileDocument(File, boolean) - Constructor for class ir.vsr.TextFileDocument
Create a new text document for the given file.
TextFileDocument(String, boolean) - Constructor for class ir.vsr.TextFileDocument
Create a new text document for the given file name.
textReader - Variable in class ir.vsr.HTMLFileDocument
The I/O reader for accessing the output of the HTML parser.
TextStringDocument - Class in ir.vsr
A simple document represented by a String
TextStringDocument(String, boolean) - Constructor for class ir.vsr.TextStringDocument
Create a simple Document for this string
threshold - Variable in class ir.classifiers.PerceptronUnit
The threshold of the perceptron.
tokenHash - Variable in class ir.vsr.InvertedIndex
A HashMap where tokens are indexed.
TokenInfo - Class in ir.vsr
A lightweight object for storing information about a token (a.k.a word, term) in an inverted index.
TokenInfo() - Constructor for class ir.vsr.TokenInfo
Create an initially empty data structure
tokenizer - Variable in class ir.vsr.HTMLFileDocument
The tokenizer for lines read from this document.
tokenizer - Variable in class ir.vsr.TextFileDocument
The tokenizer for lines read from this document.
tokenizer - Variable in class ir.vsr.TextStringDocument
The tokenizer for this document.
tokenizerDelim - Static variable in class ir.vsr.HTMLFileDocument
StringTokenizer delim for tokenizing only alphabetic strings.
tokenizerDelim - Static variable in class ir.vsr.TextFileDocument
StringTokenizer delim for tokenizing only alphabetic strings.
tokenizerDelim - Static variable in class ir.vsr.TextStringDocument
StringTokenizer delim for tokenizing only alphabetic strings.
TokenOccurrence - Class in ir.vsr
A lightweight object for storing information about an occurrence of a token (a.k.a word, term) in a Document.
TokenOccurrence(DocumentReference, int) - Constructor for class ir.vsr.TokenOccurrence
Create an occurrence with these values
toString() - Method in class ir.classifiers.Example
Returns the String representation of the example object
toString() - Method in class ir.classifiers.PointResults
 
toString() - Method in class ir.eval.RecallPrecisionPair
 
toString() - Method in class ir.vsr.DocumentReference
 
toString() - Method in class ir.vsr.HashMapVector
Return String of the vector showing the tokens and their weights
toString() - Method in class ir.vsr.TokenOccurrence
 
toString() - Method in class ir.webutils.AnchoredLink
 
toString() - Method in class ir.webutils.Link
 
toString() - Method in class ir.webutils.Node
Returns the name of the node
totalExamples - Variable in class ir.classifiers.CVLearningCurve
Stores all the examples for each class
totalNumTrain - Variable in class ir.classifiers.CVLearningCurve
Total number of training examples per fold
train(List<Example>) - Method in class ir.classifiers.Classifier
Trains the classifier on the training examples
train(List<Example>) - Method in class ir.classifiers.NaiveBayes
Trains the Naive Bayes classifier - estimates the prior probs and calculates the counts for each feature in different categories
train(List) - Method in class ir.classifiers.Perceptron
Trains the perceptron by training a PerceptronUnit for each category.
trainAndTest() - Method in class ir.classifiers.CVLearningCurve
Run training and test for each point to be plotted, gathering a result for each fold.
trainAndTestFold(Vector<Example>, Vector<Example>, int, PointResults, PointResults) - Method in class ir.classifiers.CVLearningCurve
Train and test on given example sets for the given fold:
trainCategory(List, int) - Method in class ir.classifiers.PerceptronUnit
Trains the perceptron to only fire for examples in the given category, i.e.
trainResults - Variable in class ir.classifiers.CVLearningCurve
Accuracy results for training data, one PointResults for each point on the curve
trainTime - Variable in class ir.classifiers.CVLearningCurve
Total Training time
TYPE_HTML - Static variable in class ir.vsr.DocumentIterator
docType for HTML files
TYPE_TEXT - Static variable in class ir.vsr.DocumentIterator
docType for ASCII text files

U

url - Variable in class ir.webutils.LinkExtractor
The URL for this page
URLChecker - Class in ir.webutils
URLChecker tries to clean up some URLs that do not conform to the standard and cause confusion.
UserInput - Class in ir.utilities
A place to put some helper functions for interacting with the user
UserInput() - Constructor for class ir.utilities.UserInput
 
usesInvertedIndex() - Method in class ir.classifiers.Perceptron
Function to indicate that this class does not use an inverted index

V

value - Variable in class ir.utilities.DoubleValue
 
value - Variable in class ir.utilities.Weight
A numerical weight value
vectorDivide(double[], double) - Static method in class ir.utilities.MoreMath
Divide a vector by a scalar and return the resulting vector
vectorLength(double[]) - Static method in class ir.utilities.MoreMath
 
vectorOneNorm(double[]) - Static method in class ir.utilities.MoreMath
 
visited - Variable in class ir.webutils.Spider
The URLs that have already been visited.

W

wantStrings - Variable in class ir.webutils.LinkHeuristic
The array of want strings that are desired
WebPage - Class in ir.webutils
WebPage is a static utility class that provides operations for downloading web pages.
WebPage() - Constructor for class ir.webutils.WebPage
 
WebPageViewer - Class in ir.webutils
WebPageViewer contains utilities to download and display HTML pages.
WebPageViewer() - Constructor for class ir.webutils.WebPageViewer
 
Weight - Class in ir.utilities
A simple wrapper data structure for storing a double weight as an Object that can be put into lists, maps, etc.
Weight() - Constructor for class ir.utilities.Weight
 
weights - Variable in class ir.classifiers.PerceptronUnit
The weights of the perceptron.
write(File, String) - Method in class ir.webutils.HTMLPage
Writes web page to a file with a BASE HTML element with the original URL.

A B C D E F G H I L M N O P Q R S T U V W