| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectir.webutils.Spider
ir.webutils.BeamSearchSpider
ir.webutils.BeamSearchSiteSpider
public class BeamSearchSiteSpider
A BeamSearchSpider that limits itself to a given site (web host).
| Field Summary | 
|---|
| Fields inherited from class ir.webutils.BeamSearchSpider | 
|---|
| beamSize, goal, goalPage, heuristic | 
| Fields inherited from class ir.webutils.Spider | 
|---|
| count, linksToVisit, maxCount, retriever, saveDir, slow, visited | 
| Constructor Summary | |
|---|---|
| BeamSearchSiteSpider() | |
| Method Summary | |
|---|---|
|  java.util.List<Link> | getNewLinks(HTMLPage page)Gets links from the given page that are on the same host as the page. | 
| static void | main(java.lang.String[] args)Search the web using beam search according to the following command options, but stay within the initial host site. | 
| Methods inherited from class ir.webutils.BeamSearchSpider | 
|---|
| constructLinkHeuristic, doCrawl, go, handleBCommandLineOption, handleHCommandLineOption, handleUCommandLineOption, handleWCommandLineOption, processArgs, scoreLinks | 
| Methods inherited from class ir.webutils.Spider | 
|---|
| handleCCommandLineOption, handleDCommandLineOption, handleSafeCommandLineOption, handleSlowCommandLineOption, indexPage, linkToHTMLPage | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public BeamSearchSiteSpider()
| Method Detail | 
|---|
public java.util.List<Link> getNewLinks(HTMLPage page)
getNewLinks in class BeamSearchSpiderpage - The current page.
page that have the same
 host as url.public static void main(java.lang.String[] args)
| 
 | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||