public final class SafeHTMLPageRetriever extends HTMLPageRetriever
| Constructor and Description |
|---|
SafeHTMLPageRetriever() |
| Modifier and Type | Method and Description |
|---|---|
HTMLPage |
getHTMLPage(Link link)
Tries to download the given web page.
|
public HTMLPage getHTMLPage(Link link) throws PathDisallowedException
PathDisallowedException if access to the page is
prohibited. Also updates Robots Exclusion information based on
the new page.getHTMLPage in class HTMLPageRetrieverlink - The Link to follow and download.PathDisallowedException - If url is
disallowed by a robots.txt file or Robots META tag.