Script started on Thu Jan 3 14:37:49 2008 proj1-draft> setenv CLASSPATH '/u/mooney/ir-course/' proj1-draft> java ir.vsr.InvertedIndex -html ~/ir-code/corpora/yahoo-science/ Indexing documents in /u/mooney/ir-code/corpora/yahoo-science java.lang.ArrayIndexOutOfBoundsException: 88 at javax.swing.text.html.parser.ContentModel.first(ContentModel.java:160) at javax.swing.text.html.parser.ContentModel.first(ContentModel.java:139) at javax.swing.text.html.parser.ContentModelState.advance(ContentModelState.java:174) at javax.swing.text.html.parser.TagStack.advance(TagStack.java:137) at javax.swing.text.html.parser.Parser.legalElementContext(Parser.java:519) at javax.swing.text.html.parser.Parser.legalElementContext(Parser.java:621) at javax.swing.text.html.parser.Parser.legalElementContext(Parser.java:683) at javax.swing.text.html.parser.Parser.legalTagContext(Parser.java:694) at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1890) at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1940) at javax.swing.text.html.parser.Parser.parse(Parser.java:2107) at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:105) at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.java:73) at ir.vsr.HTMLFileParserThread.run(HTMLFileDocument.java:46) Indexed 900 documents with 40039 unique terms. Now able to process queries. When done, enter an empty query to exit. Enter query: computer science Top 10 matching Documents from most to least relevant: 1. phys-page173.html Score: 0.25404 2. phys-page135.html Score: 0.12011 3. phys-page10.html Score: 0.08825 4. chem-page16.html Score: 0.07895 5. chem-page200.html Score: 0.07532 6. chem-page68.html Score: 0.07235 7. bio-page100.html Score: 0.07235 8. phys-page251.html Score: 0.06712 9. chem-page65.html Score: 0.06664 10. phys-page39.html Score: 0.06659 Enter `m' to see more, a number to show the nth document, nothing to exit. Enter command: 1 Showing document 1 in the firefox window. Enter command: 2 Showing document 2 in the firefox window. Enter command: 3 Showing document 3 in the firefox window. Enter command: 4 Showing document 4 in the firefox window. Enter command: 5 Showing document 5 in the firefox window. Enter command: 6 Showing document 6 in the firefox window. Enter command: 7 Showing document 7 in the firefox window. Enter command: 8 Showing document 8 in the firefox window. Enter command: Enter query: information theory Top 10 matching Documents from most to least relevant: 1. phys-page77.html Score: 0.33049 2. phys-page225.html Score: 0.23927 3. phys-page300.html Score: 0.22832 4. chem-page240.html Score: 0.21768 5. phys-page47.html Score: 0.15728 6. phys-page114.html Score: 0.14943 7. phys-page94.html Score: 0.1466 8. phys-page239.html Score: 0.14327 9. chem-page2.html Score: 0.14256 10. phys-page167.html Score: 0.14181 Enter `m' to see more, a number to show the nth document, nothing to exit. Enter command: 1 Showing document 1 in the firefox window. Enter command: 2 Showing document 2 in the firefox window. Enter command: 3 Showing document 3 in the firefox window. Enter command: 4 Showing document 4 in the firefox window. Enter command: 5 Showing document 5 in the firefox window. Enter command: 6 Showing document 6 in the firefox window. Enter command: 7 Showing document 7 in the firefox window. Enter command: 8 Showing document 8 in the firefox window. Enter command: Enter query: einstein rosen Top 10 matching Documents from most to least relevant: 1. phys-page257.html Score: 0.36458 2. phys-page131.html Score: 0.30719 3. phys-page193.html Score: 0.30013 4. phys-page114.html Score: 0.22944 5. phys-page18.html Score: 0.15494 6. phys-page62.html Score: 0.13327 7. phys-page245.html Score: 0.12645 8. phys-page154.html Score: 0.10634 9. phys-page47.html Score: 0.08603 10. phys-page108.html Score: 0.08533 Enter `m' to see more, a number to show the nth document, nothing to exit. Enter command: 1 Showing document 1 in the firefox window. Enter command: 3^H   2 Showing document 2 in the firefox window. Enter command: 3 Showing document 3 in the firefox window. Enter command: 4 Showing document 4 in the firefox window. Enter command: 5 Showing document 5 in the firefox window. Enter command: 6 Showing document 6 in the firefox window. Enter command: 7 Showing document 7 in the firefox window. Enter command: 8 Showing document 8 in the firefox window. Enter command: 9 Showing document 9 in the firefox window. Enter command: 10 Showing document 10 in the firefox window. Enter command: m 11. phys-page77.html Score: 0.07336 12. phys-page82.html Score: 0.06451 13. phys-page155.html Score: 0.06312 14. phys-page225.html Score: 0.05817 15. phys-page196.html Score: 0.05698 16. phys-page288.html Score: 0.05082 17. phys-page262.html Score: 0.04894 18. phys-page216.html Score: 0.04377 19. phys-page53.html Score: 0.04328 20. phys-page294.html Score: 0.04055 Enter command: 11 Showing document 11 in the firefox window. Enter command: 12 Showing document 12 in the firefox window. Enter command: 13 Showing document 13 in the firefox window. Enter command: Enter query: human genome Top 10 matching Documents from most to least relevant: 1. bio-page60.html Score: 0.54776 2. bio-page34.html Score: 0.2903 3. bio-page256.html Score: 0.19846 4. bio-page6.html Score: 0.10647 5. bio-page170.html Score: 0.10526 6. bio-page296.html Score: 0.10052 7. bio-page280.html Score: 0.08524 8. bio-page242.html Score: 0.08312 9. bio-page138.html Score: 0.06435 10. bio-page169.html Score: 0.06308 Enter `m' to see more, a number to show the nth document, nothing to exit. Enter command: 1 Showing document 1 in the firefox window. Enter command: 2 Showing document 2 in the firefox window. Enter command: 3 Showing document 3 in the firefox window. Enter command: 4 Showing document 4 in the firefox window. Enter command: Enter query: computer information science Top 10 matching Documents from most to least relevant: 1. phys-page173.html Score: 0.23745 2. phys-page135.html Score: 0.11972 3. chem-page68.html Score: 0.08984 4. bio-page100.html Score: 0.08984 5. phys-page10.html Score: 0.08741 6. chem-page200.html Score: 0.08253 7. chem-page16.html Score: 0.0755 8. bio-page170.html Score: 0.0723 9. phys-page39.html Score: 0.06926 10. phys-page161.html Score: 0.0687 Enter `m' to see more, a number to show the nth document, nothing to exit. Enter command: 1 Showing document 1 in the firefox window. Enter command: 2 Showing document 2 in the firefox window. Enter command: 3 Showing document 3 in the firefox window. Enter command: 4 Showing document 4 in the firefox window. Enter command: 5 Showing document 5 in the firefox window. Enter command: Enter query: proj1-draft> ^Dexit Script done on Thu Jan 3 14:51:05 2008