Relaxation in Web Search:
A New Paradigm for Search by Boolean Queries

Parke Godfrey


The complete paper (8 pages) is available in

March 1998 draft. Submitted.


Abstract

Search by boolean queries suffers from a paradox of precision: if the query is too general (with respect to the corpus), the response is an avalanche; if the query is too specific, the response is empty. It is difficult to cast a query balanced on this scale of specificity that retrieves a reasonable number of items. Nowhere is this more evident, perhaps, than in search on the World Wide Web.

For this reason, many Web search engines employ information retrieval (IR) techniques other than boolean search. Indeed, this problem of specificity is a key reason, in general, that other IR techniques have been researched. For instance, AltaVista's standard interface employs an IR scoring technique across the document list to rank-order a result list. (AltaVista's "advanced" interface is for boolean search.) However, the IR searches have problems too. Quite often, IR search queries result in avalanches. If the IR scoring is not "natural" for the query at hand, the documents-of-interest may be buried in the results, never to be found.

The utility of boolean queries could be saved if a good method to recover from over-specified queries were evident. In this paper, we work to devise such a method. We show how boolean queries can be relaxed automatically, by throwing away just as many keywords as necessary for the query to succeed. With this relaxation facility, a new search paradigm becomes possible: ask a specific query; it will be relaxed to the points of success to find the closest related documents.


Bibliography
  1. [paper] Digital's AltaVista web search engine. Web site.
  2. N. D. Belnap and T. B. Steel. The Logic of Questions and Answers. Yale University Press, New Haven, CT, 1976.
  3. Tom Bylander, Dean Allemang, Michael C. Tanner, and John R. Josephson. The computational complexity of abduction. Artificial Intelligence, 49:25--60, 1991.
  4. A. Colmerauer and J. F. Pique. About natural logic. In Gallaire et al. [12], pages 343-365.
  5. Francisco Corella, S. Jerrold Kaplan, Gio Wiederhold, and Lena Yesil. Cooperative responses to boolean queries. In First International Conference on Data Engineering, pages 77-85, Silver Spring, Maryland, 1984. IEEE Computer Society Press.
  6. Veronica Dahl and Patrick Saint-Dizier, editors. Natural Language Understanding and Logic Programming. North Holland, 1988.
  7. Robert Demolombe and Tomasz Imielinski, editors. Nonstandard Queries and Nonstandard Answers. Studies in Logic and Computation 3. Clarendon Press, Oxford, 1994.
  8. [paper] Excite web search engine. Web site.
  9. Terry Gaasterland. Cooperative Answers for Database Queries. PhD thesis, University of Maryland, Department of Computer Science, College Park, 1992.
  10. Terry Gaasterland, Parke Godfrey, and Jack Minker. An overview of cooperative answering. Journal of Intelligent Information Systems, 1(2):123-157, 1992.
  11. [paper] Terry Gaasterland, Parke Godfrey, and Jack Minker. An overview of cooperative answering. In Demolombe and Imielinski [6], chapter 1, pages 1-40. Appears orginally as [8].
  12. Annie Gal and Jack Minker. A natural language database interface that provides cooperative answers. Proceedings of the Second Conference on Artificial Intelligence Applications, December 11--13 1985.
  13. Annie Gal and Jack Minker. Informative and cooperative answers in databases using integrity constraints. In Dahl and Saint-Dizier [5], pages 277-300.
  14. H. Gallaire, J. Minker, and J-M. Nicolas, editors. Advances in Database Theory, Volume 1. Plenum Press, New York, 1981.
  15. [paper] Parke Godfrey. Minimization in cooperative response to failing database queries. International Journal of Cooperative Information Systems (IJCIS), 6(2):95-149, June 1997.
  16. Jürgen M. Janas. On the feasibility of informative answers. In Gallaire et al. [12], pages 397-414.
  17. A. Joshi, B. Webber, and I. Sag, editors. Elements of Discourse Understanding. Cambridge University Press, 1981.
  18. S. Jerrold Kaplan. Appropriate responses to inappropriate questions. In Joshi et al. [15], pages 127-144.
  19. S. Jerrold Kaplan. Cooperative responses from a portable natural language query system. Artificial Intelligence, 19(2):165-187, October 1982.
  20. Ronald M. Lee. Conversational aspects of database interactions. In Proceedings of the 4\em th International Conference on Very Large Data Bases, Berlin, 1978.
  21. [paper] Lexis-nexis. Web site.
  22. Udi Manber, Michael Smith, and Burra Gopal. WebGlimpse---combining browsing and searching. In Proceedings of the Usenix Technical Conference, Los Angeles, California, January 1997.
  23. Udi Manber and Sum Wu. GLIMPSE: A tool to search through entire file systems. Technical Report TR 93-34, Department of Computer Science, The University of Arizona, Tucson, Arizona, October 1993.
  24. Amihai Motro. SEAVE: A mechanism for verifying user presuppositions in query systems. ACM Transactions on Office Information Systems, 4(4):312--330, October 1986.
  25. Amihai Motro. FLEX: A tolerant and cooperative user interface to databases. IEEE Transactions on Knowledge and Data Engineering, 2(2):231-246, June 1990.
  26. [paper] NASA's earth observing system (EOS). Web site.
  27. Peter Frederick Strawson. Introduction to Logical Theory. Methuen, London, 1974.
  28. William A. Woods. Beyond ignorance-based systems (abstract). In Jon Doyle, Erik Sandewall, and Pietro Torasso, editors, Proceedings of the 4th International Conference on Principles of Knowledge Representation and Reasoning (KR'94), page 646, Bonn, Germany, May 1994. Morgan Kaufmann.
  29. William A. Woods. Beyond search engines. Talk at University of Maryland Institute for Advanced Studies Colloquium Series, September 1997.