Abstract
There have been many attempts in the history of Information Retrieval (IR) to add some linguistic capabilites to standard IR systems in order to improve their performance (mainly, their precision). These attempts have not been very successful so far, at least not in the standard IR settings.
The two main reasons are the (related but not identical) problems of data volume and of scalability. First, the volume of data typically processed by IR systems is so large that the use of more than a few isolated linguistic components seemed out of the question, and linguistic components do not work well in isolation. Second, NLP systems that work reasonably well in small scale laboratory contexts will often not scale up to real world domains like those for which IR is standardly used. Both of these points seem to all but rule out the use of full-fledged NLP methods in standard text retrieval applications.