Skip to main content



      Home
Home » Eclipse Projects » Eclipse Platform » Searching for numbers in help system
Searching for numbers in help system [message #242943] Thu, 20 May 2004 16:07 Go to next message
Eclipse UserFriend
Why do I have to search for numbers by enclosing them in quotes or using
a wildcard character?

I tried a few different searches at http://help.eclipse.org/help21/

Search on:
- results of search

2.1
- returned no results

"2.1"
- returned the expected results

2.*
- returned the expected results

It would be really nice if the search for 2.1 without quotes returned
the same results as a search within quotes. I'm guessing that Lucene's
stemming algorithm is kicking in and causing the unexpected results for
digits; as a workaround, do you think it would be possible to parse the
query and treat any keywords that contain numbers as exact string matches?

Dan
Re: Searching for numbers in help system [message #243246 is a reply to message #242943] Fri, 21 May 2004 13:10 Go to previous message
Eclipse UserFriend
Ordinary query terms (2.1) are analyzed (on English system) using
LowerCaseTokenizer->StopFilter->PorterStemFilter from Lucene. It treats non
letters as word breaks.
Terms containing wild card characters or quoted ("2.1" and 2.*) are analyzed
using our tokenizer based on java.txt and anything containing letter or
digit is treated as a word. These terms are searched for in another field
in the index.

It would be possible to possible to add artificial quotes to the query terms
if they contain digits on English locale, but a proper fix should rather be
creating a better stemming analyzer, that does preserve digits. May be it
could still reuse PorterStemFilter as is, but have different tokenizer
plugged into it. Open a feature request against Help component.
Konrad

"Dan Scott" <dan.scott@ca.ibm.com> wrote in message
news:c8j2oj$4go$1@eclipse.org...
> Why do I have to search for numbers by enclosing them in quotes or using
> a wildcard character?
>
> I tried a few different searches at http://help.eclipse.org/help21/
>
> Search on:
> - results of search
>
> 2.1
> - returned no results
>
> "2.1"
> - returned the expected results
>
> 2.*
> - returned the expected results
>
> It would be really nice if the search for 2.1 without quotes returned
> the same results as a search within quotes. I'm guessing that Lucene's
> stemming algorithm is kicking in and causing the unexpected results for
> digits; as a workaround, do you think it would be possible to parse the
> query and treat any keywords that contain numbers as exact string matches?
>
> Dan
Previous Topic:Branding lost upon workbench restart
Next Topic:setting the background color of eclipse
Goto Forum:
  


Current Time: Tue Apr 29 02:41:56 EDT 2025

Powered by FUDForum. Page generated in 0.49503 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top