Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [che-dev] Lucene search enhancements

Hello Sergii,

 

First of all, thank you very much for the prompt response J

I am glad to hear that we agree on 2.

 

Regarding 1, I am not sure I understand the proposed solution. Could you, please, elaborate on it a little bit more ?

As you can see in QueryParser documentation http://lucene.apache.org/core/5_0_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAllowLeadingWildcard(boolean)

allowLeadingWildcard property is configurable and by default is set to false because of possible performance penalty on big indexes. So, may be the decision to enable it or not should be at client and not at server as I proposed. I think I should add another query parameter allowLeadingWildcard and only if client had explicitly set its’ value to true then enable it.

I should replace this code

if (text.startsWith("*") || text.startsWith("?")) {

qParser.setAllowLeadingWildcard(true);

}

With this one

qParser.setAllowLeadingWildcard(allowLeadingWildcard);

 

What do you think ?

Thanks & Best Regards,

Rima Sirich

 

From: che-dev-bounces@xxxxxxxxxxx [mailto:che-dev-bounces@xxxxxxxxxxx] On Behalf Of Sergii Kabashniuk
Sent: Monday 25 May 2015 14:38
To: che developer discussions
Subject: Re: [che-dev] Lucene search enhancements

 

Hello

 

On Mon, May 25, 2015 at 1:10 PM, Sirich, Rima <rima.sirich@xxxxxxx> wrote:

Hello,

 

I would like to enhance Lucene search capabilities therefore I created a pull request

( https://github.com/codenvy/che-core/pull/77 ) with following changes:

 

1. queryParser.setAllowLeadingWildcard(true) enables performing contains query ( like *uer* ) 

I'm not sure about that. I think different vendors may want to have custom behaviour of QueryParser. Wdyt maybe we should allow to extend it in the same way like we do for Analyzer or Directory or IndexWriter

2. SimpleAnalyzer uses LetterTokenizer with LowerCaseFilter, that means that it creates tokens by broking a text by non-letter characters, therefore in current implementation it is not possible to search for the text containing special characters. Using WhitespaceTokenizer solves this issue as it creates tokens by broking a text by whitespace.

 

I tend to agree with you. If discussion of 1 will be long may be you should decouple this tasks. 

 

 

I would very appreciate if you could have a look on this pull request and consider merging it to master.

 

Thanks & Best Regards,

Rima Sirich


_______________________________________________
che-dev mailing list
che-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/che-dev

 

Sergii Kabashniuk


Back to the top