Re: [che-dev] Lucene search enhancements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [che-dev] Lucene search enhancements

From: "Sirich, Rima" <rima.sirich@xxxxxxx>
Date: Mon, 25 May 2015 12:34:19 +0000
Accept-language: en-US
Delivered-to: che-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/che-dev>
List-help: <mailto:che-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/che-dev>, <mailto:che-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/che-dev>, <mailto:che-dev-request@eclipse.org?subject=unsubscribe>
Thread-index: AQHQlt9bNIS0MTitOkaAXXCCMXl/352MkzQw
Thread-topic: [che-dev] Lucene search enhancements

Hello Sergii,

First of all, thank you very much for the prompt response J

I am glad to hear that we agree on 2.

Regarding 1, I am not sure I understand the proposed solution. Could you, please, elaborate on it a little bit more ?

As you can see in QueryParser documentation http://lucene.apache.org/core/5_0_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAllowLeadingWildcard(boolean)

allowLeadingWildcard property is configurable and by default is set to false because of possible performance penalty on big indexes. So, may be the decision to enable it or not should be at client and not at server as I proposed. I think I should add another query parameter allowLeadingWildcard and only if client had explicitly set its’ value to true then enable it.

I should replace this code

if (text.startsWith("*") || text.startsWith("?")) {

qParser.setAllowLeadingWildcard(true);

}

With this one

qParser.setAllowLeadingWildcard(allowLeadingWildcard);

What do you think ?

Thanks & Best Regards,

Rima Sirich

From: che-dev-bounces@xxxxxxxxxxx [mailto:che-dev-bounces@xxxxxxxxxxx] On Behalf Of Sergii Kabashniuk
Sent: Monday 25 May 2015 14:38
To: che developer discussions
Subject: Re: [che-dev] Lucene search enhancements

Hello

On Mon, May 25, 2015 at 1:10 PM, Sirich, Rima <rima.sirich@xxxxxxx> wrote:

Hello,

I would like to enhance Lucene search capabilities therefore I created a pull request

( https://github.com/codenvy/che-core/pull/77 ) with following changes:

1. queryParser.setAllowLeadingWildcard(true) enables performing contains query ( like *uer* )

I'm not sure about that. I think different vendors may want to have custom behaviour of QueryParser. Wdyt maybe we should allow to extend it in the same way like we do for Analyzer or Directory or IndexWriter

2. SimpleAnalyzer uses LetterTokenizer with LowerCaseFilter, that means that it creates tokens by broking a text by non-letter characters, therefore in current implementation it is not possible to search for the text containing special characters. Using WhitespaceTokenizer solves this issue as it creates tokens by broking a text by whitespace.

I tend to agree with you. If discussion of 1 will be long may be you should decouple this tasks.

I would very appreciate if you could have a look on this pull request and consider merging it to master.

Thanks & Best Regards,

Rima Sirich

_______________________________________________
che-dev mailing list
che-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/che-dev

Sergii Kabashniuk

Follow-Ups:
- Re: [che-dev] Lucene search enhancements
  - From: Sergii Kabashniuk

References:
- [che-dev] Lucene search enhancements
  - From: Sirich, Rima
- Re: [che-dev] Lucene search enhancements
  - From: Sergii Kabashniuk

Prev by Date: Re: [che-dev] Lucene search enhancements
Next by Date: Re: [che-dev] Lucene search enhancements
Previous by thread: Re: [che-dev] Lucene search enhancements
Next by thread: Re: [che-dev] Lucene search enhancements
Index(es):
- Date
- Thread

Breadcrumbs