[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [smila-dev] SMILA/Specifications/CrawlerAPIDiscussion09
|
Hi Allan,
Thank you for the response on crawler api
(http://wiki.eclipse.org/SMILA/Specifications/CrawlerAPIDiscussion09)
discussion. This very important question was in frozen state.
In my opinion, crawler developer should know nothing about SMILA inner
objects and transports (MObject, Record, Deltra Indexing, SCA, etc).
He should implement only simple and understandable data-source iterator.
Approx. interface:
interface Crawler {
void start(IndexOrderConfiruration config);
DataSourceReference next();
void finish();
}
interface DataSourceReference {
Object getAttribute(String name);
byte[] getAttachment(String name);
}
I will be glad to hear and to discuss other ideas and opinions.
--
Ivan
Allan Kaufmann wrote:
Hi peoples
I have read this interesting discussion about the crawler api
(http://wiki.eclipse.org/SMILA/Specifications/CrawlerAPIDiscussion09).
In my opinion it´s currently not easy to understand the crawler api,
but I believe this should be a target if you want users and developers
for this project who like it. I looked to this filesystem-crawler
sample in your current smila trunk and need much time to understand this.
So what about keeping the crawlerapi simple like discussed on this site?
I think a nice way is to reduce the MObject and record creation to
make it easier, maybe delivering all information together to
crawlercontroller with an ArrayList. OK, probably I know you need to
have a communication between Crawlercontroller and crawler to make
generation indexing possible. So what about the second alternative,
which was that getNextDeltaIndexing returns record. In that case the
crawlercontroller received the information for id and hash. Then, if
information are changed, the getRecord-method delivers the other
attributes also as record and crawlercontroller could merge this. I
think that would be easier to understand, but the other alternatives
discussed on this site are also worth to discuss or decide about.
Greetings
Allan
Allan Kaufmann
*brox *IT-Solutions GmbH*
*An der Breiten Wiese 9
30625 HANNOVER (Germany)
Tel: +49 (5 11) 33 65 28 – 67
eFax: +49 (5 11) 33 65 28 – 98 78
Fax: +49 (5 11) 33 65 28 – 29
Mail: akaufmann@xxxxxxx <mailto:tmenzel@xxxxxxx>
Web: www.brox.de <http://www.brox.de/>
==================================
According to Section 80 of the German Corporation Act brox
IT-Solutions GmbH must indicate the following information.
Address: An der Breiten Wiese 9, 30625 Hannover Germany
General Manager: Hans-Chr. Brockmann
Registered Office: Hannover, Commercial Register Hannover HRB 59240
========== Legal Disclaimer ==========
------------------------------------------------------------------------
_______________________________________________
smila-dev mailing list
smila-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/smila-dev