[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
AW: [smila-user] File formats supported by SMILA (file) crawlers?
|
Dear SMILA user,
For accessing information in the files having formats other than plain text and html, we plan to integrate Aperture (http://aperture.sourceforge.net/) conversion/extraction libraries in SMILA. Actually, the initial integration of Aperture hasn't been made public since we were waiting for the new release, which just came out and has been published under BSD license and therefore much easier to get it through the IP process that we need to apply to all 3rd party libs that we want to use and distribute with SMILA.
So, you did nothing wrong - SMILA currently doesn't support any other file format besides plain text or html/xml, but this will change soon.
Cheers
Igor
> -----Ursprüngliche Nachricht-----
> Von: smila-user-bounces@xxxxxxxxxxx [mailto:smila-user-bounces@xxxxxxxxxxx] Im
> Auftrag von bin.immer@xxxxxxxxxxx
> Gesendet: Dienstag, 1. September 2009 12:15
> An: smila-user@xxxxxxxxxxx
> Betreff: [smila-user] File formats supported by SMILA (file) crawlers?
>
> Hello there,
>
> found the note that "currently only plain text and html files are crawled and
> indexed correctly by SMILA crawlers" in your 5 minutes to success documentation.
> Is this information still up to date? I guess, it was written based upon 0.5 M1.
>
> I also did some qucik tests today, trying to crawl .pdf, .doc and so on via the
> file crawler, SMILA "only" found plain text and html files, .zip and images. Or
> did I something wrong? Within
> configuration/org.eclipse.smila.connectivity.framework/file.xml I included all
> file extensions I wanted SMILA to find, set a new <BaseDir> but changed/configured
> nothing else.
>
> Keep up the great work and thank you very much.
> --
> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> _______________________________________________
> smila-user mailing list
> smila-user@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/smila-user