Project Plan For SMILA, version 1.2
Introduction
SMILA is an extensible framework for building big data and/or search solutions to access and process unstructured information in
the enterprise. Besides providing essential infrastructure components and services, SMILA also delivers
ready-to-use add-on components, like connectors to most relevant data sources. Using the framework as their
basis will enable developers to concentrate on the creation of higher value solutions, like semantic driven
applications etc.
Release Deliverables
- Core and add-ons (includes core components as well as ready-to-use add-on components like various data connectors and BPEL services) available as compressed archive (ZIP file).
Release Milestones
| 1.2 M1 | 2013-02-27 |
The target date for availability of SMILA 1.2 is April 17th, 2013.
Target Environments
SMILA 1.2 depends on Equinox 4.2. For this release, the sources will be written and compiled against
Java Development Kit (JDK) 7 and designed to run on Java Runtime Environment (JRE) 7, Standard Edition.
Themes and Priorities
SMILA 1.2
-
Proposed
- Apache Tika integration - extracting text from binary content
- JDBC-Crawler: Splitting functionality for scaling
- Web-Crawling enhancements (robots.txt, boilerpipe integration)
- Remote Crawling
- Cluster setup tutorial
