Project Plan
All previous releases and milestones are documented in the project plan archive.
SMILA Version 1.2
This release was published on April, 17th 2013.
- Apache Tika integration - extracting text from binary content
- JDBC-Crawler: Splitting functionality for scaling
- Web-Crawling enhancements (robots.txt, boilerpipe integration)
- Remote-Crawling
- Cluster setup tutorial
Further plans
- HDFS objectstore
- Solr 4 integration/clustering
- Alternative (Scripting) engine for synchronous workflows
- Basic MapReduce support
- General configuration management
Part of
Links
Supporting Organizations