Code Recommenders is a proposed open source project under the Eclipse Technology Container Project.
This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the Eclipse community. Please send all feedback to the Eclipse Proposals Forum.
This proposal is structured as follows. Section "Background" gives the motivation of the project and provides some background information about the origins of the proposed project, namely, the Code Recommenders Project developed at Darmstadt University of Technology. Section "Scope" outlines the initial set of tools and platforms this project aims to deliver to its users; Section "Initial Contributions" describes the current state of the project and the initial contributions that will be made. "Description" gives little more details on the intermediate goals. Section "Related Eclipse Projects" describes potential future connections between current Eclipse Projects and the Code Recommenders project as well as likely collaborations. The remaining sections (Committers, Mentors, Interested Parties, Additional Information) describe what their names suggest.
Under the right circumstances, groups are remarkably intelligent and are often better than the smartest person in them. - James Surowiecki: "Wisdom of the Crowds"
Application frameworks have become an integral part of today's software development - this is hardly surprising given their promised benefits such as reduced costs, higher quality, and shorter time to market. But using an application framework is not free of cost. Before frameworks can be used efficiently, software developers have to learn their correct usage which often results in high initial training costs.
To reduce these training costs, framework developers provide diverse documentation addressing different information needs. Tutorials, for instance, describe typical usage scenarios, and thus give the application developer an initial insight into the workings of the framework. However, their benefit quickly disappears when problems have to be solved that differ from standard usage scenarios. Now, API documentation becomes the most important resource for software developers. Documentation is scanned for hints relevant for the own problem at hand but if it does not provide the required information, the most costly part of the research begins: The source code of other programs is investigated that successfully used the framework in a similar way. But learning correct framework usage from these real-world examples is difficult. The problem with these examples is that they also contain application-specific code that obscures the view on what is really important for using the framework. This significantly complicates the understanding process which makes the training a challenging and time-consuming task again. However, source code of other applications seems to be a valuable source of information. Code-search engines like Google Codesearch or Krugle experience their hype not least because existing framework documentation seems insufficient to support developers on their daily work.
But despite their widespread use, it's an open question whether code-search engines solve the problem of missing documentation in a satisfactory manner. When looking at how developers use code-search engines, it turns out that they rarely create a single query and study just a single example; instead, they typically refine their queries several times, investigate a number of examples, compare them to each other and try to extract a pattern that underlies all these examples, i.e., a common way how to use the API in question.
Although this task is very time-consuming, analyzing example code seems worth doing. Apparently, example code must provide some important insights in how to use a given API. Given this observation, the question is raised whether such important information can be extracted from example code automatically, i.e., without large manual effort. And furthermore if valuable information can be found, how can these findings made accessible to support developers on their daily work.
The Code Recommenders' project developed at Darmstadt University of Technology investigates exactly these two questions. In a nutshell, tools are developed that automatically analyze large-scale code repositories, extract various interesting data from it and integrate this information back into the IDE where it is reused by developers on their daily work. The vision of the project is to create a context-sensitive IDE that learns from what is relevant in a given code situation from its users and, in turn, give back this knowledge to other users. If you like, you may think of it like a collaborative way of sharing knowledge over the IDE.
This Eclipse proposal is the next step towards the goal to build next generation of collaborative IDE services, which we call "the IDE 2.0" - inspired by the success of Web 2.0. The complete vision and explanation of the IDE 2.0 to web 2.0 analogy is described in IDE 2.0: Collective Intelligence in Software Development - published at the Working Conference on the "Future of Software Engineering Research (FoSER) 2010".
However, the scope of the recommenders project is not limited to such kind of tools and encourages the community discuss new ideas of tools that might be helpful for software engineers.
Components like the Stacktrace search engine, or API Usage bug detector are under development yet and will follow when ready.
The proposed namespace of the project will be
Goal of the (code) recommenders project is to build IDE tools like intelligent code completion, extended API docs etc. that continuously improve themselves by leveraging implicit and explicit knowledge about how APIs are used by their clients, and, in turn, give back this information to other developers to ease their work with new and unfamiliar frameworks and development environments.
Current state of the initial contribution is that these systems are fed more or less manually by an administrator that collects example applications from large code repositories like EclipseSource's Yoxos and then starts the analysis and data extraction process to build new models. This approach may be further automated to leverage the already existing infrastructure of the Eclipse Marketplace and P2 to continuously scan and update API usages and build up-to-date models for the Eclipse APIs.
Unfortunately, such a manual approach does not scale well if potentially thousands of (non-eclipse-based) frameworks should be supported. It is simply too difficult to find enough example applications to make this approach work. Thus, in the long-term this manual data collection process should be replaced by a community-driven approach where users are allowed to voluntary share their knowledge about how use these APIs either by giving explicit or implicit feedback (cf. the position paper about user feedback and information sharing). Clearly, special requirements for privacy have to be met so that no individual's private or company�s critical data is collected or published. Different models of data sharing have to be developed and discussed with the community.
As one of the first steps, a platform allowing developers to share knowledge will be developed and the existing tools (i.e., intelligent code completion and usage-driven Javadocs) will be based on these concepts as a proof of concept. A community driven approach may follow.
The following individuals are proposed as initial committers to the project:The Code Recommenders project is developed at Darmstadt University of Technology. The project is lead by Marcel Bruch and advised by Mira Mezini. Although the number of initial committers is low, we expect this set to quickly grow. The project itself was supported by more than 50 students doing various hands-on trainings, bachelor and master theses in the past and future contributions will be made directly under the proposed project. Thus, the initial committers will be
We welcome additional committers and contributions.
The following Architecture Council members will mentor this project:
The following individuals, organisations, companies and projects have expressed interest in this project:
|22-November-2010||Updated Initial Contributions (added proposed namespace), Interested Parties (added new interested parties), Mentors (added second mentor), Committers (added three initial committers)|
Back to the top