CloMan: A Clone Management Tool
We have been developing an IDE-based code clone management system to flexibly detect, manage, and refactor both exact and near-miss code clones. Using a k-difference hybrid suffix tree algorithm, we can efficiently detect both exact and near-miss clones. We have implemented the algorithm as a plugin to the Eclipse IDE, and have been extending this for real-time code clone management with semi-automated refactoring support during the actual development process.
The current prototype of the clone search tool is available here.
SentiStrength-SE: Sentiment detection tool in Software Engineering domain
Automated sentiment analysis in software engineering textual artifacts has long been suffering from inaccuracies in those few tools available for the purpose. We conduct an in-depth qualitative study to identify the difficulties responsible for such low accuracy. The exposed difficulties are then carefully addressed in developing SentiStrength-SE, a tool for improved sentiment analysis especially designed for application in the software engineering domain.
The prototype of the tool along with required dictionary can be downloaded from here.
Code Clone Refactoring Scheduler
Duplicated code, also known as code clones, are one of the malicious `code smells' that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those 1``1. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. We propose a refactoring effort model, and apply a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
The OPL implementation of the code clone refactoring scheduler and the normalized data can be found here.
Genealogical Study on Clone Removal
An empirical study based on the clone genealogies from a significant number of releases of open-source software systems, to characterize the patterns of clone change and removal in evolving software systems. For use in this work, we have made significant extension to the basic gCad (clone genealogy extractor), which was originally developed by Ripon K. Saha.
The extended version of gCad can be downloaded from here. This distribution also includes the NiCad-2.6.3 clone detector that is necessary for gCad's operation.
A Study on API Usability
Software development today has been largely dependent on the use of API libraries, frameworks, and reusable components. However, while writing client code using the APIs, the developers often face difficulties, which increase the development cost (e.g., time, effort) and lower code quality. In this regard, we study 1,513 bug-posts across five different bug repositories, using qualitative and quantitative analysis including topic modeling technique.
This work makes three main contributions. First, we identify the API usability issues that are reflected in the bug-posts from the API users, and distinguish relative significance of the usability factors. Second, from the lessons learned by manual investigation of the bug-posts, we propose recommendations for designing APIs with better usability. Third, we demonstrate how topic modeling techniques can be applied for concept localization in the bug-reports, and explore avenue for automating similar studies in larger scale.
ZibJana: A Sensor Fusion Framework
ZibJana is a localization application we developed (joint work with Farjana Zebin Eishita) on htc magic smart phone running Android 1.6 OS. ZibJana collects sensor information from smart phone’s built-in GPS receiver, camera, and accelerometer. Then applying Kalman Filter it combines those data from different sources to obtain smart phone’s location estimation more accurately than what is obtainable from GPS only.
MSched: A University Course Timetabler
M-Sched is a university course timetabling software. It takes into account available resources (teaching staff, classrooms, courses, etc.) and associated constraints and preferences in producing a feasible timetable optimizing the utilization of those resources. MSched applies a multiphase approach to solve the timetabling problem. The entire timetabling problem is decomposed into several sub-problems, and each subproblem is mathematically modeled using constraint programming (CP) or Integer Programming (IP) techniques. Each of these models are solved in separate phases and the overall timetable is generated by accumulating solutions from all these phases.