![](/client/images/blank.gif)
Eylem Seç
![Online Topic Modeling for Software Maintenance Using a Changeset-Based Approach için kapak resmi Online Topic Modeling for Software Maintenance Using a Changeset-Based Approach için kapak resmi](/client/assets/cf6e192b74af2810/ctx/images/no_image.png)
Online Topic Modeling for Software Maintenance Using a Changeset-Based Approach
Başlık:
Online Topic Modeling for Software Maintenance Using a Changeset-Based Approach
Yazar:
Corley, Christopher Scott, author.
ISBN:
9780438040977
Yazar Ek Girişi:
Fiziksel Tanımlama:
1 electronic resource (195 pages)
Genel Not:
Source: Dissertation Abstracts International, Volume: 79-10(E), Section: B.
Advisors: Nicholas A. Kraft; Jeffrey C. Carver Committee members: Travis L. Atkison; Jeffrey G. Gray; Randy K. Smith.
Özet:
Topic modeling is a machine learning technique for discovering thematic structure within a corpus. Topic models have been applied to several areas of software engineering, including bug localization, feature location, triaging change requests, and traceability link recovery. Many of these approaches train topic models on a source code snapshot -- a revision or state of code at a particular point of time, such as a versioned release. However, source code evolution leads to model obsolescence and thus to the need to retrain the model from the latest snapshot, incurring a non-trivial computational cost of model re-learning.
This work proposes and investigates an approach that can remedy the obsolescence problem. Conventional wisdom in the software maintenance research community holds that the topic model training information must be the same information that is of interest for retrieval. The primary insight for this work is that topic models can infer the topics of any information, regardless of the information used to train the model. Pairing online topic modeling with mining software repositories, I can remove the need to retrain a model and achieve model persistence. For this, I suggest training of topic models on the software repository history in the form of the changeset -- a textual representation of the changes that occur between two source code snapshots.
To show the feasibility of this approach, I investigate two popular applications of text retrieval in software maintenance, feature location and developer identification. Feature location is a search activity for locating the source code entity that relates to a feature of interest. Developer identification is similar, but focuses on identifying the developer most apt for working on a feature of interest. Further, to demonstrate the usability of changeset-based topic models, I investigate whether I can coalesce topic-modeling-based maintenance tasks into using a single model, rather than needing to train a model for each task at hand. In sum, this work aims to show that training online topic models on software repositories removes retraining costs while maintaining accuracy of a traditional snapshot-based topic model for different software maintenance problems..
Notlar:
School code: 0004
Konu Başlığı:
Tüzel Kişi Ek Girişi:
Mevcut:*
Yer Numarası | Demirbaş Numarası | Shelf Location | Lokasyon / Statüsü / İade Tarihi |
---|---|---|---|
XX(678827.1) | 678827-1001 | Proquest E-Tez Koleksiyonu | Arıyor... |
On Order
Liste seç
Bunu varsayılan liste yap.
Öğeler başarıyla eklendi
Öğeler eklenirken hata oldu. Lütfen tekrar deneyiniz.
:
Select An Item
Data usage warning: You will receive one text message for each title you selected.
Standard text messaging rates apply.