博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
[EMSE'17] A Correlation Study between Automated Program Repair and Test-Suite Metrics
阅读量:5237 次
发布时间:2019-06-14

本文共 5144 字,大约阅读时间需要 17 分钟。

Basic Information

  • Authors: Jooyong Yi, Shin Hwei Tan, Sergey Mechtaev, Marcel Böhme, Abhik Roychoudhury
  • Publication: EMSE'17
  • Conclusion: In general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Their results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.

Interesting Points

Correlation between Mutation Testing and Automated Program Repair:

To some extent, automated program repair and mutation testing are very similar. It can be viewed that automated program repair “mutates" the original program, this time in an attempt to find a repair. As in mutation testing, mutants that fail to pass all tests in the provided test-suite are considered buggy (hence, incorrect repairs). This conceptual similarity between mutation testing and automated program repair suggests the plausibility of using the mutation score to measure the quality of a test-suite not only for mutation testing but also for automated program repair. Just as a higher mutation score is associated with a better fault-detection ability in mutation testing, it appears plausible to associate a higher mutation score with a better ability to guide a reliable repair.

There is not only similarity but also duality between mutation testing and automated program repair. As pointed out by Weimer et al (2013), “our confidence in mutant testing increases with the set of non-redundant mutants considered, but our confidence in the quality of a program repair gains increases with the set of non-redundant tests." Note that mutation score measures the non-redundancy of killed mutants, not the non-redundancy of tests capable of killing mutants. We introduce a new metric called capable-tests ratio in the next section that measures the non-redundancy of capable tests.

Measure quality of Test-suite quality and Repair

This paper mainly explore the correlation between quality of automated program repair (APR) and test-suite.
To evaluate quality of APR, traditional metrics (i.e., 1) statement coverage, 2) branch coverage, 3) test-suite size, 4) mutation score) and capable-tests ratio are used.

RQs and Results

RQ1: : Is there a negative correlation between the metrics of a testsuite and the regression ratio of automatically generated repairs? In other words, are generated repairs less likely to cause regressions, as test-suite metrics increase?

As the traditional test-suite metrics (statement coverage, branch coverage, test-suite size, and mutation score) increase, the regression rate of automatically generated repairs generally decreases, showing the promise of using the traditional test-suite metrics to control the regression ratio of automatically generated repairs. Capable-tests ratio does not seem as useful as the traditional metrics in controlling the quality of generated repairs.

RQ2: Which test-suite metric is most strongly correlated with the regression ratio of automatically generated repairs?

In our experiments, statement coverage is, on average, more strongly correlated with regression ratio than other metrics we investigate. Our results suggest that to reduce the regression ratio, increasing statement coverage is more promising than improving the other test-suite metrics.

RQ3: Is there a negative correlation between the metrics of a test-suite and the repairability of automated program repair? In other words, should repairability be sacrificed in an attempt to obtain a higher-quality repair via a higher-quality testsuite?

Our experimental results are inconclusive about the correlation between test-suites and repairability. However, we note that increasing test-suite metric does not always decrease repairability. Im some subjects, positive correlations were observed between test-suite metrics and repairability, indicating that as the test-suite metrics increase, repairability tends to increase.

RQ4: Is there a negative correlation between the metrics of a test-suite and repair time? In other words, should more time be spent in an attempt to obtain a higher-quality repair via a higher-quality test-suite?

Our experimental results are inconlusive about the correlation between test-suites and repair time. However, we note that increasing test-suite metric does not always increase repair time. In some subjects, negative correlations were observed between test-suite metrics and repair time, indicating that as the test-suite metrics increase, repair time tends to decrease.

Different Repair Algorithm: SEMFIX

Our experimental results from SEMFIX generally coincide with our finding from the GENPROG experiment, despite the differences in repair algorithms and fault localization techniques. The traditional test-suite metrics are, overall, negatively correlated with regression ratio, similar to our GENPROG experimental results. In particular, **statement coverage** is again shown to be most strongly correlated with regression ratio.

转载于:https://www.cnblogs.com/XBWer/p/9204746.html

你可能感兴趣的文章
DateTime日期格式获取 分类: C# 2014...
查看>>
ACM/ICPC 之 差分约束系统两道(ZOJ2770-POJ1201)
查看>>
转载-我的生活我做主!工作一年的理财小心得~!
查看>>
js检测对象中是否存在某个属性
查看>>
js事件应用
查看>>
C++奇数魔方阵
查看>>
搭建MySQL高可用负载均衡集群
查看>>
Gradle 语法
查看>>
Ureal Engine 4中getWorldLocation在Physics Constraint中的表现
查看>>
深入浅出JSONP:解决AJAX跨域问题
查看>>
oracle中去掉文本中的换行符、回车符、制表符
查看>>
实在自动现在APK,微信跳浏览器下载
查看>>
java之static关键字
查看>>
Linux系统下python代码运行shell命令的方法
查看>>
剑指offer——python【第38题】二叉树的深度
查看>>
Log4Net日志组件使用
查看>>
Windows 10 使用压缩包安装MySQL8.0.12
查看>>
bzoj 1689: [Usaco2005 Open] Muddy roads 泥泞的路
查看>>
Android 代码库(自定义一套 Dialog通用提示框 )
查看>>
[Java]ArrayList、LinkedList、Vector、Stack的比较
查看>>