CS Faculty Member Dr. Lingming Zhang Wins the Best Industry Paper Award at ICST’19 and Publishes Three Papers at ISSTA’19

UT Dallas Computer Science Professor and expert in the field of Software Engineering (specifically software testing, analysis, and verification), Dr. Lingming Zhang, recently had three papers accepted for presentation at the 28^th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2019). Two of the three papers feature Dr. Zhang’s Ph.D. students Ali Ghanbari, Samuel Benton, Xia Li as the first authors, while the remaining paper features Yiling Lou a visiting student from China as the first author. This year, ISSTA accepted 29 (and three conditional accepted papers) out of 144 submissions, which included the three authored by Dr. Zhang and his Ph.D. students. The ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) is the leading research symposium on software testing and analysis, bringing together academics, industrial researchers, and practitioners to exchange new ideas, problems, and experience on how to analyze and test software systems.

Dr. Zhang’s main research interests lie in the field of Software Engineering (SE), more specifically on applying various techniques from the Machine Learning, Data Mining, Programming Languages, and Formal Methods areas to Software Testing problems, including automatically detecting, localizing, and fixing software bugs efficiently and effectively. One of Dr. Zhang’s recent focuses is the synergy between Deep Learning and Software Testing, e.g., using Deep Learning techniques to solve Software Testing problems and vice versa. Besides publishing in top-tier conferences/journals, Dr. Zhang’s group is also enthusiastic about building robust tools for practical impact. For example, Dr. Zhang’s group has released various practical testing and debugging tools to the Maven Central Repository. In addition, Dr. Zhang’s group has been working closely with industry companies (e.g., Baidu, Microsoft, Google, Intel, and NASA), and has won a Google Faculty Research Award, an Amazon AWS Research Grant, a NVIDIA GPU Grant, and a Samsung GRO Award.

Dr. Zhang (right) and Hua Zhong (left) accepting their award for Best Industry Paper at ICST’19.

Dr. Zhang was also awarded Best Industry Paper for his article titled, “TestSage: Regression Test Selection for Large-scale Web Service Testing,” presented at the 12^th IEEE International Conference on Software Testing and Validation (ICST 2019). In this paper, he presented a novel regression test selection (RTS) technique, TestSage, that performs RTS for web service tests on large scale commercial software. The TestSage framework has been running in Google day-to-day now. ICST is a flagship conference on software testing that is intended to provide a common forum for researchers and practitioners from academia, industry, and government to present their latest research findings, ideas, developments and applications in the area of software testing, verification and validation.

The following is the list of the ISSTA accepted papers and their respective abstracts.

Automated Program Repair via Bytecode Mutation: An Extensive Study – Ali Ghanbari, Samuel Benton, and Lingming Zhang

Abstract: Automated Program Repair (APR) is one of the most recent advances in automated debugging, and can directly fix buggy programs with minimal human intervention. Although various advanced APR techniques (including search-based or semantic-based ones) have been proposed, they mainly work at the source-code level, and it is not clear how bytecode-level APR performs in practice. Also, extensive studies of the existing techniques on bugs beyond what has been reported in the original papers are rather limited. In this paper, we implement the first practical bytecode-level APR technique, PraPR, and present the first extensive study on fixing real-world bugs (e.g., Defects4J bugs) using JVM bytecode mutation. The experimental results show that surprisingly even PraPR with only the basic traditional mutators can produce genuine fixes for 17 bugs; with some additional commonly used APR mutators, PraPR is able to produce genuine fixes for 43 bugs, significantly outperforming state-of-the-art APR and being an order of magnitude faster. Furthermore, we also performed an extensive study of PraPR and other recent APR tools on a large number of real-world “unexpected” bugs. Lastly, PraPR has also successfully fixed bugs for other JVM languages (e.g., Kotlin), indicating that bytecode-mutation-based APR can greatly complement the existing source-code-level APR.

DeepFL: Integrating Multiple Fault Diagnosis Dimensions for Deep Fault Localization – Xia Li, Wei Li, Yuqun Zhang, and Lingming Zhang

Abstract: Learning-based fault localization has been intensively studied recently. Prior studies have shown that traditional Learning-to-Rank techniques can help precisely diagnose fault locations using various dimensions of fault-diagnosis features, such as suspiciousness values computed by various off-the-shelf fault localization techniques. However, with the increasing dimensions of features considered by advanced fault localization techniques, it can be quite challenging for the traditional Learning-to-Rank algorithms to automatically identify effective existing/latent features. In this work, we propose DeepFL, a deep learning approach to automatically learn the most effective existing/latent features for precise fault localization. Although the approach is general, in this work, we collect various suspiciousness-value-based, fault-proneness-based, and textual-similarity-based features from the fault localization, defect prediction and information retrieval areas, respectively. The corresponding DeepFL techniques have been studied on 395 real bugs from the widely used Defects4J benchmark. The experimental results show that DeepFL can significantly outperform state-of-the-art TraPT/FLUCCS (e.g., localizing 50+ more faults within Top-1). We also investigate the impacts of deep model configurations (e.g., loss functions and epoch settings) and features. Furthermore, DeepFL is also surprisingly effective for cross-project prediction.

History-driven Build Failure Fixing: How Far Are We? – Yiling Lou, Junjie Chen, Lingming Zhang, Dan Hao, and Lu Zhang

Abstract: Build systems are essential for modern software development and maintenance since they are widely used to transform source code artifacts into executable software. Previous work shows that build systems break frequently during software evolution. Therefore, automated build-fixing techniques are in huge demand. In this paper, we target a mainstream build system, Gradle, which has become the most widely used build system for Java projects in the open-source community (e.g., GitHub). HireBuild, a state-of-the-art build-fixing tool for Gradle, has been recently proposed to fix Gradle build failures via mining the history of prior fixes. Although HireBuild has been shown to be effective for fixing real-world Gradle build failures, it was evaluated on only a limited set of build failures, and largely depends on the quality/availability of historical fix information. To investigate the efficacy and limitations of the history-driven build fix, we first construct a new and large build failure dataset from Top-1000 GitHub projects. Then, we evaluate HireBuild on the extended dataset both quantitatively and qualitatively. Inspired by the findings of the study, we propose a simplistic new technique that generates potential patches via searching from the present project under test and external resources rather than the historical fix information. According to our experimental results, the simplistic approach based on present information successfully fixes 2X more reproducible build failures than the state-of-art HireBuild based on history fix information. Furthermore, our results also reveal various findings/guidelines for future advanced build failure fixing.

The following is the abstract for the paper “TestSage: Regression Test Selection for Large-scale Web Service Testing,” which won Best Industry Paper at the 1^2th IEEE International Conference on Software Testing and Validation (ICST). The paper was co-authored by Hua Zhong and Sarfraz Khurshid.

Abstract: Regression testing is an important but expensive activity in software development. Among various types of tests, web service tests are usually one of the most expensive (due to network communications) but widely adopted types of tests in commercial software development. Regression test selection (RTS) aims to reduce the number of tests which need to be retested by only running tests that are affected by code changes. Although a large number of RTS techniques have been proposed in the past few decades, these techniques have not been adopted on large-scale web service testing. This is because most existing RTS techniques either require direct code dependency between tests and code under test or cannot be applied on large scale systems with enough efficiency. In this paper, we present a novel RTS technique, TestSage, that performs RTS for web service tests on large scale commercial software. With a small overhead, TestSage is able to collect fine-grained (function level) dependency between test and service under test that does not directly depend on each other. TestSage has also been successfully applied to large complex systems with over a million functions. We conducted experiments of TestSage on a large-scale backend service at Google. Experimental results show that TestSage reduces 34% of the testing time when running all AEC (Analysis, Execution, and Collection) phases, 50% of the testing time while running without the collection phase. TestSage has been integrated with the internal testing framework at Google and runs day-to-day at the company.

ABOUT THE UT DALLAS COMPUTER SCIENCE DEPARTMENT

The UT Dallas Computer Science program is one of the largest Computer Science departments in the United States with over 2,800 bachelors-degree students, more than 1,000 master’s students, 190 Ph.D. students, 52 tenure-track faculty members, and 41 full-time senior lecturers, as of Fall 2018. With The University of Texas at Dallas’ unique history of starting as a graduate institution first, the CS Department is built on a legacy of valuing innovative research and providing advanced training for software engineers and computer scientists.