SCG News

First-class artifacts as building blocks for live in-IDE documentation

Nitish Patkar, Andrei Chis, Nataliia Stulova, and Oscar Nierstrasz. First-class artifacts as building blocks for live in-IDE documentation. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2022. Details.


A traditional round-trip engineering approach based on model transformations does not scale well to modern agile development environments where numerous artifacts are produced using a range of heterogeneous tools and technologies. To boost artifact connectivity and maintain their consistency, we propose to create and manage software-related artifacts as first-class entities directly in an integrated development environment (IDE). This approach has two advantages: (i) compared to employing separate tools, creating various artifacts directly within a development platform eliminates the necessity to recover trace links, and (ii) first-class artifacts can be composed into stakeholder-specific live document-artifacts. We detail and exemplify our approach in the Glamorous Toolkit IDE (henceforth, Glamorous toolkit), and discuss the results of a semi-structured pilot survey we conducted with practitioners and researchers to evaluate its usefulness in practice.

Posted by scg at 13 January 2022, 4:15 pm comment link

An Exploratory Study on the Usage of Gherkin Features in Open-Source Projects

Adwait Chandorkar, Nitish Patkar, Andrea Di Sorbo, and Oscar Nierstrasz. An Exploratory Study on the Usage of Gherkin Features in Open-Source Projects. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 2022. Details.


With behavior-driven development (BDD), domain experts describe system behavior and desired outcomes through natural language-like sentences, e.g., using the Gherkin language. BDD frameworks partially convert the content of Gherkin specifications into executable test code. Previous studies have reported several issues with the current BDD practice, for example long repetitive Gherkin specifications and slow-running test suites. Data tables and additional features were added to the Gherkin syntax to express compactly test inputs (e.g., provide different combinations of input values and desired outputs to run tests multiple times) and also to improve the readability of Gherkin files (henceforth called spec files). However, there is no empirical evidence about the actual usage of these Gherkin features. To fill this gap, we analyzed the content of 1,572 spec files extracted from 23 open-source projects. For each spec file, we collected a set of metrics modeling the structure and the usage of the different Gherkin features. We found that only a minority of the considered spec files (i.e., 590) used data tables that contain two rows, on average. We also used statistical tests to compare the contents of spec files with and without data tables and found significant differences between the two populations, especially for what concerns the number of lines of code (LoC). On the one hand, our results shed some light on the discrepancies between the recommendations for defining Gherkin specifications and their actual adoption in practice. On the other hand, our findings demonstrate that the adoption of additional features, such as data tables, might only partially help to reduce the length of Gherkin specifications.

Posted by scg at 13 January 2022, 4:15 pm comment link

Implementing Mondrian in Glamorous Toolkit

Cyrill J. Rohrbach. Implementing Mondrian in Glamorous Toolkit. Bachelor’s thesis, University of Bern, September 2021. Details.


Developers spend a lot of time reverse engineering software. To do this they often rely on reading the code, which is a slow and unscalable process. They rely on code reading because the environments used for the development are centered around the code editor and do not really offer other tools to help the developer understand the software. The goal of moldable development is to change this. The environment should provide tools to help the user understand software. In order to prevent the tools from not being suitable for the application, the developer should adapt and develop the tools alongside the software. Glamorous Toolkit is a new development environment based on Pharo and built around the philosophy of moldable development. One of the tools offered by Glamorous Toolkit to help understand a piece of software is a multipurpose visualization tool called GtMondrian. GtMondrian takes scripts and turns them into interactive visualizations. These scripts allow for endless customizability, but to do this the user has to know how the graphical elements of Glamorous Toolkit work. Since it takes time to familiarize oneself with those elements, this could well be something that prevents developers from using it to adapt the development tools, and therefore sabotages the concept of moldable development. We propose a tool similar to GtMondrian called CRMondrian. It has a lot of the same functionality with the major difference that the most commonly used customizations, such as changing the shape of the nodes within a graph, are done using builders. Therefore it just requires one simple keyword from the user and eliminates the need for the user to know how the graphical elements work.

Posted by scg at 17 September 2021, 12:15 pm comment link

Adherence of class comments to style guidelines

Suada Abukar. Adherence of class comments to style guidelines. Bachelor’s thesis, University of Bern, August 2021. Details.


Code comments play an important role in program comprehension and maintenance tasks. They are written in natural language and follow a semi-structured or unstructured nature. Due to this, assessing their quality is a difficult task. To control certain aspects of quality such as consistency, readability, or preciseness, programming languages provide comment-related conventions in the coding style guidelines. One of the ways to assess comment quality in mentioned aspects is to verify if the code follows the respective coding style guidelines or not. However, what specific types of conventions they suggest related to code comments and if developers follow these conventions while writing comments is not yet explored. Previous works have proposed to automatically assess code quality using various linters or static tools. However, the extent to which these tools support comments is limited and comment validation on a semantic level is not provided. Additionally, one project can follow more than one style guideline. Thus, verifying which convention is from which guideline and to what extent it is followed is an essential but nontrivial task. This thesis provides an empirical study investigation of the content of popular commenting style guidelines and commenting practices in Java and Python. We extract comment-related rules from style guidelines used by 13 open-source projects. Furthermore, we assess nearly 700 statistically significant samples of class comments originating from these 13 projects. The projects vary in domain, size, and number of contributors and are selected from two popular programming languages: Java and Python. This thesis uncovers the quality of class comments written in open-source projects and the content of the comment style guide. We discovered that 57% of the comment conventions rules do not apply to the class comment samples. From the applicable portion, most class comments follow the convention rules. The rules that are followed by the comments the most were mostly rules about the content and writing style of comments. Our results highlight the importance of writing clear and straightforward rules in the style guidelines since they are used and interpreted by developers with different levels of coding experience. In addition, the high percentage of adherence proves that developers do consult style guidelines when coding.

Posted by scg at 1 September 2021, 11:15 am comment link

How to Identify Class Comment Types? A Multi-language Approach for Class Comment Classification

Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Andrea Di Sorbo, and Oscar Nierstrasz. How to Identify Class Comment Types? A Multi-language Approach for Class Comment Classification. In Journal of Systems and Software 181 p. 111047, 2021. Details.


Most software maintenance and evolution tasks require developers to understand the source code of their software systems. Software developers usually inspect class comments to gain knowledge about program behavior, regardless of the programming language they are using. Unfortunately, (i) different programming languages present language-specific code commenting notations and guidelines; and (ii) the source code of software projects often lacks comments that adequately describe the class behavior, which complicates program comprehension and evolution activities. To handle these challenges, this paper investigates the different language-specific class commenting practices of three programming languages: Python, Java, and Smalltalk. In particular, we systematically analyze the similarities and differences of the information types found in class comments of projects developed in these languages. We propose an approach that leverages two techniques -namely Natural Language Processing and Text Analysis -to automatically identify class comment types, i.e., the specific types of semantic information found in class comments. To the best of our knowledge, no previous work has provided a comprehensive taxonomy of class comment types for these three programming languages with the help of a common automated approach. Our results confirm that our approach can classify frequent class comment information types with high accuracy for the Python, Java, and Smalltalk programming languages. We believe this work can help in monitoring and assessing the quality and evolution of code comments in different programming languages, and thus support maintenance and evolution tasks.

Posted by scg at 25 August 2021, 2:16 pm comment link
<< 1 2 3 4 5 6 7 8 9 10 >>
Last changed by admin on 21 April 2009