Developing a user interface for a CLI application to classify comments


Developers write various types of information in code comments such as summary of the class, authors of the class, or description of its methods and variables in comments. These information types help developers in understanding and modifying the code. However, identifying these information types is not a trivial task as they are written in natural language form without a strict syntax.

In our previous work, we developed a command-line based pipeline in Java to identify various information types from class comments [1]. The pipeline preprocesses the comments stored in database, process them, and prepares a machine-learning based classification model. The figure below presents an overview of the pipeline.


Your task is to develop a prototype tool (browser extension for a GitHub repository or a GitHub application [2][3]) for this command-line based pipeline so that a developer can use it to classify comments of their repository.


  • Java (to understand the existing pipeline)
  • Abstract Syntax Tree (to extract comments)
  • HTML/CSS/JavaScript (to develop web extension)
  • Choice for the technology can be further discussed


[1] How to identify class comment types? A multi-language approach for class comment classification
[2] Building GitHub Apps
[3] Example App


Pooja Rani

Last changed by pooja on 20 September 2021