SCG News

Moldable, context-aware searching with Spotter

Andrei Chiş, Tudor Gîrba, Juraj Kubelka, Oscar Nierstrasz, Stefan Reichhart, and Aliaksei Syrel. Moldable, context-aware searching with Spotter. In Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2016 p. to appear, ACM, New York, NY, USA, 2016. Details.

Abstract

Software systems involve many different kinds of domain-specific and interrelated software entities. A common strategy employed by developers to deal with this reality is to perform exploratory investigations by means of searching. Nevertheless, most integrated development environments (IDEs) support searching through generic and disconnected search tools. This impedes search tasks over domain-specific entities as considerable effort is wasted by developers locating and linking data and concepts relevant to their application domains. To tackle this problem we propose Spotter, a moldable framework for supporting contextual-aware searching in IDEs by enabling developers to easily create custom searches for domain objects. In this paper we motivate a set of requirements for Spotter and show, through usage scenarios, that Spotter improves program comprehension by reducing the effort required to find and search through concepts from a wide range of domains. Furthermore, we show that by taking code into account, Spotter can provide a single entry point for embedding search support within an IDE.

Posted by scg at 26 August 2016, 12:15 pm comment link

EggShell — A workbench for modeling scientific communities

Dominik Seliner. EggShell — A workbench for modeling scientific communities. Bachelor’s thesis, University of Bern, August 2016. Details.

Abstract

The collaboration in a scientific community can be analysed through the publication record of its members. The analysis of the metadata (e.g., title and authors) of those publications can help researchers to identify groups of collaboration, their evolution, and key authors. However, the criteria for collecting the papers of some communities might exceed the expressiveness offered by public databases and search engines available. Hence, the data has to be retrieved from the papers’ files themselves. Usually, scientific papers are available in unstructured file formats for which auto- matic extraction of data poses a challenge. To model the metadata of a community users have to define a pipeline. In it, each step contributes to the accuracy of the extracted data. The main challenge is to identify to which type of field of the document a piece of text corresponds. Previous research proposed heuristics to identify certain fields like the title and authors from papers’ files by analyzing their layout. The performance of such heuristics might vary across papers that use different layouts. Hence, ensuring the accuracy of a given heuristic is a challenging problem. Small improvements in a heuristic that tackles a popular layout can make a high impact on its overall performance. However, identifying popular layouts and evaluating the impact of improvements can be a laborious task. Visualization offers techniques that fit the analysis of such multivariate data. Through visualization, a developer who is implementing a heuristic for data extraction can obtain an overview of how it performs and find hotspots that can lead to improvements that impact the overall efficacy. In this thesis, we propose EggShell, a workbench that incorporates visualization to assess the performance of modeling pipelines for scientific papers in PDF format. We elaborate on examples of how EggShell allows users to define multiple pipelines. Pipelines can then be improved by assessing their output using visualization. We collected a corpus of 300 papers published by SOFTVIS/VISSOFT venues. We used a subset of 100 papers as a learning set to develop the pipelines, and then used the remaining 200 papers to evaluate their performance for modeling collaboration in the community. We observed that our best performing pipeline exhibits an accuracy of 70%.

Posted by scg at 26 August 2016, 11:15 am comment link

Moldable, context-aware searching with Spotter

Andrei Chiş, Tudor Gîrba, Juraj Kubelka, Oscar Nierstrasz, Stefan Reichhart, and Aliaksei Syrel. Moldable, context-aware searching with Spotter. In Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2016 p. to appear, ACM, New York, NY, USA, 2016. Details.

Abstract

Software systems involve many different kinds of domain-specific and interrelated software entities. A common strategy employed by developers to deal with this reality is to perform exploratory investigations by means of searching. Nevertheless, most integrated development environments (IDEs) support searching through generic and disconnected search tools. This impedes search tasks over domain-specific entities as considerable effort is wasted by developers locating and linking data and concepts relevant to their application domains. To tackle this problem we propose Spotter, a moldable framework for supporting contextual-aware searching in IDEs by enabling developers to easily create custom searches for domain objects. In this paper we motivate a set of requirements for Spotter and show, through usage scenarios, that Spotter improves program comprehension by reducing the effort required to find and search through concepts from a wide range of domains. Furthermore, we show that by taking code into account, Spotter can provide a single entry point for embedding search support within an IDE.

Posted by scg at 25 August 2016, 11:41 am comment link

Optimizing Parser Combinators

Jan Kurš, Jan Vraný, Mohammad Ghafari, Mircea Lungu, and Oscar Nierstrasz. Optimizing Parser Combinators. In Proceedings of International Workshop on Smalltalk Technologies (IWST 2016), 2016. To Appear. Details.

Abstract

Parser combinators are a popular approach to parsing. Parser combinators follow the structure of an underlying grammar, are modular, well-structured, easy to maintain, and can recognize a large variety of languages including context-sensitive ones. However, their universality and flexibility introduces a noticeable performance overhead. Time-wise, parser combinators cannot compete with parsers generated by well-performing parser generators or optimized hand-written code. Techniques exist to achieve a linear asymptotic performance of parser combinators, yet there is still a significant constant multiplier. This can be further lowered using meta-programming techniques. In this work we present a more traditional approach to optimization — a compiler — applied to the domain of parser combinators. A parser combinator compiler (pc-compiler) analyzes a parser combinator, applies parser combinator-specific optimizations and, generates an equivalent high-performance top-down parser. Such a compiler preserves the advantages of parser combinators while complementing them with better performance.

Posted by scg at 21 August 2016, 12:15 pm comment link

CHOOSE Forum Sept 9, 2016 on Microservices

The CHOOSE Forum 2016 will be held at UZH on Microservices. Details: http://www.choose.s-i.ch/events/forum2016/index.html

Posted by scg at 17 August 2016, 11:41 am comment link
<< 1 2 3 4 5 6 7 8 9 10 >>
Last changed by scg on 17 August 2016