Andrea Caracciolo. A Unified Approach to Automatic Testing of Architectural Constraints. In Proceedings of ICSE 2015 (37st International Conference on Software Engineering), Doctoral Symposium, ACM Press, 2015. Details.
Architectural decisions are often encoded in the form of constraints and guidelines. Non-functional requirements can be ensured by verifying that some of the formulated assumptions are correctly reflected in the implementation. Conformance checking is often a costly and error-prone process that involves the use of multiple tools, differing in effectiveness, complexity and scope of applicability. To reduce the overall effort entailed by this activity, we propose a novel approach that supports verification of human-readable declarative rules through the use of adapted off-the-shelf tools. Our approach has been implemented in a soon to be evaluated prototype called Dicto.
Aaron Karper. Efficient regular expressions that produce parse trees. Masters thesis, University of Bern, December 2014. Details.
Regular expressions naturally and intuitively define parse trees that describe the text that they’re parsing. We describe a technique for building up the complete parse tree resulting from matching a text against a regular expression. In previous tagged deterministic finite-state automaton (TDFA) matching implementations, all paths through the non-deterministic finite-state automaton (NFA) are walked simultaneously, in different co-routines, where inside each co-routine, it is fully known when which capture group was entered or left. We extend this model to keep track of not just the last opening and closing of capture groups, but all of them. We do this by storing in every co-routine a history of the all groups using the flyweight pattern. Thus, we log enough information during parsing to build up the complete parse tree after matching in a single pass, making it possible to use our algorithm with strings exceeding the machine’s memory. Further we construct the automata such that a simulation of backtracking like behaviour is possible. This is achieved in worst case time $Theta$(mn), providing full parse trees with only constant slowdown compared to matching.
Boris Spasojević, Mircea Lungu, and Oscar Nierstrasz. Overthrowing the Tyranny of Alphabetical Ordering in Documentation Systems. In 2014 IEEE International Conference on Software Maintenance and Evolution, p. 511-515, September 2014. Details.
Software developers are often unsure of the exact name of the API method they need to use to invoke the desired behavior. Most state-of-the-art documentation browsers present API artefacts in alphabetical order. Albeit easy to implement, alphabetical order does not help much: if the developer knew the name of the required method, he could have just searched for it in the first place. In a context where multiple projects use the same API, and their source code is available, we can improve the API presentation by organizing the elements in the order in which they are more likely to be used by the developer. Usage frequency data for methods is gathered by analyzing other projects from the same ecosystem and this data is used then to improve tools. We present a preliminary study on the potential of this approach to improve the API presentation by reducing the time it takes to find the method that implements a given feature. We also briefly present our experience with two proof-of-concept tools implemented for Smalltalk and Java.
Boris Spasojević, Mircea Lungu, and Oscar Nierstrasz. Mining the Ecosystem to Improve Type Inference For Dynamically Typed Languages. In Onward! ’14, ACM, New York, NY, USA, 2014. Details.
Dynamically typed languages lack information about the types of variables in the source code. Developers care about this information as it supports program comprehension. Ba- sic type inference techniques are helpful, but may yield many false positives or negatives. We propose to mine information from the software ecosys- tem on how frequently given types are inferred unambigu- ously to improve the quality of type inference for a single system. This paper presents an approach to augment existing type inference techniques by supplementing the informa- tion available in the source code of a project with data from other projects written in the same language. For all available projects, we track how often messages are sent to instance variables throughout the source code. Predictions for the type of a variable are made based on the messages sent to it. The evaluation of a proof-of-concept prototype shows that this approach works well for types that are sufficiently popular, like those from the standard librarie, and tends to create false positives for unpopular or domain specific types. The false positives are, in most cases, fairly easily identifiable. Also, the evaluation data shows a substantial increase in the number of correctly inferred types when compared to the non-augmented type inference.
Jan Kurs, Mircea Lungu, and Oscar Nierstrasz. Bounded Seas: Island Parsing Without Shipwrecks. In Benoit Combemale, DavidJ. Pearce, Olivier Barais, and JurgenJ. Vinju (Ed.), Software Language Engineering, Lecture Notes in Computer Science 8706 p. 62-81, Springer International Publishing, 2014. Details.
Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing - bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.