Automatic Token Classification

An attempt to mine useful information for parsing

Developers need software models to make decisions while developing software systems. Static software models are commonly imported by parsing source code. But building a custom parser is a difficult task in most programming languages. To build such a custom parser the grammar of the programming language is needed. If the grammar is not available, which is the case for many languages and notably dialects, it has to be inferred from the source code. Automatically finding the keywords of those languages can help the process of inferring a grammar because many keywords identify beginnings and endings of the basic building blocks of programs.