jQAssistant Language Concept Extractor Architecture

The Language Concept Extractor (LCE) architecture for jQAssistant provides a generic framework for building native tools to scan the source code of arbitrary programming languages and extract relevant language concepts from it. It then consolidates the extracted information into an easy-to-process JSON format for a jQA plugin.

Key Goals

extensibility: easily implement the detection and extraction of new language concepts
maintainability: the implementation should be easily adaptable to changes in the programming language
up-to-date: used APIs and libraries need to closely follow release cycles of the analyzed programming language to allow for the fast adoption of new syntax constructs, etc.

Solution

Core Idea:

split scanning process of source code into two parts:
1. processing of AST using a natively implemented tool for the programming language, to easily extract/consolidate relevant information
2. graph generation using the consolidated information from step one by using standard jQA scanner mechanisms
usage of JSON as an intermediary format as it can easily be processed on most platforms

Basic Overall Process:

flowchart LR
	source[(Source Code)]
	json[[JSON Representation]]
	neo4j[(Neo4j Graph)]
	source-->|LCE Tool|json
	json-->|jQA Plugin|neo4j

Language Concept Extraction Process: (performed by the LCE Tool) The Extractor API orchestrates the extraction process to obtain project objects which are then exported to a JSON file. The orchestration process encompasses the following steps:

native tools and APIs are used to get an enriched, structured view on the source code in the form of ASTs and other data structures
traversers traverse the ASTs of all source files and execute different processors to extract information on a file-by-file basis
- the decision whether a processor is executed is defined in its execution condition
- the extracted information is stored in language concept objects that are organized in concept maps
- during the traversal/processing of the AST a processing context is maintained that can be used to access and/or share all information necessary for processing
- all available traversers/processors are dynamically registered in central feature collections (which enables extensions)
- metadata assignment rules can be used to enrich language concept objects with additional information that can in-turn be used by other processors further up the tree, or by post processors
- all extracted language concepts are bundled into individual project objects
the project objects with the extracted language concepts are re-processed by post processors on a project-wide/cross-project basis, allowing for advanced resolution algorithms
- post processors have no access to the AST data, they only work on language concept objects (which may, however, contain attached metadata by metadata assignment rules)
the processed project objects are then exported to a JSON file

Concepts & Mechanisms

Projects using the LCE Architecture

jQA TypeScript Plugin (Source)
jQA Dart Plugin (Source, Docs)

jQA LCE Architecture

Explorer

jQAssistant Language Concept Extractor Architecture

Key Goals

Solution

Concepts & Mechanisms

Projects using the LCE Architecture

Graph View

Backlinks