Pages

Sunday, March 16, 2014

SVEditor: What's that reference? (Part 1)


One key feature of integrated development environments -- especially those for object oriented languages -- is the ability to find the references to a data type, a method, or a data field. A few months back, I implemented initial reference-finding functionality focused on module declarations and module instantiations. This functionality was used to support the design hierarchy view. Being able to quickly identify root modules by finding module declarations that were not referenced in the project was key to making the design hierarchy view efficient on large designs. Now, I've started work on more general support for locating references to types and fields.

SVEditor creates an AST (abstract syntax tree) for each source file that it parses. The ASTs stored the filesystem and the most recently used are cached in memory. This enables SVEditor to manage large projects without being limited by the amount memory available to Java, as well as avoid re-parsing source files on startup (provided they haven't been modified). Bringing ASTs in from disk is faster than re-parsing them, but is a time-consuming operation. Consequently, all SVEditor operations seek to minimize the number of ASTs that must be traversed.

Finding references is one of those global operations that requires information from all (or nearly all) the files in the environment. When performing a single reference lookup, waiting for a while is not a huge problem. However, reference searching is a very useful operation. As noted before, doing reference lookups for all modules in a design (often at least hundreds) is used to build the design hierarchy. In cases like these, reference lookups must be very fast.

The approach currently being implemented within SVEditor has two steps: coarse data collection during parsing and fine-grained data analysis during a lookup.

During parsing, all identifiers within a file are collected into a set that is associated with the file id. This per-file identifier hash allows a quick check to be performed to see if any references to a given element are likely. 

During a reference-finding operation, a set of files on which to perform more-involved analysis is built based on the per-file identifier hash. This first-pass filter enables more-detailed analysis to be performed on a much smaller set of files, while requiring very little storage overhead.

Next time, more details on the detailed AST analysis to identify class field and method references.

No comments:

Post a Comment