PyStructure – Automated Structure and Dependency Analysis of Python Code

This project's goal is to develop a structural analyser for programs which are written in the Python programming language. The analyser should be able to parse an application's source code, analyse it and then generate a graph representing the internal structure of the project.

As Python is a dynamic language most of the interesting details (i.e. type) are not known before the application is running. The analyser has to 'guess' the correct type by analysing the code base (see TypeInferencerIntroduction).

Structure 101 showing dependencies between classes

The following information will be provided by the analyser:

  • All involved components, like modules, packages, classes or methods
  • Dependencies between such components, caused by:
    • method calls
    • using variables/parameters/fields of a certain type
    • inheritance
  • Module layout/structure

Possible applications of this information include:

  • Visualisation of a program's architecture / internal structure
  • Code completion and navigation in IDEs

And it might improve the accuracy of tools which:

  • Detect unused/dead code
  • Look for possible bugs in code (like FindBugs for Java)
  • Do type checks and optimisations at compile time

Documentation

Bachelor Thesis: PyStructure – Automated Structure and Dependency Analysis of Python Code (PDF)

Try it out

  • Download – Try out the latest build of PyStructure (just follow the instructions in the README)

News

14th April 2008 – First release

We are happy to announce our first release :)! You can download the pystructure.zip from here, extract it and follow the instructions in the README.

Although the engine already supports a wide variety of cases it still lacks some very important features:

  • No support for inheritance: Currently the engine ignores everything that involves inheritance. For example if a method is implemented in a base class it won't be found if it was called on an instance of a sub class.
  • Type of list/dict elements is not known: The type of container elements cannot be determined yet. For projects which heavily rely on lists this means that a lot of types can't be determined-
  • Only little support for built-ins: Only a few built-in operations are recognised. For example the type inference engine doesn't yet know that len("str") returns an integer.

We are working on tackling these issues in the next two milestones and we hope to improve the accuracy of the engine significantly.

Find out more

PyStructure itself is written in Java, because it was originally part of an Eclipse plugin (see the PEPTIC project). Now it is a stand-alone library released under the LGPL license (v2 or later). Follow these links to find out more:

Organisation

PyStructure is a bachelor thesis by Robin Stocker and Reto Schüttel in collaboration with Headway Software. Headway is the creator of the structural analysis software Structure101, which is available for Java as well as a generic version for other systems (and both are free to use by FOSS projects). We are using the XML interface of the generic version to display structure and dependency data for Python. But Structure101 output is only one of the many applications of our analyser.

Headway logo

Hochschule für Technik Rapperswil logo

References

Project