CS 395T: Program Transformation and Analysis using Industrial
Tools
Fall 2012
Tuesday/Thursday 3:30-5pm Welch 3.402
Instructor:
Dr. Ira Baxter
Course Description
Personal note to CS
and ECE graduate students from
Don Batory, CS Professor:
This may be a once-in-a-lifetime opportunity for you to get hands-on experience
with a commercial-grade program transformation system (there are very few of
these) and the opportunity to learn from someone who has been at, what I
consider, to be at the leading edge of truly modern software engineering.
As software systems grow more complex, they become more
difficult to understand and revise by purely manual approaches.
Metaprogramming applies programming to the task
of analysis and modification programs. While many programming languages provide
some built-in but limited metaprogramming facilities (templates, reflection,
...), program transformation systems provide the metaprogrammer with complete
access to the program structure to enable arbitrary metaprogramming.
This class provides students with hands-on training in use of
one of the extremely few industrial-strength program transformation systems, the
DMS Software Reengineering Toolkit®
from Austin-local
Semantic Designs. DMS has been used
to implement static analysis tools such as clone detection, dynamic analysis
tools including erroneous pointer usage in C programs, and complete language
migration tools.
Students will learn the basic architecture of a practical
program transformation system, detailed techniques for defining languages
("domains") to be processed, how to implement various types of analyzers for
those languages, and how to write and apply source-to-source program
transformation rules, both to simple languages and to real languages such as
Java or C++. The course is structured as lectures on theory, tool concepts and mechanics;
students are expected to do a variety of exercises using DMS.
The instructor (Dr. Ira Baxter) will be assisted by Semantic Designs
staff both in lecture and exercise coaching. Students will be provided with
personal copies of DMS for class use; those that complete the course will be
granted a research license. Students must bring an x86 based PC with a minimum
of 4Gb RAM and 200Mb of disk space.
Topics will include the list below. Topics in
yellow assume
the corresponding background knowledge;
they are covered to enable the students to describe
the required knowledge quickly, to enable
the student to work on interesting applications of DMS rather than
struggling to describe the language he wants to process.
- Program Transformation Theory
-
Transform(ations): What is a transform,
the distinction between a transform (potential) and transformation (an instance),
procedural transforms versus rewrite rules. Characteristics, structures, and mechanisms
of typical program transformation systems.
-
Types of Specifications and Semantics. Specifications, domain-specific
languages
and general purpose languages, and how a transformation system views them.
-
Transformation Mechanics: Rewriting over structures
-
The Control Problem: Choosing sequences of transformations. Procedural control.
Goal directed control. Domain-hierarchy driven control.
-
Design as captured knowledge in a transformational context.
Incremental design updates.
- Program Transformation Practice using DMS
-
DMS Overview: Structure of the tool and some
applications
-
DMS Operation: Basics of building domain
components
-
PARLANSE: DMS's underlying
parallel programming language
Lexer: How to define a
lexer, format capture and value conversions for domain
Parser: How to define a
context-free grammar for a domain, producing a syntax tree
Pretty Printer: How to
define an anti-parser for a domain
AST Interface: APIs for direct manipulation of
syntax trees
Attribute Evaluation:
Implement analyzers with tree-shaped information flow
Symbol Table: Capturing
scope and type information for identifiers
Rule Specification Language: writing
tree-manipulation rules and patterns using domain syntax
Registry: APIs for integrating PARLANSE, rules
and patterns
Domain Engineering Tools: DMS Tools to define
composable analysis and transformation subsystems
Flow Analysis: Generic
control and data flow analysis machinery
Predefined Domains: COTS language modules and
maturity levels; complications with real languages
Java Domain: Putting it all together for a real
language
Refactoring by renaming: A specific example of
DMS applied to Java
Other topics to be determined by student
interest
Lectures build strongly on one another. For a final project,
students will build a custom tool of their choice on a domain-specific language
or on Java.
Prerequisites: programming maturity, effective working
knowledge of C and Make, some knowledge of Java, and a basic compiler course.
Student Presentations
Students will present a related technical paper of their choice, and later, a description of thier class project. Exact schedule for students can be found here.
Office Hours and Location
Thursdays 2:00-3:30 ACES 6.302
Link to shared class forum for DMS class questions and/or programming issues.
Help for students checked twice a day by Dr. Baxter or SD staff.
Useful to log your issues here, as other students may see (or even respond!)
Forum signup link, do this first!
Technical papers
Accessible here
- Balzer, R., A 15 Year Perspective on Automatic Programming, Transactions on Software Engineering, #11, pp. 1257-1268, 1985.
- Batory, D., Feature-Oriented Programming and the AHEAD Tool Suite. ICSE 2004: 702-703
- Baxter, I.D., Pidgeon, C., Mehlich, M.: DMS: Program Transformations for Practical Scalable Software Evolution. ICSE 2004: 625-634
- Bravenboer, M, Kalleberg, K., Vermaas, R. and Visser, E. Stratego/XT 0.17. A Language and Toolset for Program Transformation, Science of Computer Programming, 72(1-2):52--70, June 2008. Special issue on experimental software and toolkits.
- J.R. Cordy, 2006. The TXL Source Transformation Language. Science of Computer Programming 61,3 (August 2006), 190-210.
- Kant, E. Synthesis of mathematical-modeling software, IEEE Software, May 1993, #10(3),pp. 30 - 41
- Klint, P. van der Storm, T., Vinju, J. Rascal, A Domain Specific Langauge for Source Code Analysis and Manipulation, Source Code Analysis and Manipulation 2009
- Neighbors, James M., The Draco Approach to Constructing Software from Reusable Components, Proceedings 1983 Workshop on Reusability in Programming, pp. 167-178, ITT Programming, September 1983.
- Schurr, A., Winter, A., Zundorf, A., The PROGRES Approach: Language and Environment Chapter 1, 1999
Class Schedule (subject to change)
- Thursday Aug 30: Lecture: Introduction to Program Transformation and DMS
Reading: (hyperlink) DMS: Program Transformations for Practical Scalable Software Evolution.
- Tuesday Sept 4: Lecture: DMS discussion+overview; maturation of a domain ( language ); Algebra Example;
PARLANSE1; Libraries
DMS Installation:[Bring your PC/laptop Windows 2Gb min, 6Gb 64bit preferred]
As an additional step, you need to install Cygwin, as described under "DMS/Documentation/Installation"
DMS Documentation is at "DMS/Documentation/newindex.html" including links to PARLANSE reference manual.
PARLANSE compiler and runtime "tools" are documented under "PARLANSE Domain"
PARLANSE Tutorial slides (shown in Aug 30 lecture) are at "DMS/Documentation/Tutorial/PARLANSE.pdf"
Assignment: PARLANSE: Recursive descent parser
a) Using the PARLANSE compiler and runtime tools, compile and run "DMS/PARLANSE/Examples/HelloWorld/HelloWorld.par"
b) Fetch all the files for AlgebraParser (See sources below).
Compile and run AlgebraMain.par, a recursive descent parser for algebra.bnf
c) Write your own recursive descent parser for "Toy.bnf", compile
and run it.
You do not need the Cygwin install, or a DMS registration to accomplish this assignment.
- Thursday Sept 6: Lecture: Program Transformation theory
Reading: The Draco Approach to Constructing Software from Reusable Components
- Tuesday Sept 11: Lecture: Discuss Draco / PARLANSE 2
Assignment: PARLANSE: Build tree management module to support recursive descent parser
- Thursday Sept 13: Lecture: Types of specification and semantics
Reading: A 15 Year Perspective on Automatic Programming
- Tuesday Sep 18: Lecture: Discuss Balzer /Lexers / Java domain lexer
Assignment: Students pick languages (eventually projects), build lexers
- Thursday Sept 20: Lecture: Parsers / Java Domain Parser
Assignment: Students construct language grammars
Reading: The Graph Rewriting Language and Environment PROGRES
Project: Prepare/submit sketch of DMS project proposal. Identify language to process, and specific analysis/transformation task you might like to do. Email message to me is good enough. One paragraph describing the language and your experience with it. One paragraph describing the task, and (ignoring what DMS may or may not be capable of doing) some argument as to why it is technically feasible. The actual project is an attempt to achieve your goal.
- Tuesday Sept 25: Lecture: Discuss PROGRES
Transformation Mechanics: Rewriting over structures
(Assignment: continuing work on grammars)
- Thursday Sept 27: Lecture: The Control Problem: Choosing sequences of transformations
Reading: Stratego/XT 0.17. A Language and Toolset for Program Transformation
- Tuesday Oct 2: Lecture: ASTs and AST Interface
- Thursday Oct 4: Lecture: Makefiles and tool building
PARLANSE program to extract AST facts
Reading: The TXL Source Transformation Language
- Tuesday Oct 9: Lecture: PrettyPrinters / Java PrettyPrinter
Transformational Design
Reading: Design Maintenance Systems
- Thursday Oct 11: Lecture: Attribute Evaluation
Java Metrics Attribute Evaluator
Domain Engineering Tools
Fact collection program (metrics?)
Reading: Kant, E. Synthesis of mathematical-modeling software
- Tuesday Oct 16: Lecture: Symbol Tables
Symbol table construction
(Students choose projects)
- Thursday Oct 18: Lecture: Java Symbol Table
Reading: Feature-Oriented Programming and the AHEAD Tool Suite.
- Tuesday Oct 23: Lecture: Type inference
- Thursday Oct 25: Java Domain with Type Annotations
Reading: Rascal, A Domain Specific Language for Source Code Analysis and Manipulation
- Tuesday Oct 30: Lecture: Rule Specification Language
- Thursday Nov 1: Lecture: The Registry
- Tuesday Nov 6: Lecture: Data Flow Analysis
- Tuesday Nov 13: Student Paper presentations (20-30 minutes each)
- Thursday Nov 15: Student Paper presentations (20-30 minutes each)
- Tuesday Nov 20: Student Paper presentations (20-30 minutes each)
- Nov 22-23
- Tuesday Nov 27: Lecture: DMS Future?
- **** NOTE CHANGE OF SCHEDULE ****
- Thursday Nov 29: Presentations: Student Final Projects (20-30 minutes each)
- Tuesday Dec 4: Presentations: Student Final Projects (20-30 minutes each)
- Thursday Dec 6: Presentations: Student Final Projects (20-30 minutes each)
Accessing documents from this page
Students have reported trouble accessing documents through links
on this page. Make sure you access this page from the official
University course page www.cs.utexas.edu/graduate-program/courses/class-homepages or this link to get the right permissions.
Completed Lectures
Source Files
- A toy language grammar Toy.bnf
- A simple top-down parser in PARLANSE for Algebra
You can copy these files into a single directory anywhere on your DMS machine, compile and run this.
Simple algebra.bnf
Main Program AlgebraMain.par
Module AlgebraParser.par
Windows Command script compileit.cmd
Screenshot: compile_run_algebra_parser.jpg
Execute this by "run AlgebraMain".
- Editor support for Emacs and Notepad++ can be found in your DMS/Documentation/Installation directory. Editor support for VIM and Sublime can be found here