Publications

Here are the publications of our research group listed in order of publication date. Each entry has a citation, an abstract, and a hypertext link to a PDF copy of the paper. See the copyright notice.

To search multiple keywords separate
each keyword by "|" ex: keyword | keyword

Total Publications: 107

D. Batory. Using Modern Mathematics as an FOSD Modeling Langauge, Generative Programming and Component Engineering (GPCE), October 2008.

Modeling languages are a fundamental part of automated software development. MDD, for example, uses UML class diagrams and state machines as languages to define applications. In this paper, we explore how Feature Oriented Software Development (FOSD) uses modern mathematics as a modeling language to express the design and synthesis of programs in software product lines, but demands little mathematical sophistication from its users. Doing so has three practical benefits: (1) it offers a simple and principled mathematical description of how FOSD transforms, derives, and relates program artifacts, (2) it exposes previously unrecognized commuting relationships among tool chains, thereby providing new ways to debug tools, and (3) it reveals new ways to optimize software synthesis.

C.H.P. Kim, C. Kaestner, and D. Batory. On the Modularity of Feature Interactions, Generative Programming and Component Engineering (GPCE), October 2008.

Feature modules are the building blocks of programs in software product lines (SPLs). A foundational assumption of feature-based program synthesis is that features are composed in a predefined order. Recent work on virtual separation of concerns reveals a new model of feature interactions that shows that feature modules can be quantized as compositions of smaller modules called derivatives. We present this model and examine some of its unintuitive consequences, namely, that (1) a given program can be reconstructed by composing features in any order, and (2) the contents of a feature module (as expressed as a composition of derivatives) is determined automatically by a feature order. We show that different orders allow one to adjust the contents of a feature module to isolate and study the impact of interactions that a feature has with other features. Using derivatives, we show the utility of generalizing safe composition (SC), a basic analysis of SPLs that verifies program type-safety, to prove that every legal composition of derivatives (and thus any composition order of features) produces a type-safe program, which is a much stronger SC property. Click here for code downloads.

S. Apel, C. Kaestner, and D. Batory. Program Refactoring using Functional Aspects, Generative Programming and Component Engineering (GPCE), October 2008.

A functional aspect is an aspect that has the semantics of a transformation; it is a function that maps a program to an advised program. Functional aspects are composed by function composition. In this paper, we explore functional aspects in the context of aspect-oriented refactoring. We show that refactoring legacy applications using functional aspects is just as flexible as traditional aspects in that (a) the order in which aspects are refactored does not matter, and (b) the number of potential aspect interactions is decreased. We analyze several aspectoriented programs of different sizes to support our claims.

D. Batory and E. Boerger. Modularizing Theorems for Software Product Lines: The Jbook Case Study to appear, Jour. Universal Computer Science (JUCS).

A goal of software product lines is the economical assembly of programs in a family of programs. In this paper, we explore how theorems about program properties may be integrated into feature-based development of software product lines. As a case study, we analyze an existing Java/JVM compilation correctness proof for defining, interpreting, compiling, and executing bytecode for the Java language. We show how features modularize program source, theorem statements and their proofs. By composing features, the source code, theorem statements and proofs for a program are assembled. The investigation in this paper reveals a striking similarity of the refinement concepts used in Abstract State Machines (ASM) based system development and Feature-Oriented Programming (FOP) of software product lines. We suggest to exploit this observation for a fruitful interaction of researchers in the two communities.

G. Freeman, D. Batory, and G. Lavender Lifting Transformational Models of Product Lines: A Case Study. Int. Conference on Model Transformations (ICMT), 2008.

Model driven development (MDD) of software product lines (SPLs) merges two increasing important paradigms that synthesize programs by transformation. MDD creates programs by transforming models, and SPLs elaborate programs by applying transformations called features. In this paper, we present the design and implementation of a transformational model of a product line of scalar vector graphics and JavaScript applications. We explain how we simplified our implementation by lifting selected features and their compositions from our original product line (whose implementations were complex) to features and their compositions of another product line (whose specifications were simple). We used operators to map higher-level features and their compositions to their lower-level counterparts. Doing so exposed commuting relationships among transformations (features) in both product lines that helped validate our model and implementation.

S. Apel and D. Batory How AspectJ is Used: An Analysis of Eleven AspectJ Programs. TR MIP-0801, Dept. Informatics and Mathematics, University of Passau, Germany, April 2008.

While it is well-known that crosscutting concerns occur in many software projects, little is known on how aspect-oriented program- ming and in particular AspectJ have been used. In this paper, we analyze eleven AspectJ programs by different authors to answer the questions: which mechanisms are used, to what extent, and for what purpose. We found the code of these programs to be on average 86% objectoriented, 12% basic AspectJ mechanisms (introductions and method extensions), and 2% advanced AspectJ mechanisms (homogeneous advice or advanced dynamic advice). There is one class of crosscutting concerns { which is mostly concerned with introductions and method extensions { that matches this result well: collaborations. These results and our discussions with program authors indicate the bulk of coding activities was implementing collaborations. Several studies and researchers suggest that languages explicitly supporting collaborations are better suited than aspects a la AspectJ for this task.

C.H.P. Kim, K. Czarnecki, and D. Batory On-Demand Materialization of Aspects for Application Development. Software Engineering Properties of Languages and Aspect Technologies (SPLAT), 2008.

Framework-based application development requires applications to be implemented according to rules, recipes and conventions that are documented or assumed by the framework’s Application Programming Interface (API), thereby giving rise to systematic usage patterns. While such usage patterns can be captured cleanly using conventional aspects, their variations, which arise in exceptional conditions, typically cannot be. In this position paper, we propose material- izable aspects as a solution to this problem. A materializable aspect behaves as a normal aspect for default joinpoints, but for exceptional joinpoints, it turns into a program transformation and analysis mechanism, with the IDE transforming the advice in-place and allowing the application developer to modify the materialized advice within the semantics of the aspect. We demonstrate the need for materializable aspects through a preliminary study of open-source SWT-based applications and describe our initial implementation of materializable aspects in Eclipse.

D. Batory and D. Smith. Finite Map Spaces and Quarks: Algebras of Program Structure. University of Texas at Austin, Dept. of Computer Sciences, TR-07-66.

We present two algebras that unify the disparate software composition models of Feature-Oriented Programming and Aspect-Oriented Programming. In one algebra, a finite map space underlies program synthesis, where adding finite maps and modifying their contents are fundamental operations. A second and more general algebra uses quarks, a construct that represents both expressions and their transformations. Special cases of our algebras correspond to existing systems and languages, and thus can serve as a foundation for next-generation tools and languages for feature-based program synthesis.

S. Apel, C. Lengauer, D. Batory, B. Möller, C. Kästner, An Algebra for Feature-Oriented Software Development. Fakultät für Informatik und Mathematik, Universität Passau, MIP-0706, 2007.

Feature-Oriented Software Development (FOSD) provides a multitude of formalisms, methods, languages, and tools for building variable, customizable, and extensible software. Along different lines of research different ideas of what a feature is have been developed. Although the existing approaches have similar goals, their representations and formalizations have not been integrated so far into a common framework. We present a feature algebra as a foundation of FOSD. The algebra captures the key ideas and provides a common ground for current and future research in this field, in which also alternative options can be explored.

keywords

E. Uzuncaova, D. Garcia, S. Khurshid, and D. Batory , A Specification-Based Approach to Testing Software Product Lines (Poster Paper) . SIGSOFT FSE, 2007.

This paper presents a specification-based approach for systematic testing of products from a software product line. Our approach uses specifications given as formulas in Alloy, a first-order logic based on relations. Alloy formulas can be checked for satisfiability using the Alloy Analyzer. The fully automatic analyzer, given an Alloy formula and a scope, i.e., a bound on the universe of discourse, searches for an instance, i.e., a valuation to the relations in the formula such that it evaluates to true. The analyzer translates an Alloy formula (for the given scope) to a propositional formula and finds an instance using an off-the-shelf SAT solver. The use of an enumerating solver enables systematic test generation. We have developed a prototype based on the AHEAD theory. The prototype uses the recently developed Kodkod model finding engine of the Alloy Analyzer. We illustrate our approach using a data structure product line.

S. Thaker, D. Batory, D. Kitchin, and W. Cook, Safe Composition of Product Lines , Generative Programming and Component Engineering (GPCE), October 2007.

Programs of a software product line can be synthesized by composing modules that implement features. Besides high-level domain constraints that govern the compatibility of features, there are also low-level implementation constraints: a feature module can reference elements that are defined in other feature modules. Safe composition is the guarantee that all programs in a product line are type safe: i.e., absent of references to undefined elements (such as classes, methods, and variables). We show how safe composition properties can be verified for AHEAD product lines using feature models and SAT solvers.

C. Kaestner, S. Apel, D. Batory, A Case Study Implementing Features Using AspectJ, Software Product Line Conference (SPLC), September 2007.

Software product lines aim to create highly configurable programs from a set of features. Common belief and recent studies suggest that aspects are well-suited for implementing features. We evaluate the suitability of AspectJ with respect to this task by a case study that refactors the embedded database system Berkeley DB into 38 features. Contrary to our initial expectations, the results were not encouraging. As the number of aspects in a feature grows, there is a noticeable decrease in code readability and maintainability. Most of the unique and powerful features of AspectJ were not needed. We document where AspectJ is unsuitable for implementing features of refactored legacy applications and explain why.

S. Trujillo, D. Batory, O. Diaz, Feature Oriented Model Driven Development: A Case Study for Portlets, International Conference on Software Engineering (ICSE), 2007.

Model Driven Development (MDD) is an emerging paradigm for software construction that uses models to specify programs, and model transformations to synthesize executables. Feature Oriented Programming (FOP) is a paradigm for software product lines where programs are synthesized by composing features. Feature Oriented Model Driven Development (FOMDD) is a blend of FOP and MDD that shows how products in a software product line can be synthesized in an MDD way by composing models from features, and then transforming these models into executables. We present a case study of FOMDD on a product line of portlets, which are components of web portals. We reveal mathematical properties of portlet synthesis that helped us to validate the correctness of our abstractions, tools, and specifications, as well as optimize portlet synthesis.

D. Batory Program Refactorings, Program Synthesis, and Model-Driven Design . Keynote at European Joint Conferences on Theory and Practice of Software (ETAPS) Compiler Construction Conference (April 2007)

Program refactoring, feature-based and aspect-oriented software synthesis, and model-driven development are disjoint research areas. However, they are all architectural metaprogramming technologies as they treat programs as values and use functions (a.k.a. transformations) to map programs to other programs. In this paper, I explore their underlying connections by reviewing recent advances in each area from an architectural metaprogramming perspective. I conjecture how these areas can converge and outline a theory that may unify them.

D. Batory From Implementation to Theory in Product Synthesis . Keynote at Principles of Programming Languages (POPL) January 2007

Future software development will rely on product synthesis, i.e., the synthesis of code and non-code artifacts for a target component or application. Prior work on feature-based product synthesis can be unified by applying elementary ideas from category theory. Doing so reveals (a) important and previously unrecognized properties that product synthesis tools must satisfy, and (b) non-obvious generalizations of current techniques that will guide future research efforts in automated product development.

R. Lopez-Herrejon and D. Batory, Modeling Features in Aspect-Based Product Lines with Use Case Slices: An Exploratory Case Study." Aspect Oriented Modeling Workshop at Model Driven Engineering Languages and Systems (MoDELS). Genoa, Italy 2006. Also, in Models in Software Engineering. Workshops and Symposia at MoDELS 2006, Genoa, Italy, October 1-6, 2006, Reports and Revised Selected Papers. Series: Lecture Notes in Computer Science , Vol. 4364 Sublibrary: Programming and Software Engineering 2007.

A significant number of techniques that exploit aspects in software design have been proposed in recent years. One technique is use case slices by Jacobson and Ng, that builds upon the success of use cases as a common modeling practice. A use case slice modularizes the implementation of a use case and typically consists of a set of aspects, classes, and interfaces. Work on Feature Oriented Programming (FOP) has shown how features, increments in program functionality, can be modularized and algebraically modeled for the synthesis of product lines. When AspectJ is used in FOP, the structure of feature modules resembles that of use case slices. In this paper, we explore the relations between use case slices modeling and FOP program synthesis and describe their potential synergy for modeling and synthesizing aspect-based product lines.

D. Batory, D. Benavides, and A. Ruiz-Cortes. "Automated Analyses of Feature Models: Challenges Ahead", CACM, Special Issue on Software Product Lines, December 2006.

This paper describes recent advances in formalizing feature models and the challenges ahead in automating product specification and design.

S. Apel and J. Liu. "On the Notion of Functional Aspects in Aspect-Oriented Refactoring", Workshop on Aspects, Dependencies, and Interactions (ADI), Nantes, France, 2006.

In this paper, we examine the notion of functional aspects in context of aspect-oriented refactoring. Treating aspects as functions reduces the potential interactions between aspects significantly. We propose a simple mathematical model that incorporates fundamental properties of aspects and their interactions. Using this model, we show that with regard to refactoring functional aspects are as flexible as traditional aspects, but with reduced program complexity.

S. Apel, D. Batory, and M. Rosenmueller. "On the Structure of Crosscutting Concerns:Using Aspects or Collaborations?", 1st Workshop on Aspect-Oriented Product Line Engineering (AOPLE), October 2006.

While it is well known that crosscutting concerns occur in many software projects, little is known about the inherent properties of these concerns nor how aspects (should) deal with them. We present a framework for classifying the structural properties of crosscutting concerns into (1) those that benefit from AOP and (2) those that should be implemented by OOP mechanisms. Further, we propose a set of code metrics to perform this classification. Applying them to a case study is a first to step toward revealing the current practice of AOP.

G.T. Leavens, J-R Abrial, D. Batory, M. Butler, A. Coglio, K. Fisler, E. Hehner, C. Jones, D. Miller, S. Peyton-Jones, M. Sitaraman, D.R. Smith, and A. Stump, "Roadmap for Enhanced Languages and Methods to Aid Verification", Generative Programming and Component Engineering (GPCE), October 2006.

This road map describes ways that researchers in four areas - specification languages, program generation, correctness by construction, and programming languages - might help further the goal of verified software. It also describes what advances the "verified softwar" grand challenge might anticipate or demand from work in these areas. That is, the roadmap is intended to help foster collaboration between the grand challenge and these research areas. A common goal for research in these areas is to establish language designs and tool architectures that would allow multiple annotations and tools to be used on a single program. In the long term, researchers could try to unify these annotations and integrate such tools.

S. Apel and D. Batory. When to Use Features and Aspects? A Case Study ,Generative Programming and Component Engineering (GPCE), October 2006.

Aspect-Oriented Programming (AOP) and Feature-Oriented Programming (FOP) are complementary technologies that can be combined to overcome their individual limitations. Aspectual Mixin Layers (AML) is a representative approach that unifies AOP and FOP. We use AML in a non-trivial case study to create a product line of overlay networks. We also present a set of guidelines to assist programmers in how and when to use AOP and FOP techniques for implementing product lines in a stepwise and generative manner.

S. Trujillo, D. Batory, and O. Diaz. Feature Refactoring a Multi-Representation Program into a Product Line , Generative Programming and Component Engineering (GPCE), October 2006.

Feature refactoring is the process of decomposing a program into a set of modules, called features, that encapsulate increments in program functionality. Different compositions of features yield different programs. As programs are defined using multiple representations, such as code, makefiles, and documentation, feature refactoring requires all representations to be factored. Thus, composing features produces consistent representations of code, makefiles, documentation, etc. for a target program. We present a case study of feature refactoring a substantial tool suite that uses multiple representations. We describe the key technical problems encountered, and sketch the tool support needed for simplifying such refactorings in the future.

D. Batory. Multi-Level Models in Model Driven Development, Product-Lines, and Metaprogramming , IBM Systems Journal, Volume 45, Number 3, 2006.

Model Driven Development (MDD) aims to raise the level of abstraction in program specification and increase automation in program development. These are also the goals of product-lines (creating a family of related programs) and metaprogramming (programming as a computation). We show that the confluence of MDD, product-lines, and metaprogramming exposes a multilevel paradigm of program development, and further, we can use OO design techniques to represent programs, the metaprograms that produced these programs, and the meta-meta programs that produced these metaprograms, recursively. The paradigm is based on a small number of simple and well-known ideas, scales to the synthesis of applications of substantial size, and helps clarify concepts of MDD.

J. Liu, D. Batory, and C. Lengauer. "Feature Oriented Refactoring of Legacy Applications", International Conference on Software Engineering 2006 (ICSE), Shanghai, China.

Feature oriented refactoring (FOR) is the process of decomposing a program into features, where a feature is an increment in program functionality. We develop a theory of FOR that relates code refactoring to algebraic factoring. Our theory explains relationships between features and their implementing modules, and why features in different programs of a product-line can have different implementations. We describe a tool and refactoring methodology based on our theory, and present a validating case study.

R. Lopez-Herrejon, D. Batory, and C. Lengauer. "A Disciplined Approach to Aspect Composition", ACM SIGPLAN 2006 Symposium on Partial Evaluation and Program Manipulation (PEPM), January 2006, Charleston, SC.

Aspect-oriented programming is a promising paradigm that challenges traditional notions of program modularity. Despite its increasing acceptance, aspects have been documented to suffer limited reuse, hard to predict behavior, and difficult modular reasoning. We develop an algebraic model that relates aspects to program transformations and uncovers aspect composition as a significant source of the problems mentioned. We propose an alternative model of composition that eliminates these problems, preserves the power of aspects, and lays an algebraic foundation on which to build and understand AOP tools.

M. Grechanik, D.E. Perry, and D. Batory. A Scalable Security Mechanism For Component-Based Systems, Int. Conf. COTS-Based Software Systems, April 2006, Orlando FL.

Security, scalability, and performance are critical for large-scale component-based applications. Weaving security solutions into the fabric of component-based architectures often worsens the scalability and performance of the resulting system. In this paper we analyze the sources of non-scalability and conduct an empirical study that shows that close to 80% of interactions between components and their clients in different commercial systems occur within protected security boundaries. Based on these findings we propose a novel scalable security mechanism for component based systems called Component Adaptive Scalable Secure Infrastructure Architecture (CASSIA). CASSIA utilizes the topology of the security boundaries and patterns of interactions among components to achieve noticeable improvements in scalability and performance for component-based applications. We conduct a case study that confirms the scalability of CASSIA, and propose a Secure COmponent Protocol (SCOP) that incorporates our mechanism into a component infrastructure.

D. Batory.A Tutorial on Feature Oriented Programming and the AHEAD Tool Suite, Proc. Summer School on Generative and Transformation Techniques in Software Engineering, Braga, Portugal, R. Laemmel, J. Saraiva, and J. Visser, ed., Vol 4143, Lecture Notes in Computer Science, Springer-Verlag, 2006.

Feature oriented programming (FOP) is the study of feature modularity and its use in program synthesis. AHEAD is a theory of FOP that is based on a fundamental concept of generative programming that functions map programs. This enables the design of programs to be expressed compositionally as algebraic expressions, which are suited for automated analysis, manipulation, and program synthesis. This paper is a tutorial on FOP and AHEAD. We review AHEAD's theory and the tool set that implements it.

E. Jung, C. Kapoor, and D. Batory. "Automatic Code Generation for Actuator Interfacing from a Declarative Specification", International Conference on Robotics and Systems (IROS), August 2005, Edmonton, Canada.

Common software design practices use object-oriented (OO) frameworks that structure software in terms of objects, classes, and package; designers then create programs by inheritance and composition of classes and objects. Operational Software Components for Advanced Robotics (OSCAR) is one such framework for robot control software with abstractions for generalized kinematics, dynamics, performance criteria, decision making, and hardware interfacing. Even with OSCAR, writing new programs still requires a significant amount of manual labor. Feature-Oriented Programming (FOP) is method for software design that models and specifies programs in terms of features, where a feature encapsulates the common design decisions that occur in a domain. A set of features then forms a domain model for a Product Line Architecture. Product variants in this product line can then be generated from a declarative specification. FOP and related technologies are emerging software engineering techniques for automatically generating programs. Our research applies FOP to robot controller software. As an example, the domain of hardware interfacing is analyzed and 41 features identified. A GUI for specifying and generating programs is presented as well. Analysis of features shows 200 possible different programs could be generated.

D. Batory. "Feature Models, Grammars, and Propositional Formulas", Software Product Line Conference (SPLC) September 2005.

Feature models are used to specify members of a product-line. Despite years of progress, contemporary tools provide limited support for feature constraints and offer little or no support for debugging feature models. We integrate prior results to connect feature models, grammars, and propositional formulas. This connection allows arbitrary propositional constraints to be defined among features and enables off-the-shelf satisfiability solvers to debug feature models. We also show how our ideas can generalize recent results on the staged configuration of feature models.

J. Liu, D. Batory, and S. Nedunuri. Modeling Interactions in Feature Oriented Designs, International Conference on Feature Interactions (ICFI), June 2005.

Feature Oriented Programming (FOP) is a general theory of software development where programs are assembled by composing feature modules. A feature X interacts structurally with another feature Y by changing Y's source code. We advance FOP by proposing an algebraic theory of structural feature interactions that models feature interactions as derivatives. We use our theory to show how a legacy Java application can be refactored into a feature-based design.

R. Lopez-Herrejon, D. Batory, and W. Cook. "Evaluating Support for Features in Advanced Modularization Technologies", European Conference on Object-Oriented Programming (ECOOP), July 2005.

A software product-line is a family of related programs. Each program is defined by a unique combination of features, where a feature is an incremental change in program functionality. Modularizing features is difficult, as feature-specific code often cuts across class boundaries. New modularization technologies have been proposed in recent years, but their support for feature modules has not been thoroughly examined. In this paper, we propose a variant of the expression problem as a canonical problem in product-line design. The problem reveals a set of technology-independent properties that feature modules should exhibit. We use these properties to evaluate five technologies: AspectJ, Hyper/J, Jiazzi, Scala, and AHEAD. The results suggest an abstract model of feature composition that is technology-independent and that relates compositional reasoning with algebraic reasoning.

M. Grechanik, D.E. Perry, and D. Batory. Using AOP to Monitor and Administer Software for Grid Computing Environments, Int. Computer Software and Applications Conference (COMPSAC), Edinburgh, Scotland, July 2005.

Monitoring is a task of collecting measurements that reflect the state of a system. Administration is a collection of tasks for control and manipulation of computer systems. Monitoring and Administering computer ResourceS (MARS) in a distributed grid computing environment (i.e. a distributed environment for coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations) is an important, expensive, and critical task. We present a novel solution based on applying crosscuts using binary rewriters and an event-based model that allows developers to create non-trivial MARS programs easily and uniformly.

Our approach converts low-level API resource calls into system-wide events that MARS programs can monitor. This is accomplished by introducing advice that contains event-generating code at join points in programs that represent computer resources. We categorize low-level resource APIs by imposing a transactional metaphor to simplify the complexity of interactions between resources and to enable reasoning about MARS programs. We report both a case study and simulation that supports the viability of our approach.

R. Lopez-Herrejon and D. Batory. Improving Incremental Development in Aspectj by Bounding Quantification, Software Engineering Properties and Languages for Aspect Technologies (SPLAT), March 2005.

Incremental software development is a process of building complex programs from simple ones by successively adding programmatic details. It is an effective and common design practice that helps control program complexity. However, incrementally building software using aspects can be more challenging than using traditional modules. AspectJ quantification mechanisms do not distinguish different developmental stages, and thus pointcuts can capture join points from later stages that they originally were not intended to advice.

In this paper we present an algebraic model to describe aspects and their composition in AspectJ. We show that the way AspectJ's composes aspects plays a significant role in this problem. We propose an alternative model to compose aspects that improves the support for incremental development. It bounds the scope of quantification and still preserves the same power of AspectJ. We also show how bounded quantification contributes to aspect reuse.

J. Blair and D. Batory. A Comparison of Generative Approaches: XVCL and GenVoca, Technical Report December 2004.

We report one of the first comparative studies on two mature approaches to generative programming: XVCL and GenVoca. XVCL is the latest incarnation of Bassett's frames; GenVoca melds feature modularity with compositional programming. Both approaches explicitly rest on a pair of ideas: (1) parameterized functions return source code and (2) composing such functions synthesizes applications. Although their foundations are identical, XVCL and GenVoca have very different design philosophies. These differences raise an interesting debate on what design methodologies and programming constructs should be used in writing generative programs.

Code for this paper is here.

D. Batory A Science of Software Design . Keynote at International Conference on Algebraic Methodology And Software Technology (AMAST) 2004

Underlying large-scale software design and program synthesis are simple and powerful algebraic models. In this paper, I review the elementary ideas upon which these algebras rest and argue that they define the basis for a science of software design.

D. Batory Program Comprehension in Generative Programming: A History of Grand Challenges . Keynote at International Workshop on Program Comprehension (IWPC) 2004

The communities of Generative Programming (GP) and Program Comprehension (PC) look at similar problems: GP derives a program from a specification, PC derives a specification from a program. A basic difference between the two is GP’s use of specific knowledge representations and mental models that are essential for program synthesis. In this paper, I present a historical review of the Grand Challenges, results, and outlook for GP as they pertain to PC.

J. Liu and D. Batory. " Automatic Remodularization and Optimized Synthesis of Product-Families, Generative Programming and Component Engineering (GPCE) ,October 2004.

A product-family is a suite of integrated tools that share a common infrastructure. Program synthesis of individual tools can replicate common code, leading to unnecessarily large executables and longer build times. In this paper, we present remodularization techniques that optimize the synthesis of product-families. We show how tools can be automatically remodularized so that shared files are identified and extracted into a common package,and how isomorphic class inheritance hierarchies can be merged into a single hierarchy. Doing so substantially reduces the cost of program synthesis, product-family build times, and executable sizes. We present a case study of a product-family with five tools totalling over 170K LOC, where our optimizations reduce archive size and build times by 40%.

D. Batory, J.N. Sarvela, and A. Rauschmayer. Scaling Step-Wise Refinement, IEEE Transactions on Software Engineering (IEEE TSE), June 2004.

Step-wise refinement is a powerful paradigm for developing a complex program from a simple program by adding features incrementally. We present the AHEAD (Algebraic Hierarchical Equations for Application Design) model that shows how step-wise refinement scales to synthesize multiple programs and multiple noncode representations. AHEAD shows that software can have an elegant, hierarchical mathematical structure that is expressible as nested sets of equations. We review a tool set that supports AHEAD. As a demonstration of its viability, we have bootstrapped AHEAD tools from equational specifications, refining Java and non-Java artifacts automatically; a task that was accomplished only by ad hoc means previously.

M. Grechanik, D. Batory, and D.E. Perry.Design of Large-Scale Polylingual Systems. International Conference on Software Engineering (ICSE), Edinburgh, Scotland, UK, May 2004.

Building systems from existing applications written in two or more languages is common practice. Such systems are polylingual. Polylingual systems are relatively easy to build when the number of APIs needed to achieve language interoperability is small. However, when the number of distinct APIs become large, maintaining and evolving polylingual systems becomes a notoriously difficult task. In this paper, we present a simple, practical, and effective way to develop, maintain, and evolve large-scale polylingual systems. Our approach relies on recursive type systems whose instances can be manipulated by reflection. Foreign objects (i.e. objects that are not defined in a host programming language) are abstracted as graphs and path expressions are used for accessing and manipulating data. Path expressions are implemented by type reification ¡ª turning foreign type instances into first-class objects and enabling access to and manipulation of them in a host programming language. Doing this results in multiple benefits, including coding simplicity and uniformity that we demonstrate in a complex commercial project.

D. Batory. Feature-Oriented Programming and the AHEAD Tool Suite, International Conference on Software Engineering (ICSE), Edinburgh, Scotland, 2004.

Feature Oriented Programming (FOP) is an emerging paradigm for application synthesis, analysis, and optimization. A target application is specified declaratively as a set of features, like many consumer products (e.g., personal computers, automobiles). FOP technology translates such declarative specifications into efficient programs.

AHEAD is a model of FOP that is based on step-wise refinement, which advocates that complex programs can be synthesized from simple programs by incrementally adding features. The AHEAD Tool Suite (ATS) supports program development in AHEAD. AHEAD and ATS are among the most advanced models/tools for large-scale program synthesis.

M. Grechanik, D.E. Perry, and D. Batory.Reengineering Large-Scale Polylingual Systems. International Workshop on Incorporating COTS into Software Systems: Tools and Techniques (IWICSS) co-located with the 3rd International Conference on COTS-Based Software Systems (ICCBSS), Redondo Beach, CA, February 2004.

Building systems from existing applications written in two or more languages is common practice. Such systems are polylingual. Polylingual systems are relatively easy to build when the number of APIs needed to achieve language interoperability is small. However, when the number of distinct APIs become large, maintaining and evolving them becomes a notoriously difficult task. We present a practical and effective process to reengineer large-scale polylingual systems. We offer a programming model that is based on the uniform abstraction of polylingual systems as graphs and the use of path expressions for traversing and manipulating data. This model enables us to achieve multiple benefits, including coding simplicity and uniformity (where neither was present before) that facilitate further reverse engineering. By performing control and data flow analyses of polylingual systems we infer the schemas used by all participating programs and the actions performed by each program on others. Finally, we describe a tool called FORTRESS that automates our reverse engineering process. The contribution of this paper is a process that allows programmers to reverse engineer foreign type systems and their instances semiautomatically at the highest level of design. We know of no other approach with comparable benefits.

D. Batory, J. Liu, J.N. Sarvela. Refinements and Multi-Dimensional Separation of Concerns, ACM SIGSOFT FSE 2003.

Step-wise refinement (SWR) asserts that complex programs can be derived from simple programs by progressively adding features. The length of a program specification is the number of features that the program has. Critical to the scalability of SWR are multi-dimensional models that separate orthogonal feature sets. Let n be the dimensionality of a model and k be the number of features along a dimension. We show program specifications that could be O(k**n) features long have short and easy-to-understand specifications of length O(kn) when multi-dimensional models are used. We present new examples of multidimensional models: a micro example of a product-line (whose programs are 30 lines of code) and isomorphic macro examples (whose programs exceed 30K lines of code). Our work provides strong evidence that SWR scales to synthesis of large systems.

D. Batory, J.N. Sarvela, and A. Rauschmayer. Scaling Step-Wise Refinement, International Conference on Software Engineering (ICSE), Portland, Oregon, 2003.

Step-wise refinement is a powerful paradigm for developing a complex program from a simple program by adding features incrementally. We present the AHEAD (Algebraic Hierarchical Equations for Application Design) model that shows how step-wise refinement scales to synthesize multiple programs and multiple non-code representations. AHEAD shows that software can have an elegant, hierarchical mathematical structure that is expressible as nested sets of equations. We review a tool set that supports AHEAD. As a demonstration of it's viability, we have bootstrapped AHEAD tools solely from equational specifications, generating Java and non-Java artifacts automatically, a task that was accomplished only by ad hoc means previously.

D. Batory, J.N. Sarvela, and A. Rauschmayer. Scaling Step-Wise Refinement, International Conference on Software Engineering (ICSE), Portland, Oregon, 2003.

Step-wise refinement is a powerful paradigm for developing a complex program from a simple program by adding features incrementally. We present the AHEAD (Algebraic Hierarchical Equations for Application Design) model that shows how step-wise refinement scales to synthesize multiple programs and multiple non-code representations. AHEAD shows that software can have an elegant, hierarchical mathematical structure that is expressible as nested sets of equations. We review a tool set that supports AHEAD. As a demonstration of it's viability, we have bootstrapped AHEAD tools solely from equational specifications, generating Java and non-Java artifacts automatically, a task that was accomplished only by ad hoc means previously.

M. Grechanik, D.E. Perry, and D. Batory. An Approach to Evolving Database Dependent Systems . International Workshop on Principles of Software Evolution, Orlando, Florida, May 2002.

It is common in client/server architectures for components to have SQL statements embedded in their source code. Components submit queries to relational databases using such standards as Universal Data Access (UDA) and Open Database Connectivity (ODBC). The API that implements these standards is complex and requires the embedding of SQL statements in the language that is used to write the components. Such programming practices are widespread and result in increased complexity in maintaining systems. We propose an approach that eliminates the embedding of SQL in programming languages, thereby enabling the automation of important software maintenance tasks.

D. Batory, R. Lopez-Herrejon, and J-P. Martin. Generating Product-Lines of Product-Families, Automated Software Engineering Conference (ASE), 2002. Extended draft is here.

GenVoca is a methodology and technology for generating product-lines, i.e., building variations of a program. The primitive components from which applications are constructed are refinements or layers, which are modules that implement a feature that many programs of a product-line can share. Unlike conventional components (e.g., COM, CORBA, EJB), a layer encapsulates fragments of multiple classes. Sets of fully-formed classes can be produced by composing layers. Layers are modular, albeit unconventional, building blocks of programs.But what are the building blocks of layers? We argue that facets is an answer. A facet encapsulates fragments of multiple layers, and compositions of facets yields sets of fully-formed layers. Facets arise when refinements scale from producing variants of individual programs to producing variants of multiple, integrated programs, as typified by product families (e.g., MS. Office).

We present a mathematical model that explains relationships between layers and facets. We use the model to develop a generator for tools (i.e., product-family) that are used in language-extensible Integrated Development Environments (IDEs).

D. Batory, C. Johnson, B. MacDonald, and D. von Heeder. Achieving Extensibility Through Product-Lines and Domain-Specific Languages: A Case Study ", ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 11#2, April 2002, 191-214.

This is a case study in the use of product-line architectures (PLAs) and domain-specific languages (DSLs) to design an extensible command-and-control simulator for Army fire support. The reusable components of our PLA are layers or "aspects" whose addition or removal simultaneously impacts the source code of multiple objects in multiple, distributed programs. The complexity of our component specifications is substantially reduced by using a DSL for defining and refining state machines, abstractions that are fundamental to simulators. We present preliminary results that show how our PLA and DSL synergistically produce a more flexible way of implementing state-machine-based simulators than is possible with a pure Java implementation.

M.Grechanik, D. Batory, and Dewayne E. Perry. Integrating and Reusing GUI-Driven Applications, International Conference on Software Reuse (ICSR), Austin, Texas, April 2002.

Graphical User Interface (GUI) Driven Applications (GDAs) are ubiquitous. We present a model and techniques that take closed and monolithic GDAs and integrate them into an open, collaborative environment. The central idea is to objectify the GUI of a GDA, thereby creating an object that enables programmatic control of that GDA.

We demonstrate a non-trivial application of these ideas by integrating a standalone internet application with a stand-alone Win32 application, and explain how PDAs (Personal Digital Assistants) can be used to remotely control their combined execution. Further, we explain how Integrated Development Environment (IDEs) may be extended to integrate and reuse GDAs using our approach. We believe our work is unique: we know of no other technology that could have integrated the GDAs of our example.

Y.Smaragdakis and D. Batory. Mixin Layers: An Object-Oriented Implementation Technique for Refinements and Collaboration-Based Designs, ACM Transactions on Software Engineering and Methodology (TOSEM), April 2002.

A "refinement" is a functionality addition to a software project that can affect multiple dispersed implementation entities (functions, classes, etc.). In this paper, we examine large-scale refinements in terms of a fundamental object-oriented technique called collaboration-based design. We explain how collaborations can be expressed in existing programming languages or be supported with new language constructs (which we have implemented as extensions to the Java language). We present a specific expression of large-scale refinements called mixin layers, and demonstrate how it overcomes the scalability difficulties that plagued prior work. We also show how we used mixin layers as the primary implementation technique for building an extensible Java compiler, JTS.

A.Rauschmayer. Probe: Visualizing Algebraic Transformations in the GenBorg Generative Software Environment ,Institut fuer Informatik Ludwig Maximilians Universitaet Muenchen, December, 2001.

Software is becoming increasingly pervasive and complex. The greatest challenge for the industry today is how to produce more code without compromising quality or cost. GenBorg is a programming paradigm whose answer is to scale the units of reuse from components of individual applications to components of application suites (e. g., MS Office). While traditional approaches trade size of code units for flexibility, GenBorg uses techniques from generative programming and feature-oriented programming in order to provide models that instantiate code and non-code artifacts of customizable application suites. Because developing software is not limited to writing code, but current ways of automating development are, a lot of human energy is wasted for keeping related data such as documentation and formal properties consistent with source code. GenBorg manages to evolve all these artifacts automatically and in parallel. GenBorg's ideas are backed by a theoretical foundation, the GenBorg algebra. The Java application Probe is a first implementation of the GenBorg programming paradigm and uses a simple, file-based data structure for defining GenBorg models. Its graphical user interface makes it easy to navigate through and instantiate from large models.

D. Batory, D. Brant, M. Gibson, and M. Nolen. ExCIS: An Integration of Domain-Specific Languages and Feature-Oriented Programming, Workshop on New Visions for Software Design and Productivity: Research and Applications, Vanderbilt University, Nashville, Tennessee, December 13-14, 2001.

ExCIS is a methodology and tool suite that integrates the technologies of domain-specific languages (DSLs) and feature-oriented programming (FOP). DSLs offer compact specifications of programs in domain-specific notations that are easier to write, maintain, and evolve. FOP raises the level of abstraction in system programming from compositions of code-centric components to compositions of modules that implement individual and largely orthogonal features. ExCIS provides state-of-the-art tools for creating easier-to-specify, easier-to-maintain, and easier-to-change systems. It is being developed for STRICOM to create next-generation simulators for the U.S. Army.

R. E. Lopez-Herrejon and D. Batory. A Standard Problem for Evaluating Product-Line Methodologies, Generative and Component-Based Software Engineering (GCSE/GPCE), September 9-13, 2001 Messe Erfurt, Erfurt, Germany.

We propose a standard problem to evaluate product-line methodologies. It relies on common knowledge from Computer Science, so that domain-knowledge can be easily acquired, and it is complex enough to expose the fundamental concepts of product-line methodologies. As a reference point, we present a solution to this problem using the GenVoca design methodology. We explain a series of modeling, implementation, and benchmarking issues that we encountered, so that others can understand and compare our solution with theirs.

K. Fisler, S. Krishnamurthi, D. Batory, and J. Liu. A Model Checking Framework for Layered Command and Control Software, Workshop on Engineering Automation for Software Intensive System Integration, Monterrey, California, June 18-22, 2001.

Most existing modular model checking techniques betray their hardware roots: they assume that modules compose in parallel. In contrast, layered software systems, which have proven successful in many domains, are really quasi-sequential compositions of parallel compositions. Most such systems demand and inspire new modular verification techniques. This paper presents algorithms that exploit a layered (or feature-based) decomposition to drive verification. Our technique can verify most properties locally within a layer; we also characterize when a global state space construction is unavoidable. This work is motivated by our efforts to verify a military fire simulation and support software system called FSATS.

K. Fisler, S. Krishnamurthi, and D. Batory. Verifying Component-Based Collaboration Designs, 4th ICSE Workshop on Component-Based Software Engineering: Component Certification and System Prediction. Toronto, Ontario, Canada, May 2001.

Y. Smaragdakis and D. Batory. Mixin-Based Programming in C++, Second International Symposium on Generative and Component-Based Software Engineering (GCSE now GPCE), Erfurt, Germany, October 9-12, 2000.

Combinations of C++ features, like inheritance, templates, and class nesting, allow for the expression of powerful component patterns. In particular, research has demonstrated that, using C++ mixin classes, one can express layered component-based designs concisely with efficient implementations. In this paper, we discuss pragmatic issues related to component-based programming using C++ mixins. We explain surprising interactions of C++ features and policies that sometimes complicate mixin implementations, while other times enable additional functionality without extra effort.

D. Batory, C. Johnson, B. MacDonald, and D. von Heeder., FASTS: An Extensible C4I Simulator for Army Fire Support, Workshop on Product Lines for Command-and-Control Ground Systems at the First International Software Product Line Conference (SPLC) in Denver, Colorado, August 2000.

L. Tokuda and D. Batory. Evolving Object-Oriented Designs with Refactorings. Journal of Automated Software Engineering 8, 89-120, 2001. This is an enlarged version of our October 1999 ASE Conference paper.

Refactorings are behavior-preserving program transformations that automate design evolution in object-oriented applications. Three kinds of design evolution are: schema transformations, design pattern micro-architectures, and the hot-spot-driven-approach. This research shows that all three are automatable with refactorings. A comprehensive list of refactorings for design evolution is provided and an analysis of supported schema trans-formations, design patterns, and hot-spot meta patterns is presented. Further, we evaluate whether refactoring technology can be transferred to the mainstream by restructuring non-trivial C++ applications. The applications that we examine were evolved manually by software engineers. We show that an equivalent evolution could be reproduced significantly faster and cheaper by applying a handful of general-purpose refactorings. In one application, over 14K lines of code were transformed automatically that otherwise would have been coded by hand. Our experiments identify benefits, limitations, and topics of further research related to the transfer of refactoring technology to a production environment.

D. Batory. Refinements and Separation of Concerns. Second Workshop on Multi-Dimensional Separation of Concerns, International Conference on Software Engineering (ICSE), Limerick, Ireland, 2000.

Today's notions of encapsulation are very restricted - a module or component contains only source code. What we really need is for modules or components to encapsulate not only source code that will be installed when the component is used, but also encapsulate corresponding changes to documentation, formal properties, and performance properties - i.e., changes to the central concerns of software development. The general abstraction that encompasses this broad notion of encapsulation is called a "refinement".

D. Batory, Rich Cardone, and Y. Smaragdakis. Object-Oriented Frameworks and Product-Lines. 1st Software Product-Line Conference (SPLC), Denver, Colorado, August 2000.

Frameworks are a common object-oriented code-structuring technique that is used in application product-lines. A framework is a set of abstract classes that embody an abstract design; a framework instance is a set of concrete classes that subclass abstract classes to provide an executable subsystem. Frameworks are designed for reuse: abstract classes encapsulate common code and concrete classes encapsulate instance-specific code. Unfortunately, this delineation of reusable vs. instance-specific code is problematic. Concrete classes of different framework instances can have much in common and there can be variations in abstract classes, all of which lead to unnecessary code replication. In this paper, we show how to overcome these limitations by decomposing frameworks and framework instances into primitive and reusable components. Doing so reduces code replication and creates a component-based product-line of frameworks and framework instances.

D. Batory, Clay Johnson, Bob MacDonald, and Dale von Heeder. Achieving Extensibility Through Product-Lines and Domain-Specific Languages: A Case Study, International Conference on Software Reuse (ICSR), Vienna, Austria, 2000. Here is a pointer to an expanded version of this paper.

This is a case study in the use of product-line architectures (PLAs) and domain-specific languages (DSLs) to design an extensible command-and-control simulator for Army fire support. The reusable components of our PLA are layers or "aspects" whose addition or removal simultaneously impacts the source code of multiple objects in multiple, distributed programs. The complexity of our component specifications is substantially reduced by using a DSL for defining and refining state machines, abstractions that are fundamental to simulators. We present preliminary results that show how our PLA and DSL synergistically produce a more flexible way of implementing state-machine-based simulators than is possible with a pure Java implementation.

Y. Smaragdakis and D. Batory. Scoping Constructs for Program Generators ,Generative and Component-Based Software Engineering (GCSE now GPCE), Erfurt, Germany, September 1999.

A well-known problem in program generation is scoping. When identifiers (i.e., symbolic names) are used to refer to variables, types, or functions, program generators must ensure that generated identifiers are bound to their intended declarations. This is the standard scoping issue in programming languages, only automatically generated programs can quickly become too complex and maintaining bindings manually is hard. In this paper we present generation scoping: a language mechanism to facilitate the handling of scoping concerns. Generation scoping offers control over identifier scoping beyond the scoping mechanism of the target programming language (i.e., the language in which the generator output is expressed). Generation scoping was originally implemented as an extension of the code template operators in the Intentional Programming platform, under development by Microsoft Research. Subsequently, generation scoping has also been integrated in the JTS language extensibility tools. The capabilities of generation scoping were invaluable in the implementation of two actual software generators: DiSTiL (implemented using the Intentional Programming system), and P3 (implemented using JTS).

Y. Smaragdakis and D. Batory. Application Generators, based on an article to appear in the Software Engineering volume of the Encyclopedia of Electrical and Electronics Engineering, (John Wiley and Sons).

D. Batory and Y. Smaragdakis. Building Product-Lines with Mixin-Layers ,ECOOP 99 Workshop on Product-Line Architectures.

A mixin-layer is a building block for assembling applications of a product-line. We explain mixin-layers, their relationship to collaboration-based designs, layered designs, and GenVoca. We also summarize some of the product-lines that we have built using mixin-layers.

D. Batory, G. Chen, E. Robertson, and T. Wang.Design Wizards and Visual Programming Environments for GenVoca Generators, IEEE Transactions on Software Engineering (IEEE TSE), May 2000, 441-452. Best papers from 1998 International Conference on Software Engineering. Click here for original conference paper.

Domain-specific generators will increasingly rely on graphical languages for declarative specifications of target applications. Such languages will provide front-ends to generators and related tools to produce customized code on demand. Critical to the success of this approach will be domain-specific design wizards, tools that guide users in their selection of components for constructing particular applications. In this paper, we present the P3 ContainerStore graphical language, its generator, and design wizard.

L. Tokuda and D. Batory. Evolving Object-Oriented Architectures with Refactorings Automated Software Engineering (ASE), October 1999.

This research evaluates whether refactoring technology can make a successful transition to the mainstream by restructuring non-trivial evolving C++ applications. Results reveal that refactorings can safely automate significant architectural changes. In one example, 14K lines of code are transformed.

D. Batory, Y. Smaragdakis, and Lou Coglianese. Architectural Styles as Adaptors Software Architecture, Kluwer Academic Publishers, Patrick Donohoe, ed., 1999.

The essence of architectural styles is component communication. In this paper, we try to relate architectural styles to adaptors in the GenVoca model of software construction. GenVoca components are refinements that can have a wide range of implementations, from binaries to rule-sets of program transformation systems. We suggest that abstracting adaptors to refinements allows for program transformation implementations of adaptors that can express complex architectural styles that could not be expressed by other means. Examples from avionics are given.

Y. Smaragdakis and D. Batory Implementing Layered Designs with Mixin Layers European Conference on Object-Oriented Programming, (ECOOP), July 1998.

Mixin layers are a technique for implementing layered object-oriented designs (e.g., collaboration-based designs). Mixin layers are similar to abstract subclasses (mixin classes) but scaled to a multiple-class granularity. We describe mixin layers from a programming language viewpoint, discuss checking the consistency of a mixin layer composition, and analyze the language support issues involved.

L. Tokuda and D. Batory.Automating Three Modes of Evolution for Object-Oriented Software Architectures 5th Conference on Object-Oriented Technologies (COOTS), May 1999.

Refactorings are behavior preserving program transformations that can be applied to compilable source code to automate design level changes. This paper demonstrates that three common forms of object-oriented architectural evolution can be automated with refactorings: schema transformations, design pattern microarchitectures, and hot-spot meta patterns.

D. Batory Product-Line Architectures . Keynote at Smalltalk und Java in Industrie and Ausbildung, Erfurt, Germany, October 1998.

Today's software design methodologies are aimed at one-of-a-kind applications, designs are expressed in terms of objects and classes, and software must be coded manually. We argue that future software development will be very different and will center around product-line architectures (i.e., designs for families of related applications), refinements (a generalization of today's components), and software plug-and-play (a codeless form of programming).

D. Batory, Bernie Lofaso, and Y. Smaragdakis.JTS: Tools for Implementing Domain-Specific Languages.5th International Conference on Software Reuse (ICSR), Victoria, Canada, June 1998.

The Jakarta Tool Suite (JTS) aims to reduce substantially the cost of generator development by providing domain-independent tools for creating domain-specific languages and component-based generators called GenVoca generators. JTS is a set of precompiler-compiler tools for extending industrial programming languages (e.g., Java) with domain-specific constructs. JTS is itself a GenVoca generator, where precompilers for JTS-extended languages are constructed from components.

Y. Smaragdakis and D. Batory. Implementing Reusable Object-Oriented Components. 5th International Conference on Software Reuse (ICSR), Victoria, Canada, June 1998.

Object-oriented (OO) classes are generally not reusable because they are not meaningful in isolation; most classes only have meaning as members of cooperating suites of classes (e.g., design patterns). These suites usually arise in designs, but rarely exist as encapsulated entities in OO implementations. In this paper we present a method for directly mapping cooperating suites of classes into encapsulated C++ implementations. Our method is an improvement over the VanHilst and Notkin approach for implementing collaboration-based designs and constitutes a step towards more reusable (object-oriented) components.

(If you are thinking of using the C++ technique described in this paper, read some implementation advice: Y. Smaragdakis, Practical Advice on Using Mixin Layers (and Mixins) in C++

D. Batory, G. Chen, E. Robertson, and T. Wang. Web-Advertised Generators and Design Wizards. 5th International Conference on Software Reuse (ICSR), Victoria, Canada, June 1998.

Domain-specific generators will increasingly rely on Web-based applets for declarative specifications of target applications. Applets will communicate with generators via servers to produce customized code on demand. Critical to the success of this approach will be domain-specific design wizards, tools that guide users in their selection of components for constructing particular applications. In this paper, we present the P3 ContainerStore applet, its generator, and design wizard.

Y. Smaragdakis and D. Batory.DiSTiL: A Transformation Library for Data Structures. USENIX Conference on Domain-Specific Languages, October 1997.

DiSTiL is a representative of a new approach to domain-specific language implementation. Instead of being the usual one-of-a-kind stand-alone compiler, DiSTiL is an extension library for the Intentional Programming (IP) transformation system (currently under development by Microsoft Research). DiSTiL relies on several reusable, general purpose infrastructure tools offered by IP that substantially simplify DSL implementation.

D. Batory, D. Miranker, and D. Brant.Jakarta: A Tool Suite for Constructing Software Generators.

Jakarta is a set of compiler tools to build extensible languages and to build compilers through component composition. Jakarta is being written in a Jakarta-produced superset of the Java language. We describe briefly how Jakarta will be used to build tools to automate OO design patterns, and the design wizard tool, which captures domain-specific knowledge of software architectures using domain-specific algebras.

G. Jimenez-Perez and D. Batory. Memory Simulators and Software Generators, 1997 Symposium on Software Reuse (SSR), 136-145.

We present results on re-engineering a highly tuned and hand-coded memory simulator using the P2 container data structure generator. This application was chosen because synthesizing the simulator's data structures would not exploit P2's primary advantage of automatically applying sophisticated code optimization techniques. Thus we initially believed that using P2 would be an overkill and that P2's generated code would provide no performance advantages over hand coding. On the contrary, we found that P2 produced more efficient code and that it offered significant advantages to software development that we had not previously realized.

D. Batory and B. J. Geraci. Composition Validation and Subjectivity in GenVoca Generators, IEEE Transactions on Software Engineering (IEEE TSE) (special issue on Software Reuse), February 1997, 67-82.

GenVoca generators synthesize software systems by composing components from reuse libraries. GenVoca components are designed to export and import standardized interfaces, and thus be plug-compatible, interchangeable, and interoperable with other components. In this paper, we examine two different but important issues in software system synthesis. First, not all syntactically correct compositions of components are semantically correct. We present simple, efficient, and domain-independent algorithms for validating compositions of GenVoca components. Second, components that export and import immutable interfaces are too restrictive for software system synthesis. We show that the interfaces and bodies of GenVoca components are subjective, i.e., they mutate and enlarge upon instantiation. This mutability enables software systems with customized interfaces to be composed from components with "standardized" interfaces.

D. Batory Intelligent Components and Software Generators . Keynote at Software Quality Institute Symposium on Software Reliability, Austin, Texas, April 1997

The production of well-understood software will eventually be the responsibility of software generators. Generators will enable high-performance, customized software systems and subsystems to be assembled quickly and cheaply from component libraries. These components will be intelligent: they will encapsulate domain-specific knowledge (e.g., 'best practice' approaches) so that their instances will automatically customize and optimize themselves to the system in which they are being used. In this paper, we explore the topics intelligent components and software generation as they pertain to the issues of software productivity, performance, reliability, and quality.

E.E. Villarreal and D. Batory. Rosetta: A Generator of Data Language Compilers, 1997 Symposium on Software Reuse (SSR), 146-156.

A data language is a declarative language that enables database users to access and manipulate data. There are families of related data languages; each family member is targeted for a particular application. Unfortunately, building compilers for such languages is largely an ad hoc process; there are no tools and design methods that allow programmers to leverage the design and code of compilers for similar languages, or to simplify the evolution of existing languages to include more features. Rosetta is a generator of relational data language compilers that demonstrates practical solutions to these problems. We explain how domain analysis identifies primitive building blocks of these compilers, and how grammar-based definitions (e.g. GenVoca) of the legal compositions of these blocks yields compact and easily-evolvable specifications of data languages. Rosetta automatically transforms such specifications into compilers. Experiences with Rosetta are discussed.

Dinesh Das and D. Batory. Synthesizing Rule Sets for Query Optimizers from Components, Technical Report TR-96-05, Department of Computer Sciences, University of Texas at Austin, April 1996.

Query optimizers are complex subsystems of database management systems. Modifying query optimizers to admit new algorithms or storage structures is quite difficult, but partly alleviated by extensible approaches to optimizer construction. Rule-based optimizers are a step in that direction, but from our experience, the rule sets of such optimizers are rather monolithic and brittle. Conceptually minor changes often require wholesale modifications to a rule set. Consequently, much can be done to improve the extensibility of rule-based optimizers. As a remedy, we present a tool called Prairie that is based on an algebra of layered optimizers. This algebra naturally leads to a building-blocks approach to rule-set construction. Defining customized rule sets and evolving previously defined rule sets is accomplished by composing building-blocks. We explain an implementation of Prairie and present experimental results that show how classical relational optimizers can be synthesized from building-blocks, and that the efficiency of query optimization is not sacrificed.

D. Batory. Software System Generators, Transformation Systems,and Compilers. Working Paper, October 1995.

GenVoca generators assemble customized, high-performance software systems automatically from components. In this paper, we explain how GenVoca generators are actually compilers for domain-specific module interconnection languages and that the underlying compilation technology is a special class of transformation systems.

D. Batory. Software Component Technologies and Space Applications. International Conference on Integrated Micro-Nano Technology for Space Applications, November 1995.

In the near future, software systems will be as reconfigurable than hardware. This will be possible through the advent of software component technologies, which have been prototyped in universities and research labs. In this paper, we outline the foundations for these technologies and suggest how they might impact software for space applications.

L. Tokuda. Program Transformations for Evolving SoftwareArchitectures. OOPSLA'95 position paper for workshop on Adaptable and Adaptive Software, 1995.

Software evolution is often driven by the need to extend existing software. "Design patterns" express preferred ways to extend object-oriented software and provide desirable target states for software designs. This paper demonstrates that some design patterns can be expressed as a series of parameterized program transformations applied to a plausible initial software state. A software tool is proposed that uses primitive transformations to allow users to evolve object-oriented applications by visually altering design diagrams.

D. Batory. Subjectivityand GenVoca Generators. International Conference on Software Reuse (ICSR) (Orlando), 1996. See IEEE TSE journal version. Expanded Technical Report TR-95-32, Department of Computer Sciences, University of Texas at Austin, June 1995.

The tenet of subjectivity is that no single interface can adequately describe any object; interfaces to the same object will vary among different applications. Thus, objects with standardized interfaces seem too brittle a concept to meet the demands of a wide variety of applications. Yet, objects with standardized interfaces is a central idea in domain modeling and software generation. Standard interfaces make objects plug-compatible and interchangeable, and it is this feature that is exploited by generators to synthesize high-performance, domain-specific software systems. Interestingly, generated systems have customized interfaces that can be quite different from the interfaces of their constituent objects.

In this paper, we reconcile this apparent contradiction by showing that the objects (components) in the GenVoca model of software generation are not typical software modules; their interfaces and bodies mutate upon instantiation to a "standard" that is application-dependent.

D. Batory. Issues in Domain Modeling and Software System Generation. OOPSLA'95 position paper for Panel on Objects and Domain Engineering, 1995.

D. Batory and J. Thomas. P2: A Lightweight DBMS Generator. Journal of Intelligent Information Systems. Also, Technical Report TR-95-26, Department of Computer Sciences, University of Texas at Austin, June 1995.

A lightweight database system (LWDB) is a high-performance, application-specific DBMS. It differs from a general-purpose (heavyweight) DBMS in that it omits one or more features and specializes the implementation of its features to maximize performance. Although heavyweight monolithic and extensible DBMSs might be able to emulate LWDB capabilities, they cannot match LWDB performance.

In this paper, we describe P2, a generator of lightweight DBMSs, and explain how it was used to reengineer a hand-coded, highly-tuned LWDB used in a production system compiler (LEAPS). We present results that show P2-generated LWDBs reduced the development time and code size of LEAPS by a factor of three and that the generated LWDBs executed substantially faster than versions built by hand or using an extensible heavy weight DBMS.

D. Batory, Lou Coglianese, M. Goodwin, and S. Shafer. Creating Reference Architectures: An Example from Avionics. Symposium on Software Reusability (SSR), Seattle Washington, April 1995.

ADAGE is a project to define and build a domain-specific software architecture (DSSA) environment for assisting the development of avionics software. A central concept of DSSA is the use of software system generators to implement component-based models of software synthesis in the target domain. In this paper, we present the ADAGE component-based model (or reference architecture) for avionics software synthesis. We explain the modeling procedures used, review our initial goals, and examine what we were (and were not) able to accomplish. The contributions of our paper are the lessons that we learned; they may be beneficial to others in future modeling efforts.

D. Batory, Lou Coglianese, S. Shafer, and W. Tracz. The ADAGE Avionics Reference Architecture. AIAA Computing in Aerospace-10 Conference, San Antonio, March 1995.

ADAGE is a project to define and build a domain-specific software architecture (DSSA) environment for avionics. A central concept of ADAGE is the use of generators to implement scalable, component-based models of avionics software. In this paper, we review the ADAGE model (or reference architecture) of avionics software and describe techniques for avionics software synthesis.

Dinesh Das and D. Batory. Prairie: A Rule Specification Framework for Query Optimizers. International Conference on Data Engineering (Taipei), March 1995.

From our experience, current rule-based query optimizers do not provide a very intuitive and well-defined framework to define rules and actions. To remedy this situation, we propose an extensible and structured algebraic framework called Prairie for specifying rules. Prairie facilitates rule-writing by enabling a user to write rules and actions more quickly, correctly and in an easy-to-understand and easy-to-debug manner.

Query optimizers consist of three major parts: a search space, a cost model and a search strategy. The approach we take is only to develop the algebra which defines the search space and the cost model and use the Volcano optimizer-generator as our search engine. Using Prairie as a front-end, we translate Prairie rules to Volcano to validate our claim that Prairie makes it easier to write rules.

We describe our algebra and present experimental results which show that using a high-level framework like Prairie to design large-scale optimizers does not sacrifice efficiency.

D. Batory, D. McAllester, L. Coglianese, and W. Tracz. Domain Modeling in Engineering of Computer-Based Systems. 1995 International Symposium and Workshop on Systems Engineering of Computer Based Systems, Tucson, Arizona, February 1995.

Domain modeling is believed to be a key factor in developing an economical and scalable means for constructing families of related software systems. In this paper, we review the current state of domain modeling, and present some of our work on the ADAGE project, an integrated environment that relies heavily on domain models for generating real-time avionics applications. Specifically, we explain how we detect errors in the design of avionics systems that are expressed in terms of compositions of components. We also offer insights on how domain modeling can benefit the engineering of computer-based systems in other domains.

L. Tokuda and D. Batory. Automated Software Evolution via Design Pattern Transformations. 3rd International Symposium on Applied Corporate Computing, Monterrey, Mexico, October 1995. Also TR-95-06, Department of Computer Sciences, University of Texas at Austin, February 1995.

Software evolution is often driven by the need to extend existing software. "Design patterns" express preferred ways to extend object-oriented software and provide desirable target states for software designs. This paper demonstrates that some design patterns can be expressed as a series of parameterized program transformations applied to a plausible initial software state. A software tool is proposed that uses primitive transformations to allow users to evolve object-oriented applications by visually altering design diagrams.

J. Thomas and D. Batory. P2: An extensible Lightweight DBMS. Technical Report TR-95-04, Department of Computer Sciences, University of Texas at Austin, February 1995.

A lightweight database system (LWDB) is a high-performance, application-specific DBMS. It differs from a general-purpose (heavyweight) DBMS in that it omits one or more features and specializes the implementation of its features to maximize performance. Although heavyweight monolithic and extensible DBMSs might be able to emulate LWDB capabilities, they cannot match LWDB performance.

In this paper, we explore LWDB applications, systems, and implementation techniques. We describe P2, an extensible lightweight DBMS, and explain how it was used to reengineer a hand-coded, highly-tuned LWDB used in a production system compiler (LEAPS). We present results that show P2-generated LWDBs for LEAPS executes substantially faster than versions built by hand or that use an extensible heavyweight DBMS.

D. Batory and B.J. Geraci. Validating Component Compositions in Software System Generators, International Conference on Software Reuse (ICSR) (Orlando), 1996. See IEEE TSE journal version. Also, Expanded Technical Report TR-95-03, Department of Computer Sciences, University of Texas at Austin, June 1995.

Generators synthesize software systems by composing components from reuse libraries. In general, not all syntactically correct compositions are semantically correct. In this paper, we present domain-independent algorithms for the GenVoca model of software generators to validate component compositions. Our work relies on attribute grammars and offers powerful debugging capabilities with explanation-based error reporting. We illustrate our approach by showing how compositions are debugged by a GenVoca generator for container data structures.

D. Batory, J. Thomas, and M. Sirkin. Reengineering a Complex Application Using a Scalable Data Structure Compiler. ACM SIGSOFT FSE 1994 (New Orleans), December 1994.

P2 is a scalable compiler for collection data structures. High-level abstractions insulate P2 users from data structure implementation details. By specifying a target data structure as a composition of components from a reuse library, the P2 compiler replaces abstract operations with their concrete implementations.

LEAPS is a production system compiler that produces the fastest sequential executables of OPS5 rule sets. LEAPS is a hand-written, highly-tuned, performance-driven application that relies on complex data structures. Reengineering LEAPS using P2 was an acid test to evaluate P2's scalability, productivity benefits, and generated code performance.

In this paper, we present some of our experimental results and experiences in this reengineering exercise. We show that P2 scaled to this complex application, substantially increased productivity, and provided unexpected performance gains.

D. Batory. The LEAPS Algorithms. Technical Report TR-94-28, Department of Computer Sciences, University of Texas at Austin, November 1994.

LEAPS is a state of the art production system compiler that produces the fastest sequential executable OPS5 rule sets. The performance of LEAPS is due to its reliance on complex data structures and search algorithms to speed rule processing. In this paper, we explain the LEAPS algorithms in terms of the programming abstractions of the P2 data structure compiler.

D. Batory, B. Geraci, and J. Thomas. Introductory P2 System Manual. Technical Report TR-94-26, Department of Computer Sciences, University of Texas at Austin, November 1994.

P2 is a prototype container data structure precompiler. It is a superset of the C language offering container and cursor abstractions as part of the linguistic extensions to C. P2 is based on the GenVoca model of software system generators. This document is the users manual for programming in the P2 language.

D. Batory, B. Geraci, and J. Thomas. Advanced P2 System Manual. Technical Report TR-94-27, Department of Computer Sciences, University of Texas at Austin, November 1994.

This manual documents how layers are written in P2. There is a special language, XP, which was designed specifically for defining P2 building blocks (i.e., primitive data structure layers).

D. Batory, V. Singhal, J. Thomas, S. Dasari, B. Geraci, and M. Sirkin. The GenVoca Model of Software-System Generators. IEEE Software, September 1994.

An emerging breed of generators synthesize complex software systems from libraries of reusable components. These generators, called GenVoca generators, produce high-performance software and offer substantial increases in productivity.

D. Batory. Products of Domain Models. ARPA Domain Modeling Workshop, George Mason University, September 1994.

We argue that domain models should produce four basic products: identification of reusable software components, definition of software architectures that explain how components can be composed, a demonstration of architecture scalability, and a direct relationship of these results to software generation of target systems.

D. Batory, V. Singhal, J. Thomas, and M. Sirkin. Scalable Software Libraries. ACM SIGSOFT FSE 1993 (Los Angeles), December 1993.

Many software libraries (e.g., the Booch C++ Components, libg++, NIHCL, COOL) provide components (classes) that implement data structures. Each component is written by hand and represents a unique combination of features (e.g. concurrency, data structure, memory allocation algorithms) that distinguishes it from other components.

We argue that this way of building data structure component libraries is inherently unscalable. Libraries should not enumerate complex components with numerous features; rather, libraries should take a minimalist approach: they should provide only primitive building blocks and be accompanied by generators that can combine these blocks to yield complex custom data structures.

In this paper, we describe a prototype data structure generator and the building blocks that populate its library. We also present preliminary experimental results which suggest that this approach does not compromise programmer productivity nor the run-time performance of generated data structures.

V. Singhal and D. Batory. P++: A Language for Software System Generators. Technical Report TR-93-16, Department of Computer Sciences, University of Texas at Austin, November 1993.

P++ is a programming language that supports the GenVoca model, a particular style of software design that is intended for building software system generators. P++ is an enhanced version of C++: it offers linguistic extensions for component encapsulation, abstraction, parameterization, and inheritance, where a component is a suite of interrelated classes and functions. This paper describes the motivations for P++, the ideas which underlie its design, the syntax and features of the language, and related areas of research.

J. Thomas, D. Batory, V. Singhal, and M. Sirkin. A Scalable Approach to Software Libraries. 6th Annual Workshop on Software Reuse (Owego, New York), November 1993.

Software libraries offer a convenient and accessible means to achieve the benefits of reuse. The components of these libraries are written by hand, and each represents a unique combination of features that distinguishes it from other components. Unfortunately, as the number of features grows, the size of these libraries grows exponentially, making them unscalable.

Predator is a research project to develop abstractions and tools to provide the benefits of software libraries without incurring the scalability disadvantages just mentioned. Our approach relies on a careful analysis of an application domain to arrive at appropriate high-level abstractions, standardized (i.e., plug-compatible) interfaces, and layered decompositions. Predator defines language extensions for implementing components, and compilers to automatically convert component compositions into efficient programs.

V. Singhal and D. Batory. P++: A Language for Large-Scale Reusable Software Components. 6th Annual Workshop on Software Reuse (Owego, New York), November 1993.

P++ is a programming language that supports the GenVoca model, a particular style of software design that is intended for building software system generators. P++ is an enhanced version of C++: it offers linguistic extensions for component encapsulation, abstraction, parameterization, and inheritance, where a component is a subsystem, i.e., a suite of interrelated classes and functions.

M. Sirkin. Predator: A Data Structure Compiler. A manual describing the features and syntax of P1, a prototype data structure compiler (unpublished).

M. Sirkin, D. Batory, and V. Singhal. Software Components in a Data Structure Precompiler. International Conference on Software Engineering (ICSE) (Baltimore, MD), May 1993, pages 437-446.

PREDATOR is a data structure precompiler. that generates efficient code for maintaining and querying complex data structures. It embodies a novel component reuse technology that transcends traditional generic data types. In this paper, we explain the concepts of our work and our prototype system. We show how complex data structures can be specified as compositions of software building blocks, and present performance results that compare PREDATOR output to hand-optimized programs

D. Batory, V. Singhal, and J. Thomas. Database Challenge: Single Schema Database Management Systems. Technical Report TR-92-47, Department of Computer Sciences, University of Texas at Austin, December 1992.

Many data-intensive applications require high-performance data management facilities, but utilize only a small fraction of the power of a general-purpose database system (DBMS). We believe single schema database systems (SSTs), i.e., special-purpose DBMSs that are designed for a single schema and a predeclared set of database operations, are a vital need of today's software industry. The challenge is to create a technology for economically building high-performance SSTs. SST research will combine results from object-oriented databases, persistent object stores, module interconnection languages, rule-based optimizers, open-architecture systems, extensible databases, and generic data types.

D. Batory and S. O'Malley. The Design and Implementation of Hierarchical Software Systems with Reusable Components. ACM Transactions on Software Engineering and Methodology (TOSEM), 1(4):355-398, October 1992.

We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and large-scale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software component/building-block technologies and domain modeling. Building-block technologies exploit large-scale reuse, rely on open architecture software, and elevate the granularity of programming to the subsystem level. Domain modeling formalizes the similarities and differences among systems of a domain. We believe our model is a blue-print for achieving software component technologies in many domains.

D. Batory, V. Singhal, and M. Sirkin. Implementing a Domain Model for Data Structures. International Journal of Software Engineering and Knowledge Engineering, 2(3):375-402, September 1992.

We present a model of the data structure domain that is expressed in terms of the GenVoca domain modeling concepts. We show how familiar data structures can be encapsulated as realms of plug-compatible, symmetric, and reusable components, and we show how complex data structures can be formed from their composition. The target application of our research is a precompiler for specifying and generating customized data structures.