Publications

Here are the publications of our research group listed in order of publication date. Each entry has a citation, an abstract, and a hypertext link to a PDF copy of the paper. See the copyright notice.

To search for multiple keywords separate each keyword by a comma: keyword , keyword

Keywords include:

architectures, analysis, aspects, categories, commuting diagrams, computational design, derivatives, Dense Linear Algebra (DLA),
Design by Transformation (DxT), design rules, Design Rule Checking (DRC), evolution, feature interactions, feature models, geodesic,
higher-order transformations, kubes, MDA, MDD, MDE, mixin layers, optimization, optional feature problem, origami, refactoring,
streaming, safe composition, semantics, testing, UML, verification, wizards, P3 (P1/P2/P3/DiSTiL data structure generator).

See all publications

Total Publications: 156

J. Feigenspan, D. Batory, and T. Riche. Is the Derivation of a Model Easier to Understand than the Model Itself? . International Conference on Program Comprehenseion (ICPC), June 2012.

Software architectures can be presented by graphs with components as nodes and connectors as edges. These graphs, or models, typically encode expert domain knowledge, which makes them difficult to understand. Hence, instead of presenting a complete complex model, we can derive it from a simple, easy-to-understand model by a set of easy-to-understand transformations. In two controlled experiments, we evaluate whether a derivation of a model is easier to understand than the model itself.

architectures, DxT, MDE, MDA, MDD.

N. Siegmund, S.S. Kolesnikov, C. Kaestner, S. Apel, D. Batory, M. Rosenmueller, G. Saake. Predicting Performance via Automated Feature-Interaction Detection . International Conference on Software Engineering (ICSE), June 2012.

Customizable programs and program families provide user-selectable features to tailor a program to an application scenario. Knowing in advance which feature selection yields the best performance is difficult because a direct measurement of all possible feature combinations is infeasible. Our work aims at predicting program performance based on selected features. The challenge is predicting performance accurately when features interact. An interaction occurs when a feature combination has an unexpected influence on performance. We present a method that automatically detects performance feature interactions to improve prediction accuracy. To this end, we propose three heuristics to reduce the number of measurements required to detect interactions. Our evaluation consists of six real-world case studies from varying domains (e.g. databases, compression libraries, and web server) using different configuration techniques (e.g., configuration files and preprocessor flags). Results show, on average, a prediction accuracy of 95%.

feature interactions, performance prediction.

B. Marker, A. Terrel, J. Poulson, R. van de Geijn, and D. Batory. Mechanizing the Expert Dense Linear Algebra Developer (Poster Paper) . Principles and Practice of Parallel Programming (PPoPP), February 2012.

The efforts of an expert to parallelize and optimize a dense linear algebra algorithm for distributed-memory targets are largely mechanical and repetitive. We demonstrate that these efforts can be encoded and automatically applied to obviate the manual implementation of many algorithms in high-performance code.

DxT, MDE, DLA, dense linear algebra

D. Batory, P. Hoefner, and J. Kim. Feature Interactions, Products, and Composition . Generative Programming and Component Engineering (GPCE), October 2011.

The relationship between feature modules and feature interactions is not well-understood. To explain classic examples of feature interactions, we show that features are not only composed sequentially by the (dot) operation, but also by the cross-product (x) and interaction (#) operations that heretofore were implicit in the literature. We present a formal model of these operations, show how it connects and explains previously unrelated results in Feature Oriented Soft- ware Development (FOSD), and describe a tool, based on our formalism, that demonstrates how changes in composed documents can be back-propagated to their original feature module definitions, thereby improving FOSD tooling.

feature interactions, optional feature problem

B. Delaware, W. Cook, and D. Batory. Product Lines of Theorems . SPLASH/OOPSLA Conference , October 2011.

Mechanized proof assistants are powerful verification tools, but proof development can be difficult and time-consuming. When verifying a family of related programs, the effort can be reduced by proof reuse. In this paper, we show how to engineer product lines with theorems and proofs built from feature modules. Each module contains proof fragments which are composed together to build a complete proof of correctness for each product. We consider a product line of programming languages, where each variant includes metatheory proofs verifying the correctness of its semantic definitions. This approach has been realized in the Coq proof assistant, with the proofs of each feature independently certifiable by Coq. These proofs are composed for each language variant, with Coq mechanically verifying that the composite proofs are correct. As validation, we formalize a core calculus for Java in Coq which can be extended with any combination of casts, interfaces, or generics.

oopsla, semantics, verification, feature interaction

D. Batory and B. Delaware Towards Verification of Product Lines . Keynote at Interactive Theorem Proving (ITP), August 2011.

Although mechanized proof assistants are powerful verification tools, proof development can still be difficult and time-consuming. It becomes even more challenging when proofs are needed for product lines. A product line is a family of similar programs. Each program is constructed by a distinct (linear) combination of features, where a feature or feature module encapsulates program fragments that are added to a program to introduce a new capability or functionality to that program.
The first part of my presentation reviews basic concepts on product lines and how programs are synthesized using feature modules. The second part addresses product line verification. I explain how proofs for product lines can be engineered using feature modules. Each module contains proof fragments which are composed to build a complete proof of correctness for each product. A product line of programming languages is considered, where each variant includes metatheory proofs verifying the correctness of its syntax and semantic definitions. The approach has been realized in the Coq proof assistant, where the proofs of each feature are independently certifiable by Coq. Proofs are composed for each language variant, where Coq mechanically verifies that the composite proofs are correct. As validation, a core calculus for Java in Coq was formalized which can be extended with any combination of casts, interfaces, or generics.
This is joint work with Benjamin Delaware and William Cook.

Presentation Slides

keynote, semantics, safe composition, verification, feature interaction

B. Marker, D. Batory, J. Poulson, and R. van de Geijn. How a Domain-Specific Language Enables the Automation of Optimized Code for Dense Linear Algebra (Extended Abstract) . International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), May 2011.

DxT, MDE, MDA, MDE, streaming architectures, DLA

J. Kim. Paan: A Tool for Back-Propagating Changes to Projected Documents . M.Sc. Thesis, May 2011, The Department of Computer Sciences, The University of Texas at Austin.

Research in Software Product Line Engineering (SPLE) traditionally focuses on product derivation. Prior work has explored the automated derivation of products by module composition. However, it has so far neglected propagating changes (edits) in a product back to the product line definition. A domain-specific product should be possible to update its features locally, and later these changes should be propagated back to the product line definition automatically. Otherwise, the entire product line has to be revised manually in order to make the changes permanent. Although this is the current state, it is a very error-prone process. To address these issues, we present a tool called Paan to create product lines of MS Word documents with back-propagation support. It is a diff-based tool that ignores unchanged fragments and reveals fragments that are changed, added or deleted. Paan takes a document with variation points (VPs) as input, and shreds it into building blocks called tiles. Only those tiles that are new or have changed must be updated in the tile repository. In this way, changes in composed documents can be back-propagated to their original feature module definitions. A document is synthesized by retrieving the appropriate tiles and composing them.

Paan web page.

student, analysis, origami, safe composition, verification

D. Dig and D. Batory, The 4th Workshop on Refactoring Tools (WRT’11) . March 2011.

Refactoring is the process of applying behavior-preserving transformations to a program with the objective of improving the program's design. A specific refactoring is identified by a name (e.g., Extract Method), a set of preconditions, and a set of transformations that need to be performed. Tool support for refactoring is essential because checking the preconditions of refactoring often requires nontrivial program analysis, and applying transformations may affect many locations throughout a program. In recent years, the emergence of light-weight programming methodologies such as Extreme Programming has generated a great amount of interest in refactoring, and refactoring support has become a required feature in today's IDEs. This workshop is a continuation of a series of previous workshops (ECOOP 2007, OOPSLA 2008 and 2009 -- see http://refactoring.info/WRT) where researchers and developers of refactoring tools can meet, discuss recent ideas and work, and view tool demonstrations.

refactoring

C. Kaestner. Virtual Separation of Concerns: Towards Preprocessors 2.0 . Ph.D. Thesis, Department of Technical and Business Information Systems, University of Magdeburg, Germany, May 2010.

Conditional compilation with preprocessors such as cpp is a simple but effective means to implement variability. By annotating code fragments with #ifdef and #endif directives, different program variants with or without these annotated fragments can be created, which can be used (among others) to implement software product lines. Although, such annotation-based approaches are frequently used in practice, researchers often criticize them for their negative effect on code quality and maintainability. In contrast to modularized implementations such as components or aspects, annotation-based implementations typically neglect separation of concerns, can entirely obfuscate the source code, and are prone to introduce subtle errors. Our goal is to rehabilitate annotation-based approaches by showing how tool support can address these problems. With views, we emulate modularity; with a visual representation of annotations, we reduce source code obfuscation and increase program comprehension; and with disciplined annotations and a product-line-aware type system, we prevent or detect syntax and type errors in the entire software product line. At the same time we emphasize unique benefits of annotations, including simplicity, expressiveness, and being language independent. All in all, we provide tool-based separation of concerns without necessarily dividing source code into physically separated modules; we name this approach virtual separation of concerns. We argue that with these improvements over contemporary preprocessors, virtual separation of concerns can compete with modularized implementation mechanisms. Despite our focus on annotation-based approaches, we do intend not give a definite answer on how to implement software product lines. Modular implementations and annotation-based implementations both have their advantages; we even present an integration and migration path between them. Our goal is to rehabilitate preprocessors and show that they are not a lost cause as many researchers think. On the contrary, we argue that -- with the presented improvements -- annotation-based approaches are a serious alternative for product-line implementation.

student, CIDE, safe composition

S. Apel, D. Batory, K. Czarnecki, F. Heidenreich, C. Kaestner, and O. Nierstrasz. Proceedings of the 2nd International Workshop on Feature-Oriented Software Development (FOSD), October 2010. .

C.H.P. Kim, D. Batory, and S. Khurshid. Reducing Combinatorics in Testing Product Lines . Aspect Oriented Software Development (AOSD), March 2011.

A Software Product Line (SPL) is a family of programs where each program is defined by a unique combination of features. Testing or checking properties of an SPL is hard as it may require the examination of a combinatorial number of programs. In reality, however, features are often irrelevant for a given test | they augment, but do not change, existing behavior, making many feature combinations unnecessary as far as testing is concerned. In this paper we show how to reduce the amount of effort in testing an SPL. We represent an SPL in a form where conventional static program analysis techniques can be applied to find irrelevant features for a test. We use this information to reduce the combinatorial number of SPL programs to examine.

testing, analysis, feature models, analysis, feature models, testing

D. Batory FOSD - A Science of Software Design . Keynote at Dagstuhl on Feature Oriented Software Development, January 2011.


Presentation Slides

keynote, categories, commuting diagrams, feature interactions, MDA, MDE, MDD, mixin layers

D. Batory. Thoughts on Automated Software Design and Synthesis . Future of Software Engineering Research Workshop (FoSER), November 2010.

I summarize some personal observations on the topic of automated software design and synthesis that I accumulated over twenty years. They are intended to alert researchers to pitfalls that may be encountered and to identify goals for future efforts in advancing software engineering education and research.

computational design, refactoring, composition.

C.H.P. Kim, E. Bodden, D. Batory, and S. Khurshid. Enforcing Safety Properties in Product Lines . Runtime Verification (RV), November 2010.

A product line is a family of programs where each program is defined by a unique combination of features. Product lines, like conventional programs, can be checked for safety properties through execution monitoring. However, applying execution monitoring techniques for conventional programs against product lines is expensive because one would have to monitor all of the product line's programs, the number of which can be exponential in the number of features. Also, doing so would be wasteful because many programs can provably never violate a stated property. We introduce a monitoring technique dedicated to product lines that given a safety property, determines the statements that need to be instrumented for the corresponding execution monitor to trigger and the feature combinations required for these statements to be reached. Thus, we identify feature combinations that cannot possibly violate the stated safety property, reducing the number of programs to monitor. Experiments show that our technique is effective, particularly for safety properties that crosscut many optional features.

testing, analysis, feature models

C.H.P. Kim, D. Batory, and S. Khurshid. Eliminating Products to Test in a Software Product Line (Short Paper) . Automated Software Engineering (ASE), September 2010.

A Software Product Line (SPL) is a family of programs. Testing an SPL is a challenge because the number of programs to examine may be exponential in the number of features. However, there are features whose absence or presence has no bearing on the outcome of a test. We can ignore such irrelevant features and consider combinations of only the remaining features, thereby eliminating unnecessary test runs. In this paper, we propose a product line representation that enables a conventional static program analysis to be applied. We then present a classification of features that can be used to narrow down the search for relevant features. Conditions of relevance that a static analysis can check are outlined and a procedure that uses the set of relevant features to reduce the combinatorial number of programs to test is sketched.

analysis, feature models, testing

T.L. Riche, H.M. Vin, and D. Batory. Transformation-Based Parallelization of Request-Processing Applications . Model Driven Engineering Languages and Systems (MODELS), October 2010.

Multicore, multithreaded processors are rapidly becoming the platform of choice for high-throughput request-processing applications (RPAs). We refer to this class of modern parallel platforms as multi-* systems. In this paper, we describe the design and implementation of Lagniappe, a translator that simplifies RPA development by transforming portable models of RPAs to highthroughput multi-* executables. We demonstrate Lagniappe’s effectiveness with a pair of stateful RPA case studies.

MDD MDE computational design award streaming architectures DxT

M. Azanza, D. Batory, O. Díaz, and S. Trujillo. Domain-Specific Composition of Model Deltas . Int. Conference on Model Transformations (ICMT) , LNCS 6142, July 2010.

We present a general approach to the incremental development of model-based applications using endogenous transformations, i.e. transformations whose input and output models conform to the same metamodel. Our work describes these transformations as model deltas that, when composed, deliver a complete model. We establish a relationship between a metamodel and its corresponding delta metamodel, show how model deltas can be defined as model changes (additions), explain how deltas can be composed using domain-specific composition algorithms, and propose metamodel annotations to specify these algorithms. We present different case studies as proofs of concept.

MDD MDE categories computational design

E. Uzuncaova, S. Khurshid, and D. Batory. Incremental Test Generation for Software Product Lines . IEEE Transactions on Software Engineering (IEEE TSE), Feb. 2010.

Recent advances in mechanical techniques for systematic testing have increased our ability to automatically find subtle bugs, and hence to deploy more dependable software. This paper builds on one such systematic technique, scope-bounded testing, to develop a novel specification-based approach for efficiently generating tests for products in a software product line. Given properties of features as firstorder logic formulas in Alloy, our approach uses SAT-based analysis to automatically generate test inputs for each product in a product line. To ensure soundness of generation, we introduce an automatic technique for mapping a formula that specifies a feature into a transformation that defines incremental refinement of test suites. Our experimental results using different data structure product lines show that an incremental approach can provide an order of magnitude speed-up over conventional techniques. We also present a further optimization using dedicated integer constraint solvers for feature properties that introduce integer constraints, and show how to use a combination of solvers in tandem for solving Alloy formulas.

MDD MDE analysis commuting diagrams geodesic testing verification

D. Batory. On the Importance and Challenges of FOSD. Keynote at First Workshop on Feature Oriented Software Development (FOSD), Denver, Colorado, October 2009.

Among the key elements of mature engineering is automated production: we understand the technical problems and we understand their solutions; our goal is to automate production as much as possible to increase product quality, reduce costs and time-to-market, and be adept at creating new products quickly and cheaply. Automated production is a technological statement of maturity: 'We have built these products so often by hand that we have gotten it down to a Science'. Models of automated production are indeed the beginnings of a Science of Automated Design (SOAD). Feature Oriented Software Development (FOSD) will play a fundamental role in SOAD, and I believe also play a fundamental role in the future of software engineering. In this presentation, I explain what distinguishes FOSD from other software design disciplines and enumerate key technical barriers that lie ahead for FOSD and SOAD.

PDF of presentation
Survey and responses for this keynote

keynote, evolution, feature interactions, featue models, kubes, origami, refactoring, semantics, testing, verification

D. Batory (work with T. Riche). Refinement and Optimization of Streaming Architectures. Keynote at Software Engineering and Data Engineering (SEDE), Las Vegas, June 2009 and the Software Engineering and Databases (JISDB), San Sebastian, Spain, September 2009.

We present a Model Driven Engineering approach to explain, verify, build, and test dataflow or streaming software architectures that are to be parallelized for performance or availability. Componentconnector models are incrementally elaborated by transformations that refine or optimize architectural designs. We re-engineered two significant case studies to illustrate the generality of our work: (1) recoverable crash fault tolerant servers and (2) join parallelizations in database machines.

PDF of presentation

DxT, keynote, pipe-and-filter, architectural refinement, optimization

G. Freeman, D. Batory, G. Lavender, and J.N. Sarvela. Lifting Transformational Models of Product Lines: A Case Study. Software and Systems Modeling. October 2009. This is a journal version of our 2008 ICMT paper.

Model driven development (MDD) of software product lines (SPLs) merges two increasing important paradigms that synthesize programs by transformation. MDD creates programs by transforming models, and SPLs elaborate programs by applying transformations called features. In this paper, we present the design and implementation of a transformational model of a product line of scalar vector graphics and JavaScript applications. We explain how we simplified our implementation by lifting selected features and their compositions from our original product line (whose implementations were complex) to features and their compositions of another product line (whose specifications were simple). We used operators to map higher-level features and their compositions to their lower-level counterparts. Doing so exposed commuting relationships among transformations (features) in both product lines that helped validate our model and implementation.

higher-order transformations, MDD, MDE, MDA, award

M. Kuhlemann, D. Batory, and S. Apel. Safe Composition of Non-Monotonic Features . Generative Programming and Component Engineering (GPCE), October 2009.

Programs can be composed from features. We want to verify automatically that all legal combinations of features can be composed safely without errors. Prior work on this problem assumed that features add code monotonically. We generalize prior work to enable features to both add and remove code, describe our analyses and implementation, and review case studies. We observe that more expressive features can increase the complexity of developed programs rapidly up to the point where automated concepts as presented in this paper are not a helpful tool but a necessity for verification.

refactoring, safe composition

M. Kuhlemann, D. Batory, and S. Apel. Refactoring Feature Modules . International Conference on Software Reuse (ICSR), September 2009.

In feature-oriented programming, a feature is an increment in program functionality and is implemented by a feature module. Programs are generated by composing feature modules. A generated program may be used by other client programs but occasionally must be transformed to match a particular legacy interface before it can be used. We call the mismatch of the interface of a generated program and a client-desired interface an incompatibility. We introduce the notion of refactoring feature modules (RFMs) that extend feature modules with refactorings. We explain how RFMs reduce incompatibilities and facilitate reuse. We report our experiences on five case studies.

refactoring

B. Delaware, W. Cook, and D. Batory. Fitting the Pieces Together: A Machine-Checked Model of Safe Composition . ACM SIGSOFT FSE, August 2009.

Programs of a software product line can be synthesized by composing features which implement a unit of program functionality. In most product lines, only some combination of features are meaningful; feature models express the highlevel domain constraints that govern feature compatibility. Product line developers also face the problem of safe composition --whether every product allowed by a feature model is type-safe when compiled and run. To study the problem of safe composition, we present Lightweight Feature Java (LFJ), an extension of Lightweight Java with support for features. We define a constraint-based type system for LFJ and prove its soundness using a full formalization of LFJ in Coq. In LFJ, soundness means that any composition of features that satisfies the typing constraints will generate a well-formed LJ program. If the constraints of a feature model imply these typing constraints then all programs allowed by the feature model are type-safe.

feature models, mixin layers, semantics, safe composition

C. Kaestner, S. Apel, S.S. ur Rahman, M. Rosenmueller, and D. Batory. On the Impact of the Optional Feature Problem: Analysis and Case Studies. Software Product Line Conference (SPLC), September 2009.

A software product-line is a family of related programs that are distinguished in terms of features. A feature implements a stakeholders’ requirement. Different program variants specified by distinct feature selections are produced from a common code base. The optional feature problem describes a common mismatch between variability intended in the domain and dependencies in the implementation. When this occurs, some variants that are valid in the domain cannot be produced due to implementation issues. There are many different solutions to the optional feature problem, but they all suffer from drawbacks such as reduced variability, increased development effort, reduced efficiency, or reduced source code quality. In this paper, we examine the impact of the optional feature problem in two case studies in the domain of embedded database systems, and we survey different state-ofthe- art solutions and their trade-offs. Our intension is to raise awareness of the problem, to guide developers in selecting an appropriate solution for their product-line project, and to identify opportunities for future research.

derivatives, optional feature problem

C. Kaestner, S. Apel, S. Trujillo, M. Kuhlemann, and D. Batory. Guaranteeing Syntactic Correctness for All Product Line Variants: A Language-Independent Approach . International Conference Objects, Models, Components, Patterns (TOOLS), June 2009.

A software product line (SPL) is a family of related program variants in a well-defined domain, generated from a set of features. A fundamental difference from classical application development is that engineers develop not a single program but a whole family with hundreds to millions of variants. This makes it infeasible to separately check every distinct variant for errors. Still engineers want guarantees on the entire SPL. A further challenge is that an SPL may contain artifacts in different languages (code, documentation, models, etc.) that should be checked. In this paper, we present CIDE, an SPL development tool that guarantees syntactic correctness for all variants of an SPL. We show how CIDE’s underlying mechanism abstracts from textual representation and we generalize it to arbitrary languages. Furthermore, we automate the generation of safe plug-ins for additional languages from annotated grammars. To demonstrate the language-independent capabilities, we applied CIDE to a series of case studies with artifacts written in Java, C++, C, Haskell, ANTLR, HTML, and XML.

feature interactions, verification

T. Thuem. Reasoning About Feature Model Edits . B.Sc. Thesis, Department of Technical and Business Information Systems, University of Magdeburg, Germany, May 2008.

Feature modeling is a technique in domain analysis to describe the variabilities and commonalities among similar programs. The valid combinations of features are specified in a feature model. Automated analysis of feature models reveals errors and supports the configuration process. Feature models evolve over time and there is a need for edits on feature models. We categorize edits into refactorings, specializations, and generalizations, and arbitrary edits to support the modeling process. This paper presents an algorithm to reason about feature model edits and their categories. Two stand-alone feature models can be taken as input, where the set of features is not necessary identical. We evaluate the algorithm with random feature models to show that the algorithm scales even for larger feature models.

student, feature models, analysis

B. Delaware, W. Cook, and D. Batory. A Machine-Checked Model of Safe Composition. Foundations of Aspect-Oriented Languages (FOAL), March 2009.

Programs of a software product line can be synthesized by composing features which implement some unit of program functionality. In most product lines, only some combination of features are meaningful; feature models express the high-level domain constraints that govern feature compatibility. Product line developers also face the problem of safe composition -- whether every product allowed by a feature model is type-safe when compiled and run To study the problem of safe composition, we present Lightweight Feature Java (LFJ), an extension of Lightweight Java with support for features. We define a constraint-based type system for LFJ and prove its soundness using a full formalization of LFJ in Coq. In LFJ, soundness means that any composition of features that satisfies the typing constraints will generate a well-formed LJ program. If the constraints of a feature model imply these typing constraints then all programs allowed by the feature model are type-safe.

PDF of Presentation

constraints, safe composition, feature models, verification, semantics

T. Thuem, D. Batory, and C. Kaestner. Reasoning about Edits to Feature Models. International Conference on Software Engineering (ICSE), May 2009.

Features express the variabilities and commonalities among programs in a software product line (SPL). A feature model defines the valid combinations of features, where each combination corresponds to a program in an SPL. SPLs and their feature models evolve over time. We classify the evolution of a feature model via modifications as refactorings, specializations, generalizations, or arbitrary edits. We present an algorithm to reason about feature model edits to help designers to determine how the program membership of an SPL has changed. Our algorithm takes two feature models as input (before and after edit versions), where the set of features in both models are not necessarily the same, and it determines the change classification. Our algorithm efficiently classifies edits to models that have thousands of features.

feature models, recommended design rule checking

D. Batory. Program Kubes: Dimensions of Variability in Software Synthesis. Keynote at Variability Modeling in Software-Intensive Systems (VAMOS), January 2009, and the Software Product Line Evolution Workshop, Pune, India, February 2009.

It is common in software product lines to have orthogonal and interacting sets of features. In such cases, the design of a program can expressed as a matrix of program transformations, where both rows and columns denote features. The technique is called Origami, as the matrix is folded in precise ways (thereby composing transformations) until a scalar is produced. This scalar is an expression (a composition of transformations) that synthesizes the program when it is evaluated. Origami generalizes to n-dimensional arrays, where each axis defines a dimension of variability. Previously our understanding of Origami was intuitive; here we present a mathematical explanation of Origami that exposes its principles and integrates diverse topics: database technology (data cubes), mathematics (ideas from tensors and categories), programming languages (the expression problem), and software design (feature interactions).

PDF of presentation

keynote, origami, categories, commuting diagrams, computational design, feature interactions, feature models, tensors

D. Batory, M. Azanza, and J. Saraiva. The Objects and Arrows of Computational Design. Keynote at Model Driven Engineering Languages and Systems (MODELS), October 2008.

Computational Design (CD) is a paradigm where both program design and program synthesis are computations. CD merges Model Driven Engineering (MDE) which synthesizes programs by transforming models, with Software Product Lines (SPL) where programs are synthesized by composing transformations called features. In this paper, basic relationships between MDE and SPL are explored using the language of modern mathematics.

PDF of Presentation
Interesting blogs on MODELS keynote: blog1, blog2.

MDE, MDD, MDA, categories, keynote, recommended theory

E. Uzuncaova, D. Garcia, S. Khurshid, and D. Batory. Testing Software Product Lines Using Incremental Test Generation. Int. Symposium on Software Reliability Engineering (ISSRE), November 2008.

We present a novel specification-based approach for generating tests for products in a software product line. Given properties of features as first-order logic formulas, our approach uses SAT-based anaylsis to automatically generate test inputs for each product in a product line. To ensure soundness of generation, we introduce an automatic technique for mapping a formula that specifies a feature into a transformation that defines incremental refinement of test suites. Our experimental results using different data structure product lines show that our approach can provide an order of magnitude speed-up over conventional techniques.

testing, commuting diagrams, geodesic

D. Batory. A Modeling Language for Program Design and Synthesis. Advances in Software Engineering, E. Börger and A. Cisternino (editors), LNCS 5316. Keynote at Brazilian Symposium on Software Engineering, October 2007.

Software engineers define structures called programs and use tools to manipulate, transform, and analyze them. A modeling language is needed to express program design and synthesis as a computation, and elementary algebra fits the bill. I review recent results in automated software design, testing, and maintenance and use the language of elementary mathematics to explain and relate them. Doing so outlines a general and simple way to express and understand the relationships between different topics in program synthesis.

Computational Design, commuting diagrams, keynote, geodesic

D. Batory. Using Modern Mathematics as an FOSD Modeling Langauge. Generative Programming and Component Engineering (GPCE), October 2008.

Modeling languages are a fundamental part of automated software development. MDD, for example, uses UML class diagrams and state machines as languages to define applications. In this paper, we explore how Feature Oriented Software Development (FOSD) uses modern mathematics as a modeling language to express the design and synthesis of programs in software product lines, but demands little mathematical sophistication from its users. Doing so has three practical benefits: (1) it offers a simple and principled mathematical description of how FOSD transforms, derives, and relates program artifacts, (2) it exposes previously unrecognized commuting relationships among tool chains, thereby providing new ways to debug tools, and (3) it reveals new ways to optimize software synthesis.

categories, Computational Design, commuting diagrams, geodesic

C.H.P. Kim, C. Kaestner, and D. Batory. On the Modularity of Feature Interactions. Generative Programming and Component Engineering (GPCE), October 2008.

Feature modules are the building blocks of programs in software product lines (SPLs). A foundational assumption of feature-based program synthesis is that features are composed in a predefined order. Recent work on virtual separation of concerns reveals a new model of feature interactions that shows that feature modules can be quantized as compositions of smaller modules called derivatives. We present this model and examine some of its unintuitive consequences, namely, that (1) a given program can be reconstructed by composing features in any order, and (2) the contents of a feature module (as expressed as a composition of derivatives) is determined automatically by a feature order. We show that different orders allow one to adjust the contents of a feature module to isolate and study the impact of interactions that a feature has with other features. Using derivatives, we show the utility of generalizing safe composition (SC), a basic analysis of SPLs that verifies program type-safety, to prove that every legal composition of derivatives (and thus any composition order of features) produces a type-safe program, which is a much stronger SC property.

Click here for code downloads.

derivatives, feature interactions, safe composition, commuting diagrams, optional feature problem

S. Apel, C. Kaestner, and D. Batory. Program Refactoring using Functional Aspects. Generative Programming and Component Engineering (GPCE), October 2008.

A functional aspect is an aspect that has the semantics of a transformation; it is a function that maps a program to an advised program. Functional aspects are composed by function composition. In this paper, we explore functional aspects in the context of aspect-oriented refactoring. We show that refactoring legacy applications using functional aspects is just as flexible as traditional aspects in that (a) the order in which aspects are refactored does not matter, and (b) the number of potential aspect interactions is decreased. We analyze several aspectoriented programs of different sizes to support our claims.

aspects, commuting diagrams

A. Hemakumar. Finding Contradictions in Feature Models. Workshop on the Analysis of Software Product Lines (ASPL), September 2008.

A feature model defines each product in a product-line by a unique combination of features. Feature compatibilities are expressed as constraints in feature models and may be contradictory. We suggest a run-time approach to expose contradictions in feature models when they are uncovered. However, the emphasis of this paper is to explore the possibility of finding contradictions statically using model checking and an incremental consistency algorithm.

Click here for the feature models used in this work.

constraints, feature models

D. Batory and E. Boerger. Modularizing Theorems for Software Product Lines: The Jbook Case Study. Journal of Universal Computer Science (JUCS). Keynote at Abstract State Machine (ASM) Workshop, June 2007.

A goal of software product lines is the economical assembly of programs in a family of programs. In this paper, we explore how theorems about program properties may be integrated into feature-based development of software product lines. As a case study, we analyze an existing Java/JVM compilation correctness proof for defining, interpreting, compiling, and executing bytecode for the Java language. We show how features modularize program source, theorem statements and their proofs. By composing features, the source code, theorem statements and proofs for a program are assembled. The investigation in this paper reveals a striking similarity of the refinement concepts used in Abstract State Machines (ASM) based system development and Feature-Oriented Programming (FOP) of software product lines. We suggest to exploit this observation for a fruitful interaction of researchers in the two communities.

verification, semantics

G. Freeman, D. Batory, and G. Lavender. Lifting Transformational Models of Product Lines: A Case Study. Int. Conference on Model Transformations (ICMT), 2008. ( see journal version.)

Model driven development (MDD) of software product lines (SPLs) merges two increasing important paradigms that synthesize programs by transformation. MDD creates programs by transforming models, and SPLs elaborate programs by applying transformations called features. In this paper, we present the design and implementation of a transformational model of a product line of scalar vector graphics and JavaScript applications. We explain how we simplified our implementation by lifting selected features and their compositions from our original product line (whose implementations were complex) to features and their compositions of another product line (whose specifications were simple). We used operators to map higher-level features and their compositions to their lower-level counterparts. Doing so exposed commuting relationships among transformations (features) in both product lines that helped validate our model and implementation.

higher-order transformations, MDD, MDE, MDA

S. Apel and D. Batory. How AspectJ is Used: An Analysis of Eleven AspectJ Programs. TR MIP-0801, Dept. Informatics and Mathematics, University of Passau, Germany, April 2008.

While it is well-known that crosscutting concerns occur in many software projects, little is known on how aspect-oriented program- ming and in particular AspectJ have been used. In this paper, we analyze eleven AspectJ programs by different authors to answer the questions: which mechanisms are used, to what extent, and for what purpose. We found the code of these programs to be on average 86% objectoriented, 12% basic AspectJ mechanisms (introductions and method extensions), and 2% advanced AspectJ mechanisms (homogeneous advice or advanced dynamic advice). There is one class of crosscutting concerns which is mostly concerned with introductions and method extensions that matches this result well: collaborations. These results and our discussions with program authors indicate the bulk of coding activities was implementing collaborations. Several studies and researchers suggest that languages explicitly supporting collaborations are better suited than aspects a la AspectJ for this task.

Journal of Object Technology

aspects

C.H.P. Kim, K. Czarnecki, and D. Batory. On-Demand Materialization of Aspects for Application Development. Software Engineering Properties of Languages and Aspect Technologies (SPLAT), 2008.

Framework-based application development requires applications to be implemented according to rules, recipes and conventions that are documented or assumed by the framework’s Application Programming Interface (API), thereby giving rise to systematic usage patterns. While such usage patterns can be captured cleanly using conventional aspects, their variations, which arise in exceptional conditions, typically cannot be. In this position paper, we propose material- izable aspects as a solution to this problem. A materializable aspect behaves as a normal aspect for default joinpoints, but for exceptional joinpoints, it turns into a program transformation and analysis mechanism, with the IDE transforming the advice in-place and allowing the application developer to modify the materialized advice within the semantics of the aspect. We demonstrate the need for materializable aspects through a preliminary study of open-source SWT-based applications and describe our initial implementation of materializable aspects in Eclipse.

aspects

D. Batory and D. Smith. Finite Map Spaces and Quarks: Algebras of Program Structure. University of Texas at Austin, Dept. of Computer Sciences, TR-07-66.

We present two algebras that unify the disparate software composition models of Feature-Oriented Programming and Aspect-Oriented Programming. In one algebra, a finite map space underlies program synthesis, where adding finite maps and modifying their contents are fundamental operations. A second and more general algebra uses quarks, a construct that represents both expressions and their transformations. Special cases of our algebras correspond to existing systems and languages, and thus can serve as a foundation for next-generation tools and languages for feature-based program synthesis.

computational design

S. Apel, C. Lengauer, D. Batory, B. Möller, and C. Kästner. An Algebra for Feature-Oriented Software Development. Fakultät für Informatik und Mathematik, Universität Passau, MIP-0706, 2007.

Feature-Oriented Software Development (FOSD) provides a multitude of formalisms, methods, languages, and tools for building variable, customizable, and extensible software. Along different lines of research different ideas of what a feature is have been developed. Although the existing approaches have similar goals, their representations and formalizations have not been integrated so far into a common framework. We present a feature algebra as a foundation of FOSD. The algebra captures the key ideas and provides a common ground for current and future research in this field, in which also alternative options can be explored.

computational design

E. Uzuncaova, D. Garcia, S. Khurshid, and D. Batory. A Specification-Based Approach to Testing Software Product Lines (Poster Paper) . SIGSOFT FSE, 2007.

This paper presents a specification-based approach for systematic testing of products from a software product line. Our approach uses specifications given as formulas in Alloy, a first-order logic based on relations. Alloy formulas can be checked for satisfiability using the Alloy Analyzer. The fully automatic analyzer, given an Alloy formula and a scope, i.e., a bound on the universe of discourse, searches for an instance, i.e., a valuation to the relations in the formula such that it evaluates to true. The analyzer translates an Alloy formula (for the given scope) to a propositional formula and finds an instance using an off-the-shelf SAT solver. The use of an enumerating solver enables systematic test generation. We have developed a prototype based on the AHEAD theory. The prototype uses the recently developed Kodkod model finding engine of the Alloy Analyzer. We illustrate our approach using a data structure product line.

testing, geodesic

S. Thaker, D. Batory, D. Kitchin, and W. Cook. Safe Composition of Product Lines. Generative Programming and Component Engineering (GPCE), October 2007.

Programs of a software product line can be synthesized by composing modules that implement features. Besides high-level domain constraints that govern the compatibility of features, there are also low-level implementation constraints: a feature module can reference elements that are defined in other feature modules. Safe composition is the guarantee that all programs in a product line are type safe: i.e., absent of references to undefined elements (such as classes, methods, and variables). We show how safe composition properties can be verified for AHEAD product lines using feature models and SAT solvers.

constraints, recommended safe composition, verification

C. Kaestner, S. Apel, and D. Batory. A Case Study Implementing Features Using AspectJ. Software Product Line Conference (SPLC), September 2007.

Software product lines aim to create highly configurable programs from a set of features. Common belief and recent studies suggest that aspects are well-suited for implementing features. We evaluate the suitability of AspectJ with respect to this task by a case study that refactors the embedded database system Berkeley DB into 38 features. Contrary to our initial expectations, the results were not encouraging. As the number of aspects in a feature grows, there is a noticeable decrease in code readability and maintainability. Most of the unique and powerful features of AspectJ were not needed. We document where AspectJ is unsuitable for implementing features of refactored legacy applications and explain why.

aspects

S. Trujillo. Feature Oriented Model Driven Product Lines . Ph.D. Dissertation, Department of Computer Sciences, The University of the Basque Country, Spain, March 2007.

Model Driven Development (MDD) is an emerging paradigm for software construction that uses models to specify programs, and model transformations to synthesize executables. Feature Oriented Programming (FOP) is a paradigm for software product lines where programs are synthesized by composing features. Feature Oriented Model Driven Development (FOMDD) is a blend of FOP and MDD that shows how programs in a software product line can be synthesized in an MDD way by composing models from features, and then transforming these models into executables. A case study on a product line of portlets, which are components of web portals, is used to illustrate FOMDD. This case reveals mathematical properties (i.e., commuting diagrams) of portlet synthesis that helped us to validate the correctness of our approach (abstractions, tools and specifications), as well as optimize portlet synthesis. These properties expose the nature of metaprograms for program synthesis (i.e., synthesis is a metaprogram combining primitive operations that form a geometry). We exploit this to synthesize metaprograms, which when executed, will synthesize a target program of a product-line. Specifically, we elaborate on the generation of metaprograms (not programs) from abstract specifications. This is the core of GROVE: the GeneRative metaprOgramming for Variable structurE approach. Variability of synthesis is also considered. Nonetheless, the ultimate envision is a structural theory behind program synthesis.

student, MDD, MDE, MDA, geodesic, higher-order transformations, kubes, origami, categories, commuting diagrams

C. Kaestner. Aspect-Oriented Refactoring of Berkeley DB . M.Sc. Thesis, Department of Technical and Business Information Systems, University of Magdeburg, Germany, March 2007.

Aspect-oriented programming has been repeatedly suggested to implement software product lines (SPL). In this thesis we create a large case study by refactoring the open source project Berkeley DB into an SPL with 38 features using AspectJ. We focus on expressibility, readability, the role of bounded quantification, and feature interactions. Contrary to our initial expectations, the results were not encouraging. We document problems encountered and suggest solutions and directions for further research.

student, aspects

S. Apel. The Role of Features and Aspects in Software Development . Ph.D. Dissertation, Department of Technical and Business Information Systems, University of Magdeburg, Germany, March 2007.

Feature-Oriented Programming (FOP) and Aspect-Oriented Programming (AOP) are complementary technologies. Though they aim at crosscutting modularity, they do so in different ways. We observed that FOP and AOP can be combined to overcome their individual limitations. Consequently, we propose Aspectual Feature Modules (AFMs), a representative approach that unifies AOP and FOP. From this symbiosis we derive the novel notion of Aspect Refinement (AR) that integrates aspects into the stepwise development philosophy of FOP. We use AFMs and AR in a non-trivial case study to create a product line of overlay networks. We also present a set of guidelines to assist programmers in how and when to use FOP and AOP techniques for implementing product lines in a stepwise and generative manner. Finally, we answer the question of how FOP and AOP-related implementation techniques are used today by analyzing a representative set of AspectJ programs of different sizes. We observe that aspects are used frequently for implementation problems that are closely related to FOP. We discuss why this is not surprising.

student, aspects

S. Trujillo, D. Batory, and O. Diaz Feature Oriented Model Driven Development: A Case Study for Portlets. International Conference on Software Engineering (ICSE), 2007.

Model Driven Development (MDD) is an emerging paradigm for software construction that uses models to specify programs, and model transformations to synthesize executables. Feature Oriented Programming (FOP) is a paradigm for software product lines where programs are synthesized by composing features. Feature Oriented Model Driven Development (FOMDD) is a blend of FOP and MDD that shows how products in a software product line can be synthesized in an MDD way by composing models from features, and then transforming these models into executables. We present a case study of FOMDD on a product line of portlets, which are components of web portals. We reveal mathematical properties of portlet synthesis that helped us to validate the correctness of our abstractions, tools, and specifications, as well as optimize portlet synthesis.

MDD, MDE, MDA, categories, commuting diagrams

D. Batory. Program Refactorings, Program Synthesis, and Model-Driven Design . Keynote at European Joint Conferences on Theory and Practice of Software (ETAPS) Compiler Construction Conference, April 2007.

Program refactoring, feature-based and aspect-oriented software synthesis, and model-driven development are disjoint research areas. However, they are all architectural metaprogramming technologies as they treat programs as values and use functions (a.k.a. transformations) to map programs to other programs. In this paper, I explore their underlying connections by reviewing recent advances in each area from an architectural metaprogramming perspective. I conjecture how these areas can converge and outline a theory that may unify them.

MDD, MDE, MDA, categories, commuting diagrams, refactoring, keynote, recommended theory

D. Batory. From Implementation to Theory in Product Synthesis . Keynote at Principles of Programming Languages (POPL), January 2007.

Future software development will rely on product synthesis, i.e., the synthesis of code and non-code artifacts for a target component or application. Prior work on feature-based product synthesis can be unified by applying elementary ideas from category theory. Doing so reveals (a) important and previously unrecognized properties that product synthesis tools must satisfy, and (b) non-obvious generalizations of current techniques that will guide future research efforts in automated product development.

Published version of this paper.

MDD, MDE, MDA, categories, commuting diagrams, computational design, geodesic

R. Lopez-Herrejon and D. Batory, Modeling Features in Aspect-Based Product Lines with Use Case Slices: An Exploratory Case Study." Aspect Oriented Modeling Workshop at Model Driven Engineering Languages and Systems (MODELS). Genoa, Italy 2006. Also, in Models in Software Engineering. Workshops and Symposia at MoDELS 2006, Genoa, Italy, October 1-6, 2006, Reports and Revised Selected Papers. Series: Lecture Notes in Computer Science, Vol. 4364 Sublibrary: Programming and Software Engineering 2007.

A significant number of techniques that exploit aspects in software design have been proposed in recent years. One technique is use case slices by Jacobson and Ng, that builds upon the success of use cases as a common modeling practice. A use case slice modularizes the implementation of a use case and typically consists of a set of aspects, classes, and interfaces. Work on Feature Oriented Programming (FOP) has shown how features, increments in program functionality, can be modularized and algebraically modeled for the synthesis of product lines. When AspectJ is used in FOP, the structure of feature modules resembles that of use case slices. In this paper, we explore the relations between use case slices modeling and FOP program synthesis and describe their potential synergy for modeling and synthesizing aspect-based product lines.

aspects, UML, award

S. Thaker. Design and Analysis of Multidimensional Program Structures . M.Sc. Thesis, December 2006, The Department of Computer Sciences, The University of Texas at Austin.

Software design is an art. Just as understanding of the structure of atomic elements is critical to the natural sciences, it too is fundamental to the science behind software engineering. Feature-Oriented Programming and compositional programming embody ideas of software decomposition structure. Within feature-decomposition, multidimensional structures are of particular interest. Multidimensional structures capture a higher-order structural relationship within conventional hierarchal decomposition. They simplify a complex decomposition's representation, understanding, and compositional specification. In this thesis we first overview multidimensional structures. We then identify two critical properties a decomposition must satisfy and devise algorithms for efficiently verifying them. The Safe Composition property guarantees lack of type-errors at composition-time. The Orthogonality property requires a multidimensional decomposition to be identical along all dimensions. In identifying and verifying key properties we have also demonstrated how a structural model can simplify our understanding and applicability of essential principles. When the fundamental structure is well-understood, software design is amenable to automation.

student, analysis, origami, safe composition, verification

D. Batory, D. Benavides, and A. Ruiz-Cortes. Automated Analyses of Feature Models: Challenges Ahead. CACM, Special Issue on Software Product Lines, December 2006.

This paper describes recent advances in formalizing feature models and the challenges ahead in automating product specification and design.

feature models

S. Apel and J. Liu. "On the Notion of Functional Aspects in Aspect-Oriented Refactoring". Keynote at Workshop on Aspects, Dependencies, and Interactions (ADI), Nantes, France, 2006.

In this paper, we examine the notion of functional aspects in context of aspect-oriented refactoring. Treating aspects as functions reduces the potential interactions between aspects significantly. We propose a simple mathematical model that incorporates fundamental properties of aspects and their interactions. Using this model, we show that with regard to refactoring functional aspects are as flexible as traditional aspects, but with reduced program complexity.

aspects, keynote

S. Apel, D. Batory, and M. Rosenmueller. "On the Structure of Crosscutting Concerns:Using Aspects or Collaborations?", 1st Workshop on Aspect-Oriented Product Line Engineering (AOPLE), October 2006.

While it is well known that crosscutting concerns occur in many software projects, little is known about the inherent properties of these concerns nor how aspects (should) deal with them. We present a framework for classifying the structural properties of crosscutting concerns into (1) those that benefit from AOP and (2) those that should be implemented by OOP mechanisms. Further, we propose a set of code metrics to perform this classification. Applying them to a case study is a first to step toward revealing the current practice of AOP.

aspects

G.T. Leavens, J-R Abrial, D. Batory, M. Butler, A. Coglio, K. Fisler, E. Hehner, C. Jones, D. Miller, S. Peyton-Jones, M. Sitaraman, D.R. Smith, and A. Stump, "Roadmap for Enhanced Languages and Methods to Aid Verification", Generative Programming and Component Engineering (GPCE), October 2006.

This road map describes ways that researchers in four areas - specification languages, program generation, correctness by construction, and programming languages - might help further the goal of verified software. It also describes what advances the "verified softwar" grand challenge might anticipate or demand from work in these areas. That is, the roadmap is intended to help foster collaboration between the grand challenge and these research areas. A common goal for research in these areas is to establish language designs and tool architectures that would allow multiple annotations and tools to be used on a single program. In the long term, researchers could try to unify these annotations and integrate such tools.

verification, semantics

S. Apel and D. Batory. When to Use Features and Aspects? A Case Study ,Generative Programming and Component Engineering (GPCE), October 2006.

Aspect-Oriented Programming (AOP) and Feature-Oriented Programming (FOP) are complementary technologies that can be combined to overcome their individual limitations. Aspectual Mixin Layers (AML) is a representative approach that unifies AOP and FOP. We use AML in a non-trivial case study to create a product line of overlay networks. We also present a set of guidelines to assist programmers in how and when to use AOP and FOP techniques for implementing product lines in a stepwise and generative manner.

aspects

S. Trujillo, D. Batory, and O. Diaz. Feature Refactoring a Multi-Representation Program into a Product Line , Generative Programming and Component Engineering (GPCE), October 2006.

Feature refactoring is the process of decomposing a program into a set of modules, called features, that encapsulate increments in program functionality. Different compositions of features yield different programs. As programs are defined using multiple representations, such as code, makefiles, and documentation, feature refactoring requires all representations to be factored. Thus, composing features produces consistent representations of code, makefiles, documentation, etc. for a target program. We present a case study of feature refactoring a substantial tool suite that uses multiple representations. We describe the key technical problems encountered, and sketch the tool support needed for simplifying such refactorings in the future.

refactoring

M. Grechanik Design and Analysis of Interoperating Components . Ph.D. dissertation, September 2006, The Department of Computer Sciences, The University of Texas at Austin.

Components are modular units (e.g., objects, modules, or programs) that interact by exchanging data. Two or more components interoperate when they exchange information. We propose an abstraction in which foreign objects (i.e., objects that are not defined in a host programming language) are abstracted as graphs and abstract operations access and manipulate them. Using this abstraction, we build a framework called Reification Object-Oriented Framework (ROOF) to reduce the multiplicity of platform API calls to a small set of operations on foreign objects that are common to all platforms. Based on our abstraction, we designed Foreign Object REification Language (FOREL), an extension for object-oriented languages whose type checking coupled with a conservative static analysis mechanism reports potential errors when referencing foreign components. Also, we design and build a tool for legacy code called a Verifier for Interoperating cOmponents for finding Logic fAults (Viola) that finds errors in components exchanging XML data.

student, analysis, verification

R.E. Lopez-Herrejon. Understanding Feature Modularity . .Ph.D. dissertation, September 2006, The Department of Computer Sciences, The University of Texas at Austin.

An undesirable consequence is that developers must lower their abstractions from features to those provided by the underlying implementation languages, a process that is far from simple let alone amenable to significant automation. The conceptual gap created between feature abstractions and their modularization hinders program understanding and product line development. The root of the problem is the fact that feature modularity is not well understood and thus not well supported in conventional programming languages, modularization mechanisms, and design techniques. In this dissertation, we explore language and modularity support for features founded on an algebraic model geared for program synthesis. Our model integrates ideas from collaboration-based designs, mixin layers, aspect oriented programming, multidimensional separation of concerns, and generative programming. We assess our model with an implementation of a non-trivial product line case study, and evaluate feature support in emerging modularization technologies.

student, origami, aspects

D. Batory. Multi-Level Models in Model Driven Development, Product-Lines, and Metaprogramming , IBM Systems Journal, Volume 45, Number 3, 2006.

Model Driven Development (MDD) aims to raise the level of abstraction in program specification and increase automation in program development. These are also the goals of product-lines (creating a family of related programs) and metaprogramming (programming as a computation). We show that the confluence of MDD, product-lines, and metaprogramming exposes a multilevel paradigm of program development, and further, we can use OO design techniques to represent programs, the metaprograms that produced these programs, and the meta-meta programs that produced these metaprograms, recursively. The paradigm is based on a small number of simple and well-known ideas, scales to the synthesis of applications of substantial size, and helps clarify concepts of MDD.

MDD, MDE, MDA

J. Liu, D. Batory, and C. Lengauer. "Feature Oriented Refactoring of Legacy Applications", International Conference on Software Engineering 2006 (ICSE), Shanghai, China.

Feature oriented refactoring (FOR) is the process of decomposing a program into features, where a feature is an increment in program functionality. We develop a theory of FOR that relates code refactoring to algebraic factoring. Our theory explains relationships between features and their implementing modules, and why features in different programs of a product-line can have different implementations. We describe a tool and refactoring methodology based on our theory, and present a validating case study.

refactoring, computational design, optional feature problem

R. Lopez-Herrejon, D. Batory, and C. Lengauer. "A Disciplined Approach to Aspect Composition", ACM SIGPLAN 2006 Symposium on Partial Evaluation and Program Manipulation (PEPM), January 2006, Charleston, SC.

Aspect-oriented programming is a promising paradigm that challenges traditional notions of program modularity. Despite its increasing acceptance, aspects have been documented to suffer limited reuse, hard to predict behavior, and difficult modular reasoning. We develop an algebraic model that relates aspects to program transformations and uncovers aspect composition as a significant source of the problems mentioned. We propose an alternative model of composition that eliminates these problems, preserves the power of aspects, and lays an algebraic foundation on which to build and understand AOP tools.

aspects, computational design

M. Grechanik, D.E. Perry, and D. Batory. A Scalable Security Mechanism For Component-Based Systems, Int. Conf. COTS-Based Software Systems, April 2006, Orlando FL.

Security, scalability, and performance are critical for large-scale component-based applications. Weaving security solutions into the fabric of component-based architectures often worsens the scalability and performance of the resulting system. In this paper we analyze the sources of non-scalability and conduct an empirical study that shows that close to 80% of interactions between components and their clients in different commercial systems occur within protected security boundaries. Based on these findings we propose a novel scalable security mechanism for component based systems called Component Adaptive Scalable Secure Infrastructure Architecture (CASSIA). CASSIA utilizes the topology of the security boundaries and patterns of interactions among components to achieve noticeable improvements in scalability and performance for component-based applications. We conduct a case study that confirms the scalability of CASSIA, and propose a Secure COmponent Protocol (SCOP) that incorporates our mechanism into a component infrastructure.

D. Batory.A Tutorial on Feature Oriented Programming and the AHEAD Tool Suite, Proc. Summer School on Generative and Transformation Techniques in Software Engineering, Braga, Portugal, R. Laemmel, J. Saraiva, and J. Visser, ed., Vol 4143, Lecture Notes in Computer Science, Springer-Verlag, 2006.

Feature oriented programming (FOP) is the study of feature modularity and its use in program synthesis. AHEAD is a theory of FOP that is based on a fundamental concept of generative programming that functions map programs. This enables the design of programs to be expressed compositionally as algebraic expressions, which are suited for automated analysis, manipulation, and program synthesis. This paper is a tutorial on FOP and AHEAD. We review AHEAD's theory and the tool set that implements it.

E. Jung, C. Kapoor, and D. Batory. "Automatic Code Generation for Actuator Interfacing from a Declarative Specification", International Conference on Robotics and Systems (IROS), August 2005, Edmonton, Canada.

Common software design practices use object-oriented (OO) frameworks that structure software in terms of objects, classes, and package; designers then create programs by inheritance and composition of classes and objects. Operational Software Components for Advanced Robotics (OSCAR) is one such framework for robot control software with abstractions for generalized kinematics, dynamics, performance criteria, decision making, and hardware interfacing. Even with OSCAR, writing new programs still requires a significant amount of manual labor. Feature-Oriented Programming (FOP) is method for software design that models and specifies programs in terms of features, where a feature encapsulates the common design decisions that occur in a domain. A set of features then forms a domain model for a Product Line Architecture. Product variants in this product line can then be generated from a declarative specification. FOP and related technologies are emerging software engineering techniques for automatically generating programs. Our research applies FOP to robot controller software. As an example, the domain of hardware interfacing is analyzed and 41 features identified. A GUI for specifying and generating programs is presented as well. Analysis of features shows 200 possible different programs could be generated.

D. Batory. "Feature Models, Grammars, and Propositional Formulas", Software Product Line Conference (SPLC) September 2005.

Feature models are used to specify members of a product-line. Despite years of progress, contemporary tools provide limited support for feature constraints and offer little or no support for debugging feature models. We integrate prior results to connect feature models, grammars, and propositional formulas. This connection allows arbitrary propositional constraints to be defined among features and enables off-the-shelf satisfiability solvers to debug feature models. We also show how our ideas can generalize recent results on the staged configuration of feature models.

constraints, feature models, recommended design rule checking

J. Liu, D. Batory, and S. Nedunuri. Modeling Interactions in Feature Oriented Designs, International Conference on Feature Interactions (ICFI), June 2005.

Feature Oriented Programming (FOP) is a general theory of software development where programs are assembled by composing feature modules. A feature X interacts structurally with another feature Y by changing Y's source code. We advance FOP by proposing an algebraic theory of structural feature interactions that models feature interactions as derivatives. We use our theory to show how a legacy Java application can be refactored into a feature-based design.

feature interactions, derivatives, optional feature problem

R. Lopez-Herrejon, D. Batory, and W. Cook. "Evaluating Support for Features in Advanced Modularization Technologies", European Conference on Object-Oriented Programming (ECOOP), July 2005.

A software product-line is a family of related programs. Each program is defined by a unique combination of features, where a feature is an incremental change in program functionality. Modularizing features is difficult, as feature-specific code often cuts across class boundaries. New modularization technologies have been proposed in recent years, but their support for feature modules has not been thoroughly examined. In this paper, we propose a variant of the expression problem as a canonical problem in product-line design. The problem reveals a set of technology-independent properties that feature modules should exhibit. We use these properties to evaluate five technologies: AspectJ, Hyper/J, Jiazzi, Scala, and AHEAD. The results suggest an abstract model of feature composition that is technology-independent and that relates compositional reasoning with algebraic reasoning.

origami, kubes, tensors, aspects

M. Grechanik, D.E. Perry, and D. Batory. Using AOP to Monitor and Administer Software for Grid Computing Environments, Int. Computer Software and Applications Conference (COMPSAC), Edinburgh, Scotland, July 2005.

Monitoring is a task of collecting measurements that reflect the state of a system. Administration is a collection of tasks for control and manipulation of computer systems. Monitoring and Administering computer ResourceS (MARS) in a distributed grid computing environment (i.e. a distributed environment for coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations) is an important, expensive, and critical task. We present a novel solution based on applying crosscuts using binary rewriters and an event-based model that allows developers to create non-trivial MARS programs easily and uniformly. Our approach converts low-level API resource calls into system-wide events that MARS programs can monitor. This is accomplished by introducing advice that contains event-generating code at join points in programs that represent computer resources. We categorize low-level resource APIs by imposing a transactional metaphor to simplify the complexity of interactions between resources and to enable reasoning about MARS programs. We report both a case study and simulation that supports the viability of our approach.

R. Lopez-Herrejon and D. Batory. Improving Incremental Development in Aspectj by Bounding Quantification, Software Engineering Properties and Languages for Aspect Technologies (SPLAT), March 2005.

Incremental software development is a process of building complex programs from simple ones by successively adding programmatic details. It is an effective and common design practice that helps control program complexity. However, incrementally building software using aspects can be more challenging than using traditional modules. AspectJ quantification mechanisms do not distinguish different developmental stages, and thus pointcuts can capture join points from later stages that they originally were not intended to advice. In this paper we present an algebraic model to describe aspects and their composition in AspectJ. We show that the way AspectJ's composes aspects plays a significant role in this problem. We propose an alternative model to compose aspects that improves the support for incremental development. It bounds the scope of quantification and still preserves the same power of AspectJ. We also show how bounded quantification contributes to aspect reuse.

aspects, computational design

J. Blair and D. Batory. A Comparison of Generative Approaches: XVCL and GenVoca, Technical Report December 2004.

We report one of the first comparative studies on two mature approaches to generative programming: XVCL and GenVoca. XVCL is the latest incarnation of Bassett's frames; GenVoca melds feature modularity with compositional programming. Both approaches explicitly rest on a pair of ideas: (1) parameterized functions return source code and (2) composing such functions synthesizes applications. Although their foundations are identical, XVCL and GenVoca have very different design philosophies. These differences raise an interesting debate on what design methodologies and programming constructs should be used in writing generative programs.
Code for this paper is here.

D. Batory A Science of Software Design . Keynote at International Conference on Algebraic Methodology And Software Technology (AMAST) 2004.

Underlying large-scale software design and program synthesis are simple and powerful algebraic models. In this paper, I review the elementary ideas upon which these algebras rest and argue that they define the basis for a science of software design.

computational design

D. Batory Program Comprehension in Generative Programming: A History of Grand Challenges . Keynote at International Workshop on Program Comprehension (IWPC) 2004.

The communities of Generative Programming (GP) and Program Comprehension (PC) look at similar problems: GP derives a program from a specification, PC derives a specification from a program. A basic difference between the two is GP’s use of specific knowledge representations and mental models that are essential for program synthesis. In this paper, I present a historical review of the Grand Challenges, results, and outlook for GP as they pertain to PC.

keynote

J. Liu and D. Batory. " Automatic Remodularization and Optimized Synthesis of Product-Families, Generative Programming and Component Engineering (GPCE), October 2004.

A product-family is a suite of integrated tools that share a common infrastructure. Program synthesis of individual tools can replicate common code, leading to unnecessarily large executables and longer build times. In this paper, we present remodularization techniques that optimize the synthesis of product-families. We show how tools can be automatically remodularized so that shared files are identified and extracted into a common package,and how isomorphic class inheritance hierarchies can be merged into a single hierarchy. Doing so substantially reduces the cost of program synthesis, product-family build times, and executable sizes. We present a case study of a product-family with five tools totalling over 170K LOC, where our optimizations reduce archive size and build times by 40%.

recommended optimization

D. Batory, J.N. Sarvela, and A. Rauschmayer. Scaling Step-Wise Refinement , IEEE Transactions on Software Engineering (IEEE TSE), June 2004. This is a journal version of our 2002 ICSE paper.

Step-wise refinement is a powerful paradigm for developing a complex program from a simple program by adding features incrementally. We present the AHEAD (Algebraic Hierarchical Equations for Application Design) model that shows how step-wise refinement scales to synthesize multiple programs and multiple noncode representations. AHEAD shows that software can have an elegant, hierarchical mathematical structure that is expressible as nested sets of equations. We review a tool set that supports AHEAD. As a demonstration of its viability, we have bootstrapped AHEAD tools from equational specifications, refining Java and non-Java artifacts automatically; a task that was accomplished only by ad hoc means previously.

computational design, award, recommended theory

M. Grechanik, D. Batory, and D.E. Perry.Design of Large-Scale Polylingual Systems. International Conference on Software Engineering (ICSE), Edinburgh, Scotland, UK, May 2004.

Building systems from existing applications written in two or more languages is common practice. Such systems are polylingual. Polylingual systems are relatively easy to build when the number of APIs needed to achieve language interoperability is small. However, when the number of distinct APIs become large, maintaining and evolving polylingual systems becomes a notoriously difficult task. In this paper, we present a simple, practical, and effective way to develop, maintain, and evolve large-scale polylingual systems. Our approach relies on recursive type systems whose instances can be manipulated by reflection. Foreign objects (i.e. objects that are not defined in a host programming language) are abstracted as graphs and path expressions are used for accessing and manipulating data. Path expressions are implemented by type reification which turns foreign type instances into first-class objects and enabling access to and manipulation of them in a host programming language. Doing this results in multiple benefits, including coding simplicity and uniformity that we demonstrate in a complex commercial project.

D. Batory. Feature-Oriented Programming and the AHEAD Tool Suite, International Conference on Software Engineering (ICSE), Edinburgh, Scotland, 2004.

Feature Oriented Programming (FOP) is an emerging paradigm for application synthesis, analysis, and optimization. A target application is specified declaratively as a set of features, like many consumer products (e.g., personal computers, automobiles). FOP technology translates such declarative specifications into efficient programs. AHEAD is a model of FOP that is based on step-wise refinement, which advocates that complex programs can be synthesized from simple programs by incrementally adding features. The AHEAD Tool Suite (ATS) supports program development in AHEAD. AHEAD and ATS are among the most advanced models/tools for large-scale program synthesis.

M. Grechanik, D.E. Perry, and D. Batory.Reengineering Large-Scale Polylingual Systems. International Workshop on Incorporating COTS into Software Systems: Tools and Techniques (IWICSS) co-located with the 3rd International Conference on COTS-Based Software Systems (ICCBSS), Redondo Beach, CA, February 2004.

Building systems from existing applications written in two or more languages is common practice. Such systems are polylingual. Polylingual systems are relatively easy to build when the number of APIs needed to achieve language interoperability is small. However, when the number of distinct APIs become large, maintaining and evolving them becomes a notoriously difficult task. We present a practical and effective process to reengineer large-scale polylingual systems. We offer a programming model that is based on the uniform abstraction of polylingual systems as graphs and the use of path expressions for traversing and manipulating data. This model enables us to achieve multiple benefits, including coding simplicity and uniformity (where neither was present before) that facilitate further reverse engineering. By performing control and data flow analyses of polylingual systems we infer the schemas used by all participating programs and the actions performed by each program on others. Finally, we describe a tool called FORTRESS that automates our reverse engineering process. The contribution of this paper is a process that allows programmers to reverse engineer foreign type systems and their instances semiautomatically at the highest level of design. We know of no other approach with comparable benefits.

D. Batory, J. Liu, J.N. Sarvela. Refinements and Multi-Dimensional Separation of Concerns, ACM SIGSOFT FSE 2003.

Step-wise refinement (SWR) asserts that complex programs can be derived from simple programs by progressively adding features. The length of a program specification is the number of features that the program has. Critical to the scalability of SWR are multi-dimensional models that separate orthogonal feature sets. Let n be the dimensionality of a model and k be the number of features along a dimension. We show program specifications that could be O(k**n) features long have short and easy-to-understand specifications of length O(kn) when multi-dimensional models are used. We present new examples of multidimensional models: a micro example of a product-line (whose programs are 30 lines of code) and isomorphic macro examples (whose programs exceed 30K lines of code). Our work provides strong evidence that SWR scales to synthesis of large systems.

recommended origami, kubes, tensors

D. Batory, J.N. Sarvela, and A. Rauschmayer. Scaling Step-Wise Refinement, International Conference on Software Engineering (ICSE), Portland, Oregon, 2003.

Step-wise refinement is a powerful paradigm for developing a complex program from a simple program by adding features incrementally. We present the AHEAD (Algebraic Hierarchical Equations for Application Design) model that shows how step-wise refinement scales to synthesize multiple programs and multiple non-code representations. AHEAD shows that software can have an elegant, hierarchical mathematical structure that is expressible as nested sets of equations. We review a tool set that supports AHEAD. As a demonstration of it's viability, we have bootstrapped AHEAD tools solely from equational specifications, generating Java and non-Java artifacts automatically, a task that was accomplished only by ad hoc means previously.

computational design

D. Batory. The Road to Utopia: A Future for Generative Programming . Keynote at Dagstuhl for Domain-Specific Program Generation, March 2003.

The future of software engineering lies in automation, and will exploit the combined strengths of generative programming, domain-specific languages, and automatic programming. While each of these areas is still in its infancy, a spectacularly successful example of their combination was realized twenty-five years ago: relational query optimization. In this paper, I chart the successes and mindset used by database researchers to generate efficient query processing programs automatically. I argue that the road that they have so successfully followed is the same road that the generative programming, domain-specific languages, and automatic programming communities are now traversing.

keynote

M. Grechanik, D.E. Perry, and D. Batory. An Approach to Evolving Database Dependent Systems . International Workshop on Principles of Software Evolution, Orlando, Florida, May 2002.

It is common in client/server architectures for components to have SQL statements embedded in their source code. Components submit queries to relational databases using such standards as Universal Data Access (UDA) and Open Database Connectivity (ODBC). The API that implements these standards is complex and requires the embedding of SQL statements in the language that is used to write the components. Such programming practices are widespread and result in increased complexity in maintaining systems. We propose an approach that eliminates the embedding of SQL in programming languages, thereby enabling the automation of important software maintenance tasks.

evolution

D. Batory, R. Lopez-Herrejon, and J-P. Martin. Generating Product-Lines of Product-Families, Automated Software Engineering Conference (ASE), 2002.

GenVoca is a methodology and technology for generating product-lines, i.e., building variations of a program. The primitive components from which applications are constructed are refinements or layers, which are modules that implement a feature that many programs of a product-line can share. Unlike conventional components (e.g., COM, CORBA, EJB), a layer encapsulates fragments of multiple classes. Sets of fully-formed classes can be produced by composing layers. Layers are modular, albeit unconventional, building blocks of programs.But what are the building blocks of layers? We argue that facets is an answer. A facet encapsulates fragments of multiple layers, and compositions of facets yields sets of fully-formed layers. Facets arise when refinements scale from producing variants of individual programs to producing variants of multiple, integrated programs, as typified by product families (e.g., MS. Office). We present a mathematical model that explains relationships between layers and facets. We use the model to develop a generator for tools (i.e., product-family) that are used in language-extensible Integrated Development Environments (IDEs).
Example is here. Extended draft is here.

origami, tensors, kubes, award

D. Batory, C. Johnson, B. MacDonald, and D. von Heeder. Achieving Extensibility Through Product-Lines and Domain-Specific Languages: A Case Study ", ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 11#2, April 2002, 191-214. This is a journal version of our 2000 ICSR paper.

This is a case study in the use of product-line architectures (PLAs) and domain-specific languages (DSLs) to design an extensible command-and-control simulator for Army fire support. The reusable components of our PLA are layers or "aspects" whose addition or removal simultaneously impacts the source code of multiple objects in multiple, distributed programs. The complexity of our component specifications is substantially reduced by using a DSL for defining and refining state machines, abstractions that are fundamental to simulators. We present preliminary results that show how our PLA and DSL synergistically produce a more flexible way of implementing state-machine-based simulators than is possible with a pure Java implementation.

award

M.Grechanik, D. Batory, and Dewayne E. Perry. Integrating and Reusing GUI-Driven Applications, International Conference on Software Reuse (ICSR), Austin, Texas, April 2002.

Graphical User Interface (GUI) Driven Applications (GDAs) are ubiquitous. We present a model and techniques that take closed and monolithic GDAs and integrate them into an open, collaborative environment. The central idea is to objectify the GUI of a GDA, thereby creating an object that enables programmatic control of that GDA. We demonstrate a non-trivial application of these ideas by integrating a standalone internet application with a stand-alone Win32 application, and explain how PDAs (Personal Digital Assistants) can be used to remotely control their combined execution. Further, we explain how Integrated Development Environment (IDEs) may be extended to integrate and reuse GDAs using our approach. We believe our work is unique: we know of no other technology that could have integrated the GDAs of our example.

Y.Smaragdakis and D. Batory. Mixin Layers: An Object-Oriented Implementation Technique for Refinements and Collaboration-Based Designs, ACM Transactions on Software Engineering and Methodology (TOSEM), April 2002.

A "refinement" is a functionality addition to a software project that can affect multiple dispersed implementation entities (functions, classes, etc.). In this paper, we examine large-scale refinements in terms of a fundamental object-oriented technique called collaboration-based design. We explain how collaborations can be expressed in existing programming languages or be supported with new language constructs (which we have implemented as extensions to the Java language). We present a specific expression of large-scale refinements called mixin layers, and demonstrate how it overcomes the scalability difficulties that plagued prior work. We also show how we used mixin layers as the primary implementation technique for building an extensible Java compiler, JTS.

mixin layers

A. Rauschmayer. Probe: Visualizing Algebraic Transformations in the GenBorg Generative Software Environment, Institut fuer Informatik Ludwig Maximilians Universitaet Muenchen, December, 2001.

Software is becoming increasingly pervasive and complex. The greatest challenge for the industry today is how to produce more code without compromising quality or cost. GenBorg is a programming paradigm whose answer is to scale the units of reuse from components of individual applications to components of application suites (e. g., MS Office). While traditional approaches trade size of code units for flexibility, GenBorg uses techniques from generative programming and feature-oriented programming in order to provide models that instantiate code and non-code artifacts of customizable application suites. Because developing software is not limited to writing code, but current ways of automating development are, a lot of human energy is wasted for keeping related data such as documentation and formal properties consistent with source code. GenBorg manages to evolve all these artifacts automatically and in parallel. GenBorg's ideas are backed by a theoretical foundation, the GenBorg algebra. The Java application Probe is a first implementation of the GenBorg programming paradigm and uses a simple, file-based data structure for defining GenBorg models. Its graphical user interface makes it easy to navigate through and instantiate from large models.

student

D. Batory, D. Brant, M. Gibson, and M. Nolen. ExCIS: An Integration of Domain-Specific Languages and Feature-Oriented Programming, Workshop on New Visions for Software Design and Productivity: Research and Applications, Vanderbilt University, Nashville, Tennessee, December 13-14, 2001.

ExCIS is a methodology and tool suite that integrates the technologies of domain-specific languages (DSLs) and feature-oriented programming (FOP). DSLs offer compact specifications of programs in domain-specific notations that are easier to write, maintain, and evolve. FOP raises the level of abstraction in system programming from compositions of code-centric components to compositions of modules that implement individual and largely orthogonal features. ExCIS provides state-of-the-art tools for creating easier-to-specify, easier-to-maintain, and easier-to-change systems. It is being developed for STRICOM to create next-generation simulators for the U.S. Army.

R. E. Lopez-Herrejon and D. Batory. A Standard Problem for Evaluating Product-Line Methodologies, Generative and Component-Based Software Engineering (GCSE/GPCE), September 9-13, 2001 Messe Erfurt, Erfurt, Germany.

We propose a standard problem to evaluate product-line methodologies. It relies on common knowledge from Computer Science, so that domain-knowledge can be easily acquired, and it is complex enough to expose the fundamental concepts of product-line methodologies. As a reference point, we present a solution to this problem using the GenVoca design methodology. We explain a series of modeling, implementation, and benchmarking issues that we encountered, so that others can understand and compare our solution with theirs.

GPL

E.W. Dijkstra. What is Refinement? E.W.D. Memo, August 30, 2001.

This is E.W. Dijkstra's response to the question 'What is a Refinement?

semantics

K. Fisler, S. Krishnamurthi, D. Batory, and J. Liu. A Model Checking Framework for Layered Command and Control Software, Workshop on Engineering Automation for Software Intensive System Integration, Monterrey, California, June 18-22, 2001.

Most existing modular model checking techniques betray their hardware roots: they assume that modules compose in parallel. In contrast, layered software systems, which have proven successful in many domains, are really quasi-sequential compositions of parallel compositions. Most such systems demand and inspire new modular verification techniques. This paper presents algorithms that exploit a layered (or feature-based) decomposition to drive verification. Our technique can verify most properties locally within a layer; we also characterize when a global state space construction is unavoidable. This work is motivated by our efforts to verify a military fire simulation and support software system called FSATS.

verification, semantics

K. Fisler, S. Krishnamurthi, and D. Batory. Verifying Component-Based Collaboration Designs, 4th ICSE Workshop on Component-Based Software Engineering: Component Certification and System Prediction. Toronto, Ontario, Canada, May 2001.

verification, semantics

Y. Smaragdakis and D. Batory. Mixin-Based Programming in C++, Second International Symposium on Generative and Component-Based Software Engineering (GCSE now GPCE), Erfurt, Germany, October 9-12, 2000.

Combinations of C++ features, like inheritance, templates, and class nesting, allow for the expression of powerful component patterns. In particular, research has demonstrated that, using C++ mixin classes, one can express layered component-based designs concisely with efficient implementations. In this paper, we discuss pragmatic issues related to component-based programming using C++ mixins. We explain surprising interactions of C++ features and policies that sometimes complicate mixin implementations, while other times enable additional functionality without extra effort.

mixin layers

D. Batory, C. Johnson, B. MacDonald, and D. von Heeder., FASTS: An Extensible C4I Simulator for Army Fire Support, Workshop on Product Lines for Command-and-Control Ground Systems at the First International Software Product Line Conference (SPLC) in Denver, Colorado, August 2000.

L. Tokuda and D. Batory. Evolving Object-Oriented Designs with Refactorings. Journal of Automated Software Engineering 8, 89-120, 2001. This is an enlarged version of our 1999 ASE paper.

Refactorings are behavior-preserving program transformations that automate design evolution in object-oriented applications. Three kinds of design evolution are: schema transformations, design pattern micro-architectures, and the hot-spot-driven-approach. This research shows that all three are automatable with refactorings. A comprehensive list of refactorings for design evolution is provided and an analysis of supported schema trans-formations, design patterns, and hot-spot meta patterns is presented. Further, we evaluate whether refactoring technology can be transferred to the mainstream by restructuring non-trivial C++ applications. The applications that we examine were evolved manually by software engineers. We show that an equivalent evolution could be reproduced significantly faster and cheaper by applying a handful of general-purpose refactorings. In one application, over 14K lines of code were transformed automatically that otherwise would have been coded by hand. Our experiments identify benefits, limitations, and topics of further research related to the transfer of refactoring technology to a production environment.

refactoring award

D. Batory. Refinements and Separation of Concerns. Second Workshop on Multi-Dimensional Separation of Concerns, International Conference on Software Engineering (ICSE), Limerick, Ireland, 2000.

Today's notions of encapsulation are very restricted - a module or component contains only source code. What we really need is for modules or components to encapsulate not only source code that will be installed when the component is used, but also encapsulate corresponding changes to documentation, formal properties, and performance properties - i.e., changes to the central concerns of software development. The general abstraction that encompasses this broad notion of encapsulation is called a "refinement".

D. Batory, Rich Cardone, and Y. Smaragdakis. Object-Oriented Frameworks and Product-Lines. 1st Software Product-Line Conference (SPLC), Denver, Colorado, August 2000.

Frameworks are a common object-oriented code-structuring technique that is used in application product-lines. A framework is a set of abstract classes that embody an abstract design; a framework instance is a set of concrete classes that subclass abstract classes to provide an executable subsystem. Frameworks are designed for reuse: abstract classes encapsulate common code and concrete classes encapsulate instance-specific code. Unfortunately, this delineation of reusable vs. instance-specific code is problematic. Concrete classes of different framework instances can have much in common and there can be variations in abstract classes, all of which lead to unnecessary code replication. In this paper, we show how to overcome these limitations by decomposing frameworks and framework instances into primitive and reusable components. Doing so reduces code replication and creates a component-based product-line of frameworks and framework instances.

mixin layers, recommended frameworks

D. Batory, Clay Johnson, Bob MacDonald, and Dale von Heeder. Achieving Extensibility Through Product-Lines and Domain-Specific Languages: A Case Study, International Conference on Software Reuse (ICSR), Vienna, Austria, 2000. Here is a pointer to an expanded version of this paper.

This is a case study in the use of product-line architectures (PLAs) and domain-specific languages (DSLs) to design an extensible command-and-control simulator for Army fire support. The reusable components of our PLA are layers or "aspects" whose addition or removal simultaneously impacts the source code of multiple objects in multiple, distributed programs. The complexity of our component specifications is substantially reduced by using a DSL for defining and refining state machines, abstractions that are fundamental to simulators. We present preliminary results that show how our PLA and DSL synergistically produce a more flexible way of implementing state-machine-based simulators than is possible with a pure Java implementation.

mixin layers

Y. Smaragdakis and D. Batory. Scoping Constructs for Program Generators ,Generative and Component-Based Software Engineering (GCSE now GPCE), Erfurt, Germany, September 1999.

A well-known problem in program generation is scoping. When identifiers (i.e., symbolic names) are used to refer to variables, types, or functions, program generators must ensure that generated identifiers are bound to their intended declarations. This is the standard scoping issue in programming languages, only automatically generated programs can quickly become too complex and maintaining bindings manually is hard. In this paper we present generation scoping: a language mechanism to facilitate the handling of scoping concerns. Generation scoping offers control over identifier scoping beyond the scoping mechanism of the target programming language (i.e., the language in which the generator output is expressed). Generation scoping was originally implemented as an extension of the code template operators in the Intentional Programming platform, under development by Microsoft Research. Subsequently, generation scoping has also been integrated in the JTS language extensibility tools. The capabilities of generation scoping were invaluable in the implementation of two actual software generators: DiSTiL (implemented using the Intentional Programming system), and P3 (implemented using JTS).

P3

Y. Smaragdakis and D. Batory. Application Generators, based on an article to appear in the Software Engineering volume of the Encyclopedia of Electrical and Electronics Engineering, (John Wiley and Sons).

D. Batory and Y. Smaragdakis. Building Product-Lines with Mixin-Layers ,ECOOP 99 Workshop on Product-Line Architectures.

A mixin-layer is a building block for assembling applications of a product-line. We explain mixin-layers, their relationship to collaboration-based designs, layered designs, and GenVoca. We also summarize some of the product-lines that we have built using mixin-layers.

mixin layers

D. Batory, G. Chen, E. Robertson, and T. Wang. Design Wizards and Visual Programming Environments for GenVoca Generators, IEEE Transactions on Software Engineering (IEEE TSE), May 2000, 441-452. This is a journal version of our 1998 ICSR paper.

Domain-specific generators will increasingly rely on graphical languages for declarative specifications of target applications. Such languages will provide front-ends to generators and related tools to produce customized code on demand. Critical to the success of this approach will be domain-specific design wizards, tools that guide users in their selection of components for constructing particular applications. In this paper, we present the P3 ContainerStore graphical language, its generator, and design wizard.

P3, recommended optimization, wizards, award

L.A. Tokuda. Evolving Object-Oriented Designs with Refactorings . Ph.D. dissertation, September 1999, The Department of Computer Sciences, The University of Texas at Austin.

Refactorings are behavior-preserving program transformations that automate design evolution in object-oriented applications. Three kinds of design evolution are: schema transformations, the introduction of design pattern micro-architectures, and the hot-spot-driven-approach. This research shows that all three are automatable with refactorings. A comprehensive list of refactorings for design evolution is provided and an analysis of supported schema transformations, design patterns, and hot-spot meta patterns is presented. Further, we evaluate whether refactoring technology can be transferred to the mainstream by restructuring non-trivial C++ applications. The applications that we examine were evolved manually by software engineers. We show that an equivalent evolution could be reproduced significantly faster and cheaper by applying a handful of general-purpose refactorings. In one application, over 14K lines of code were transformed automatically that otherwise would have been coded by hand. Our experiments identify benefits, limitations, and topics of further research related to the transfer of refactoring technology to a production environment.

student, refactoring, analysis

Y. Smaragdakis. Implementing Large-Scale Object-Oriented Components . Ph.D. dissertation, September 1999, The Department of Computer Sciences, The University of Texas at Austin.

The complexity of software has driven both researchers and practitioners toward design methodologies that decompose problems into intellectually manageable pieces and that assemble partial products into complete software artifacts. Modularity in design, however, rarely translates into modularity at the implementation level. Hence, an important problem is to provide implementation (i.e., programming language) support for expressing modular designs concisely. This dissertation shows that software can be conveniently modularized using large-scale object-oriented software components. Such large-scale components encapsulate multiple classes but can themselves be viewed as classes, as they support the object-oriented mechanisms of encapsulation and inheritance. This conceptual approach has several concrete applications. First, existing language mechanisms, like a pattern of inheritance, class nesting, and parameterization, can be used to simulate large-scale components called mixin layers. Mixin layers are ideal for implementing certain forms of object-oriented designs and result in simpler and more concise implementations than those possible with previous methodologies. Second, we propose new language constructs to provide better support for component-based programming. These constructs express components cleanly (i.e., without unwanted interactions with other language mechanisms) and address the issue of component type-checking.

mixin layers, student

L. Tokuda and D. Batory. Evolving Object-Oriented Architectures with Refactorings Automated Software Engineering (ASE), October 1999.

This research evaluates whether refactoring technology can make a successful transition to the mainstream by restructuring non-trivial evolving C++ applications. Results reveal that refactorings can safely automate significant architectural changes. In one example, 14K lines of code are transformed.

refactoring

D. Batory, Y. Smaragdakis, and Lou Coglianese. Architectural Styles as Adaptors Software Architecture, Kluwer Academic Publishers, Patrick Donohoe, ed., 1999.

The essence of architectural styles is component communication. In this paper, we try to relate architectural styles to adaptors in the GenVoca model of software construction. GenVoca components are refinements that can have a wide range of implementations, from binaries to rule-sets of program transformation systems. We suggest that abstracting adaptors to refinements allows for program transformation implementations of adaptors that can express complex architectural styles that could not be expressed by other means. Examples from avionics are given.

Y. Smaragdakis and D. Batory Implementing Layered Designs with Mixin Layers European Conference on Object-Oriented Programming, (ECOOP), July 1998.

Mixin layers are a technique for implementing layered object-oriented designs (e.g., collaboration-based designs). Mixin layers are similar to abstract subclasses (mixin classes) but scaled to a multiple-class granularity. We describe mixin layers from a programming language viewpoint, discuss checking the consistency of a mixin layer composition, and analyze the language support issues involved.

L. Tokuda and D. Batory.Automating Three Modes of Evolution for Object-Oriented Software Architectures 5th Conference on Object-Oriented Technologies (COOTS), May 1999.

Refactorings are behavior preserving program transformations that can be applied to compilable source code to automate design level changes. This paper demonstrates that three common forms of object-oriented architectural evolution can be automated with refactorings: schema transformations, design pattern microarchitectures, and hot-spot meta patterns.

D. Batory Product-Line Architectures . Keynote at Smalltalk und Java in Industrie and Ausbildung, Erfurt, Germany, October 1998.

Today's software design methodologies are aimed at one-of-a-kind applications, designs are expressed in terms of objects and classes, and software must be coded manually. We argue that future software development will be very different and will center around product-line architectures (i.e., designs for families of related applications), refinements (a generalization of today's components), and software plug-and-play (a codeless form of programming).

keynote

D. Batory, Bernie Lofaso, and Y. Smaragdakis.JTS: Tools for Implementing Domain-Specific Languages.5th International Conference on Software Reuse (ICSR), Victoria, Canada, June 1998.

The Jakarta Tool Suite (JTS) aims to reduce substantially the cost of generator development by providing domain-independent tools for creating domain-specific languages and component-based generators called GenVoca generators. JTS is a set of precompiler-compiler tools for extending industrial programming languages (e.g., Java) with domain-specific constructs. JTS is itself a GenVoca generator, where precompilers for JTS-extended languages are constructed from components.

mixin layers

Y. Smaragdakis and D. Batory. Implementing Reusable Object-Oriented Components. 5th International Conference on Software Reuse (ICSR), Victoria, Canada, June 1998.

Object-oriented (OO) classes are generally not reusable because they are not meaningful in isolation; most classes only have meaning as members of cooperating suites of classes (e.g., design patterns). These suites usually arise in designs, but rarely exist as encapsulated entities in OO implementations. In this paper we present a method for directly mapping cooperating suites of classes into encapsulated C++ implementations. Our method is an improvement over the VanHilst and Notkin approach for implementing collaboration-based designs and constitutes a step towards more reusable (object-oriented) components.

(If you are thinking of using the C++ technique described in this paper, read some implementation advice: Y. Smaragdakis, >Practical Advice on Using Mixin Layers (and Mixins) in C++

mixin layers

D. Batory, G. Chen, E. Robertson, and T. Wang. Web-Advertised Generators and Design Wizards. 5th International Conference on Software Reuse (ICSR), Victoria, Canada, June 1998.

Domain-specific generators will increasingly rely on Web-based applets for declarative specifications of target applications. Applets will communicate with generators via servers to produce customized code on demand. Critical to the success of this approach will be domain-specific design wizards, tools that guide users in their selection of components for constructing particular applications. In this paper, we present the P3 ContainerStore applet, its generator, and design wizard.

P3 optimization, wizards

Y. Smaragdakis and D. Batory. DiSTiL: A Transformation Library for Data Structures. USENIX Conference on Domain-Specific Languages, October 1997.

DiSTiL is a representative of a new approach to domain-specific language implementation. Instead of being the usual one-of-a-kind stand-alone compiler, DiSTiL is an extension library for the Intentional Programming (IP) transformation system (currently under development by Microsoft Research). DiSTiL relies on several reusable, general purpose infrastructure tools offered by IP that substantially simplify DSL implementation.

P3

D. Batory, D. Miranker, and D. Brant. Jakarta: A Tool Suite for Constructing Software Generators.

Jakarta is a set of compiler tools to build extensible languages and to build compilers through component composition. Jakarta is being written in a Jakarta-produced superset of the Java language. We describe briefly how Jakarta will be used to build tools to automate OO design patterns, and the design wizard tool, which captures domain-specific knowledge of software architectures using domain-specific algebras.

G. Jimenez-Perez and D. Batory. Memory Simulators and Software Generators, 1997 Symposium on Software Reuse (SSR), 136-145.

We present results on re-engineering a highly tuned and hand-coded memory simulator using the P2 container data structure generator. This application was chosen because synthesizing the simulator's data structures would not exploit P2's primary advantage of automatically applying sophisticated code optimization techniques. Thus we initially believed that using P2 would be an overkill and that P2's generated code would provide no performance advantages over hand coding. On the contrary, we found that P2 produced more efficient code and that it offered significant advantages to software development that we had not previously realized.

P3

D. Batory and B. J. Geraci. Composition Validation and Subjectivity in GenVoca Generators, IEEE Transactions on Software Engineering (IEEE TSE) (special issue on Software Reuse), February 1997, 67-82. This is a journal version of papers Validating Component Compositions in Software System Generators and Subjectivity and GenVoca Generators.

GenVoca generators synthesize software systems by composing components from reuse libraries. GenVoca components are designed to export and import standardized interfaces, and thus be plug-compatible, interchangeable, and interoperable with other components. In this paper, we examine two different but important issues in software system synthesis. First, not all syntactically correct compositions of components are semantically correct. We present simple, efficient, and domain-independent algorithms for validating compositions of GenVoca components. Second, components that export and import immutable interfaces are too restrictive for software system synthesis. We show that the interfaces and bodies of GenVoca components are subjective, i.e., they mutate and enlarge upon instantiation. This mutability enables software systems with customized interfaces to be composed from components with "standardized" interfaces.

constraints, safe composition, award, design rules, DRC, recommended design rule checking

D. Batory Intelligent Components and Software Generators . Keynote at Software Quality Institute Symposium on Software Reliability, Austin, Texas, April 1997

The production of well-understood software will eventually be the responsibility of software generators. Generators will enable high-performance, customized software systems and subsystems to be assembled quickly and cheaply from component libraries. These components will be intelligent: they will encapsulate domain-specific knowledge (e.g., 'best practice' approaches) so that their instances will automatically customize and optimize themselves to the system in which they are being used. In this paper, we explore the topics intelligent components and software generation as they pertain to the issues of software productivity, performance, reliability, and quality.

keynote

E.E. Villarreal and D. Batory. Rosetta: A Generator of Data Language Compilers, 1997 Symposium on Software Reuse (SSR), 146-156.

A data language is a declarative language that enables database users to access and manipulate data. There are families of related data languages; each family member is targeted for a particular application. Unfortunately, building compilers for such languages is largely an ad hoc process; there are no tools and design methods that allow programmers to leverage the design and code of compilers for similar languages, or to simplify the evolution of existing languages to include more features. Rosetta is a generator of relational data language compilers that demonstrates practical solutions to these problems. We explain how domain analysis identifies primitive building blocks of these compilers, and how grammar-based definitions (e.g. GenVoca) of the legal compositions of these blocks yields compact and easily-evolvable specifications of data languages. Rosetta automatically transforms such specifications into compilers. Experiences with Rosetta are discussed.

V.P. Singhal. A Programming Language for Writing Domain-Specific Software System Generators . Ph.D. Dissertation. Department of Computer Sciences, University of Texas at Austin, September 1996.

Automating routine programming tasks is an effective way to increase the productivity of software development. Software system generators have the potential to achieve this goal: customized software systems can be quickly and easily assembled from component libraries. Our research demonstrates that for generators to be successful, component libraries must be scalable. Scalability enables libraries to be small, because the components of the library implement distinct and largely orthogonal features. These components can be combined to yield an enormous family of software systems and subsystems. Generators thus become tools for combining components to manufacture these systems and subsystems. In GenVoca, the programming model that forms the foundation of our research, components act as large-scale refinements which simultaneously transform multiple classes from one abstraction to another. Because GenVoca advocates a novel style of program organization, there is little language or tool support for this paradigm. We have developed a programming language called P++, which extends C++ with specialized constructs to support the GenVoca model. It permits components to be treated as transformations which can simultaneously refine several classes in a consistent manner. To validate the utility of this language, we solved a challenge problem in software reuse: we reimplemented the Booch C++ Components data structures library as a scalable P++ library. We were able to reduce the volume of code and number of components by approximately a factor of four, without compromising the performance of generated systems.

student

Dinesh Das and D. Batory. Synthesizing Rule Sets for Query Optimizers from Components, Technical Report TR-96-05, Department of Computer Sciences, University of Texas at Austin, April 1996.

Query optimizers are complex subsystems of database management systems. Modifying query optimizers to admit new algorithms or storage structures is quite difficult, but partly alleviated by extensible approaches to optimizer construction. Rule-based optimizers are a step in that direction, but from our experience, the rule sets of such optimizers are rather monolithic and brittle. Conceptually minor changes often require wholesale modifications to a rule set. Consequently, much can be done to improve the extensibility of rule-based optimizers. As a remedy, we present a tool called Prairie that is based on an algebra of layered optimizers. This algebra naturally leads to a building-blocks approach to rule-set construction. Defining customized rule sets and evolving previously defined rule sets is accomplished by composing building-blocks. We explain an implementation of Prairie and present experimental results that show how classical relational optimizers can be synthesized from building-blocks, and that the efficiency of query optimization is not sacrificed.

D. Batory. Software System Generators, Transformation Systems,and Compilers. Working Paper, October 1995.

GenVoca generators assemble customized, high-performance software systems automatically from components. In this paper, we explain how GenVoca generators are actually compilers for domain-specific module interconnection languages and that the underlying compilation technology is a special class of transformation systems.

D. Batory. Software Component Technologies and Space Applications. International Conference on Integrated Micro-Nano Technology for Space Applications, November 1995.

In the near future, software systems will be as reconfigurable than hardware. This will be possible through the advent of software component technologies, which have been prototyped in universities and research labs. In this paper, we outline the foundations for these technologies and suggest how they might impact software for space applications.

L. Tokuda. Program Transformations for Evolving Software Architectures. OOPSLA'95 position paper for workshop on Adaptable and Adaptive Software, 1995.

Software evolution is often driven by the need to extend existing software. "Design patterns" express preferred ways to extend object-oriented software and provide desirable target states for software designs. This paper demonstrates that some design patterns can be expressed as a series of parameterized program transformations applied to a plausible initial software state. A software tool is proposed that uses primitive transformations to allow users to evolve object-oriented applications by visually altering design diagrams.

D. Batory. Subjectivity and GenVoca Generators. International Conference on Software Reuse (ICSR) (Orlando), 1996. See IEEE TSE journal version. Expanded Technical Report TR-95-32, Department of Computer Sciences, University of Texas at Austin, June 1995.

The tenet of subjectivity is that no single interface can adequately describe any object; interfaces to the same object will vary among different applications. Thus, objects with standardized interfaces seem too brittle a concept to meet the demands of a wide variety of applications. Yet, objects with standardized interfaces is a central idea in domain modeling and software generation. Standard interfaces make objects plug-compatible and interchangeable, and it is this feature that is exploited by generators to synthesize high-performance, domain-specific software systems. Interestingly, generated systems have customized interfaces that can be quite different from the interfaces of their constituent objects.
In this paper, we reconcile this apparent contradiction by showing that the objects (components) in the GenVoca model of software generation are not typical software modules; their interfaces and bodies mutate upon instantiation to a "standard" that is application-dependent.

D. Batory. Issues in Domain Modeling and Software System Generation. OOPSLA'95 position paper for Panel on Objects and Domain Engineering, 1995.

D. Batory and J. Thomas. P2: A Lightweight DBMS Generator. Journal of Intelligent Information Systems. Also, Technical Report TR-95-26, Department of Computer Sciences, University of Texas at Austin, June 1995.

A lightweight database system (LWDB) is a high-performance, application-specific DBMS. It differs from a general-purpose (heavyweight) DBMS in that it omits one or more features and specializes the implementation of its features to maximize performance. Although heavyweight monolithic and extensible DBMSs might be able to emulate LWDB capabilities, they cannot match LWDB performance. In this paper, we describe P2, a generator of lightweight DBMSs, and explain how it was used to reengineer a hand-coded, highly-tuned LWDB used in a production system compiler (LEAPS). We present results that show P2-generated LWDBs reduced the development time and code size of LEAPS by a factor of three and that the generated LWDBs executed substantially faster than versions built by hand or using an extensible heavy weight DBMS.

P3

D. Das. Making Database Optimizers More Extensible . Ph.D. Dissertation. Department of Computer Sciences, University of Texas at Austin, May 1995.

Query optimizers are fundamental components of database management systems (DBMSs). An optimizer consists of three features: a search space, a cost model, and a search strategy. The experience of many researchers has shown that hard-wiring these features results in an optimizer that is very inflexible and difficult to modify. Rule-based optimizers have been developed to alleviate some of the problems of monolithic optimizers. Unfortunately, contemporary rule-based optimizers do not provide enough support to enable database implementers (DBI) to fully realize the potential of open systems. We have identified four requirements that a rule-based optimizer should satisfy to address these needs. First, rules should be specified using high-level abstractions, insulating the DBI from underlying implementation details. Second, rule sets should be easily extensible, with a minimum of reprogramming required. Third, rule sets should be easily reconfigurable, that is, changeable to meet a variety of user needs, interfaces, database schemes, etc. Fourth, rule-based optimizers should be fast, that is, performance should not be sacrificed for the sake of high-level specifications. In this dissertation, we describe Prairie, an environment for specifying rules for rule-based optimizers that satisfies all four of the above requirements. The Prairie specification language is presented and we show how it allows a DBI to design an easily extensible rule set for a rule-based optimizer. Experimental results are presented using the Texas Instruments Open ODD optimizer rule set to validate the claim of good performance using Prairie. Finally, a building blocks approach of constructing rule sets is presented; this results in easily reconfigurable rule sets whose features are changeable simply by assembling the blocks in various ways.

student

D. Batory, Lou Coglianese, M. Goodwin, and S. Shafer. Creating Reference Architectures: An Example from Avionics. Symposium on Software Reusability (SSR), Seattle Washington, April 1995.

ADAGE is a project to define and build a domain-specific software architecture (DSSA) environment for assisting the development of avionics software. A central concept of DSSA is the use of software system generators to implement component-based models of software synthesis in the target domain. In this paper, we present the ADAGE component-based model (or reference architecture) for avionics software synthesis. We explain the modeling procedures used, review our initial goals, and examine what we were (and were not) able to accomplish. The contributions of our paper are the lessons that we learned; they may be beneficial to others in future modeling efforts.

award

D. Batory, Lou Coglianese, S. Shafer, and W. Tracz. The ADAGE Avionics Reference Architecture. AIAA Computing in Aerospace-10 Conference, San Antonio, March 1995.

ADAGE is a project to define and build a domain-specific software architecture (DSSA) environment for avionics. A central concept of ADAGE is the use of generators to implement scalable, component-based models of avionics software. In this paper, we review the ADAGE model (or reference architecture) of avionics software and describe techniques for avionics software synthesis.

Dinesh Das and D. Batory. Prairie: A Rule Specification Framework for Query Optimizers. International Conference on Data Engineering (Taipei), March 1995.

From our experience, current rule-based query optimizers do not provide a very intuitive and well-defined framework to define rules and actions. To remedy this situation, we propose an extensible and structured algebraic framework called Prairie for specifying rules. Prairie facilitates rule-writing by enabling a user to write rules and actions more quickly, correctly and in an easy-to-understand and easy-to-debug manner.
Query optimizers consist of three major parts: a search space, a cost model and a search strategy. The approach we take is only to develop the algebra which defines the search space and the cost model and use the Volcano optimizer-generator as our search engine. Using Prairie as a front-end, we translate Prairie rules to Volcano to validate our claim that Prairie makes it easier to write rules.
We describe our algebra and present experimental results which show that using a high-level framework like Prairie to design large-scale optimizers does not sacrifice efficiency.

D. Batory, D. McAllester, L. Coglianese, and W. Tracz. Domain Modeling in Engineering of Computer-Based Systems. 1995 International Symposium and Workshop on Systems Engineering of Computer Based Systems, Tucson, Arizona, February 1995.

Domain modeling is believed to be a key factor in developing an economical and scalable means for constructing families of related software systems. In this paper, we review the current state of domain modeling, and present some of our work on the ADAGE project, an integrated environment that relies heavily on domain models for generating real-time avionics applications. Specifically, we explain how we detect errors in the design of avionics systems that are expressed in terms of compositions of components. We also offer insights on how domain modeling can benefit the engineering of computer-based systems in other domains.

L. Tokuda and D. Batory. Automated Software Evolution via Design Pattern Transformations. 3rd International Symposium on Applied Corporate Computing, Monterrey, Mexico, October 1995. Also TR-95-06, Department of Computer Sciences, University of Texas at Austin, February 1995.

Software evolution is often driven by the need to extend existing software. "Design patterns" express preferred ways to extend object-oriented software and provide desirable target states for software designs. This paper demonstrates that some design patterns can be expressed as a series of parameterized program transformations applied to a plausible initial software state. A software tool is proposed that uses primitive transformations to allow users to evolve object-oriented applications by visually altering design diagrams.

refactoring

J. Thomas and D. Batory. P2: An extensible Lightweight DBMS. Technical Report TR-95-04, Department of Computer Sciences, University of Texas at Austin, February 1995.

A lightweight database system (LWDB) is a high-performance, application-specific DBMS. It differs from a general-purpose (heavyweight) DBMS in that it omits one or more features and specializes the implementation of its features to maximize performance. Although heavyweight monolithic and extensible DBMSs might be able to emulate LWDB capabilities, they cannot match LWDB performance.
In this paper, we explore LWDB applications, systems, and implementation techniques. We describe P2, an extensible lightweight DBMS, and explain how it was used to reengineer a hand-coded, highly-tuned LWDB used in a production system compiler (LEAPS). We present results that show P2-generated LWDBs for LEAPS executes substantially faster than versions built by hand or that use an extensible heavyweight DBMS.

P3

D. Batory and B.J. Geraci. Validating Component Compositions in Software System Generators, International Conference on Software Reuse (ICSR) (Orlando), 1996. See IEEE TSE journal version. Also, Expanded Technical Report TR-95-03, Department of Computer Sciences, University of Texas at Austin, June 1995.

Generators synthesize software systems by composing components from reuse libraries. In general, not all syntactically correct compositions are semantically correct. In this paper, we present domain-independent algorithms for the GenVoca model of software generators to validate component compositions. Our work relies on attribute grammars and offers powerful debugging capabilities with explanation-based error reporting. We illustrate our approach by showing how compositions are debugged by a GenVoca generator for container data structures.

constraints, safe composition, DRC, design rules, design rule checking

E.E. Villarreal. Automated Compiler Generation for Extensible Data Languages . Ph.D. Dissertation. Department of Computer Sciences, University of Texas at Austin, December 1994.

To meet the changing needs of the DBMS community, e.g., to support new database applications such as geographic or temporal databases, new data languages are frequently proposed. Most offer extensions to previously defined languages such as SQL or Quel. Few are ever implemented. The maturity of the area of data languages demands that researchers go beyond the proposal stage to have hands-on experience with their languages, if only to separate the good ideas from the bad. Tools and methodologies for building families of similar languages are definitely needed; we solve this problem by automating the generation of compilers for data languages. Our work, Rosetta, is based on two concepts. First, underlying the domain of data languages is a common backplane of relational operations. Backplane operations are primitive building blocks for language execution and construction, where a building block has standardized semantics. The definition of a well-designed backplane is implementation-independent; that is, the backplane is defined once but can be used to model arbitrarily many data languages. Second, there exist primitive building-blocks for language construction. From our analysis of the database data language domain, we have identified three classes of building-blocks: one class maps language syntax to backplane functions, another builds an internal representation of the backplane operator tree, and a third class manages contextual information. For modeling data languages, we define the Rosetta specification language, a grammar-based specification language tailored to our needs with the power to define syntax, map it to the target language, and build an operator tree all in one rule. Thus each rule is a microcosmic model of a language clause which encapsulates input parsing and code generation. Our specification language models data languages based on the composition of primitive building blocks for semantics and the customization of the syntax for invoking the compositions. A compiler for a data language is generated by first modeling the language and then compiling the specification. The ease and efficiency with which Rosetta customizes languages derives from the reuse of the backplane operations and the high-level specification supported.

student

D. Batory, J. Thomas, and M. Sirkin. Reengineering a Complex Application Using a Scalable Data Structure Compiler. ACM SIGSOFT FSE 1994 (New Orleans), December 1994.

P2 is a scalable compiler for collection data structures. High-level abstractions insulate P2 users from data structure implementation details. By specifying a target data structure as a composition of components from a reuse library, the P2 compiler replaces abstract operations with their concrete implementations.
LEAPS is a production system compiler that produces the fastest sequential executables of OPS5 rule sets. LEAPS is a hand-written, highly-tuned, performance-driven application that relies on complex data structures. Reengineering LEAPS using P2 was an acid test to evaluate P2's scalability, productivity benefits, and generated code performance.
In this paper, we present some of our experimental results and experiences in this reengineering exercise. We show that P2 scaled to this complex application, substantially increased productivity, and provided unexpected performance gains.

P3

D. Batory. The LEAPS Algorithms. Technical Report TR-94-28, Department of Computer Sciences, University of Texas at Austin, November 1994.

LEAPS is a state of the art production system compiler that produces the fastest sequential executable OPS5 rule sets. The performance of LEAPS is due to its reliance on complex data structures and search algorithms to speed rule processing. In this paper, we explain the LEAPS algorithms in terms of the programming abstractions of the P2 data structure compiler.

D. Batory, B. Geraci, and J. Thomas. Introductory P2 System Manual. Technical Report TR-94-26, Department of Computer Sciences, University of Texas at Austin, November 1994.

P2 is a prototype container data structure precompiler. It is a superset of the C language offering container and cursor abstractions as part of the linguistic extensions to C. P2 is based on the GenVoca model of software system generators. This document is the users manual for programming in the P2 language.

P3

D. Batory, B. Geraci, and J. Thomas. Advanced P2 System Manual. Technical Report TR-94-27, Department of Computer Sciences, University of Texas at Austin, November 1994.

This manual documents how layers are written in P2. There is a special language, XP, which was designed specifically for defining P2 building blocks (i.e., primitive data structure layers).

P3

D. Batory, V. Singhal, J. Thomas, S. Dasari, B. Geraci, and M. Sirkin. The GenVoca Model of Software-System Generators. IEEE Software, September 1994.

An emerging breed of generators synthesize complex software systems from libraries of reusable components. These generators, called GenVoca generators, produce high-performance software and offer substantial increases in productivity.

P3

M.J. Sirkin. A Software System Generator for Data Structures . Ph.D. Dissertation. Department of Computer Science and Engineering, University of Washington, March 1994.

Although data structures are a fundamental part of most applications, using and writing data structures is time-consuming, difficult, and error-prone. Programmers often select inappropriate data structures for their applications because they do not know which data structure to use, they do not know how to implement a particular data structure, or they do not have an existing implementation of the data structure to use. This dissertation describes a model and a technology for overcoming these problems. Our approach is based on non-traditional parameterized types (NPTs). NPTs are an extension to traditional parameterized types (TPTs), which are already familiar to most programmers. Our NPTs are based on the GenVoca domain modeling concepts, vertical parameterization, a consistent high-level interface, and a transformational compiler. Our research has led to the construction of a software system generator for data structures called Predator. Predator is able to transform data structure declarations and data structure-independent functions into efficient code. Predator also allows programmers to adjust a data structure's implementation by simply changing its declaration and recompiling. This dissertation discusses our model (and how it differs from standard models), our Predator compiler, and the results of our validation efforts.

student P3

D. Batory. Products of Domain Models. ARPA Domain Modeling Workshop, George Mason University, September 1994.

We argue that domain models should produce four basic products: identification of reusable software components, definition of software architectures that explain how components can be composed, a demonstration of architecture scalability, and a direct relationship of these results to software generation of target systems.

D. Batory, V. Singhal, J. Thomas, and M. Sirkin. Scalable Software Libraries. ACM SIGSOFT FSE 1993 (Los Angeles), December 1993.

Many software libraries (e.g., the Booch C++ Components, libg++, NIHCL, COOL) provide components (classes) that implement data structures. Each component is written by hand and represents a unique combination of features (e.g. concurrency, data structure, memory allocation algorithms) that distinguishes it from other components. We argue that this way of building data structure component libraries is inherently unscalable. Libraries should not enumerate complex components with numerous features; rather, libraries should take a minimalist approach: they should provide only primitive building blocks and be accompanied by generators that can combine these blocks to yield complex custom data structures. In this paper, we describe a prototype data structure generator and the building blocks that populate its library. We also present preliminary experimental results which suggest that this approach does not compromise programmer productivity nor the run-time performance of generated data structures.

P3

V. Singhal and D. Batory. P++: A Language for Software System Generators. Technical Report TR-93-16, Department of Computer Sciences, University of Texas at Austin, November 1993.

P++ is a programming language that supports the GenVoca model, a particular style of software design that is intended for building software system generators. P++ is an enhanced version of C++: it offers linguistic extensions for component encapsulation, abstraction, parameterization, and inheritance, where a component is a suite of interrelated classes and functions. This paper describes the motivations for P++, the ideas which underlie its design, the syntax and features of the language, and related areas of research.

J. Thomas, D. Batory, V. Singhal, and M. Sirkin. A Scalable Approach to Software Libraries. 6th Annual Workshop on Software Reuse (Owego, New York), November 1993.

Software libraries offer a convenient and accessible means to achieve the benefits of reuse. The components of these libraries are written by hand, and each represents a unique combination of features that distinguishes it from other components. Unfortunately, as the number of features grows, the size of these libraries grows exponentially, making them unscalable.
Predator is a research project to develop abstractions and tools to provide the benefits of software libraries without incurring the scalability disadvantages just mentioned. Our approach relies on a careful analysis of an application domain to arrive at appropriate high-level abstractions, standardized (i.e., plug-compatible) interfaces, and layered decompositions. Predator defines language extensions for implementing components, and compilers to automatically convert component compositions into efficient programs.

P3

V. Singhal and D. Batory. P++: A Language for Large-Scale Reusable Software Components. 6th Annual Workshop on Software Reuse (Owego, New York), November 1993.

P++ is a programming language that supports the GenVoca model, a particular style of software design that is intended for building software system generators. P++ is an enhanced version of C++: it offers linguistic extensions for component encapsulation, abstraction, parameterization, and inheritance, where a component is a subsystem, i.e., a suite of interrelated classes and functions.

award

M. Sirkin. Predator: A Data Structure Compiler. A manual describing the features and syntax of P1, a prototype data structure compiler (unpublished).

M. Sirkin, D. Batory, and V. Singhal. Software Components in a Data Structure Precompiler. International Conference on Software Engineering (ICSE) (Baltimore, MD), May 1993, pages 437-446.

PREDATOR is a data structure precompiler. that generates efficient code for maintaining and querying complex data structures. It embodies a novel component reuse technology that transcends traditional generic data types. In this paper, we explain the concepts of our work and our prototype system. We show how complex data structures can be specified as compositions of software building blocks, and present performance results that compare PREDATOR output to hand-optimized programs

D. Batory, V. Singhal, and J. Thomas. Database Challenge: Single Schema Database Management Systems. Technical Report TR-92-47, Department of Computer Sciences, University of Texas at Austin, December 1992.

Many data-intensive applications require high-performance data management facilities, but utilize only a small fraction of the power of a general-purpose database system (DBMS). We believe single schema database systems (SSTs), i.e., special-purpose DBMSs that are designed for a single schema and a predeclared set of database operations, are a vital need of today's software industry. The challenge is to create a technology for economically building high-performance SSTs. SST research will combine results from object-oriented databases, persistent object stores, module interconnection languages, rule-based optimizers, open-architecture systems, extensible databases, and generic data types.

P3

D. Batory and S. O'Malley. The Design and Implementation of Hierarchical Software Systems with Reusable Components. ACM Transactions on Software Engineering and Methodology (TOSEM), 1(4):355-398, October 1992.

We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and large-scale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software component/building-block technologies and domain modeling. Building-block technologies exploit large-scale reuse, rely on open architecture software, and elevate the granularity of programming to the subsystem level. Domain modeling formalizes the similarities and differences among systems of a domain. We believe our model is a blue-print for achieving software component technologies in many domains.

GenVoca

D. Batory, V. Singhal, and M. Sirkin. Implementing a Domain Model for Data Structures. International Journal of Software Engineering and Knowledge Engineering, 2(3):375-402, September 1992.

We present a model of the data structure domain that is expressed in terms of the GenVoca domain modeling concepts. We show how familiar data structures can be encapsulated as realms of plug-compatible, symmetric, and reusable components, and we show how complex data structures can be formed from their composition. The target application of our research is a precompiler for specifying and generating customized data structures.

D.S. Batory. The Genesis Papers: Volume 1 . UTCS Technical Report, June 1991.

This is a collection of major publications on the Genesis extensible DBMS project. There are six papers overall. Three explain the Genesis open-system architecture concept, two papers discuss its implementation, and the last paper is the Genesis Unix User's Manual.

feature modules, layered designs, GenVoca, DRC, design rules, design rule checking