Executive Summary


As the dramatic changes of the information age take hold, two fundamental CISE challenges have captured the interest of the Computer Science faculty and many of our colleagues throughout the University. First, we see computer simulation and modeling becoming established as the third pillar of science and engineering, complementing the traditional activities of theory and experimentation, and expanding to encompass information processing activities, such as data base indexing and financial modeling. These activities demand not only tremendous computational and I/O capacity that is easily accessible, but also new methods of interacting with data, with on-going simulations, and with research colleagues. Second, the increasing level of computer integration requires that design principles be developed for large-scale "systems of systems", where the individual components are complete systems with complex behaviors that must operate together in a coherent fashion. Soon, large organizations will be dealing with systems on the scale of millions of processors. As the scale of the system increases, as the individual components become more complex, and as the range of application demands broaden, the system can no longer be viewed as a set of rigid, closely interlocking components like a mechanical system; nor can it be decomposed into simple client-server hierarchies. Such systems function more like an economy, where many complex components take local actions that influence one another and implicitly shape the behavior of the system as a whole. In addition, the behavior of these systems is strongly influenced by how they are used, so the research in design principles cannot be conducted in isolation from its application and user context.

The faculty of the Electrical Engineering and Computer Sciences Department, along with colleagues in other departments, propose to conduct a major experiment in the design, implementation and application a novel multi-level system of systems for advanced, interactive computer simulation and modeling.

This proposal is the capstone of three long-term efforts centered on the expanding role of computing within research and education enterprise of the University. First, in recognition of the increasing importance of computer modeling and simulation in science and engineering, a set of graduate programs in computational science were formed within various disciplines, sharing a core portion of the curriculum. The process of forming the program forged a community of faculty with common interests, a shared vision, and a collection of exciting investigations that are currently out of reach. Second, in response to the broadening role of information management and processing a new school was formed, the School of Information Management and Systems (SIMS), with joint faculty in Computer Science, Business, and Law. Third, members of the Computer Science Division have investigated deeply the problems of system, language, and programming support for large-scale modeling and information management applications and have developed core technology to solve these problems in the context of emerging networks and state-of-the-art commercial operating systems. Based largely on these three efforts, UCB recently received a donation of six million dollars of computing equipment from the Intel Corporation and a major commitment of software support from Microsoft and Sun Microsystems to develop novel applications of high performance computing in research and education. The program we have assembled to advance the state of computational science and engineering in this way we call Millennium.

Building upon these three long-term efforts, we propose to transform the Intel donation into an experimental infrastructure for examining a set of deep questions that cut across traditional Computer Science, Information Management, and Economics in support of the computational modeling activity. The hardware organization of the proposed system-of-systems consists of a federation of systems at five levels that mirror the organizational structure of the institution. The individual desktop and departmental symmetric multiprocessor (SMP) server levels are augmented first with a "local" high-performance cluster of SMP's, called a CLUMP, which utilizes extensions of the communications, systems, and programming technology developed in the Berkeley Network of Workstations (NOW) project and other cluster research efforts. We believe this kind of computing environment will become common in future scientific laboratories. These group CLUMPs are further augmented by a large "campus" CLUMP, which pushes the envelop of existing cluster technology and provides enough resources for breakthrough investigations. The entire collection of clusters will be interconnected across campus with high-bandwidth gigabit ethernet links to form a large cluster of clusters of SMPs, called an intercluster. This presents an exciting, real environment in which to investigate and test design principles for large-scale complex systems. In addition to driving systems research, the advance of computational science and engineering within this interdisciplinary community creates an opportunity to investigate novel means of visualizing and interacting with computational models, information, and other researchers. The intercluster provides a natural testbed for investigating these issues. The desktops and servers have substantial graphics capability, and a few sites will be created within the complex that can provide intense immersive interaction environments or large active workspaces in which to assimilate information from many sources. These new technologies and new applications will drive development of new forms of user interfaces.

The key conceptual change offered by emerging networks is that resources in the system need no longer be viewed as tightly associated with the physical box in which they reside. Instead, resources can be integrated according to application needs. While resource sharing can be obtained with centralization, the user community has become increasingly empowered with local resources over more than a decade and strongly associate this autonomy with their personal productivity. There is little desire to create a strong central authority that manages resources to maximize their own gain. Moreover, locality of use provides substantial gains in efficiency.

We see the deployment of a computational economy, both as a user model and as a fundamental system design principle, as a potential breakthrough of great significance. In essence, users, or their encompassing departments, own resources that they can utilize to their own benefit. Most of these groups have applications that can consume far more than their local resources, however, the use of these applications varies tremendously depending on whether the research is closing in on an important finding (or against an important deadline). Also, some groups possess unique resources that are valuable to others. Making the buying, selling, and using of resources over time an explicit feature of the system design provides a potentially elegant solution to a number of technical challenges that have plagued distributed systems. It offers a framework in which the efficiency of the system can be optimized relative to user demand, rather than the system's charging algorithm. For example, it creates an incentive for decentrally administered systems to be kept compatible and establishes an exchange rate across different kinds of resources. It is inherently adaptive, providing not only feedback from resource availability to user demand, as does any charging algorithm, but also feedback between users since the price is determined by current user demand. The algorithms are, potentially, very scalable and robust.

This approach has been studied extensively from a theoretical viewpoint (several of the experts in the field are represented within the projects), however, only limited experimental evidence exists. To our knowledge, no comparable experimental testbed for the study of large scale systems-of-systems exists. This multi-level system of systems provides the raw experimental test bed for an array of research projects in computer systems, algorithms, languages, networking, information management, and user interfaces, as well as a tremendously powerful engine for applications, both within computer science and in the other disciplines. The wealth of resources and the capability of the pathways interconnecting them are essential for an active and efficient computational economy. The critical enabling technologies for such an aggressive experiment have come to fruition through work on clusters, global computing environments, platform-independent languages (e.g., Java), universal naming and name-services, and high-bandwidth, scalable networks.

The usage model for the system-of-systems has four general classes of applications: routine computational needs, large-scale simulation and modeling, persistent information services, and on-line interaction with data and users.