Adaptive Construction of Hierarchy from Peer-to-Peer Network by Yitao Duan and Ling Huang duan@eecs.berkeley.edu hlion@newton.berkeley.edu In our initial proposal we tried to attack the service discovery problem in a dynamic network. In particular, we planned to improve or modify algorithms that discover the existence of all network nodes. As been pointed out, this is a too broad area and there has already been much research. In rethinking about our project plan, we realized that, along the same general direction of our initial proposal, many systems that provide or use some kind of service discovery service(such as SDS, distributed web caching, or ad hoc sensor network) either assume the existence of a hierarchical structure in the network or do without it. There isn't too much literature showing how to construct such hierarchy other than manual configuration. We feel that having some form of structure in the network is beneficial in many ways. For example, nodes are asymmetric in terms of capabilities (CPU, memory, bandwidth, etc..) and it would be more efficient to have them perform asymmetric tasks. Another example would be if a sensor network consisting of many small sensors and a few big sensors can self-organize into some hierarchy, it can be more efficient and robust than simple peer-to-peer cooperation. Therefore we narrowed our focus down to finding a reasonably good algorithm/protocol to construct network hierarchy from flat peer-to-peer node mesh to support efficient routing and data query based on the notion of "supernode". Supernode is a just a node who is powerful in terms of capabilities (CPU, bandwidth, etc.) and willing to be a query server. All the peer-to-peer clients can be partitioned into disjoint "local network group". Each group is serviced by its nearby strategically selected supernode. A collection of supernode collaboratively provides all the query service. Supernodes can also form higher level of hierarchy if desired. Clients can locate a nearby supernode and tap into the query service via the supernode. The problem can be abstracted to the following: Given a partially connected directed graph with each edge representing a node's initial knowledge of another, find a tree(possibly of a given depth) or forest, according to certain criteria, that encompass all nodes. Different variants of the problem specification can be applied to other application. For example, in a modified Nepstor service model, there is no central server(nothing to be shut down!), but a number of participating machines will elect a local supernode which will cooperate with supernodes from other groups and provide services to its clients. Whatever protocol that is used to form such two-level hierarchy must be adaptive in that adding/removing of supernodes and clients must not disrupt the operation. We believe this is more manageable than our initial proposal because it is clearly defined and we already have some rough ideas of how it will work. Also many difficulties we anticipated are design decisions. Issues we expect to handle include but not limited to: 1. Node selection: which node, out of those willing, should be selected as supernode and how is this done. 2. Node discovery between superNodes: How would roots of each tree in the resulting forest discover each other -- Name Dropper algorithm?. 3. Dynamic join and disappearing of supernodes and clients. 4. Optimization of the architecture: supernodes mesh optimization to improve the architecture if more supernodes are known. This will make the supernodes mesh more wisely connected, and minimize the number of "hops" a query must travel before it find the metadata for its destination. 5. Evaluation: basically we have three ways to evaluate our algorithm: (1) mathematical analysis, (2) simulation and (3) comparing application performance with systems that doesn't use hierarchy(e.g., is routing more efficient?) Tentative Timeline Week 8 : Literature survey. This already be doing for a while. We are tying to get more idea about algorithm and application space. We will also develop our algorithm concurrently. Week 9 : Node selection/(discovery?). Week 10: Message passing mechanism and format. Local data structure. Week 11: Prove algorithm correctness. Optimization. Week 12 - 13: Simulation/Algorithm refinement. Week 14 - 15: Paper write-up. The above is only tentative. We will try to be early to avoid any problem.