Back to index
A Reliable Multicast Framework for Light-weight Sessions and Application
Level Framing
Sally Floyd, Van Jacobson, Steve McCanne, LBL; C.G. Liu, USC; Lixia
Zhang, Xerox PARC
One-line summary: Reliable multicast that allows app-level
policies for reliability and ordering, rather than putting it in the
protocol. Scalability achieved by having one client multicast a repair
request if none is seen after a small random delay.
Overview/Main Points
- Many-to-many multicast (based on IP multicast) where source ID's
of mcasts are persistent. All data has a unique name.
- Ordering and naming/sequence semantics are application-level
only; thus used only when needed, and apps can choose whatever
naming convention works well for them, eg in wb need to
know what seq numbers you've missed right after you join.
- BW management/allocation are also at app level.
- To prevent ACK implosion, NAK's are used.
- When someone needs repair, they wait a random amt of time for
someone else to mcast a repair request; if not seen, they
mcast it themselves.
- Since ADU names can be independent of the original sender, anyone
who has a copy of the requested data may transmit it.
- If repair is lost, timeout triggers a retransmit.
- Wb instantiation: most ops are idempotent; time stamps on drawops
are used to determine rendering order. Captures reasonable
temporal causality without heavyweight causal delivery. For
repairs, someone who has a copy of requested data mcasts it
after a random time; this prevents response implosion.
- Some nice simulations of request/repair behavior on various
topologies suggest that it works well even with large numbers of
nodes.
- Other apps for which RMP would be interesting: distributed Web
caching; Usenet; internet routing information exchange.
- Related work: mostly distinguishes between token-based and
distributed responsibility for reliability and ordering.
Relevance
Convincing application of the end-to-end argument to reliable multicast.
Effective prevention of ACK implosion and response implosion.
Flaws
- There are lots of parameters in the simulations whose values were
fixed (timeout, etc.) - not clear how they affect performance.
- If repair is dropped, have to rely on timer retransmit. On poor
networks, this could suck.
- WWW application seems somewhat gratuitous - not clear if mcast is
the right way to deliver WWW updates to distributed caches.
Back to index