Paper Review: A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing

Reviewer: Kenneth Chin

Purpose This paper describes a framework, Scalable Reliable Multicast (SRM), that provides primitive functionalities for different multicast applications. Although different applications have varying multicast requirements, they do share some common requirements. In that sense, a framework that supports those common requirements should be a good basis for all kinds of multicast applications. The primitive functionalities of SMR are scalability and reliability.

Something from Others Summary The paper describes SRM with the help of the wb application and uses the application data unit (ADU) model. This assumes that session members have persistent unique IDs and each data item has a unique local name at the sender. Hence each data item that is sent has a unique, persistent name V combination of a persistent source ID and the unique local name. Session messages are used for state exchange and help to estimate the one-way distance between the nodes. The status exchange also helps to figure out other things like the number of active participants in the group, detect losses in the transmission etc. This model is based on the standard TCP/IP model for transmission of data but the receivers are responsible for the reliability of the data they receive. For loss recovery, the node sets a request timer in accordance to its distance from the source. If it sees another request for the same data before its timer expires it performs an exponential backoff otherwise, it multicasts the request when its timer expires. Any host which receives the recovery request sets a repair timer and multicasts the data requested if it receives no repair before its timer expires else it cancels its timer. The authors discuss the loss recovery performance for 3 topologies V star, tree and the chain and also discuss the effect of the parameters (C1, C2, D1, and D2) for setting the timers on the loss recovery mechanism in various scenarios. The choice of these parameters depends on the tradeoff between delay for loss recovery and the suppression of duplicate recovery requests. The authors present simulation results and graphs to support their claims. Towards the end of the paper, local recovery mechanisms have also been discussed to preserve bandwidth when the neighborhood affected by the loss is small. Such small groups can be identified by methods like administrative scoping, TTL-based scoping etc. Finally, some application specific aspects of the reliable multicasts have been discussed with the examples like BGP routing tables, web caches etc.

3 Main Ideas Potential Problems with SRM Framework
  • Flooding of data packets; the paper only mentioned how control packets can be suppressed but nothing has been done to the data packets.
  • Wasting bandwidth; since the data packets floods the network, it eats up a lot of bandwidth especially in the case of big groups with dispersed members.
  • Session message not scalable; all members send session messages periodically to estimate the distance to every other group member, so it is not scalable in a big group. Although the paper mentioned a hierarchical approach for scalable session messages which sacrifices accuracy in 'distance' estimation with scalability, that approach does not work because it lacks information, which is used to be in the session message, telling the sender whether a particular receiver receives the very last 'object' or not.
  • The metric 'distance' estimated may not reflect the real network condition; the accuracy depends on the persistence and prevalence of the routes. In the paper, it is assumed that the routes are symmetric.
  • Potential Areas to Work On Grading This paper is a very good paper in the sense that SRM framework addresses and in a certain extend solves the scalability issue in multicast. Although it is not a perfect, ideal solution, it marches a big step forward. I would grade this paper a number 3. However, this paper is too lengthy.