AbstractWriting and debugging distributed programs can be difficult. When a program is working, it can be difficult to achieve reasonable execution performance. A major cause of these difficulties is a lack of tools for the programmer. We use a model of distributed computation and measurement to implement a program monitoring system for programs running on the Berkeley UNIX 4.2BSD operating system. The model of distributed computation describes the activities of the processes within a distributed program in terms of computation (internal events) and communication (external events). The measurement model focuses on external events and separates the detection of external events, event record selection and data analysis. The implementation of the measurement tools involved changes to the Berkeley UNIX kernel, and the addition of daemon processes to allow the monitoring activity to take place across machine boundaries. A user interface has also been implemented.
Read full abstract