Abstract

The impact of computer science on statistical computing, while important, has not been as great as it should. Part of this is caused by statisticians being unaware of relevant research in computer science. A related, and possibly more serious problem, is that computer scientists are not aware of some of the interesting problems in statistical computing. This paper will discuss several such problems. For example, statisticians are more concerned about “nice” case behavior than “worst” case behavior, e.g., so their primary interest is in expected running time using realistic models for typical data sets. Improvements in algorithms can often be made in “nice” cases with little or no sacrifice in worst case behavior, by being optimistic as well as pessimistic in the design of algorithms. Statisticians are often willing to accept approximate solutions or solutions which are asymptotically equivalent to the correct solution. A second example is the behavior of algorithms when data are stored in a virtual memory or auxillary memory. While some theoretical work has been done, only results on sorting and a few matrix operations have had an impact on statistical computing. There also seems to be a lack of models which give general predictions about the behavior of programs using virtual or auxillary memory, especially for portable programs which are not designed for specific page sizes. A third example is the problem of explaining issues of numerical accuracy to users of statitsical programs, e.g., if an ill-conditioned problem is diagnosed, it is necessary to explain to users what is wrong with their data. This might require finding a statistically meaningful index related to condition numbers. A fourth example is in language design for statistical packages. These languages do not need to deal with complex flow of control that programming languages do, On the other hand, they must be designed to be useful for the occasional and novice user. As a final example, while the developing technology of software engineering is having a significant influence on the writing of statistical software (especially statistical packages) there are many questions which remain to be answered, especially concerning portability of large applications programs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call