Measurement and modeling of computer reliability as affected by system activity
ACM Transactions on Computer Systems
M. C. Hsueh
R. K. Iyer
Exploring event correlation for failure prediction in coalitions of clusters