A repetition based measure for verification of text collections and for text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval - SIGIR ’03
Dmitry V. Khmelev
Legal documents categorization by compression