Many web sites utilize deprecated software products that are no longer maintained by the associated software producers. This paper explores the question of whether an existing big data collection can be used to predict the likelihood of deprecated PHP releases based on different abstract components in modern web deployment stacks. Building on web intelligence, software security, and data-based industry rationales, the question is examined by focusing on the most popular domains in the contemporary web-facing Internet. Logistic regression is used for classification. Although statistical classification performance is modest, the results indicate that deprecated PHP releases are associated with Linux and other open source software components. Geographical variation is small. Besides these results, the paper contributes to the web intelligence research by evaluating the feasibility of existing big data collections for mass-scale fingerprinting.
Jukka Ruohonen, Sami Hyrynsalmi, Ville Leppänen (University of Turku): Exploring the Use of Deprecated PHP Releases in the Wild Internet: Still a LAMP Issue?
Presented at WIMS ’16 Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics