1. bookVolume 43 (2018): Issue 4 (December 2018)
Journal Details
First Published
24 Oct 2012
Publication timeframe
4 times per year
access type Open Access

Predicting Aggregated User Satisfaction in Software Projects

Published Online: 31 Dec 2018
Page range: 335 - 357
Received: 07 Aug 2018
Accepted: 19 Oct 2018
Journal Details
First Published
24 Oct 2012
Publication timeframe
4 times per year

User satisfaction is an important feature of software quality. However, it was rarely studied in software engineering literature. By enhancing earlier research this paper focuses on predicting user satisfaction with machine learning techniques using software development data from an extended ISBSG dataset. This study involved building, evaluating and comparing a total of 15,600 prediction schemes. Each scheme consists of a different combination of its components: manual feature preselection, handling missing values, outlier elimination, value normalization, automated feature selection, and a classifier. The research procedure involved a 10-fold cross-validation and separate testing, both repeated 10 times, to train and to evaluate each prediction scheme. Achieved level of accuracy for best performing schemes expressed by Matthews correlation coefficient was about 0.5 in the cross-validation and about 0.5–0.6 in the testing stage. The study identified the most accurate settings for components of prediction schemes.


[1] Atkeson C.G., Moore A.W., Schaal S., Locally Weighted Learning, Artificial Intelligence Review, 11, 1-5, 1997, 11-73.Search in Google Scholar

[2] Cerpa, N., Bardeen, M., Astudillo, C. A., Verner, J., Evaluating different families of prediction methods for estimating software project outcomes, Journal of Systems and Software, 112, 2016, 48–64.Search in Google Scholar

[3] Cleary J.G., Trigg L.E., K*: an instance-based learner using and entropic distance measure, in: Proceedings of the Twelfth International Conference on International Conference on Machine Learning (ICML’95), Armand Prieditis and Stuart J. Russell (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1995, 108-114.Search in Google Scholar

[4] Ding, C.H.Q., Peng, H., Minimum redundancy feature selection from microarray gene expression data, in: Proc. the 2nd IEEE Comp. Society Bioinformatics Conf., Stanford, CA, IEEE Comp. Society, Los Alamitos, 2003, 523–529.Search in Google Scholar

[5] Fenton, N., Marsh, W., Neil, M., Cates, P., Forey, S., Tailor, M., Making Resource Decisions for Software Projects, in: Proceedings of the 26th International Conference on Software Engineering, IEEE Computer Society, Washington, DC, 2004, 397–406.Search in Google Scholar

[6] Frank E., Hall M.A., Witten I.H., The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”, Morgan Kaufmann, Fourth Edition, 2016, http://www.cs.waikato.ac.nz/ml/weka/Witten_et_al_2016_appendix.pdf, last accessed 2018/05/22.Search in Google Scholar

[7] Frank E., Witten I.H., Generating Accurate Rule Sets Without Global Optimization. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML ‘98), Jude W. Shavlik (Ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998, 144-151.Search in Google Scholar

[8] Friedman J., Hastie T., Tibshirani R., Special Invited Paper. Additive Logistic Regression: A Statistical View of Boosting, The Annals of Statistics, 28, 2, 2000, 337-374.Search in Google Scholar

[9] Garcés, L., Ampatzoglou, A., Avgeriou, P., Nakagawa, E.Y., Quality attributes and quality models for ambient assisted living software systems: A systematic mapping, Information and Software Technology, 82, 2017, 121-138.Search in Google Scholar

[10] Hall M., Frank E., Combining Naive Bayes and Decision Tables, in: D.L. Wilson & H. Chad (Eds), Proceedings of Twenty-First International Florida Artificial Intelligence Research Society Conference, AAAI Press, Coconut Grove, Florida, USA, 2008, 318-319.Search in Google Scholar

[11] Holmes G., Pfahringer B., Kirkby R., Frank E., Hall M., Multiclass Alternating Decision Trees, in: Proceedings of the 13th European Conference on Machine Learning (ECML ‘02), Tapio Elomaa, Heikki Mannila, and Hannu Toivonen (Eds.). Springer-Verlag, London, UK, 2002, 161-172.Search in Google Scholar

[12] Holte R.C., Very simple classification rules perform well on most commonly used datasets. Machine Learning. 11, 1993, 63-91.Search in Google Scholar

[13] ISBSG Repository Data Release 11. International Software Benchmarking Standards Group, 2009.Search in Google Scholar

[14] Idri, A., Bachiri, M., Fernández-Alemán, J.L., A Framework for Evaluating the Software Product Quality of Pregnancy Monitoring Mobile Personal Health Records, Journal of Medical Systems, 40, 3, 2016, art. no. 50, 1-17.Search in Google Scholar

[15] ISO/IEC: Software engineering Software product Quality Requirements and Evaluation (SQuaRE) System and software quality models, volume ISO/IEC 25010:2011(E), 2011.Search in Google Scholar

[16] Jin W., Tung A.K.H., Han J., Wang W., Ranking Outliers Using Symmetric Neighborhood Relationship, in: Ng WK., Kitsuregawa M., Li J., Chang K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science, vol 3918. Springer, Berlin, Heidelberg, 2006.Search in Google Scholar

[17] Jones C., Applied Software Measurement: Global Analysis of Productivity and Quality, McGraw-Hill Education, 3rd edition, 2008.Search in Google Scholar

[18] Kitchenham B.A., Madeyski L., Budgen D., Keung J., Brereton P., Charters S., Gibbs S., Pohthong A., Robust Statistical Methods for Empirical Software Engineering, Empirical Software Engineering, 22, 2, 2017, 579-630.Search in Google Scholar

[19] Kocaguneli E., Menzies T., Bener A., Keung J. W., Exploiting the Essential Assumptions of Analogy-Based Effort Estimation, IEEE Transactions on Software Engineering, 38, 2, 2012, 425–438.Search in Google Scholar

[20] Kohavi R., The power of decision tables, in: Proceedings of the 8th European Conference on Machine Learning (ECML’95), Nada Lavrač and Stefan Wrobel (Eds.). Springer-Verlag, Berlin, Heidelberg, 1995, 174-189.Search in Google Scholar

[21] Kohavi R., Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD’96), Evangelos Simoudis, Jiawei Han, and Usama Fayyad (Eds.). AAAI Press, 1996, 202-207.Search in Google Scholar

[22] Landwehr N., Hall M., Frank E., Logistic Model Trees. Machine Learning, 59, 1-2, 2005, 161-205.Search in Google Scholar

[23] Le Cessie S., Van Houwelingen J., Ridge Estimators in Logistic Regression, Journal of the Royal Statistical Society. Series C (Applied Statistics), 41, 1, 1992, 191-201.Search in Google Scholar

[24] Menzies T., Jalali O., Hihn J., Baker D., Lum K., Stable rankings for different effort models, Automated Software Engineering, 17, 4, 2010, 409–437.Search in Google Scholar

[25] Olsina, L., Lew, P., Dieser, A., Rivera, B., Updating quality models for evaluating new generation web applications, Journal of Web Engineering, 11, 3, 2012, 209-246.Search in Google Scholar

[26] Pearl J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Representation and Reasoning Series (2nd printing ed.). San Francisco, California: Morgan Kaufmann, 1988.Search in Google Scholar

[27] Quinlan R., C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA, 1993.Search in Google Scholar

[28] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2017.Search in Google Scholar

[29] Radlinski L., How software development factors influence user satisfaction in meeting business objectives and requirements?, in: Madeyski, L., Ochodek, M. (eds.), Software Engineering from Research and Practice Perspectives, chapter 6, Nakom, Poznan-Warszawa, 2014, 101–119.Search in Google Scholar

[30] Radliński Ł., Preliminary evaluation of schemes for predicting user satisfaction with the ability of system to meet stated objectives, Journal of Theoretical and Applied Computer Science, 9, 2, 2015, 32–50.Search in Google Scholar

[31] Radlinski L., Towards expert-based modeling of integrated software quality, Journal of Theoretical and Applied Computer Science, 6, 2, 2012, 13–26.Search in Google Scholar

[32] RapidMiner Studio, https://rapidminer.com/products/studio/, last accessed 2018/05/22.Search in Google Scholar

[33] Schowe B., Morik K., Fast-Ensembles of Minimum Redundancy Feature Selection, in: Okun O., Valentini G., Re M. (eds) Ensembles in Machine Learning Applications. Studies in Computational Intelligence, vol 373. Springer, Berlin, Heidelberg, 2011.Search in Google Scholar

[34] Shepperd M., Bowes D., Hall T., Researcher Bias: The Use of Machine Learning in Software Defect Prediction. IEEE Transactions on Software Engineering, 40, 2014, 603–616.Search in Google Scholar

[35] Shi H., Best-first Decision Tree Learning, Thesis, Master of Science. The University of Waikato, Hamilton, New Zealand, 2007.Search in Google Scholar

[36] Song Q., Jia Z., Shepperd M., Ying S., Liu J., A General Software Defect-Proneness Prediction Framework, IEEE Transactions on Software Engineering, 37, 3, 2011, 356-370.Search in Google Scholar

[37] Sumner M., Frank E., Hall M., Speeding up logistic model tree induction, in: Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’05), Alípio Mário Jorge, Luís Torgo, Pavel Brazdil, Rui Camacho, and João Gama (Eds.). Springer-Verlag, Berlin, Heidelberg, 2005, 675-683.Search in Google Scholar

[38] Tang J., Chen Z., Fu A. W. C., Cheung, D. W., Enhancing Effectiveness of Outlier Detections for Low Density Patterns, in: Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD). Taipei, 2002, 535-548.Search in Google Scholar

[39] Vargas J.A., García-Mundo L., Genero M., Piattini M., A systematic mapping study on serious game quality, in: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE ‘14), ACM, New York, 2014, Article no. 15.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo