[Ahmadov, A., Thiele, M., Eberius, J., Lehner, W. and Wrembel, R. (2015). Towards a hybrid imputation approach using web tables, IEEE/ACM International Symposium on Big Data Computing (BDC), Limassol, Cyprus, pp. 21-30.]Search in Google Scholar
[Bekkerman, R., Bilenko, M. and Langford, J. (2011). Scaling Up Machine Learning: Parallel and Distributed Approaches, Cambridge University Press, New York, NY.10.1145/2107736.2107740]Search in Google Scholar
[Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Whang, S.E. and Widom, J. (2009). Swoosh: A generic approach to entity resolution, The VLDB Journal 18(1): 255-276. 10.1007/s00778-008-0098-x]Open DOISearch in Google Scholar
[Bayer, M.A. and Edjlali, R. (2014). Magic Quadrant for Data Warehouse Database Management Systems, Gartner Publications, Stamford, CT, https://www.gartner.com/doc/2678018/magic-quadrant-data-warehouse-database.]Search in Google Scholar
[Beyer, M. and Laney, D. (2012). The Importance of “Big Data”: A Definition, Gartner Publications, Stamford, CT. ]Search in Google Scholar
[Boyd, D. and Crawford, K. (2012). Critical questions for big data, Information, Communication and Society 15(5): 662-679.10.1080/1369118X.2012.678878]Open DOISearch in Google Scholar
[Brzezinski, D. and Stefanowski, J. (2014). Combining block-based and online methods in learning ensembles from concept drifting data streams, Information Sciences 265: 50-67.10.1016/j.ins.2013.12.011]Search in Google Scholar
[Che, D., Safran, M. and Peng, Z. (2013). From big data to big data mining: Challenges, issues, and opportunities, in10.1007/978-3-642-40270-8_1]Search in Google Scholar
[B. Hong et al. (Eds.), International Conference on Database Systems for Advanced Applications, Lecture Notes in Computer Science, Vol. 7827, Springer, Berlin/Heidelberg, pp. 1-15.]Search in Google Scholar
[Chen, C.L.P. and Zhang, C. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on big data, Information Sciences 275(10): 314-347.10.1016/j.ins.2014.01.015]Search in Google Scholar
[Custers, B., Calders, T., Schermer, B. and Zarsky, T.Z. (Eds.) (2013). Discrimination and Privacy in the Information Society-Data Mining and Profiling in Large Databases, Studies in Applied Philosophy, Epistemology and Rational Ethics, Vol. 3, Springer, Berlin/Heidelberg.]Search in Google Scholar
[Ditzler, G., Roveri, M., Alippi, C. and Polikar, R. (2015). Learning in nonstationary environments: A survey, IEEE Computational Intelligence Magazine 10(4): 12-25.10.1109/MCI.2015.2471196]Search in Google Scholar
[Domingos, P. and Hulten, G. (2000). Mining high-speed data streams, ACM SIGKDD International Conference on Knowledge Discovery Data Mining, Boston, MA, USA, pp. 71-80.]Search in Google Scholar
[Duggan, J., Elmore, A.J., Stonebraker, M., Balazinska, M., Howe, B., Kepner, J., Madden, S., Maier, D., Mattson, T. and Zdonik, S. (2015). The BigDAWG polystore system, SIGMOD Record 44(2): 11-16.10.1145/2814710.2814713]Search in Google Scholar
[Elmagarmid, A., Rusinkiewicz, M. and Sheth, A. (Eds.) (1999). Management of Heterogeneous and Autonomous Database Systems, Morgan Kaufmann, San Francisco, CA.]Search in Google Scholar
[Fernández, A., del Río, S., Chawla, N.V. and Herrera, F. (2017). An insight into imbalanced big data classification: Outcomes and challenges, Complex & Intelligent Systems 3(2): 105-120.10.1007/s40747-017-0037-9]Search in Google Scholar
[Francisco, P. (2012). Oracle Exadata and IBM Netezza data warehouse appliance compared, IBM White Paper, www.ibmbigdatahub.com/pdf/Oracle_Exadata_IBMNetezza_Compared_WP_EN.pdf.]Search in Google Scholar
[Gama, J. (2010). Knowledge Discovery from Data Streams, Chapman and Hall, Boca Raton, FL.10.1201/EBK1439826119]Search in Google Scholar
[Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M. and Bouchachia, A. (2014). A survey on concept drift adaptation, ACM Computing Surveys 46(4): 44:1-44:37.10.1145/2523813]Open DOISearch in Google Scholar
[Gens, F. (2011). IDC predictions 2012: Competing for 2020. IDC analyze the future, https://www.virtustream.com/sites/default/files/IDCTOP10Predictions2012.pdf.]Search in Google Scholar
[Gessert, F., Schaarschmidt, M., Wingerath, W., Witt, E., Yoneki, E. and Ritter, N. (2017). Quaestor: Query web caching for database-as-a-service providers, PVLDB 10(12): 1670-1681.10.14778/3137765.3137773]Search in Google Scholar
[Glavic, B. (2014). Big data provenance: Challenges and implications for benchmarking, in T. Rabl et al. (Eds.), Specifying Big Data Benchmarks, Springer, New York, NY, pp. 72-80.10.1007/978-3-642-53974-9_7]Search in Google Scholar
[Gupta, A. (2009). Data provenance, in L. Liu and M.T. O¨ zsu (Eds.), Encyclopedia of Database Systems, Springer, Berlin, pp. 608-608.10.1007/978-0-387-39940-9_1305]Search in Google Scholar
[Han, J. and Kamber, M. (Eds.) (2011). Data Mining. Concepts and Techniques, Morgan Kaufmann, San Francisco, CA.]Search in Google Scholar
[Hashem, H. and Ranc, D. (2016). Pre-processing and modeling tools for bigdata, Foundations of Computing and Decision Sciences 41(3): 151-162.10.1515/fcds-2016-0009]Search in Google Scholar
[Japkowicz, N. and Stefanowski, J. (2016a). A machine learning perspective on big data analysis, in N. Japkowicz and J.10.1007/978-3-319-26989-4_1]Open DOISearch in Google Scholar
[Stefanowski (Eds.), Big Data Analysis: New Algorithms for a New Society, Springer, Cham, pp. 1-31.]Search in Google Scholar
[Japkowicz, N. and Stefanowski, J. (Eds.) (2016b). Big Data Analysis: New Algorithms for a New Society, Studies in Big Data, Vol. 16, Springer, Cham.10.1007/978-3-319-26989-4]Search in Google Scholar
[Kingma, D.P. and Welling, M. (2013). Auto-encoding variational Bayes, ArXiv e-prints, 1312.6114a.]Search in Google Scholar
[Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J. and Wozniak, M. (2017). Ensemble learning for data stream analysis: A survey, Information Fusion 37: 132-156.10.1016/j.inffus.2017.02.004]Search in Google Scholar
[Krempl, G., Zliobaite, I., Brzezinski, D., H¨ullermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M. and Stefanowski, J. (2014). Open challenges for data stream mining research, SIGKDD Explorations 16(1): 1-10.10.1145/2674026.2674028]Search in Google Scholar
[Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks, in F. Pereira et al. (Eds.), Advances in Neural Information Processing Systems 25, Curran Associates, Inc., Red Hook, NY, pp. 1097-1105.]Search in Google Scholar
[Krawiec, K. (2016). Evolutionary feature selection and construction, in S. Claude and G. Webb (Eds.), Encyclopedia of Machine Learning and Data Mining, Springer, Boston, MA.10.1007/978-1-4899-7502-7_90-1]Search in Google Scholar
[Langegger, A., Wöß, W. and Blöchl, M. (2008). A semantic web middleware for virtual data integration on the web, European Semantic Web Conference on the Semantic Web: Research and Applications (ESWC), Tenerife, Canary Islands, Spain, pp. 493-507.]Search in Google Scholar
[LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep learning, Nature 521(7553): 436-444.10.1038/nature1453926017442]Search in Google Scholar
[Liu, M. and Wang, Q. (2016). Rogas: A declarative framework for network analytics, International Conference on Very Large Data Bases (VLDB), New Delhi, India, pp. 1561-1564.]Search in Google Scholar
[Matwin, S. (2013). Privacy-preserving data mining techniques: Survey and challenges, in B. Custers et al. (Eds.), Discrimination and Privacy in the Information Society, Vol 3. Springer, Berlin/Heidelberg, pp. 209-221.10.1007/978-3-642-30487-3_11]Search in Google Scholar
[Mauro, A.D., Greco, M. and Grimaldi, M. (2015). What is big data? A consensual definition and a review of key research topics, International Conference on Integrated Information, Madrid, Spain, pp. 97-104.]Search in Google Scholar
[Miao, X., Gao, Y., Guo, S. and Liu, W. (2017). Incomplete data management: A survey, Frontiers of Computer Science, DOI: 10.1007/s11704-016-6195-x.10.1007/s11704-016-6195-x]Open DOISearch in Google Scholar
[Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E. and den Bussche, J.V. (2011). The open provenance model core specification (v1.1), Future Generation Computer Systems 27(6): 743-756.10.1016/j.future.2010.07.005]Open DOISearch in Google Scholar
[Napierala, K. and Stefanowski, J. (2016). Types of minority class examples and their influence on learning classifiers from imbalanced data, Journal of Intelligent Information Systems 46(3): 563-597.10.1007/s10844-015-0368-1]Open DOISearch in Google Scholar
[Naumann, F. (2014). Data profiling revisited, SIGMOD Record 42(4): 40-49.10.1145/2590989.2590995]Open DOISearch in Google Scholar
[Rudin, C. (2014). Algorithms for interpretable machine learning, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 1519-1519.]Search in Google Scholar
[Russom, P. (2017). Data lakes: Purposes, practices, patterns, and platforms. TDWI White Paper, https://info.talend.com/rs/talend/images/WP_EN_BD_TDWI_DataLakes.pdf.]Search in Google Scholar
[Schmidhuber, J. (2015). Deep learning in neural networks: An overview, Neural Networks 61(C): 85-117.10.1016/j.neunet.2014.09.00325462637]Search in Google Scholar
[Shaker, A. and Hüllermeier, E. (2014). Survival analysis on data streams: Analyzing temporal events in dynamically changing environments, International Journal of Applied Mathematics and Computer Science 24(1): 199-212, DOI: 10.2478/amcs-2014-0015.10.2478/amcs-2014-0015]Open DOISearch in Google Scholar
[Soltanpoor, R. and Sellis, T. (2016). Prescriptive analytics for big data, Australasian Database Conference on Databases Theory and Applications (ADC), Sydney, Australia, pp. 245-256.]Search in Google Scholar
[Sun, Y., Tang, K., Minku, L.L., Wang, S. and Yao, X. (2016). Online ensemble learning of data streams with gradually evolved classes, IEEE Transactions on Knowledge and Data Engineering 28(6): 1532-1545.10.1109/TKDE.2016.2526675]Search in Google Scholar
[Terrizzano, I., Schwarz, P., Roth, M. and Colino, J.E. (2015). Data wrangling: The challenging journey from the wild to the lake, Conference on Innovative Data Systems Research (CIDR), Asiloma, CA, USA.]Search in Google Scholar
[Wang, J., Crawl, D., Purawat, S., Nguyen, M.H. and Altintas, I. (2015). Big data provenance: Challenges, state of the art and opportunities, IEEE International Conference on Big Data, Santa Clara, CA, USA, pp. 2509-2516.]Search in Google Scholar
[Wiederhold, G. (1992). Mediators in the architecture of future information systems, IEEE Computer 25(3): 38-49.10.1109/2.121508]Open DOISearch in Google Scholar
[Wylot, M., Cudré-Mauroux, P., Hauswirth, M. and Groth, P.T. (2017). Storing, tracking, and querying provenance in linked data, IEEE Transactions on Knowledge and Data Engineering 29(8): 1751-1764.10.1109/TKDE.2017.2690299]Search in Google Scholar
[Zakhary, V., Agrawa, D. and El Abbadi, A. (2017). Caching at the web scale, International Conference on World Wide Web Companion, Perth, Australia, pp. 909-912.]Search in Google Scholar