[1. http://www.brightplanet.com/completeplanet/]Search in Google Scholar
[2. Su, W., H. Wu, Y. Li et al. Understanding Query Interfaces by Statistical Parsing. - ACM Transactions on the Web (TWEB), Vol. 7, 2013, No 2, p. 8.10.1145/2460383.2460387]Search in Google Scholar
[3. Dragut, E. C., W. Meng, C. T. Yu. Deep Web Query Interface Understanding and Integration. - Synthesis Lectures on Data Management, Vol. 7, 2012, No 1, pp. 1-168.10.2200/S00419ED1V01Y201205DTM026]Search in Google Scholar
[4. Lu, Y, H. He, H. Zhao et al. Annotating Search Results from Web Databases. - Knowledge and Data Engineering, IEEE Transactions on, Vol. 25, 2013, No 3, pp. 514-527.10.1109/TKDE.2011.175]Search in Google Scholar
[5. Palekar, V. R., M. S. Ali, R. Meghe. Deep Web Data Extraction Using Web Programming-Language Independent Approach. - Journal of Data Mining and Knowledge Discovery, Vol. 3, 2012, No 2, p. 69.]Search in Google Scholar
[6. Wang, Z., G. Xu, H. Li et al. A Probabilistic Approach to String Transformation. - Knowledge and Data Engineering, IEEE Transactions on, Vol. 26, 2014, No 5, pp. 1063-1075.10.1109/TKDE.2013.11]Search in Google Scholar
[7. Sood, S., D. Loguinov. Probabilistic Near-Duplicate Detection Using Simhash. - In Proc of 20th ACM International Conference on Information and Knowledge Management, ACM, 2011, pp. 1117-1126.10.1145/2063576.2063737]Search in Google Scholar
[8. Zhao, W. L., C. W. Ngo, H. K. Tan et al. Near-Duplicate Keyframe Identification with Interest Point Matching And Pattern Learning. - Multimedia, IEEE Transactions on, Vol. 9, 2007, No 5, pp. 1037-1048.10.1109/TMM.2007.898928]Search in Google Scholar
[9. Hajishirzi, H., W. Yih, A. Kolcz. Adaptive Near-Duplicate Detection via Similarity Learning. - In: Proc. of 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2010, pp. 419-426.10.1145/1835449.1835520]Search in Google Scholar
[10. Zhao, P., J. Xin, X. Xian et al. Active Learning for Duplicate Record Identification in Deep Web. Foundations of Intelligent Systems. Berlin, Heidelberg, Springer, 2014, pp. 125-134.10.1007/978-3-642-54924-3_12]Search in Google Scholar
[11. Xiao, C., W. Wang, X. Lin et al. Efficient Similarity Joins for Near-Duplicate Detection. - ACM Transactions on Database Systems (TODS), Vol. 36, 2011, No 3, p. 15.10.1145/2000824.2000825]Search in Google Scholar
[12. He, B., K. C.-C. Chang. Making Holistic Schema Matching Robust: An Ensemble Approach. - KDD, 2005, pp. 429-43810.1145/1081870.1081920]Search in Google Scholar
[13. Fellegi, I. P., A. B. Sunter. A Theory for Record Linkage. - Journal of the American Statistical Association, Vol. 64, December 1969, No 328, pp. 1183-1210.10.1080/01621459.1969.10501049]Search in Google Scholar
[14. Newcombe, H. B., J. M. Kennedy, S. J. Axford, A. P. James. Automatic Linkage of Vital Records. - Science, Vol. 130, October 1959, No 3381, pp. 954-959.10.1126/science.130.3381.954]Search in Google Scholar
[15. Jaro, M. A. Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. - Journal of the American Statistical Association, Vol. 84, June 1989, No 406, pp. 414-420.10.1080/01621459.1989.10478785]Search in Google Scholar
[16. Dempster, A., N. Laird, D. Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. - Journal of the Royal Statistical Society, Vol. B, 1977, No 39, pp. 1-38.10.1111/j.2517-6161.1977.tb01600.x]Search in Google Scholar
[17. Winkler, W. E. Improved Decision Rules in the Felligi-Sunter Model of Record Linkage. Technical Report Statistical Research Report Series RR93/12, U.S. Bureau of the Census, Washington, D.C., 1993.]Search in Google Scholar
[18. Cochinwala, M., V. Kurien et al. Improving Generalization with Active Learning. - Information Sciences, Vol. 137, September 2001, No 1-4, pp. 1-15.10.1016/S0020-0255(00)00070-0]Search in Google Scholar
[19. Breiman, L., J. Friedman et al. Classification and Regression Trees. CRC Press, July 1984. ]Search in Google Scholar
[20. Hastie, T., R. Tibshirani, J. Friedman. The Elements of Statistical Learning. - Springer Verlag, August 2001.10.1007/978-0-387-21606-5]Search in Google Scholar
[21. Bilenko, M., R. Mooney et al. Adaptive Name Matching in Information Integration. - IEEE Intelligent Systems, Vol. 18, 2003, No 5, pp. 16-23.10.1109/MIS.2003.1234765]Search in Google Scholar
[22. Chang, K. C., B. He, C. Li, M. Patel, Z. Zhang. Structured Databases on the Web: Observations and Implications. - SIGMOD Record, Vol. 33, 2004, No 3, pp. 61-70.10.1145/1031570.1031584]Search in Google Scholar
[23. Cohen, W., J. Richman. Learning to Match and Cluster Large High-Dimensional Data Sets for Data Integration. - In Proc. of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002.10.1145/775047.775116]Search in Google Scholar
[24. Mc Callum, A., B. Wellner. Conditional Models of Identity Uncertainty with Application to Noun Coreference. - In: Proc. of Advances in Neural Information Processing Systems (NIPS’2004), 2004.]Search in Google Scholar
[25. Xiao, C., W. Wang, X. Lin et al. Efficient Similarity Joins for Near-Duplicate Detection. - ACM Transactions on Database Systems (TODS), Vol. 36, 2011, No 3, p. 15.10.1145/2000824.2000825]Search in Google Scholar
[26. Tejada, S., C. Knoblock, S. Minton. Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification. - In: Proc. of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002.10.1145/775047.775099]Search in Google Scholar
[27. Rohit, A., S. Chaudhuri, V. Ganti. Eliminating Fuzzy Duplicates in Data Warehouses. - In: Proc. of 28th International Conference on Very Large Databases, 2002.]Search in Google Scholar
[28. Guha, S., N. Koudas et al. Merging the Results of Approximate Match Operations. - In: Proc. of 30th International Conference on Very Large Databases, 2004, pp. 636-647.10.1016/B978-012088469-8.50057-7]Search in Google Scholar
[29. Chaudhuri, S., V. Ganti, R. Motwani. Robust Identification of Fuzzy Duplicates. - In: Proc. of 21st IEEE International Conference on Data Engineering (ICDE’2005), 2005, pp. 865-876.]Search in Google Scholar
[30. Christen, P. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication. - IEEE Transactions on Knowledge and Data Engineering, Vol. 24, 2012, No 9, pp. 1537-1555. 10.1109/TKDE.2011.127]Search in Google Scholar