Data Science Structure and the role of Statistics
Conference
64th ISI World Statistics Congress
Format: IPS Paper
Keywords: citizen science, data science, machine learning
Session: IPS 90 - Fourth Industrial Revolution, Data Science and the Future of Official Statistics
Monday 17 July 2 p.m. - 3:40 p.m. (Canada/Eastern)
Abstract
Data science has emerged as a very strong, visible, and publicly recognized label for problem-solving using ever-growing, large datasets and new data sources such as administrative registers, satellites and aircrafts, webcams, data voluntarily provided by the internet users, data harvested from the web and so on. The applications of data science tools range from earth observation to official statistics. The discussion on advantages, disadvantages, limitations, and requirements of the use of alternative data sources integrated with probability sample surveys is informing the debate in national and international statistical systems all over the world. The crucial question is: what is “data science”? Is it statistics or computer science? Which is its relationship with “big data”, “new data sources”, “machine learning”, “artificial intelligence” and “smart statistics”? These questions will be addressed in this paper.