For more details on registrations and submissions for the Large-Scale Spatial Data Science, please first login to your account. If you do not have an account then you can create one below:
The course, designed for data scientists, geospatial analysts, and researchers, will provide a comprehensive understanding of advanced methods in large-scale geospatial data science. The focus will be on three key topics: large-scale data modeling and prediction, accelerating geospatial data processing with multi- and mixed-precision techniques on modern hardware architectures, and parallelizing related R codes using the first parallel runtime system package in R. Participants will first explore ExaGeoStatCPP, a parallel framework for high-performance geostatistical computations. It enables efficient modeling and prediction of large-scale geospatial datasets within C++ and R environments. The course will also focus on the MPCR package, which provides multi- and mixed-precision support on CPUs and GPUs. Attendees will learn how to integrate MPCR functions into their R workflows to optimize performance and precision trade-offs in computational tasks. Participants will also be introduced to RCOMPSs, a new runtime system designed to parallelize R code across HPC systems. The course will demonstrate how RCOMPSs can be used to accelerate R code execution in high-performance computing environments, providing hands-on experience in parallelizing computations effectively. Hands-on sessions will provide practical examples of parallelizing computations. By the end of the course, participants will have gained advanced skills in large-scale geospatial data science and be ready to apply them in their professional roles.
Marc G. Genton is Al-Khawarizmi Distinguished Professor of Statistics at the King Abdullah University of Science and Technology (KAUST) in Saudi Arabia. He received the Ph.D. degree in Statistics (1996) from the Swiss Federal Institute of Technology (EPFL), Lausanne. He is a fellow of the American Statistical Association (ASA), of the Institute of Mathematical Statistics (IMS), and the American Association for the Advancement of Science (AAAS), and is an elected member of the International Statistical Institute (ISI). In 2010, he received the El-Shaarawi award for excellence from the International Environmetrics Society (TIES) and the Distinguished Achievement award from the Section on Statistics and the Environment (ENVR) of the American Statistical Association (ASA). He received an ISI Service award in 2019 and the Georges Matheron Lectureship award in 2020 from the International Association for Mathematical Geosciences (IAMG). He led a Gordon Bell Prize finalist team with the ExaGeoStat software for Super Computing 2022. He received the Royal Statistical Society (RSS) 2023 Barnett Award for his outstanding research in environmental statistics and the prestigious 2024 Don Owen Award from the ASA’s San Antonio Chapter. He again led a Gordon Bell Prize in Climate Modeling winner team with Exascale Climate Emulators for Super Computing 2024. His research interests include statistical analysis, flexible modeling, prediction, and uncertainty quantification of spatio-temporal data, with applications in environmental and climate science, as well as renewable energies.
Personal webpage: http://stsds.kaust.edu.sa
Sameh Abdulah obtained his M.S. and Ph.D. degrees from Ohio State University, Columbus, USA, in 2014 and 2016, respectively. Presently, he serves as a research scientist at the Extreme Computing Research Center (ECRC), King Abdullah University of Science and Technology, Saudi Arabia. His research focuses on various areas, including high-performance computing applications, big data, bitmap indexing, handling large spatial datasets, parallel spatial statistics applications, algorithm-based fault tolerance, and machine learning and data mining algorithms. Sameh was a part of the KAUST team nominated for the ACM Gordon Bell Prize in 2022 and winning it on 2024 (climate track) for their work on large-scale climate/weather modeling and prediction.
Personal webpage: https://sites.google.com/view/samehabdulah
Mary Lai O. Salvaña is an Assistant Professor in Statistics at the University of Connecticut (UConn). Prior to joining UConn, she was a Postdoctoral Fellow in the Department of Mathematics at the University of Houston. She received her B.S. and M.S. degrees in Applied Mathematics from Ateneo de Manila University, Philippines, in 2015 and 2016, respectively, and Ph.D. degree at the King Abdullah University of Science and Technology (KAUST), Saudi Arabia. Her research interests include extreme and catastrophic events, risks, disasters, space-time statistics, environmental statistics, high performance computing, and computational statistics.
Personal webpage: https://marylaisalvana.com/
The course, designed for data scientists, geospatial analysts, and researchers, will provide a comprehensive understanding of advanced methods in large-scale geospatial data science. The focus will be on three key topics: large-scale data modeling and prediction, accelerating geospatial data processing with multi- and mixed-precision techniques on modern hardware architectures, and parallelizing related R codes using the first parallel runtime system package in R. Participants will first explore ExaGeoStatCPP, a parallel framework for high-performance geostatistical computations. It enables efficient modeling and prediction of large-scale geospatial datasets within C++ and R environments. The course will also focus on the MPCR package, which provides multi- and mixed-precision support on CPUs and GPUs. Attendees will learn how to integrate MPCR functions into their R workflows to optimize performance and precision trade-offs in computational tasks. Participants will also be introduced to RCOMPSs, a new runtime system designed to parallelize R code across HPC systems. The course will demonstrate how RCOMPSs can be used to accelerate R code execution in high-performance computing environments, providing hands-on experience in parallelizing computations effectively. Hands-on sessions will provide practical examples of parallelizing computations. By the end of the course, participants will have gained advanced skills in large-scale geospatial data science and be ready to apply them in their professional roles.
We can summarise the proposed course topics as follows:
The prerequisites for attending this short course include having a background in data science, geospatial analysis, or related research. It is designed for individuals who want to advance their skills in large-scale geospatial data science, specifically those interested in geospatial data modeling, multi-precision computing, and parallelization of R code for high-performance computing.
For more details on registrations and submissions for the Large-Scale Spatial Data Science, please first login to your account. If you do not have an account then you can create one below: