Development of a Supervised Model of Regression with Machine Learning for Particulate Matter in the Metropolitan Area of Guadalajara
Fecha
Autores
Título de la revista
ISSN de la revista
Título del volumen
Editor
Resumen
The Metropolitan Area of Guadalajara (MAG) is the second most contaminated area in Mexico. The effects of the change could be seen in the region’s climate system on the ecosystem and the health of its population, caused mainly by anthropogenic activities that produce atmospheric pollutants. A greater understanding of the atmosphere, the phenomena and effects of pollutants on the air quality of the MAG is needed. particulate matter (PM) affects directly air quality, human health and the environment, mainly. The MAG does not have measurements of PM 2.5 micrometers (anthropogenic) of its entire surface, only with PM 10 micrometers. In this project, using Python, data from the MAG and its surroundings were extracted from Giovanni Nasa Web Page files, with the netCDF4 library. Three measurements of interest (Ångström Exponent, Aerosol Optical Depth and Mass Concentration) were integrated into a single dataset. From this, descriptive statistics were obtained with Seaborn, Matplotlib and Tableau. Seasonal patterns stood out and motivated the search for a machine learning regressor that would “learn” and predict the size of the particles by month. Due to the distribution of the patterns, it was decided to use a “boosting” model. Three ways were proposed to “train” the model. The first, with the complete dataset. The second and third with “feature engineering” and a “sliding window”, which considers trends followed each month, using data from the MAG and data from the MAG and its surroundings, respectively. The final model makes accurate quantitative and qualitative forecasts of particle size in a given month, useful for taking preventive measures.