An Anomaly Detection Process for a Business Solution





López-Miranda, Ana V.

Título de la revista

ISSN de la revista

Título del volumen





In this work, a business solution’s implemented using machine learning algorithms. The solution consists of a particular realistic case of a corporation where the financial department has struggled to evaluate large quantities of information to capture cost irregularities from its external suppliers’ billing process. The design of this solution is to solve three critical problems from the business, a tool that detects the anomalies in an automated fashion that helps the increase savings reducing the number of experts who are currently needed only to detect the anomalies. Implement unsupervised machine learning methods that allow a massive tagging of the data due to the current lack of labels in the information. Apply a method whose results are validated and reviewed by the business Subject Matter Experts, a.k.a. SMEs for use. A process that simulates the expert’s classification is generated, using a method that allows us to tag the historical data and accelerate the SMEs’ manual labeling. The overall workflow consists of five different phases, where first gather the information from the organization into a single database from where the feature transformation and selection are applied. Once the characteristics are defined and ready to use, the process continues with the unsupervised training using a probabilistic method that provides us with the massive tagging of our binary classification. The labeled dataset is then shared with the business experts for a review and feedback process in which they provide us the correct classification for the observations that went into the model. Finally, the data is inputted into a supervised algorithm selected through a fixed accuracy threshold and contamination rate. Using these parameters as conditions, the model that adjusts better than the probabilistic unsupervised approach is then selected. When the criteria are met, the model is deployed to a production environment for user consultation.

Palabras clave

Anomaly Detection, Business Solution, Outlier Analysis, Unsupervised Learning


López-Miranda, A. V. (2020). An Anomaly Detection Process for a Business Solution. Trabajo de obtención de grado, Maestría en Ciencia de Datos. Tlaquepaque, Jalisco: ITESO.