Unsupervised Text Classification: a Contractual Risk Detection Approach

Cargando...
Miniatura

Fecha

2020-11

Autores

Villalobos-Ramos, Omar A.

Título de la revista

ISSN de la revista

Título del volumen

Editor

ITESO

Resumen

Descripción

Enterprise contracting process tends to be tedious when there is thousands of active contracts to manage. The aim of this work was to implement an automatic indexing and information retrieval method in order to classify the semantic structure within contract documents into two classes, risk and non-risk legal language, on the basis of terms contained in new documents further called queries. The technique implemented is term frequency as the transformation procedure for each of the documents and singular-value decomposition to represent such transformations into a set of optimized number of factors. Queries are analyzed as vectors formed from the linear combination of the terms and compared to known documents class with cosine values to determine the nature of the legal language (as risk or non-risk). The result of this work shows that the class detection is possible using the proposed methodology with high relative percentage of accuracy.

Palabras clave

Natural Language Processing, Contracting, Classification

Citación

Villalobos-Ramos, O. A. (2020). Unsupervised Text Classification: a Contractual Risk Detection Approach. Trabajo de obtención de grado, Maestría en Ciencia de Datos. Tlaquepaque, Jalisco: ITESO.