Towards a general cellular frustration approach to anomaly detection

Gonçalves, Rodrigo Faria

Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/34080

Title:	Towards a general cellular frustration approach to anomaly detection
Other Titles:	Desenvolvimentos para a generalização dos modelos de frustração celular para deteção de anomalias
Author:	Gonçalves, Rodrigo Faria
Advisor:	Abreu, Fernão
Keywords:	Cellular frustration Machine learning Unsupervised learning Anomaly detection One-class support vector machines Isolation forest k-means
Defense Date:	10-Dec-2021
Abstract:	Cellular Frustrated Systems model the interactions between presenter and detector agents, with the goal of detecting anomalies in data sets. These two types of agents follow a frustrated dynamic (i.e., unstable), in which they continuously change the agent of the other type they are paired with, when a normal sample is presented by the presenter agents. In order for CFSs to make detections, presenters must show abnormal samples that have one or more abnormal features, which leads the agents to pair for longer times, leading to stable pairs, hence detections. This work improves upon the previous versions of this model by allowing detectors to see two regions of feature space as abnormal, with a normal region of feature space in-between. The K-means clustering technique is also used to cluster data in data sets, so that detectors are able to partition the feature space and be assigned to a certain region which they will see as normal, with the rest being seen as abnormal. It is shown that there is no need to train separate populations of detectors in order to make detections with data sets that have abnormal samples in-between normal ones. The current version of the model is compared with previous versions of the Cellular Frustration model, and also with two well known anomaly detection methods, the One-Class Support Vector Machines, and the Isolation Forest. The results show that it has comparable performance in relation to the competing methods, whereas regarding the previous versions, it is able to achieve the same results while being a more robust model applicable in more situations with less effort. Finally, some ideas for future work are discussed in order to further improve the model, which still has some issues when certain unfavorable conditions arise in data sets. Os Sistemas de Frustração Celular modelam interações entre agentes apresentadores e detetores, com o objetivo de concretizar deteções de anomalias em data sets. Estes dois tipos de agentes seguem uma dinâmica frustrada (i.e., instável), na qual continuamente trocam de agente do outro tipo com o qual estão emparelhados, quando uma amostra normal é apresentada pelos agentes apresentadores. De forma a que os SFCs consigam fazer deteções, os apresentadores têm de mostrar amostras anómalas que tenham uma ou mais características anómalas, o que leva a que os agentes emparelhem durante mais tempo, levando a emparelhamentos estáveis e consequentemente deteções. Este trabalho melhora as versões anteriores do modelo ao permitir que os detetores vejam duas regiões do espaço das características como anómalas, com uma região normal entre elas. A técnica de clustering K-means também foi utilizada para agrupar dados nos data sets, para que os detetores consigam particionar o espaço das características e sejam atribuídos a uma certa região que verão como normal, enquanto que o resto verão como anómala. Mostra-se que não existe necessidade de treinar populações de detetores separadamente para fazer deteções em data sets que tenham amostras anómalas entre amostras normais. A versão atual do modelo é comparada com versões prévias do modelo de Frustração Celular, e também com dois métodos bem conhecidos de deteção de anomalias, o One-Class Support Vector Machines, e o Isolation Forest. Os resultados mostram que tem desempenho equiparável em relação aos métodos concorrentes, enquanto que em relação às versões prévias, é capaz de atingir resultados equivalentes sendo um modelo mais robusto aplicável em mais situações com menos esforço. Por fim, algumas ideias sobre trabalho futuro são discutidas de forma a que o modelo seja melhorado, pois ainda revela alguns problemas quando certas condições pouco favoráveis surgem em data sets.
URI:	http://hdl.handle.net/10773/34080
Appears in Collections:	UA - Dissertações de mestrado DETI - Dissertações de mestrado DFis - Dissertações de mestrado DMat - Dissertações de mestrado

Files in This Item:

File	Description	Size	Format
Documento_Rodrigo_Gonçalves.pdf		18.37 MB	Adobe PDF	View/Open

Show full item record