This fraud detection framework is expected to be able to learn various attack types in various types of online social network (OSN) and then recommends actions based on the attack type detected. It should learn those attacks on any kind of network that is plugged into it.
This component will be responsible for continuously collecting OSN data either by manual crawling or using an OSN API and then feeding it into the Content Filter.
The content filter will parse the OSN data in three separate parts: network topology information as graph object, images, and content sequences collection according to Figure 1. For example, it will search all images posted by users in the OSN data and feeds these images to the Image-base detection component that will leverage Deep Convolution Neural Network. This content filter will also be responsible for supplying the ground truth labels that will be used for training and testing in each detection techniques. These ground truth labels can be supplied by human at the beginning.
Static Attack / Fraud Detection using Deep Autoencoder
This component will take a single time snapshot of the OSN graph as input. It will then extract spectral features (eigen-values and eigen-vectors) and construct training features based on each nodes’ spectral features and each nodes’ neighborhood’s spectral features. Then it will do a representation learning on the training features using deep autoencoders and feeds that to a supervised classifier. Then it should be able to classify a node in the graph as either attacking node or benign node. This is discussed in my Project 14. Previous works have shown that spectral features and neighborhood information can be leveraged to identify attacking nodes. However, previous works have not done it by leveraging the power of deep autoencoder. Combining these two techniques should make the detection more powerful and automated.
Image-based Attack / Fraud Detection using Deep Convolution Neural Network
This part will take images as input and leverages deep convolutional neural network to detect offensive images, hoax images, and other kind of images related to an attack/fraud. For example, we can introduce the pattern of offensive images as training data. This component will then learn to search for / recognize similar images from the OSN data and classifies those as offensive images.
Figure 1: The Architecture of the Fraud Detection Framework
Dynamic Attack / Fraud Detection using Deep Recurrent Neural Network
This component will take users’ content posting sequence collection as input and leverages deep recurrent neural network to detect dynamic attacks that occur in multiple phases over time. For example, one data vector will contain the sequence of pages/posts for one particular user. An attacking node that tries to resemble regular node at the beginning and then do the actual attack after getting enough connections should be detected by this technique. After having enough training data of these dynamic attack patterns, the deep recurrent network should be able to identify similar attacking patterns.
Decision Maker and Actions
All three detection techniques will feed their classification results to the decision maker component. The decision maker component will then act as a classifier or a mapper to recommend relevant action based on those classification results. For example, a bad edit on Wikipedia should be reverted. Another example, an account that posted offensive images should be banned / blocked / suspended from the network. After getting good detection rate, this framework can be integrated into the social network provider system as automatic administrator (can be considered as a good bot) to minimize the workload of human administrators.
In addition to performance review by human at the beginning, all three detection techniques will continuously improve their detection rate by testing on future feed / next stream of the OSN data. Then as an OSN content is evolving, the framework will get some feedback / blame terms to improve their detection rates from testing on the next stream of the OSN data. This feedback / blame terms is analogous to backpropagation technique.