CROWD ABNORMAL BEHAVIOUR DETECTION USING DEEP LEARNING

. Crowd analysis has become an extremely famous research point in the territory of computer vision. Computerized examination of group exercises utilizing reconnaissance recordings is a signiﬁcant issue for public security since it permits the identiﬁcation of hazardous groups and where they’re going. We all see how many problems are faced because of the crowd. In our country, many terrorists are there. They plant a bomb in a crowded area which causes a lot of injuries. Thieves are mostly found or always leave in crowded areas so they can easily get an advantage of the crowd. In that situation, crowd analysis is very important. This paper presents the design of the deep learning architecture that provides control over the crowd behavior that will help to avoid violence or any other act which occurs because of the crowd which causes harmful e ﬀ ects to the society. So we are proposing a system that detects abnormal behavior of crowds using deep learning techniques.


Introduction
A crowd is countless individuals assembled in a scattered or uncontrollable manner. It is a gathering of individuals assembled in a specific area. Crowds vary in various situations, for example, the crowd in a sanctuary will be unique in relation to the crowd in a shopping zone. The setting where the word 'crowd' is utilized shows the kind of gathering as far as size, duration, composition, inspiration, cohesion, and nearness of people.
Nowadays, so as to ensure social security, video surveillance frameworks whose equipment cost is low, have been broadly applied out in the public places, for example, air terminals, metro stations, traffic intersections, and schools, and so on. In any case, the greater part of the current video surveillance frameworks are simply utilized as a record framework, which can't identify and dissect an unusual occasion naturally, and it is inconceivable for an individual to watch the screen consistently. [1] Since the observing offices are ordinarily mounted out in the public places, abnormal events discovery in crowd has been basic and has become another region of enthusiasm for the wise checking framework look into network the outcomes might be utilized for public security purposes [2].
Then again, traditional methods for individual behavior analysis can't be suitable for crowded scenarios due to the occlusion phenomenon. Video surveillance in packed territories is getting progressively huge for public security. Analyzing these crowd behaviors is crucial so any basic circumstance can be controlled before it turns out to be more serious. Videos of crowded scenes present testing issues in computer vision [3]. Video surveillance assumes an essential job in the present society in guaranteeing security in the region where huge quantities of individuals are relied upon to accumulate. An expansion in security threats has utilized surveillance systems in practically all the region. For instance, CCTV cameras are installed in the roads, shopping edifices, temples, stadiums, and so on., to guarantee security. Crowd behavior analysis encourages from numerous points of view to all the more likely comprehend crowd elements and related individuals behavior create reconnaissance or crowd control systems, organize and compose open spaces and improve computer animation models utilized with regards to video games or special effects. The structure of society has become more complex than it used to be. The people living in society have become intolerant towards sensitive issues persisting in society [4]. This has led to increased instances of public violence. Moreover, the weak security measures have made violence break out in the crowd easier. The present practice is physically checking video takes care of from a few sources. Video analytic permits the the automatic detection of events of interest, however it faces numerous difficulties as a result of non-inflexible crowd movements and impediments. The algorithms developed for rigid ob-jects are ineffectual in managing crowds. The aim of this work is to develop a system which can classify normal and abnormal behaviors in crowds using a real-time video surveillance system to monitor crowded urban environments using machine algorithm techniques [5]. This system provides rein in crime because the recorded footage is substantial evidence against the criminal and because of crowd analysis will help security agencies to stop the crowd criminal activities like riot etc.

Review of Literature
Numerous works have been done related to the crowd behavior analysis using different machine learning techniques. The datasets, algorithms, methods used by the authors and observed results along with the future scope are carried out in finding out efficient methods of crowd abnormal behavior detection with different types of crowded scenes. Here research on crowd behavior analysis has been carried out for several years. This section gives information of the existing system available and gives information of the various techniques used.
Lazaros Lazaridis. [1] proposed new techniques for the detection and characterization of unusual behavior in dense crowd for example the approach used to create the crowd density heat-maps, the extraction of the related optical flow. But when talking about Abnormal behavior detection it should cover the weather condition, lightning condition and camera angle but it is difficult to gather genuine information with abnormal behavior in dense crowds and that's why it will not give the more accurate results. Vishwa Patel. [2] used object based behavior reorganization method which is transfer on the information on the people that structure the group, which was trying for denser situations. It is just reasonable for low and moderately crowded scenes. So they additionally utilized the second methodology that is all holistic approach regards the crowd as a solitary element, with the goal that the tracking issue is not a challenge. This subsequent methodology is progressively suitable for denser scenes, yet it can't identify abnormal behavior effectively.
Krutika Rohit. [6] presents behavior label distribution method, that can be used for solving problems of mixed behavior. By using behavior label distribution archived public security and safety. But this technique contains so many steps like giving sequence numbers and labels to every behavior and then analyzing the abnormal behavior. This process is time consuming and this system is not real time so that it is not much efficient to handle critical situations.
Swathi H Y. [7] discussed the various techniques utilized for recognizing unusual crowd behavior viz., crowd density estimation, tracking, motion detection and behavior recognition/understanding. They likewise examined the issues in crowd analysis and the answers for these issues. This work is their initial phase in the research of crowd behavior analysis, there are a few issues despite everything being confronted in regards to the efficiency, effectiveness, robustness of the automated surveillance systems for crowd analysis. Dongping Zhang. [8] proposed another technique for crowd motion behavior detection that is SIFT flow innovation is used for identifying the movement data from the two adjacent frames in the video arrangements, at that point the weighted direction histogram which assumes the job of a factual estimation for the SIFT flow is taken as the contribution for the HMM model.
Marc Van Droogenbroeck. [9] presented a widespread sample based background subtraction algorithm, called ViBe, which consolidates three creative strategies i.e. classification model, background pixel model, neighboring pixel model.
[10] Used a novel system for representing video information by a lot of basic highlights, which are implied naturally from a long video film through a deep learning approach. In particular, a deep neural network made out of a measure of convolutional autoencoders was utilized to process video frames that caught uncommon structures in information, which gathered, make the video representation. This representation is taken care of into a measure of convolutional brief autoencoders to become familiar with the regular temporal patterns. Right now, neural network in that, all data sources and yields are autonomous from one another.
R. Mehran. [11] proposed another algorithm for discovery and confinement of abnormalities present in crowded scenes recordings by utilizing the SFM association power in blend with the Particle crowd Optimization. This work lies in presenting the enhancement of the social force and performing molecule shift in weather conditions to acquire the improved association power as per the fundamental optical flow field. The entire irregularity detection/localization process is completed with no learning stage. This infers the proposed strategy is very appropriate for genuine situations.
Weiming Hu. [12] introduced a review of ongoing improvements in visual reconnaissance inside a general handling structure for visual observation frameworks. The best in class of existing strategies in each key issue is depicted with the emphasis on the following errands: recognition, following, comprehension and depiction of practices, individual recognizable proof for visual reconnaissance, and intelligent observation utilizing numerous cameras. With respect to the discovery of moving items, it includes ecological demonstrating, movement division and article grouping. Three systems for movement division are tended to: foundation subtraction, fleeting differencing, and optical stream.
But still, there is a need to detect abnormal activity of crowds at an early stage so that it will help us to prevent problems faced because of the crowd. That motivates us to build a system which detects abnormal behavior of crowds.

Proposed Methodology
The proposed system consists of planning and executing the system which secured the answers for the existing system required. The task additionally includes an extensive assessment of the system, and is to give proficient se-curity to Urban by building up an Unsupervised Abnormal Crowd Behavior Detection system. We developed a system which can characterize normal and abnormal behaviors in crowds utilizing a constant video observation system and using CNN algorithm to analyze monitoring crowded urban environments.

Algorithms
Algorithms used to build the proposed system are described as follows:

ViBe
ViBe is a universal sample-based background subtraction algorithm. Background subtraction is the way toward isolating out frontal area objects from the Background during a sequence of video outlines. sequence subtraction is a generally utilized methodology for discovery of moving objects from static cameras [9]. This methodology varies from those dependent on the traditional conviction that the most seasoned qualities ought to be supplanted first. At long last, when the pixel is seen as a major aspect of the foundation, its worth is proliferated away from plain sight model of a neighboring pixel. We depict our strategy in full subtleties and contrast it with other foundation subtraction strategies [13]. Productivity figures show that our strategy outflanks later and demonstrated cutting edge techniques regarding both calculation speed and recognition rate. We likewise examine the presentation of a downscaled variant of our calculation to irrefutably the base of one examination and one byte of memory for each pixel. It gives the idea that even such an improved adaptation of our calculation performs superior to standard procedures.

Convolutional Neural Network (CNN)
Convolutional neural networks are most useful with very large data sets, large numbers of features and complex classification tasks. A Convolutional Neural Network (CNN) is a type of neural network algorithm that is also used in deep learning and it comprised one or more layers. The CNN algorithm is designed to take images as an input to train the system to give the predictions. It builds the features from the various groups of pixels. CNN's are simpler to prepare and have numerous less parameters than completely associated with the systems with a similar number of hidden units. Convolutional neural systems comprise of an input layer, an output layer, and hidden layer operations.
A neural system comprises of a few distinct layers, for example, the information layer, in any event one concealed layer, and a yield layer. They are best utilized in object discovery for perceiving examples, for example, edges, shapes, hues, and surfaces. The shrouded layers are convolutional layers right now neural system which acts like a channel that initially gets input, changes it utilizing a particular example/include, and sends it to the following layer. With more convolutional layers, each time another information is sent to the following convolutional layer, they are changed in various ways. For instance, in the first convolutional layer, the channel may distinguish shape/shading in a locale, and the following one might have the option to close the item it truly is , and the last convolutional layer may characterize the article as a pooch. Essentially, as an ever increasing number of layers the info experiences, the more refined examples the future ones can identify.
For a frame t, the output highlight of 5th convolution layer is a 6h6h256 vector. In a standard CNN, the highdimensional convolution highlights are normally taken care of to fully connected layers for dimensionality decrease, and then input to a classifier. Be that as it may, in FCN, there are no completely associated layers, and we are not preparing a classification task. So we dispose of heavy classifier, for example SVM, and utilize a novel strategy for dimensionality decrease as well as abnormal recognition.
We encode the high-dimensional convolution includes as a set of binary codes by iterative quantization (ITQ) [13]. ITQ is a proficient unsupervised method for learning binary codes in huge scope picture assortments. ITQ reduces the dimensionality of information by PCA first, at that point finds a rotation of zero-centered data to limit the quantization blunder of mapping this information to binary codes. Preparing the ITQ costs not as much as preparing a classifier, and we can extend the identification task simpler dependent on the binary codes.

K-Means
K-Means clustering algorithm is an unaided algorithm and it is utilized to section the intrigue zone from the foundation. It groups, or segments the given information into K-clusterd or parts dependent on the K-centroids. The algorithm is utilized when you have unlabeled data(i.e. information without characterized classes or gatherings). The objective is to discover certain gatherings dependent on a closeness in the information with the quantity of gatherings group to by K.
A K-Means clustering algorithm is used for motion detection of the objects which are detected by CNN. We created the cluster of objects pixels. Then we calculate the difference between the position of that object and calculated Euclidean distance on the bases of that we set the threshold value for detecting abnormal behavior of the crowd.
Knn is used to calculate the points distance between two image frame in sequence to track the motion of moving objects.

Results
Success of intensive learning network training is subject to the presence of enormous datasets. Large buildings, commented on with the mark of ground truth, the dataset is over the top expensive because of the need of human time and exertion. A decent Dataset needed to detect unusual phenomena in crowded scenes be different, occupying every single imaginable part of the issue; And fine annotate. It should cover different seasons Conditions, power conditions, different camera angles and Crowd density. Furthermore, due to legal and confidentiality rules it is incredibly hard to gather genuine information with unusual behavior in dense crowds. After analyzing some of the most well known datasets related to abnormal crowd behavior, we conclude the UMN dataset [15]: this dataset includes panic congestion behavior in a non-reasonable way. Novel violent [16]: By comparing this dataset with UMN, which is progressively reasonable however it needs crowd density.
The proposed network gives better results than those reported in the literature. Table-I [1] is a confusion matrix of existing system which uses Density Heat maps and Optical Flow for Abnormal behavior detection in crowd scenes. On the basis of confusion matrix the accuracy of existing system is 77.08% Whereas Table-II is confusion of proposed system. On the basis of confusion matrix the accuracy of proposed system which is based on convolutional neural network and K-Means algorithm is 80.79%.