Design and Implementation of SMS Extraction and Analysis System

. With more and more applications of smart phone, more and more text messages in mobile phone, and then more and more information possibly contained in text messages. How to dig out useful information form SMS? This paper discusses techniques of the SMS extraction and analysis. Taking the bank SMS as example, key information is extracted to inform, is formatted to story in APP database, and then be analysed and statistic result shown in chart. The APP with the function is run well on Android phone and has Practical value. This technology helps to expand the application of SMS.


Introduction
"Intelligent" of smart phone greatly extends the functionality of the phone [1], the phone call is not just a call tool, it has become a good assistant and an indispensable tool in people work, live, entertainment, communication and transaction processing.Meanwhile, SMS(Short Message Service) is no longer a text communication tool.As one of the guarantees real-name system, SMS gradually become a tool of receiving verification information, accessing to the notification, and making transaction confirmation.With the development of fast payment such as online banking, Paypal, WeChat pay, etc., people are more inclined to quickly and easily done through mobile payment, during this process, SMS link in the whole play, timely notification, preserve transaction records and play an important role.So, it is cleared that SMS-based treatment processing, statistical analysis becomes significant and promising.

Business Process
Data is extracted automatically in the background.When the client first starts, SMS inbox is scanned automatically, key information is extracted from the corresponding text by regular expressions, including transaction time, transaction amount, transaction type, transaction banking, and etc., and automatically recorded in the APP database.When new message arrives, critical information also extracted by regular expressions [2], simplified refining and give a notification.Business process is shown in Figure 1.

User Guide
The main function of this module is to show the main function and characteristics of the APP for the first use.This module is to be beautiful with elegant design and natural smooth transition animation.Leave a great impression to the users, meanwhile hiding the database processing module processing in the background.

Authentication
The main function of this module is to ensure the safety of personal privacy [3].Because APP involves important information such as bank card, belong to the category of personal financial privacy, therefore, before entering the APP take two authentication mechanisms, including the Pin code and fingerprint recognize authentication.
Pin code validation:4 digit Pin code is used as the authentication information, and be set by the user.
Fingerprint authentication: fingerprintsis used as authentication schemes.a system level verification, access to key information in the TrustZone hardware, security is guaranteed.

Data Processing
Data extraction occurs at two moments: one is the first start of the application; another is the arrival of a new message after the starts.For the first case, the inbox database is scanned.In both case, data are acquired through match by regular expression, then are formatted and stored into APP database for analysis.
Database processing flow of the first case is shown in figure 1.

Regular Expression
The main function of this module is to get key information using a regular expression.Regular Expression [4] ( abbreviated as regex, regexp or RE), is syntax rules described by single string which is used to match a series of sentence.In many text editor, Regular expression are typically used to retrieve, replace the text that fits a pattern.
Different types of message take different regular expressions [5], Part of the regular expression, function and the matching results are shown in table 1. the key message is not the same, use Bank and Express message as examples.
Bank SMS: the bank name, bank card number and transaction amount, transaction time and transaction details information are necessary.Some information needs to format to show in charts.
Express SMS: Courier name, delivery time, delivery take the key information are necessary to ensure information notice.

SMS Broadcast Receiver
The main function of this module is to get messages broadcast and analyze the message transmitting and text messages when phone received new messages, if the message is from bank or express , call the regular key information processing module to extract data, and call the information notification module of key information to make a notice, at the same time to write the key information into the database for storage.The business flow chart is shown in figure 3.

SMS Summary
The main function of this module is to show the information of the database.Using CardView and RecyclerView layout, only show the key information, Such loading step by step:Load 20 items while o the interface is initialized, cooperate with the drop-down refresh and tensile load, more data to be loaded when the action triggers, reduce page rendering time.While adding a refresh and load the animation, in line with the intuitive operation.

Statistic Chart
Charts display visually the user's income and expenditure.For data stored in the APP database give statistics daily expenses and income; illustrate result monthly time periods, illustrating statistical results for each month.Expenditure and revenue respectively below and above the horizontal axis, a head understand.The results of the monthly income and expenditure are given in text.

QR Code Scanning
The main function of this module is to read the Courier number, at the same time for the APP to join more extensions provide reliable interface.

Guide Interface
The interface uses a light background color with a flat Schematic diagram.It introduces APP features, while completing the database scan task in the background.The running effect as show in Figure 4.

Authentication Interface
The interface provides two authentication ways: fingerprint verification and password verification, shown in Figure 5.In Fig. 5, the left one for fingerprint verification error, the center one for fingerprint authentication is successful interface, and the right one to enter the Pin code validation interface.

SMS Summary
Use the card layout, left for conventional interface, center for the drop-down refresh, on the right for tensile load and dynamic hide the ToolBar, shown in Figure 6.The key code is as follow:

Conclusion
This system developed on the Android [6] platform with Java language and MVC [7] architecture.The system develop a complete text extraction, stored, ,analysis and gracphic display static result.System interface has the stytle of Google official Material Design style, with a good visual effect and interactive experience.Using a regular expression to extract text information, has high flexibility and extensibility.

Figure 3 .
Figure 3. Receiving and Process of SMS

Figure 4
Figure 4 Guide Interface

Table 1 .
Example of Some Regular Expression