Business Meeting Summary Generation Using NLP

. The e ↵ ort of providing a brief and ﬂuent summary of a business meeting while preserving vital information content and overall meaning is known as business meeting summarizing.Initially, we convert recorded meetings, interviews, speeches, lectures, and other audio streams into text documents in the proposed work. After that, speaker text is divided into segments. Minutes of the Meeting are generated automatically using the proposed work.We also focused on using the Rev-AI Speech-to-Text API to better the voice to text conversion of a speciﬁc recorded audio ﬁle while making sure that the summarised text that has been converted from speech to text provides a speciﬁc and accurate meaning that is not only understandable to everyone but also covers all of the recorded ﬁle’s important points.


Introduction
Nowadays, most data distributed is through Internet due to which summarizing the distributed data becomes really essential. text synopsis turns out to be e↵ectively significant [1]. The earliest numerical data points in this topic date back to 1958. Analysts recommended that the recurrence of words be utilized as a numerical measure in this interaction, which is still valid for certain strategies [1].To save time during these lengthy Business Meetings, it is necessary to summarize the meetings. Text Summarization can be used to perform this summarization of Long Business Meetings. Business Meeting Summary Generation Using NLP can be used to listen to the doubts of the users and give answers to them in form of results which of late is something that Internet service providers are doing from past few years. For business summarizing, there are essentially two methods: • Abstractive Summarization :The Abstractive approach attempts to give an outline by interpreting the text utilizing progressed regular language calculations to build a new, more limited message that fetch the most data -portions of which may not present in the first record.This is in contrast to the extractive method, which uses just the sentences found in the original text • Extractive Summarization :The Extractive summarization approach will be used in our research .It uses just the sentences found in the original text.

Motivation
Because of the growing availability of documents, extensive study in the field of text summarization is required. A ⇤ e-mail: ary.jha.rt18@rait.ac.in ⇤⇤ e-mail: sam.tem.rt18@rait.ac.in ⇤⇤⇤ e-mail: pre.heg.rt18@rait.ac.in ⇤⇤⇤⇤ e-mail: navin.singhaniya@rait.ac.in summary is a text that is created from one or more texts, contains crucial information from the original text, and is no more than half the length of the original text, and frequently much less. The principle of Text Summarization is followed in "Business Meeting Summarization Using NLP."

Objective
The job of developing a brief and fluent summary while keeping the important data content and overall meaning is known as business meeting summarizing. The goal is to employ Business Summarization in Business Meetings to assist us summarise a recorded meeting while maintaining critical information and ensuring that the summarized meeting has the right context and meaning. Investigate various Business summarizing strategies. Compare the summaries that were generated.
Determine the best summary parameters (for example, k in extractive business meeting summarization). To scale an algorithm to di↵erent languages, identify or make adjustments (if possible). Also, distinguish between the various applications of business meeting summarization.

Organization Of Report
The report is organized as follows: • The introduction is given in Section-1. It describes the fundamental terms used in this project. It motivates to study and understand the di↵erent techniques used in this work. This Section also presents the outline of the objective as well as defines the problem of the report.Motivation,Objective and Novelty of the project is also mentioned in this section. • Section 3-presents the System Implementation in which algorithms and techniques are explained . The,architecture workflow of the system implemented in the research paper is discussed in detail.Flowchart of the algorithm that is used in the project is also mentioned in this section.
• Section 4-presents the proposed method for summarization. .The flowchart,workflow of Extractive summariation method that we have used in this project is also explained in this section.
• Section 5-Applications of the summary generator that we have build are mentioned in this section.
• Section 6-All the Experimental results of the summary generator that we have build are included in this section.
• Section 7-Output screenshots are included in this section.
• Section 8-The conclusion and the result are presented in this section.

TextRank Algorithm
The TextRank approach is based on PageRank, a popular algorithm for ranking web pages in search results. It makes a m x m lattice, with every cell having the probability that the client will visit that site, which is 1/(number of special connections in website page wi). The qualities are then intelligently refreshed and helps in briefing of text. TextRank performs in the same way. It creates a m x m adjacent matrix, where m is the number of sentences in the text. It is compared to mj (where i!= j) for each sentence mi (where i is a record). This comparison is made using cosine similarity or another approach for comparing two sentences. This is how the entire matrix is filled. The full row is added to decide the score for each sentence mi (where i = 1, 2, 3, and so on). A greedy search is used to select the top k sentences based on this score. These are the needed summary sentences. While processing the graph, the adjacency matrix increases the time complexity to O(m power 2), but the adjacency list reduces the difficulty to O(v + e) [1]. TextRank is more efficient at identifying the links between sentences. It is simple to apply cosine similarity when vectors are used.A chart connect between the two sentences shows that both are fundamental for an essential condition. As a result, TextRank performs brilliantly .Steps involved are : • ConvertToVector():Convert every one of the sentences to vector representation..
• SentencesSimilarities():Construct the graph with sentences as hubs(nodes) and register the hub edges (sentence likenesses) • kTopRanked(): Return K top words for generation of summary.

PageRank Algorithm
PageRank is a Google Search algorithm that ranks webpages in search engine results. Larry Page, one of Google's founders, was the inspiration for PageRank. PageRank is a metric for determining how important a website's pages are. The PageRank algorithm produces a probability distribution that is used to indicate the possibility of a random user clicking on links ending up on a specific page. PageRank can be estimated for any large set of documents. Several research publications assume that at the start of the computational process, the distribution is evenly distributed among all documents in the collection. The PageRank computations require several passes through the collection, referred to as "iterations," in order to alter approximation PageRank values to more nearly represent the theoretical actual value.

Cosine Distance -Text Similarity Metric
The degree to which two documents are related in terms of context or meaning is measured by text similarity. The Cosine distance-text similarity metric measures the cosine of the point between two m-layered vectors projected in a multi-faceted space. The cosine similarity between two papers can range from 0 to 1. [3] When the Cosine similarity score reaches 1, two vectors are aligned in the same orientation. The lower the value, the less comparable the two papers appear to be. The mathematical equation of Cosine similarity between two non-zero vectors is:

Extractive Summarization
An extractive summarization method involves extracting key sentences or paragraphs from the original text and compressing them into a shorter text. As the name suggests, extractive summarization focuses on calculating To understand how summarization system actually works, we'll go over three fairly independent tasks that all summarizers must finish.
• Generate an intermediate representation of the input text which expresses the text's key points.
• The sentences are selected depending on their representation.
• Choose a summary composed of a few sentences.

Intermediate Representation
Every summarizing system creates an alternative representation of the text it aims to summarize and uses it to identify essential content. There are two techniques to representation based on the representation: subject representation and indicator representation. Topic representation techniques reads the text's themes by converting it into an intermediate form. Each sentence is described as a list of significant attributes such as sentence length, position in the document, the existence of certain phrases, and so on in indicator representation approaches.

Sentence Score
After generating the intermediate representation, we assign an importance score to each sentence. The score of a sentence in topic representation approaches demonstrates how well the sentence explains some of the most important topics in the text. The score is computed in the most of of indicator representation methods by aggregating evidence from multiple indicators.

Summary Sentences Selection
To produce a summary, the summarizer algorithm selects the top k most important sentences. Some approaches use greedy algorithms to select important sentences, while others convert sentence selection into an optimization problem where a set of sentences is chosen with the constraint of maximising overall relevance and coherency while minimising redundancy. The meaning in which the summary is generated, for example, could be useful in determining its importance. Another factor which may impact sentence selection is the type of document (newspaper article, email, scientific paper, etc.).

Applications
This method is used by well-known social media platforms to generate summaries for postings that are categorized into subjects. These subjects are utilised to keep consumers interested on the internet. Instead of only o↵ering links, today's search engines provide immediate answers to the user's inquiry.Technologies for voice-based associates that responds to the users inquiries utilizes a similar methodology.Many weekly emails start with a brief introduction before presenting a well curated selection of relevant articles.
• One of the most significant marketing channels is video. People are now uploading videos on professional networks like LinkedIn, in addition to video-focused venues like YouTube or Vimeo. Scripting may or may not be required, depending on the type of video. When it comes to writing a script that integrates research from a variety of sources, summarization can be a valuable ally.
• For example, [1] Google's home feed makes synopses on basis of the users likes and dislikes . Instead of only o↵ering links, today's search engines provide immediate answers to the user's inquiry.
• Many weekly emails begin with an introduction and then feature a carefully curated selection of relevant content. Summarization would allow companies to supplement newsletters with a stream of summaries (rather than a list of links), which is a more mobile-friendly format.
• One of the most significant marketing channels is video. People are now uploading videos on professional networks like LinkedIn, in addition to video-focused venues like YouTube or Vimeo. Scripting may or may not be required, depending on the type of video. When it comes to writing a script that integrates research from a variety of sources, summarization can be a valuable ally.

ROUGE(Recall-Oriented Understudy for Gisting Evaluation)
Our summary generator correlates highly with human judgment and has high recall and precision significance test with manual evaluation results.We choose ROUGE(Recall-Oriented Understudy for Gisting Evaluation)precision, recall as the measurement of our experiment results.It is a set of metrics for calculating automatic Business Meeting summarization as well as machine translation.It compares an automatically generated summary or translation to a set of reference summaries (typically human-produced).To get a good quantitative value, we can actually compute the precision and recall using the overlap.

Recall
Recall (in the context of ROUGE) refers to how much of the reference summary the system summary is recovering or capturing .Recall can be measured as: number_o f _overlapping_words total_words_in_re f erence_summary

Precision
In precision, you're essentially determining how much of the system summary was actually relevant or actually needed.When trying to generate brief summaries, precision becomes extremely important .Precision can be meaured as : number_o f _overlapping_words total_words_in_system_summary

F-Measure
It is test accuracy metric. It contains the test's precision and recall, where the precision is the number of true positive test result divided by the overall number of positive test result, including those that were, by mistake grouped, and the recall is the number of true positive test result divided by the overall number of measurements that should have been identified as positive [5].The formula for f-measure is given by: Here we are comparing the model-generated summary of our input text with the reference summary.The reference is a human-made summary of our input text.Here we used ROUGE-1 values because it helps to measure the fluency of the summaries much better than ROUGE-2 and ROUGE-l. We are getting accuracy of 70 percent ,recall of 85.81 percent and f-measure of 75.94 percent. The data for this experiment comes from the kaggle bbc news summary dataset.  We have successfully converted 30 minutes of meeting into text using Rev-Ai API and then successfully summarized the text using our model.For the evaluation of our results we have used ROUGE-1 score values(Precision,recall and f-measure) because it helps to measure the fluency of the summaries much better than ROUGE-2 and ROUGE-L.In our experimental results we have used bbc news summary kaggle dataset and we are getting the accuracy of 70% ,recall of 85.81% and f-measure of 75.94%. Our summary generator correlates highly with human judgment and has high recall and precision significance test with manual evaluation results. We also focused on improving the voice to text conversion of a specific recorded audio file using the Rev-AI Speech-to-Text API while ensuring that the summarised text that has been converted from speech to text o↵ers a precise and concise meaning that is not only understandable to everyone but also covers all of the recorded file's important points.