Automated Text Summarization as A Service

Authors(4) :-Ketan Shahapure, Samit Shivadekar, Shivam Vibhute, Milton Halem

Recent advancements in technology have enabled the storage of voluminous data. As this data is abundant, there is a need to create summaries that would capture the relevant details of the original source. Since manual summarization is a very taxing process, researchers have been actively trying to automate this process using modern computers that could try to comprehend and generate natural human language. Automated text summarization has been one of the most researched areas in the realm of Natural Language Processing (NLP). Extractive and abstractive summarization are two of the most commonly used techniques for generating summaries. In this study, we present a new methodology that takes the aforementioned summarization techniques into consideration and based on the input, generates a summary that is seemingly better than that generated using a single approach. Further, we have made an attempt to provide this methodology as a service that is deployed on the internet and is remotely accessible from anywhere. This service provided is scalable, fully responsive, and configurable. Next, we also discuss the evaluation process through which we came up with the best model out of many candidate models. Lastly, we conclude by discussing the inferences that we gained out of this study and provide a brief insight into future directions that we could explore.

Authors and Affiliations

Ketan Shahapure
Department of CSEE, University of Maryland Baltimore County, USA
Samit Shivadekar
Department of CSEE, University of Maryland Baltimore County, USA
Shivam Vibhute
San Jose State University, San Jose, California, United States
Milton Halem
University of Maryland, Baltimore County

Machine Learning, Natural Language Processing, Automated Text Summarization, Language Modeling

  1. Bidirectional attentional encoder-decoder model and bidirectional beam search for abstractive summarization. https://intellipaat.com/blog/what-is-lstm/.
  2. Hugging face. https://huggingface.co/models.
  3. Lstm architecture. https://medium.com/@ottaviocalzone/an-intuitive-explanation-of-lstm-a035eb6ab42c.
  4. Rnn architecture. https://www.researchgate.net/figure/An-unrolled-Recurrent-Neural-Networkf ig2321811462.
  5. Streamlit cloud. https://streamlit.io/cloud.
  6. Text summarization with spacy. https://spacy.io/usage/spacy-101.
  7. Xlsum. https://huggingface.co/csebuetnlp/mT5multilingualX LSum.
  8. Kamal Al-Sabahi, Zhang Zuping, and Yang Kang. Bidirectional attentional encoder-decoder model and bidirectional beam search for abstractive summarization. arXiv preprint arXiv:1809.06662, 2018.
  9. Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Eliz-abeth D Trippe, Juan B Gutierrez, and Krys Kochut. Text summarization techniques: a brief survey. arXiv preprint arXiv:1707.02268, 2017.
  10. Kavita Ganesan. ROUGE 2.0: Updated and improved measures for evaluation of summarization tasks. CoRR, abs/1803.01937, 2018.
  11. Som Gupta and Sanjai Kumar Gupta. Abstractive summarization: An overview of the state of the art. Expert Systems with Applications, 121:49–65, 2019.
  12. Sepp Hochreiter and Jurgen¨ Schmidhuber. Long short-term memory. Neural computation, 9:1735–80, 12 1997.
  13. Matthew Honnibal and Ines Montani. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear, 2017.
  14. Pavan Kartheek Rachabathuni. A survey on abstractive summarization techniques. In 2017 International Conference on Inventive Computing and Informatics (ICICI), pages 762–765, 2017.
  15. Usama Khalid, Mirza Omer Beg, and Muhammad Umair Arshad. RUBERT: A bilingual roman urdu BERT using cross lingual transfer learning. CoRR, abs/2102.11278, 2021.
  16. Mohammad Khorasani, Mohamed Abdou, and Javier Hernandez´ Fernandez´. Streamlit Use Cases, pages 309–361. Apress, Berkeley, CA, 2022.
  17. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. CoRR, abs/1910.13461, 2019.
  18. Tianyang Lin, Yuxin Wang, Xiangyang Liu, and Xipeng Qiu. A survey of transformers. AI Open, 2022.
  19. Yang Liu and Mirella Lapata. Text summarization with pretrained encoders. arXiv preprint arXiv:1908.08345, 2019.
  20. N. Moratanch and S. Chitrakala. A survey on extractive text summariza-tion. In 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP), pages 1–6, 2017.
  21. Nikita Munot and Sharvari S Govilkar. Comparative study of text sum-marization methods. International Journal of Computer Applications, 102(12), 2014.
  22. Ramesh Nallapati, Bing Xiang, and Bowen Zhou. Sequence-to-sequence rnns for text summarization. CoRR, abs/1602.06023, 2016.
  23. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text-transformer. CoRR, abs/1910.10683, 2019.
  24. Radim Rehu˚ˇrek and Petr Sojka. Software Framework for Topic Mod-elling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45–50, Valletta, Malta, May 2010. ELRA.
  25. Leonard Richardson. Beautiful soup docu-mentation. Dosegljivo: https://www. crummy.com/software/BeautifulSoup/bs4/doc/.[Dostopano: 7. 7. 2018], 2007.
  26. Alex Sherstinsky. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. CoRR, abs/1808.03314, 2018.
  27. Sam Shleifer and Alexander M. Rush. Pre-trained summarization distillation. CoRR, abs/2010.13002, 2020.
  28. Ralf C. Staudemeyer and Eric Rothstein Morris. Understanding LSTM - a tutorial into long short-term memory recurrent neural networks. CoRR, abs/1909.09586, 2019.
  29. Qing Sun, Stefan Lee, and Dhruv Batra. Bidirectional beam search: Forward-backward inference in neural sequence models for fill-in-the-blank image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6961–6969, 2017.
  30. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  31. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi´ Louf, Morgan Funtowicz, and Jamie Brew. Huggingface’s transformers: State-of-the-art natural language processing. CoRR, abs/1910.03771, 2019.
  32. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi´ Louf, Morgan Funtowicz, and Jamie Brew. Huggingface’s transformers: State-of-the-art natural language processing. CoRR, abs/1910.03771, 2019.
  33. Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. mt5: A massively multilingual pre-trained text-to-text transformer. CoRR, abs/2010.11934, 2020.

Publication Details

Published in : Volume 11 | Issue 1 | January-February 2024
Date of Publication : 2024-02-29
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 101-112
Manuscript Number : IJSRSET12310669
Publisher : Technoscience Academy

Print ISSN : 2395-6011, Online ISSN : 2395-602X

Cite This Article :

Ketan Shahapure, Samit Shivadekar, Shivam Vibhute, Milton Halem, " Automated Text Summarization as A Service", International Journal of Scientific Research in Science and Technology(IJSRST), Print ISSN : 2395-6011, Online ISSN : 2395-602X, Volume 11, Issue 1, pp.101-112, January-February-2024. Available at doi : https://doi.org/10.32628/IJSRSET12310669    
Journal URL : https://ijsrst.com/IJSRSET12310669
Citation Detection and Elimination     |      | | BibTeX | RIS | CSV

Article Preview