مدلی جدید برپایه معماری کدگذار-کدگشا و سازوکار توجه برای خلاصه‌سازی چکیده‌ای خودکار متون

محورهای موضوعی :

حسن علی اکبرپور ¹ , محمدتقی منظوری‌شلمانی ^{2
*} , امیرمسعود رحمانی ³

1 - دانشجوی دکتری دانشگاه آزاد اسلامی، واحد علوم و تحقیقات، گروه مهندسی کامپیوتر، تهران، ایران
2 - دانشیار گروه مهندسی کامپیوتر‌، دانشگاه صنعتی شریف‌، تهران‌، ایران
3 - استاد دانشگاه آزاد اسلامی، واحد علوم و تحقیقات، گروه مهندسی کامپیوتر، تهران، ایران

تاریخ دریافت : 1400/01/11 تاریخ پذیرش : 1400/10/04 تاریخ انتشار : 1401/06/23

کلید واژه: یادگیری عمیق, خلاصه‌سازی چکیده‌ای, , معماری کدگذار-کدگشا, سازوکار توجه کمکی, ویژگی‌های زبانی.,

چکیده مقاله :

با گسترش وب و در دسترس قرار گرفتن حجم زیادی از اطلاعات در قالب اسناد متنی‌، توسعه سیستم‌های خودکار خلاصه‌سازی متون به‌عنوان یکی از موضوعات مهم در پردازش زبان‌های طبیعی در مرکز توجه محققان قرار گرفته است. البته با معرفی روش‌های یادگیری عمیق در حوزه پردازش متن، خلاصه‌سازی متون نیز وارد فاز جدیدی از توسعه شده و در سال‌های اخیر نیز استخراج خلاصه‌ چکیده‌ای از متن با پیشرفت قابل‌توجهی مواجه شده است. اما می‌توان ادعا کرد که تاکنون از همه ظرفیت شبکه‌های عمیق برای این هدف استفاده نشده است و نیاز به پیشرفت در این حوزه توأمان با در نظر گرفتن ویژگی‌های شناختی همچنان احساس می‌شود. در این راستا، در این مقاله یک مدل دنباله‌ای مجهز به سازوکار توجه کمکی برای خلاصه‌سازی چکیده‌ای متون معرفی شده است که نه‌تنها از ترکیب ویژگی‌های زبانی و بردارهای تعبیه به‌عنوان ورودی مدل یادگیری بهره می‌برد بلکه برخلاف مطالعات پیشین که همواره از سازوکار توجه در بخش کد‌گذار استفاده می‌کردند، از سازوکار توجه کمکی در بخش کدگذار استفاده می‌کند. به کمک سازوکار توجه کمکی معرفی‌شده که از سازوکار ذهن انسان هنگام تولید خلاصه الهام می‌گیرد، بجای اینکه کل متن ورودی کدگذاری شود، تنها قسمت‌های مهم‌تر متن کدگذاری شده و در اختیار کدگشا برای تولید خلاصه قرار می‌گیرند. مدل پیشنهادی همچنین از یک سوئیچ به همراه یک حد آستانه در کدگشا برای غلبه بر مشکل با کلمات نادر بهره می‌برد. مدل پیشنهادی این مقاله روی دو مجموعه داده CNN/Daily Mail و DUC-2004 مورد آزمایش قرار گرفت. بر اساس نتایج حاصل از آزمایش‌ها و معیار ارزیابی ROUGE، مدل پیشنهادی از دقت بالاتری نسبت به سایر روش‌های موجود برای تولید خلاصه چکیده‌ای روی هر دو مجموعه داده برخوردار است.

چکیده انگلیسی:

By the extension of the Web and the availability of a large amount of textual information, the development of automatic text summarization models as an important aspect of natural language processing has attracted many researchers. However, with the growth of deep learning methods in the field of text processing, text summarization has also entered a new phase of development and abstractive text summarization has experienced significant progress in recent years. Even though, it can be claimed that all the potential of deep learning has not been used for this aim and the need for progress in this field, as well as considering the human cognition in creating the summarization model, is still felt. In this regard, an encoder-decoder architecture equipped with auxiliary attention is proposed in this paper which not only used the combination of linguistic features and embedding vectors as the input of the learning model but also despite previous studies that commonly employed the attention mechanism in the decoder, it utilized auxiliary attention mechanism in the encoder to imitate human brain and cognition in summary generation. By the employment of the proposed attention mechanism, only the most important parts of the text rather than the whole input text are encoded and then sent to the decoder to generate the summary. The proposed model also used a switch with a threshold in the decoder to overcome the rare words problem. The proposed model was examined on CNN / Daily Mail and DUC-2004 datasets. Based on the empirical results and according to the ROUGE evaluation metric, the proposed model obtained a higher accuracy compared to other existing methods for generating abstractive summaries on both datasets.

منابع و مأخذ:

[1] M. Dey and D. Das, "A Deep Dive into Supervised Extractive and Abstractive Summarization from Text," in Data Visualization and Knowledge Engineering: Springer, 2020, pp. 109-132.
[2] T. Shi, Y. Keneshloo, N. Ramakrishnan, and C. K. Reddy, "Neural abstractive text summarization with sequence-to-sequence models," ACM Transactions on Data Science, vol. 2, no. 1, pp. 1-37, 2021.
[3] A. M. Al-Numai and A. M. Azmi, "The Development of Single-Document Abstractive Text Summarizer During the Last Decade," in Trends and Applications of Text Summarization Techniques: IGI Global, 2020, pp. 32-60.
[4] S. Chakraborty, X. Li, and S. Chakraborty, "A more abstractive summarization model," arXiv preprint arXiv:2002.10959, 2020.
[5] L. Abualigah, M. Q. Bashabsheh, H. Alabool, and M. Shehab, "Text Summarization: A Brief Review," in Recent Advances in NLP: The Case of Arabic Language: Springer, 2020, pp. 1-15.
[6] Y. Dong, "A survey on neural network-based summarization methods," arXiv preprint arXiv:1804.04589, 2018.
[7] F. Zhao, B. Quan, J. Yang, J. Chen, Y. Zhang, and X. Wang, "Document Summarization using Word and Part-of-speech based on Attention Mechanism," in Journal of Physics: Conference Series, 2019, vol. 1168, no. 3: IOP Publishing, p. 032008.
[8] D. Suleiman and A. Awajan, "Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges," Mathematical Problems in Engineering, vol. 2020, 2020.
[9] H. Lin and V. Ng, "Abstractive Summarization: A Survey of the State of the Art," in Proceedings of the AAAI Conference on Artificial Intelligence, 2019, vol. 33, pp. 9815-9822.
[10] W. Kryściński, N. S. Keskar, B. McCann, C. Xiong, and R. Socher, "Neural text summarization: A critical evaluation," arXiv preprint arXiv: 1908.08960, 2019.
[11] X. Xiang, G. Xu, X. Fu, Y. Wei, L. Jin, and L. Wang, "Skeleton to Abstraction: An Attentive Information Extraction Schema for Enhancing the Saliency of Text Summarization," Information, vol. 9, no. 9, p. 217, 2018.
[12] S. Song, H. Huang, and T. Ruan, "Abstractive text summarization using LSTM-CNN based deep learning," Multimedia Tools and Applications, vol. 78, no. 1, pp. 857-875, 2019.
[13] H. P. Luhn, "The automatic creation of literature abstracts," IBM Journal of research and development, vol. 2, no. 2, pp. 159-165, 1958.
[14] I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," in Advances in neural information processing systems, 2014, pp. 3104-3112.
[15] A. M. Rush, S. Chopra, and J. Weston, "A neural attention model for abstractive sentence summarization," arXiv preprint arXiv:1509.00685, 2015.
[16] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv: 1409.0473,2014.
[17] S. Chopra, M. Auli, and A. M. Rush, "Abstractive sentence summarization with attentive recurrent neural networks," in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 93-98.
[18] W. Zeng, W. Luo, S. Fidler, and R. Urtasun, "Efficient summarization with read-again and copy mechanism," arXiv preprint arXiv:1611.03382, 2016.
[19] S. Shen, Y. Zhao, Z. Liu, and M. Sun, "Neural headline generation with sentence-wise optimization," arXiv preprint arXiv:1604.01904, 2016.
[20] S. Takase, J. Suzuki, N. Okazaki, T. Hirao, and M. Nagata, "Neural headline generation on abstract meaning representation," in Proceedings of the 2016 conference on empirical methods in natural language processing, 2016, pp. 1054-1059.
[21] T. Wang, P. Chen, K. Amaral, and J. Qiang, "An experimental study of LSTM encoder-decoder model for text simplification," arXiv preprint arXiv:1609.03663, 2016.
[22] Q. Chen, X. Zhu, Z. Ling, S. Wei, and H. Jiang, "Distraction-based neural networks for document summarization," arXiv preprint arXiv:1610.08462, 2016.
[23] A. See, P. J. Liu, and C. D. Manning, "Get to the point: Summarization with pointer-generator networks," arXiv preprint arXiv:1704.04368, 2017.
[24] K. Al-Sabahi, Z. Zuping, and Y. Kang, "Bidirectional attentional encoder-decoder model and bidirectional beam search for abstractive summarization," arXiv preprint arXiv:1809.06662, 2018.
[25] K. Yao, L. Zhang, D. Du, T. Luo, L. Tao, and Y. Wu, "Dual encoding for abstractive text summarization," IEEE transactions on cybernetics, 2018.
[26] W. H. Alquliti and N. B. A. Ghani, "Convolutional Neural Network based for Automatic Text Summarization."
[27] Y. Zhang, D. Li, Y. Wang, Y. Fang, and W. Xiao, "Abstract Text Summarization with a Convolutional Seq2seq Model," Applied Sciences, vol. 9, no. 8, p. 1665, 2019.
[28] R. Nallapati, B. Zhou, C. Gulcehre, and B. Xiang, "Abstractive text summarization using sequence-to-sequence rnns and beyond," arXiv preprint arXiv:1602.06023, 2016.
[29] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality, Nips," 2013.
[30] W. Yoon, Y. S. Yeo, M. Jeong, B.-J. Yi, and J. Kang, "Learning by Semantic Similarity Makes Abstractive Summarization Better," arXiv preprint arXiv:2002.07767, 2020.
[31] A. Graves, "Generating sequences with recurrent neural networks," arXiv preprint arXiv:1308.0850, 2013.
[32] P. Over, H. Dang, and D. Harman, "DUC in context," Information Processing & Management, vol. 43, no. 6, pp. 1506-1520, 2007.
[33] C.-Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," in Association for Computational Linguistic, Barcelona, Spain, 2004.
[34] A. Fan, D. Grangier, and M. Auli, "Controllable abstractive summarization," arXiv preprint arXiv:1711.05217, 2017.
[35] R. Paulus, C. Xiong, and R. Socher, "A deep reinforced model for abstractive summarization," arXiv preprint arXiv:1705.04304, 2017.
[36] W.-T. Hsu, C.-K. Lin, M.-Y. Lee, K. Min, J. Tang, and M. Sun, "A unified model for extractive and abstractive summarization using inconsistency loss," arXiv preprint arXiv:1805.06266, 2018.
[37] A. Celikyilmaz, A. Bosselut, X. He, and Y. Choi, "Deep communicating agents for abstractive summarization," arXiv preprint arXiv:1803.10357, 2018.
[38] H. Zhang, J. Xu, and J. Wang, "Pretraining-based natural language generation for text summarization," arXiv preprint arXiv:1902.09243, 2019.
[39] P. Li, L. Bing, and W. Lam, "Actor-critic based training framework for abstractive summarization," arXiv preprint arXiv:1803.11070, 2018.
[40] Q. Zhou, N. Yang, F. Wei, and M. Zhou, "Selective encoding for abstractive sentence summarization," arXiv preprint arXiv:1704.07073, 2017.

مقالات مرتبط

سیستم توصیه گر فیلم فیلتر اشتراکی مبتنی بر ضریب همبستگی بین کاربران و محاسبه میانگین وزنی امتیازات با دقت بالا
تاریخ چاپ : 1401/06/23
شناسایی تاکتیک‏های معماری در کد منبع بر اساس یک رویکرد معنایی
تاریخ چاپ : 1401/06/23
ارائه روش جدید انرژی بهینه برای ردیابی اهداف متحرک در شبکه حسگر بی¬سیم با استفاده از الگوریتم جستجوی شکار
تاریخ چاپ : 1401/06/23
دو الگوریتم نیروی مجازی فازی برای بهبود چیدمان حسگرها در شبکه‌های حسگر بی‌سیم
تاریخ چاپ : 1401/06/23
سیاست‌گذاری دسترسی به داده‌های باز در داخل کشور از منظر صیانت از حریم خصوصی و مالکیت داده‌های شخصی
تاریخ چاپ : 1401/06/23
مسیریابی وسایل نقلیه با استفاده از الگوریتم جهش قورباغه مخلوط شده فرد محور
تاریخ چاپ : 1401/06/23

اشتراک گذاری

آدرس مقاله

مدلی جدید برپایه معماری کدگذار-کدگشا و سازوکار توجه برای خلاصه‌سازی چکیده‌ای خودکار متون