（转）awesome-text-summarization

发布日期：2021-09-24 22:28:49 浏览次数：2 分类：技术文章

本文共 28483 字，大约阅读时间需要 94 分钟。

awesome-text-summarization

2018-07-19 10:45:13

A curated list of resources dedicated to text summarization

Corpus

contains 51 articles. Each article is about a product’s feature, like iPod’s Battery Life, etc. and is a collection of reviews by customers who purchased that product. Each article in the dataset has 5 manually written “gold” summaries. Usually the 5 gold summaries are different but they can also be the same text repeated 5 times.

: English Gigaword was produced by Linguistic Data Consortium (LDC).

: This corpus is constructed from the Chinese microblogging website SinaWeibo. It consists of over 2 million real Chinese short texts with short summaries given by the writer of each text.

Ziqiang Cao, Chengyao Chen, Wenjie Li, Sujian Li, Furu Wei, Ming Zhou. . arXiv:1511.08417, 2015.

contains a release of the scientific document summarization corpus and annotations from the WING NUS group.

Avinesh P.V.S., Maxime Peyrard, Christian M. Meyer. . arXiv:1802.09884, 2018.

Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield, Dragomir R. Radev.. arXiv:1805.04617, 2018. The source code is . All the datasets could be found through the . The blog is an excellent user guide with step by step instructions on how to use the search engine.

Text Summarization Software

implemented in Python is a well tested & multi-language evaluation framework for text summarization.

is a simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains simple evaluation framework for text summaries. Implemented summarization methods are Luhn, Edmundson, LSA, LexRank, TextRank, SumBasic and KL-Sum.

implements the TextRank algorithm to extract key words/phrases and text summarization in Chinese. It is written in Python.

is python library for processing Chinese text.

is an integrated toolkit for automatic document summarization. It supports single-document, multi-document and topic-focused multi-document summarizations, and a variety of summarization methods have been implemented in the toolkit. It supports Western languages (e.g. English) and Chinese language.

is a toolkit for Chinese natural language processing.

Word Representation

G. E. Hinton, J. L, McClelland, and D. E. Rumelhart. . In D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, MIT Press, Cambridge, MA. 1986. The related slides are or .

Yoshua Bengio, Réjean Ducharme, Pascal Vincent and Christian Jauvin. . 2003.
- They proposed to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.

Christopher Olah. . This post reviews some extremely remarkable results in applying deep neural networks to NLP, where the representation perspective of deep learning is a powerful view that seems to answer why deep neural networks are so effective.

Levy, Omer, and Yoav Goldberg. . NIPS. 2014.

's a series of blogs/papers about word embeddings:
- The blog is a very good overview about word embedding.
- The blog introduces the main result about , which answers three interesting questions: 1. Why do low-dimensional embeddings capture huge statistical information? 2. Why do low dimensional embeddings work better than high-dimensional ones? 3. Why do Semantic Relations correspond to Directions?
- The blog introduces the main result about , which shows that word senses are easily accessible in many current word embeddings.

: This is a post with links to and descriptions of word2vec tutorials, papers, and implementations.

an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus.

Li, Yitan, et al. . IJCAI. 2015.

O. Levy, Y. Goldberg, and I. Dagan. . Trans. Assoc. Comput. Linguist., 2015.

Eric Nalisnick, Sachin Ravi. . arXiv:1511.05392, 2015.
- They describe a method for learning word embeddings with data-dependent dimensionality. Their Stochastic Dimensionality Skip-Gram (SD-SG) and Stochastic Dimensionality Continuous Bag-of-Words (SD-CBOW) are nonparametric analogs of Mikolov et al.'s (2013) well-known 'word2vec' model.

William L. Hamilton, Jure Leskovec, Dan Jurafsky. .
- Hamilton et al. model changes in word meaning by fitting word embeddings on consecutive corpora of historical language. They compare several ways of quantifying meaning (co-occurrence vectors weighted by PPMI, SVD embeddings and word2vec embeddings), and align historical embeddings from different corpora by finding the optimal rotational alignment that preserves the cosine similarities as much as possible.

Zijun Yao, Yifan Sun, Weicong Ding, Nikhil Rao, Hui Xiong. . arXiv:1703.00607v2, International Conference on Web Search and Data Mining (WSDM 2018).

Yang, Wei and Lu, Wei and Zheng, Vincent. . ACL, 2017. The source code in C is .
- This paper presents a simple yet effective method for learning word embeddings based on text from different domains.

Sebastian Ruder.

Bryan McCann, James Bradbury, Caiming Xiong and Richard Socher. . For a high-level overview of why CoVe are great, check out the .
- A Keras/TensorFlow implementation of the MT-LSTM/CoVe is .
- A PyTorch implementation of the MT-LSTM/CoVe is .

Maria Pelevina, Nikolay Arefyev, Chris Biemann, Alexander Panchenko. . arXiv:1708.03390, 2017. The source code written in Python is .
- Making sense embedding out of word embeddings using graph-based word sense induction.

Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov. . arXiv:1607.04606. 2017. The souce code in C++11 is , which is a library for efficient learning of word representations and sentence classification.

Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer and Herv{'e} J{'e}gou. . arXiv:1710.04087, 2017. The source code in Python is , which is a library for multilingual unsupervised or supervised word embeddings.

Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch and Armand Joulin. . arXiv:1712.09405, 2017.

Gabriel Grand, Idan Asher Blank, Francisco Pereira, Evelina Fedorenko. . arXiv:1802.01241, 2018.
- Could context-dependent relationships be recovered from word embeddings? To address this issue, they introduce a powerful, domain-general solution: "semantic projection" of word-vectors onto lines that represent various object features, like size (the line extending from the word "small" to "big"), intelligence (from "dumb" to "smart"), or danger (from "safe" to "dangerous").

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer. . arXiv:1802.05365, NAACL 2018. The code is .

Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov. . arXiv:1802.06893v2, Proceedings of LREC, 2018.

Douwe Kiela, Changhan Wang and Kyunghyun Cho. . arXiv:1804.07983, 2018.
- While one of the first steps in many NLP systems is selecting what embeddings to use, they argue that such a step is better left for neural networks to figure out by themselves. To that end, they introduce a novel, straightforward yet highly effective method for combining multiple types of word embeddings in a single model, leading to state-of-the-art performance within the same model class on a variety of tasks.

Laura Wendlandt, Jonathan K. Kummerfeld, Rada Mihalcea. . arXiv:1804.09692, NAACL HLT 2018.
- They provide empirical evidence for how various factors contribute to the stability of word embeddings, and analyze the effects of stability on downstream tasks.

Sentence Representation

Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. . arXiv:1404.2188, 2014.

Quoc Le and Tomas Mikolov. . arXiv:1405.4053, 2014.

Yoon Kim. . arXiv:1408.5882, EMNLP 2014.

Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun and Sanja Fidler. . arXiv:1506.06726, 2015. The source code in Python is . The TensorFlow implementation of Skip-Thought Vectors is

John Wieting and Mohit Bansal and Kevin Gimpel and Karen Livescu. . arXiv:1511.08198, ICLR 2016. The source code written in Python is .

Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, Lawrence Carin. . arXiv:1611.07897, EMNLP 2017. The training code written in Python is .

Matteo Pagliardini, Prakhar Gupta, Martin Jaggi. . arXiv:1703.02507, NAACL 2018. The source code in Python is .

Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston. . arXiv:1709.03856, 2017. The source code in C++11 is .

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes. . arXiv:1705.02364v4, EMNLP 2017. The source code in Python is .

Sanjeev Arora, Yingyu Liang, Tengyu Ma. . ICLR 2017. The source code written in Python is . is a minimum example for the sentence embedding algorithm.

Yixin Nie, Mohit Bansal. . arXiv:1708.02312, EMNLP 2017. The source code in Python is . The new repo is for Residual-connected sentence encoder for NLI.

Lajanugen Logeswaran, Honglak Lee. . arXiv:1803.02893, ICLR 2018. The open review comments are listed .

Eric Zelikman. . arXiv:1803.08493, 2018.

Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil. . arXiv:1803.11175v2, 2018.

Extractive Text Summarization

H. P. Luhn. . IBM Journal of Research and Development, 1958. Luhn's method is as follows:
1. Ignore Stopwords: Common words (known as stopwords) are ignored.
2. Determine Top Words: The most often occuring words in the document are counted up.
3. Select Top Words: A small number of the top words are selected to be used for scoring.
4. Select Top Sentences: Sentences are scored according to how many of the top words they contain. The top four sentences are selected for the summary.

H. P. Edmundson. . Journal of the Association for Computing Machinery, 1969.

David M. Blei, Andrew Y. Ng and Michael I. Jordan. . Journal of Machine Learning Research, 2003. The source code in Python is . Reimplement Luhn's algorithm, but with topics instead of words and applied to several documents instead of one.
1. Train LDA on all products of a certain type (e.g. all the books)
2. Treat all the reviews of a particular product as one document, and infer their topic distribution
3. Infer the topic distribution for each sentence
4. For each topic that dominates the reviews of a product, pick some sentences that are themselves dominated by that topic.

David M. Blei. . Communications of the ACM, 2012.

Rada Mihalcea and Paul Tarau. . ACL, 2004. The source code in Python is . pytextrank works in four stages, each feeding its output to the next:
- Part-of-Speech Tagging and lemmatization are performed for every sentence in the document.
- Key phrases are extracted along with their counts, and are normalized.
- Calculates a score for each sentence by approximating jaccard distance between the sentence and key phrases.
- Summarizes the document based on most significant sentences and key phrases.

Federico Barrios, Federico López, Luis Argerich and Rosa Wachenchauzer. . arXiv:1602.03606, 2016. The source code in Python is . Gensim's summarization only works for English for now, because the text is pre-processed so that stop words are removed and the words are stemmed, and these processes are language-dependent. TextRank works as follows:
- Pre-process the text: remove stop words and stem the remaining words.
- Create a graph where vertices are sentences.
- Connect every sentence to every other sentence by an edge. The weight of the edge is how similar the two sentences are.
- Run the PageRank algorithm on the graph.
- Pick the vertices(sentences) with the highest PageRank score.

uses basic summarization features and build from it. Those features are:
- Title feature is used to score the sentence with the regards to the title. It is calculated as the count of words which are common to title of the document and sentence.
- Sentence length is scored depends on how many words are in the sentence. TextTeaser defined a constant “ideal” (with value 20), which represents the ideal length of the summary, in terms of number of words. Sentence length is calculated as a normalized distance from this value.
- Sentence position is where the sentence is located. I learned that introduction and conclusion will have higher score for this feature.
- Keyword frequency is just the frequency of the words used in the whole text in the bag-of-words model (after removing stop words).

Güneş Erkan and Dragomir R. Radev. . 2004.
- LexRank uses IDF-modified Cosine as the similarity measure between two sentences. This similarity is used as weight of the graph edge between two sentences. LexRank also incorporates an intelligent post-processing step which makes sure that top sentences chosen for the summary are not too similar to each other.

Josef Steinberger and Karel Jezek. . Proc. ISIM’04, 2004.

Josef Steinberger and Karel Ježek. . International Conference on Advances in Information Systems, 2004.

Josef Steinberger, Massimo Poesio, Mijail A Kabadjov and Karel Ježek. . Information Processing & Management, 2007.

James Clarke and Mirella Lapata. . EMNLP-CoNLL, 2007.

Dan Gillick and Benoit Favre. . ACL, 2009.

Ani Nenkova and Kathleen McKeown. . Foundations and Trend in Information Retrieval, 2011. are also available.

Vahed Qazvinian, Dragomir R. Radev, Saif M. Mohammad, Bonnie Dorr, David Zajic, Michael Whidby, Taesun Moon. . arXiv:1402.0556, 2014.

Kågebäck, Mikael, et al. . Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)@ EACL. 2014.

Ramesh Nallapati, Bowen Zhou, Mingbo Ma. . arXiv:1611.04244. 2016.

Ramesh Nallapati, Feifei Zhai, Bowen Zhou. . arXiv:1611.04230, AAAI, 2017.

Shashi Narayan, Nikos Papasarantopoulos, Mirella Lapata, Shay B. Cohen. . arXiv:1704.04530, 2017.

Rakesh Verma, Daniel Lee. . arXiv:1704.05550, 2017.

Ed Collins, Isabelle Augenstein, Sebastian Riedel. . arXiv:1706.03946, 2017.

Sukriti Verma, Vagisha Nidhi. . arXiv:1708.04439, 2017.

Parth Mehta, Gaurav Arora, Prasenjit Majumder. . arXiv:1802.04675, 2018.

Shashi Narayan, Shay B. Cohen, Mirella Lapata. . arXiv:1802.08636, NAACL, 2018.

Aakash Sinha, Abhishek Yadav, Akshay Gahlot. . arXiv:1802.10137, 2018.

Yuxiang Wu, Baotian Hu. . arXiv:1804.07036, AAAI, 2018.

Abstractive Text Summarization

Alexander M. Rush, Sumit Chopra, Jason Weston. . EMNLP, 2015. The source code in LUA Torch7 is .
- They use sequence-to-sequence encoder-decoder LSTM with attention.
- They use the first sentence of a document. The source document is quite small (about 1 paragraph or ~500 words in the training dataset of Gigaword) and the produced output is also very short (about 75 characters). It remains an open challenge to scale up these limits - to produce longer summaries over multi-paragraph text input (even good LSTM models with attention models fall victim to vanishing gradients when the input sequences become longer than a few hundred items).
- The evaluation method used for automatic summarization has traditionally been the ROUGE metric - which has been shown to correlate well with human judgment of summary quality, but also has a known tendency to encourage "extractive" summarization - so that using ROUGE as a target metric to optimize will lead a summarizer towards a copy-paste behavior of the input instead of the hoped-for reformulation type of summaries.

Peter Liu and Xin Pan. . 2016. The source code in Python is .
- They use sequence-to-sequence encoder-decoder LSTM with attention and bidirectional neural net.
- They use the first 2 sentences of a document with a limit at 120 words.
- The scores achieved by Google’s textsum are 42.57 ROUGE-1 and 23.13 ROUGE-2.

Ramesh Nallapati, Bowen Zhou, Cicero Nogueira dos santos, Caglar Gulcehre, Bing Xiang. . arXiv:1602.06023, 2016.
- They use GRU with attention and bidirectional neural net.
- They use the first 2 sentences of a documnet with a limit at 120 words.
- They use the of Jean et al. 2014, which means when you decode, use only the words that appear in the source - this reduces perplexity. But then you lose the capability to do "abstractive" summary. So they do "vocabulary expansion" by adding a layer of "word2vec nearest neighbors" to the words in the input.
- Feature rich encoding - they add TFIDF and Named Entity types to the word embeddings (concatenated) to the encodings of the words - this adds to the encoding dimensions that reflect "importance" of the words.
- The most interesting of all is what they call the "Switching Generator/Pointer" layer. In the decoder, they add a layer that decides to either generate a new word based on the context / previously generated word (usual decoder) or copy a word from the input (that is - add a pointer to the input). They learn when to do Generate vs. Pointer and when it is a Pointer which word of the input to Point to.

Konstantin Lopyrev. . arXiv:1512.01712, 2015. The source code in Python is .

Jiwei Li, Minh-Thang Luong and Dan Jurafsky. . arXiv:1506.01057, 2015. The source code in Matlab is .

Sumit Chopra, Alexander M. Rush and Michael Auli. . NAACL, 2016.

Jianpeng Cheng, Mirella Lapata. . arXiv:1603.07252, 2016.
- This paper uses attention as a mechanism for identifying the best sentences to extract, and then go beyond that to generate an abstractive summary.

Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama. . arXiv:1609.07033, Proceedings of the 2015 ACM Symposium on Document Engineering, DocEng' 2015.

Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama. . arXiv:1609.07034, 2016.

Suzuki, Jun, and Masaaki Nagata. . EACL 2017 (2017): 291.

Jiwei Tan and Xiaojun Wan. . ACL, 2017.

Preksha Nema, Mitesh M. Khapra, Balaraman Ravindran and Anirban Laha. . ACL,2017

Romain Paulus, Caiming Xiong, Richard Socher. . arXiv:1705.04304, 2017. The related blog is .
- Their model is trained with teacher forcing and reinforcement learning at the same time, being able to make use of both word-level and whole-summary-level supervision to make it more coherent and readable.

Shibhansh Dohare, Harish Karnick. . arXiv:1706.01678, 2017.

Piji Li, Wai Lam, Lidong Bing, Zihao Wang. . arXiv:1708.00625, 2017.

Xinyu Hua, Lu Wang. . arXiv:1707.07062, 2017.

Angela Fan, David Grangier, Michael Auli. . arXiv:1711.05217, 2017.

Linqing Liu, Yao Lu, Min Yang, Qiang Qu, Jia Zhu, Hongyan Li. . arXiv:1711.09357, 2017.

Johan Hasselqvist, Niklas Helmertz, Mikael Kågebäck. . arXiv:1712.06100, 2017.

Tal Baumel, Matan Eyal, Michael Elhadad. . arXiv:1801.07704, 2018.

André Cibils, Claudiu Musat, Andreea Hossman, Michael Baeriswyl. . arXiv:1802.01457, 2018.

Chieh-Teng Chang, Chi-Chia Huang, Jane Yung-Jen Hsu. . arXiv:1802.09968, 2018.

Asli Celikyilmaz, Antoine Bosselut, Xiaodong He, Yejin Choi. . arXiv:1803.10357, 2018.

Piji Li, Lidong Bing, Wai Lam. . arXiv:1803.11070, 2018.

Paul Azunre, Craig Corcoran, David Sullivan, Garrett Honke, Rebecca Ruppel, Sandeep Verma, Jonathon Morgan. . arXiv:1804.01503, 2018.

Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, Nazli Goharian. . arXiv:1804.05685, 2018.

Ramakanth Pasunuru, Mohit Bansal. . arXiv:1804.06451, 2018.

Shuming Ma, Xu Sun, Junyang Lin, Xuancheng Ren. . arXiv:1805.01089, IJCAI 2018.

Li Wang, Junlin Yao, Yunzhe Tao, Li Zhong, Wei Liu, Qiang Du. . arXiv:1805.03616, International Joint Conference on Artificial Intelligence and European Conference on Artificial Intelligence (IJCAI-ECAI), 2018.

Text Summarization

Eduard Hovy and Chin-Yew Lin. . In Proceedings of a Workshop on Held at Baltimore, Maryland, ACL, 1998.

Eduard Hovy and Chin-Yew Lin. . In Advances in Automatic Text Summarization, 1999.

Dipanjan Das and Andre F.T. Martins. . Technical report, CMU, 2007

J. Leskovec, L. Backstrom, J. Kleinberg. . ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2009.

Ryang, Seonggi, and Takeshi Abekawa. "." In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 256-265. Association for Computational Linguistics, 2012. [not neural-based methods]

King, Ben, Rahul Jha, Tyler Johnson, Vaishnavi Sundararajan, and Clayton Scott. "." Machine Learning (2011).

Liu, Yan, Sheng-hua Zhong, and Wenjie Li. "." AAAI. 2012.

He, Zhanying, Chun Chen, Jiajun Bu, Can Wang, Lijun Zhang, Deng Cai, and Xiaofei He. "." In AAAI. 2012.

Mohsen Pourvali, Mohammad Saniee Abadeh. . arXiv:1203.3586, 2012.

PadmaPriya, G., and K. Duraiswamy. . Journal of Computer Science 10, no. 1 (2013): 1-9.

Rushdi Shams, M.M.A. Hashem, Afrina Hossain, Suraiya Rumana Akter, Monika Gope. . arXiv:1304.2476, Procs. of the IEEE International Conference on Computer and Communication Engineering (ICCCE10), pp. 115-120, Kuala Lumpur, Malaysia, May 11-13, (2010).

Juan-Manuel Torres-Moreno. . arXiv:1209.3126, 2012.

Rioux, Cody, Sadid A. Hasan, and Yllias Chali. . In EMNLP, pp. 681-690. 2014.[not neural-based methods]

Fatma El-Ghannam, Tarek El-Shishtawy. . arXiv:1401.0640, 2014.

Denil, Misha, Alban Demiraj, and Nando de Freitas. . arXiv:1412.6815, 2014.

Denil, Misha, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, and Nando de Freitas.. arXiv:1406.3830, 2014.

Cao, Ziqiang, Furu Wei, Li Dong, Sujian Li, and Ming Zhou. . AAAI, 2015.

Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, and Noah A. Smith. . NAACL, 2015.

Wenpeng Yin， Yulong Pei. Optimizing Sentence Modeling and Selection for Document Summarization. IJCAI, 2015.

Liu, He, Hongliang Yu, and Zhi-Hong Deng. . In Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

Jin-ge Yao, Xiaojun Wan and Jianguo Xiao. . IJCAI, 2015.

Piji Li, Lidong Bing, Wai Lam, Hang Li, and Yi Liao. . arXiv:1504.07324, IJCAI, 2015.

Marta Aparício, Paulo Figueiredo, Francisco Raposo, David Martins de Matos, Ricardo Ribeiro, Luís Marujo. . arXiv:1506.01273, 2015.

Luís Marujo, Ricardo Ribeiro, David Martins de Matos, João P. Neto, Anatole Gershman, Jaime Carbonell. . arXiv:1507.02907, 2015.

Xiaojun Wan, Yansong Feng and Weiwei Sun. . Book Chapter in CCF 2014-2015 Annual Report on Computer Science and Technology in China (In Chinese), 2015. 0. Xiaojun Wan, Ziqiang Cao, Furu Wei, Sujian Li, Ming Zhou. . arXiv:1507.02062, 2015.

Gulcehre, Caglar, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. . arXiv:1603.08148, 2016.

Jiatao Gu, Zhengdong Lu, Hang Li, Victor O.K. Li. . arXiv:1603.06393, ACL, 2016.
- They addressed an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. In this paper, they incorporated copying into neural network-based Seq2Seq learning and propose a new model called CopyNet with encoder-decoder structure. CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence.

Jianmin Zhang, Jin-ge Yao and Xiaojun Wan. . In Proceedings of ACL, 2016.

Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei. "". arXiv:1604.00125, 2016

Ayana, Shiqi Shen, Yu Zhao, Zhiyuan Liu and Maosong Sun. . arXiv:1604.01904, 2016.

Ayana, Shiqi Shen, Zhiyuan Liu and Maosong Sun. . 2016.

Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, Claire Cardie. . arXiv:1606.07548, 2016.

Milad Moradi, Nasser Ghadiri. . arXiv:1605.02948, 2016.

Kikuchi, Yuta, Graham Neubig, Ryohei Sasano, Hiroya Takamura, and Manabu Okumura. . arXiv:1609.09552, 2016.

Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei and Hui Jiang. . arXiv:1610.08462, IJCAI, 2016.

Wang, Lu, and Wang Ling. . NAACL, 2016.

Yishu Miao, Phil Blunsom. . EMNLP, 2016.

Takase, Sho, Jun Suzuki, Naoaki Okazaki, Tsutomu Hirao, and Masaaki Nagata. . EMNLP, 1054-1059, 2016.

Wenyuan Zeng, Wenjie Luo, Sanja Fidler, Raquel Urtasun. . arXiv:1611.03382, 2016.

Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei. . arXiv:1611.09238, 2016.

Hongya Song, Zhaochun Ren, Piji Li, Shangsong Liang, Jun Ma, and Maarten de Rijke. . In WSDM 2017: The 10th International Conference on Web Search and Data Mining, 2017.

Piji Li, Zihao Wang, Wai Lam, Zhaochun Ren, Lidong Bing. . In AAAI, 2017.

Yinfei Yang, Forrest Sheng Bao, Ani Nenkova. . arXiv:1702.07998, 2017.

Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi. . arXiv:1704.06879, 2017. The source code written in Python is .

Abigail See, Peter J. Liu and Christopher D. Manning. . ACL, 2017.

Qingyu Zhou, Nan Yang, Furu Wei and Ming Zhou. . arXiv:1704.07073, ACL, 2017.

Maxime Peyrard and Judith Eckle-Kohler. . ACL, 2017.

Jin-ge Yao, Xiaojun Wan and Jianguo Xiao. . KAIS, survey paper, 2017.

Pranay Mathur, Aman Gill and Aayush Yadav. . 2017.
- They compared modern extractive methods like LexRank, LSA, Luhn and Gensim’s existing TextRank summarization module on the of 51 (article, summary) pairs. They also had a try with an abstractive technique using Tensorflow’s algorithm , but didn’t obtain good results due to its extremely high hardware demands (7000 GPU hours).

Arman Cohan, Nazli Goharian. . arXiv:1704.06619, EMNLP, 2015.

Arman Cohan, Nazli Goharian. . arXiv:1706.03449, 2017.

Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, Dragomir Radev. . arXiv:1706.06681, CoNLL, 2017.

Abeed Sarker, Diego Molla, Cecile Paris. . arXiv:1706.08162, 2017.

Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saeid Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, Krys Kochut. . arXiv:1707.02268, 2017. 5. Demian Gholipour Ghalandari. . arXiv:1708.07690, EMNLP, 2017.

Shuming Ma, Xu Sun. . arXiv:1710.02318, 2017.

Kaustubh Mani, Ishan Verma, Lipika Dey. . arXiv:1710.02745, 2017.

Liqun Shao, Hao Zhang, Ming Jia, Jie Wang. . arXiv:1710.00284, KDIR, 2017.

Mohammad Ebrahim Khademi, Mohammad Fakhredanesh, Seyed Mojtaba Hoseini. . arXiv:1710.10994, 2017.

Jingjing Xu. . arXiv:1710.11332, 2017.

Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, Noam Shazeer. . arXiv:1801.10198, 2018.

Parth Mehta, Prasenjit Majumder. . arXiv:1802.00946, 2018.

Mayank Chaudhari, Aakash Nelson Mattukoyya. . arXiv:1802.09426, 2018.

Chinese Text Summarization

Mao Song Sun. . Journal of Chinese Information Processing, 2011.

Baotian Hu, Qingcai Chen and Fangze Zhu. . 2015.
- They constructed a large-scale Chinese short text summarization dataset constructed from the Chinese microblogging website Sina Weibo, which is released to . Then they performed GRU-based encoder-decoder method on it to generate summary. They took the whole short text as one sequence, this may not be very reasonable, because most of short texts contain several sentences.
- LCSTS contains 2,400,591 (short text, summary) pairs as the training set and 1,106 pairs as the test set.
- All the models are trained on the GPUs tesla M2090 for about one week.
- The results show that the RNN with context outperforms RNN without context on both character and word based input.
- Moreover, the performances of the character-based input outperform the word-based input.

Evaluation Metrics

Chin-Yew Lin and Eduard Hovy. . In Proceedings of the Human Technology Conference 2003 (HLT-NAACL-2003).

Chin-Yew Lin. . Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004.

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. .

Arman Cohan, Nazli Goharian. . arXiv:1604.00400, LREC, 2016.

Maxime Peyrard. . arXiv:1801.08991, 2018.

Kavita Ganesan. . arXiv:1803.01937, 2018. It works by comparing an automatically produced summary or translation against a set of reference summaries (typically human-produced). ROUGE is one of the standard ways to compute effectiveness of auto generated summaries. The evaluation toolkit is an easy to use for Automatic Summarization tasks.

Opinion Summarization

Kavita Ganesan, ChengXiang Zhai and Jiawei Han. . Proceedings of COLING '10, 2010.

Kavita Ganesan, ChengXiang Zhai and Evelyne Viegas. . WWW'12, 2012.

Kavita Ganesan. . PhD Thesis, University of Illinois at Urbana-Champaign, 2013.

Ozan Irsoy and Claire Cardie. . In EMNLP, 2014.

Ahmad Kamal. . arXiv:1504.03068, 2015.

Haibing Wu, Yiwei Gu, Shangdi Sun and Xiaodong Gu. . 2015.

Lu Wang, Hema Raghavan, Claire Cardie, Vittorio Castelli. . arXiv:1606.05702, 2016.

转载于:https://www.cnblogs.com/wangxiaocvpr/p/9334478.html

转载地址：https://blog.csdn.net/a1424262219/article/details/102148599 如侵犯您的版权，请留言回复原文章的地址，我们会给您删除此文章，给您带来不便请您谅解！

上一篇：论文笔记：Variational Capsules for Image Analysis and Synthesis

下一篇：深度学习课程笔记（十）Q-learning (Continuous Action)

发表评论

关于作者

喝酒易醉，品茶养心，人生如梦，品茶悟道，何以解忧？唯有杜康！

-- 愿君每日到此一游！