中图网(原中国图书网):网上书店,中文字幕在线一区二区三区,尾货特色书店,中文字幕在线一区,30万种特价书低至2折!

歡迎光臨中圖網 請 | 注冊

包郵 文本數據挖掘(英文版)

出版社:清華大學出版社出版時間:2021-10-01
開本: 其他 頁數: 372
中 圖 價:¥66.4(5.6折) 定價  ¥119.0 登錄后可看到會員價
加入購物車 收藏
開年大促, 全場包郵
?新疆、西藏除外
本類五星書更多>

文本數據挖掘(英文版) 版權信息

文本數據挖掘(英文版) 本書特色

《文本數據挖掘(英文版)》面向文本挖掘任務的實際需求,通過實例從原理上對相關技術的理論方法和實現算法進行闡述,寫作風格力求言簡意賅,深入淺出,而不過多地涉及實現細節,盡量使讀者能夠在充分理解基本原理的基礎上掌握應用系統的實現方法。 It is suitable for students, researchers and practitioners interested in text data mining both as a learning text and as a reference book. Professors can readily use it for classes on text data mining or NLP.

文本數據挖掘(英文版) 內容簡介

《Text data mining》 offers thorough and detailed introduction to the fundamental theories and methods of text data mining, ranging from pre-processing (for both Chinese and English texts), text representation, feature selection, to text classification and text clustering. Also it presents predominant applications of text data mining, for example, topic model, sentiment analysis and opinion mining, topic detection and tracking, information extraction, and text automatic summarization, etc.

文本數據挖掘(英文版) 目錄

1 Introduction 1 1.1 The Basic Concepts 1 1.2 Main Tasks of Text Data Mining 3 1.3 Existing Challenges in Text Data Mining 6 1.4 Overview and Organization of This Book 9 1.5 Further Reading 12 2 Data Annotation and Preprocessing 15 2.1 Data Acquisition 15 2.2 Data Preprocessing 20 2.3 Data Annotation 22 2.4 Basic Tools of NLP 25 2.4.1 Tokenization and POS Tagging 25 2.4.2 Syntactic Parser 27 2.4.3 N-gram Language Model 29 2.5 Further Reading 30 3 Text Representation 33 3.1 Vector Space Model 33 3.1.1 Basic Concepts 33 3.1.2 Vector Space Construction 34 3.1.3 Text Length Normalization 36 3.1.4 Feature Engineering 37 3.1.5 Other Text Representation Methods 39 3.2 Distributed Representation of Words 40 3.2.1 Neural Network Language Model 41 3.2.2 C&W Model 45 3.2.3 CBOW and Skip-Gram Model 47 3.2.4 Noise Contrastive Estimation and Negative Sampling 49 3.2.5 Distributed Representation Based on the Hybrid Character-Word Method 51 3.3 Distributed Representation of Phrases 53 3.3.1 Distributed Representation Based on the Bag-of-Words Model 54 3.3.2 Distributed Representation Based on Autoencoder 54 3.4 Distributed Representation of Sentences 58 3.4.1 General Sentence Representation 59 3.4.2 Task-Oriented Sentence Representation 63 3.5 Distributed Representation of Documents 66 3.5.1 General Distributed Representation of Documents 67 3.5.2 Task-Oriented Distributed Representation of Documents 69 3.6 Further Reading 72 4 Text Representation with Pretraining and Fine-Tuning 75 4.1 ELMo: Embeddings from Language Models 75 4.1.1 Pretraining Bidirectional LSTM Language Models 76 4.1.2 Contextualized ELMo Embeddings for Downstream Tasks 77 4.2 GPT: Generative Pretraining 78 4.2.1 Transformer 78 4.2.2 Pretraining the Transformer Decoder 80 4.2.3 Fine-Tuning the Transformer Decoder 81 4.3 BERT: Bidirectional Encoder Representations from Transformer 82 4.3.1 BERT: Pretraining 83 4.3.2 BERT: Fine-Tuning 86 4.3.3 XLNet: Generalized Autoregressive Pretraining 86 4.3.4 UniLM 89 4.4 Further Reading 90 5 Text Classi?cation 93 5.1 The Traditional Framework of Text Classi?cation 93 5.2 Feature Selection 95 5.2.1 Mutual Information 96 5.2.2 Information Gain 99 5.2.3 The Chi-Squared Test Method 100 5.2.4 Other Methods 101 5.3 Traditional Machine Learning Algorithms for Text Classi?cation 102 5.3.1 Na?ve Bayes 103 5.3.2 Logistic/Softmax and Maximum Entropy 105 5.3.3 Support Vector Machine 107 5.3.4 Ensemble Methods 110 5.4 Deep Learning Methods ............................................. 111 5.4.1 Multilayer Feed-Forward Neural Network ................ 111 5.4.2 Convolutional Neural Network ............................ 113 5.4.3 Recurrent Neural Network ................................. 115 5.5 Evaluation of Text Classi?cation 120 5.6 Further Reading 123 6 Text Clustering 125 6.1 Text Similarity Measures 125 6.1.1 The Similarity Between Documents 125 6.1.2 The Similarity Between Clusters 128 6.2 Text Clustering Algorithms 129 6.2.1 K-Means Clustering 129 6.2.2 Single-Pass Clustering 133 6.2.3 Hierarchical Clustering 136 6.2.4 Density-Based Clustering 138 6.3 Evaluation of Clustering 141 6.3.1 External Criteria 141 6.3.2 Internal Criteria 142 6.4 Further Reading 143 7 Topic Model 145 7.1 The History of Topic Modeling. 145 7.2 Latent Semantic Analysis 146 7.2.1 Singular Value Decomposition of the Term-by-Document Matrix 147 7.2.2 Conceptual Representation and Similarity Computation 148 7.3 Probabilistic Latent Semantic Analysis 150 7.3.1 Model Hypothesis .......................................... 150 7.3.2 Parameter Learning ......................................... 151 7.4 Latent Dirichlet Allocation .......................................... 153 7.4.1 Model Hypothesis .......................................... 153 7.4.2 Joint Probability ............................................ 155 7.4.3 Inference in LDA ........................................... 158 7.4.4 Inference for New Documents ............................. 160 7.5 Further Reading 161 8 Sentiment Analysis and Opinion Mining 163 8.1 History of Sentiment Analysis and Opinion Mining 163 8.2 Categorization of Sentiment Analysis Tasks 164 8.2.1 Categorization According to Task Output 164 8.2.2 According to Analysis Granularity 165 8.3 Methods for Document/Sentence-Level Sentiment Analysis 168 8.3.1 Lexicon- and Rule-Based Methods 169 8.3.2 Traditional Machine Learning Methods 170 8.3.3 Deep Learning Methods 174 8.4 Word-Level Sentiment Analysis and Sentiment Lexicon Construction 178 8.4.1 Knowledgebase-Based Methods 178 8.4.2 Corpus-Based Methods 179 8.4.3 Evaluation of Sentiment Lexicons 182 8.5 Aspect-Level Sentiment Analysis 183 8.5.1 Aspect Term Extraction .................................... 183 8.5.2 Aspect-Level Sentiment Classi?cation .................... 186 8.5.3 Generative Modeling of Topics and Sentiments .......... 191 8.6 Special Issues in Sentiment Analysis................................ 193 8.6.1 Sentiment Polarity Shift .................................... 193 8.6.2 Domain Adaptation ......................................... 195 8.7 Further Reading ...................................................... 198 9 Topic Detection and Tracking ............................................. 201 9.1 History of Topic Detection and Tracking ........................... 201 9.2 Terminology and Task De?nition.................................... 202 9.2.1 Terminology ................................................ 202 9.2.2 Task ......................................................... 203 9.3 Story/Topic Representation and Similarity Computation .......... 206 9.4 Topic Detection....................................................... 209 9.4.1 Online Topic Detection ..................................... 209 9.4.2 Retrospective Topic Detection ............................. 211 9.5 Topic Tracking........................................................ 212 9.6 Evaluation ............................................................ 213 9.7 Social Media Topic Detection and Tracking ........................ 215 9.7.1 Social Media Topic Detection.............................. 216 9.7.2 Social Media Topic Tracking .............................. 217 9.8 Bursty Topic Detection............................................... 217 9.8.1 Burst State Detection ....................................... 218 9.8.2 Document-Pivot Methods .................................. 221 9.8.3 Feature-Pivot Methods ..................................... 222 9.9 Further Reading ...................................................... 224 10 Information Extraction 227 10.1 Concepts and History 227 10.2 Named Entity Recognition 229 10.2.1 Rule-based Named Entity Recognition 230 10.2.2 Supervised Named Entity Recognition Method 231 10.2.3 Semisupervised Named Entity Recognition Method 239 10.2.4 Evaluation of Named Entity Recognition Methods 241 10.3 Entity Disambiguation ............................................... 242 10.3.1 Clustering-Based Entity Disambiguation Method ........ 243 10.3.2 Linking-Based Entity Disambiguation .................... 248 10.3.3 Evaluation of Entity Disambiguation .. . . . ................. 254 10.4 Relation Extraction ................................................... 256 10.4.1 Relation Classi?cation Using Discrete Features .......... 258 10.4.2 Relation Classi?cation Using Distributed Features ....... 265 10.4.3 Relation Classi?cation Based on Distant Supervision .. . . 268 10.4.4 Evaluation of Relation Classi?cation . ..................... 269 10.5 Event Extraction 270 10.5.1 Event Description Template................................ 270 10.5.2 Event Extraction Method ................................... 272 10.5.3 Evaluation of Event Extraction ............................ 281 10.6 Further Reading ...................................................... 281 11 Automatic Text Summarization 285 11.1 Main Tasks in Text Summarization 285 11.2 Extraction-Based Summarization 287 11.2.1 Sentence Importance Estimation 287 11.2.2 Constraint-Based Summarization Algorithms 298 11.3 Compression-Based Automatic Summarization 299 11.3.1 Sentence Compression Method 300 11.3.2 Automatic Summarization Based on Sentence Compression 305 11.4 Abstractive Automatic Summarization 307 11.4.1 Abstractive Summarization Based on Information Fusion 307 11.4.2 Abstractive Summarization Based on the Encoder-Decoder Framework .............................. 313 11.5 Query-Based Automatic Summarization ............................ 316 11.5.1 Relevance Calculation Based on the Language Model . . . 317 11.5.2 Relevance Calculation Based on Keyword Co-occurrence .............................................. 317 11.5.3 Graph-Based Relevance Calculation Method ............. 318 11.6 Crosslingual and Multilingual Automatic Summarization ......... 319 11.6.1 Crosslingual Automatic Summarization .. . ................ 319 11.6.2 Multilingual Automatic Summarization .. . . ............... 323 11.7 Summary Quality Evaluation and Evaluation Workshops.......... 325 11.7.1 Summary Quality Evaluation Methods .................... 325 11.7.2 Evaluation Workshops...................................... 330 11.8 Further Reading ...................................................... 332 References 335
展開全部

文本數據挖掘(英文版) 作者簡介

Chengqing Zong is professor at the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences. He serves as chairs for many prestigious conferences such as ACL-IJCNLP, IJCAI, IJCAI-ECAI, AAAI and COLING, etc., and served as associate editors for prestigious journals such as TALLIP, Machine Translation, etc. He is the President of Asian Federation on Natural Language Processing and a member of International Committee on Computational Linguistics.

商品評論(0條)
暫無評論……
書友推薦
本類暢銷
編輯推薦
返回頂部
中圖網
在線客服
主站蜘蛛池模板: 档案密集柜_手动密集柜_智能密集柜_内蒙古档案密集柜-盛隆柜业内蒙古密集柜直销中心 | 安徽集装箱厂-合肥国彩钢结构板房工程有限公司 | 防爆电机生产厂家,YBK3电动机,YBX3系列防爆电机,YBX4节防爆电机--河南省南洋防爆电机有限公司 | LED灯杆屏_LED广告机_户外LED广告机_智慧灯杆_智慧路灯-太龙智显科技(深圳)有限公司 | 魔方网-培训咨询服务平台 | 超声骨密度仪,双能X射线骨密度仪【起草单位】,骨密度检测仪厂家 - 品源医疗(江苏)有限公司 | 激光内雕_led玻璃_发光玻璃_内雕玻璃_导光玻璃-石家庄明晨三维科技有限公司 激光内雕-内雕玻璃-发光玻璃 | 塑料检查井_双扣聚氯乙烯增强管_双壁波纹管-河南中盈塑料制品有限公司 | 次氯酸钠厂家,涉水级次氯酸钠,三氯化铁生产厂家-淄博吉灿化工 | NMRV减速机|铝合金减速机|蜗轮蜗杆减速机|NMRV减速机厂家-东莞市台机减速机有限公司 | 骁龙云呼电销防封号系统-axb电销平台-外呼稳定『免费试用』 | 消泡剂_水处理消泡剂_切削液消泡剂_涂料消泡剂_有机硅消泡剂_广州中万新材料生产厂家 | 电动葫芦|手拉葫芦|环链电动葫芦|微型电动葫芦-北京市凌鹰起重机械有限公司 | 网站优化公司_北京网站优化_抖音短视频代运营_抖音关键词seo优化排名-通则达网络 | 成都亚克力制品,PVC板,双色板雕刻加工,亚克力门牌,亚克力标牌,水晶字雕刻制作-零贰捌广告 | 杭州标识标牌|文化墙|展厅|导视|户内外广告|发光字|灯箱|铭阳制作公司 - 杭州标识标牌|文化墙|展厅|导视|户内外广告|发光字|灯箱|铭阳制作公司 | 泰来华顿液氮罐,美国MVE液氮罐,自增压液氮罐,定制液氮生物容器,进口杜瓦瓶-上海京灿精密机械有限公司 | 会议会展活动拍摄_年会庆典演出跟拍_摄影摄像直播-艾木传媒 | 土壤有机碳消解器-石油|表层油类分析采水器-青岛溯源环保设备有限公司 | 软文推广发布平台_新闻稿件自助发布_媒体邀约-澜媒宝 | 紧急切断阀_气动切断阀_不锈钢阀门_截止阀_球阀_蝶阀_闸阀-上海上兆阀门制造有限公司 | 股票入门基础知识_股票知识_股票投资大师_格雷厄姆网 | 塑料熔指仪-塑料熔融指数仪-熔体流动速率试验机-广东宏拓仪器科技有限公司 | 强效碱性清洗剂-实验室中性清洗剂-食品级高纯氮气发生器-上海润榕科学器材有限公司 | 冷柜风机-冰柜电机-罩极电机-外转子风机-EC直流电机厂家-杭州金久电器有限公司 | 武汉高温老化房,恒温恒湿试验箱,冷热冲击试验箱-武汉安德信检测设备有限公司 | 安平县鑫川金属丝网制品有限公司,防风抑尘网,单峰防风抑尘,不锈钢防风抑尘网,铝板防风抑尘网,镀铝锌防风抑尘网 | 注浆压力变送器-高温熔体传感器-矿用压力传感器|ZHYQ朝辉 | 地脚螺栓_材质_标准-永年县德联地脚螺栓厂家 | 法兰螺母 - 不锈钢螺母制造厂家 - 万千紧固件--螺母街 | 专业广州网站建设,微信小程序开发,一物一码和NFC应用开发、物联网、外贸商城、定制系统和APP开发【致茂网络】 | 耐火浇注料-喷涂料-浇注料生产厂家_郑州市元领耐火材料有限公司 耐力板-PC阳光板-PC板-PC耐力板 - 嘉兴赢创实业有限公司 | 陕西高职单招-陕西高职分类考试网 | 武汉印刷厂-不干胶标签印刷厂-武汉不干胶印刷-武汉标签印刷厂-武汉标签制作 - 善进特种标签印刷厂 | 美国PARKER齿轮泵,美国PARKER柱塞泵,美国PARKER叶片泵,美国PARKER电磁阀,美国PARKER比例阀-上海维特锐实业发展有限公司二部 | 山东成考网-山东成人高考网| 【甲方装饰】合肥工装公司-合肥装修设计公司,专业从事安徽办公室、店面、售楼部、餐饮店、厂房装修设计服务 | 济南律师,济南法律咨询,山东法律顾问-山东沃德律师事务所 | 旅游规划_旅游策划_乡村旅游规划_景区规划设计_旅游规划设计公司-北京绿道联合旅游规划设计有限公司 | 断桥铝破碎机_发动机破碎机_杂铝破碎机厂家价格-皓星机械 | 等离子空气净化器_医用空气消毒机_空气净化消毒机_中央家用新风系统厂家_利安达官网 |