中图网(原中国图书网):网上书店,中文字幕在线一区二区三区,尾货特色书店,中文字幕在线一区,30万种特价书低至2折!

歡迎光臨中圖網 請 | 注冊

包郵 文本數據挖掘(英文版)

出版社:清華大學出版社出版時間:2021-10-01
開本: 其他 頁數: 372
中 圖 價:¥66.4(5.6折) 定價  ¥119.0 登錄后可看到會員價
加入購物車 收藏
開年大促, 全場包郵
?新疆、西藏除外
本類五星書更多>

文本數據挖掘(英文版) 版權信息

文本數據挖掘(英文版) 本書特色

《文本數據挖掘(英文版)》面向文本挖掘任務的實際需求,通過實例從原理上對相關技術的理論方法和實現算法進行闡述,寫作風格力求言簡意賅,深入淺出,而不過多地涉及實現細節,盡量使讀者能夠在充分理解基本原理的基礎上掌握應用系統的實現方法。 It is suitable for students, researchers and practitioners interested in text data mining both as a learning text and as a reference book. Professors can readily use it for classes on text data mining or NLP.

文本數據挖掘(英文版) 內容簡介

《Text data mining》 offers thorough and detailed introduction to the fundamental theories and methods of text data mining, ranging from pre-processing (for both Chinese and English texts), text representation, feature selection, to text classification and text clustering. Also it presents predominant applications of text data mining, for example, topic model, sentiment analysis and opinion mining, topic detection and tracking, information extraction, and text automatic summarization, etc.

文本數據挖掘(英文版) 目錄

1 Introduction 1 1.1 The Basic Concepts 1 1.2 Main Tasks of Text Data Mining 3 1.3 Existing Challenges in Text Data Mining 6 1.4 Overview and Organization of This Book 9 1.5 Further Reading 12 2 Data Annotation and Preprocessing 15 2.1 Data Acquisition 15 2.2 Data Preprocessing 20 2.3 Data Annotation 22 2.4 Basic Tools of NLP 25 2.4.1 Tokenization and POS Tagging 25 2.4.2 Syntactic Parser 27 2.4.3 N-gram Language Model 29 2.5 Further Reading 30 3 Text Representation 33 3.1 Vector Space Model 33 3.1.1 Basic Concepts 33 3.1.2 Vector Space Construction 34 3.1.3 Text Length Normalization 36 3.1.4 Feature Engineering 37 3.1.5 Other Text Representation Methods 39 3.2 Distributed Representation of Words 40 3.2.1 Neural Network Language Model 41 3.2.2 C&W Model 45 3.2.3 CBOW and Skip-Gram Model 47 3.2.4 Noise Contrastive Estimation and Negative Sampling 49 3.2.5 Distributed Representation Based on the Hybrid Character-Word Method 51 3.3 Distributed Representation of Phrases 53 3.3.1 Distributed Representation Based on the Bag-of-Words Model 54 3.3.2 Distributed Representation Based on Autoencoder 54 3.4 Distributed Representation of Sentences 58 3.4.1 General Sentence Representation 59 3.4.2 Task-Oriented Sentence Representation 63 3.5 Distributed Representation of Documents 66 3.5.1 General Distributed Representation of Documents 67 3.5.2 Task-Oriented Distributed Representation of Documents 69 3.6 Further Reading 72 4 Text Representation with Pretraining and Fine-Tuning 75 4.1 ELMo: Embeddings from Language Models 75 4.1.1 Pretraining Bidirectional LSTM Language Models 76 4.1.2 Contextualized ELMo Embeddings for Downstream Tasks 77 4.2 GPT: Generative Pretraining 78 4.2.1 Transformer 78 4.2.2 Pretraining the Transformer Decoder 80 4.2.3 Fine-Tuning the Transformer Decoder 81 4.3 BERT: Bidirectional Encoder Representations from Transformer 82 4.3.1 BERT: Pretraining 83 4.3.2 BERT: Fine-Tuning 86 4.3.3 XLNet: Generalized Autoregressive Pretraining 86 4.3.4 UniLM 89 4.4 Further Reading 90 5 Text Classi?cation 93 5.1 The Traditional Framework of Text Classi?cation 93 5.2 Feature Selection 95 5.2.1 Mutual Information 96 5.2.2 Information Gain 99 5.2.3 The Chi-Squared Test Method 100 5.2.4 Other Methods 101 5.3 Traditional Machine Learning Algorithms for Text Classi?cation 102 5.3.1 Na?ve Bayes 103 5.3.2 Logistic/Softmax and Maximum Entropy 105 5.3.3 Support Vector Machine 107 5.3.4 Ensemble Methods 110 5.4 Deep Learning Methods ............................................. 111 5.4.1 Multilayer Feed-Forward Neural Network ................ 111 5.4.2 Convolutional Neural Network ............................ 113 5.4.3 Recurrent Neural Network ................................. 115 5.5 Evaluation of Text Classi?cation 120 5.6 Further Reading 123 6 Text Clustering 125 6.1 Text Similarity Measures 125 6.1.1 The Similarity Between Documents 125 6.1.2 The Similarity Between Clusters 128 6.2 Text Clustering Algorithms 129 6.2.1 K-Means Clustering 129 6.2.2 Single-Pass Clustering 133 6.2.3 Hierarchical Clustering 136 6.2.4 Density-Based Clustering 138 6.3 Evaluation of Clustering 141 6.3.1 External Criteria 141 6.3.2 Internal Criteria 142 6.4 Further Reading 143 7 Topic Model 145 7.1 The History of Topic Modeling. 145 7.2 Latent Semantic Analysis 146 7.2.1 Singular Value Decomposition of the Term-by-Document Matrix 147 7.2.2 Conceptual Representation and Similarity Computation 148 7.3 Probabilistic Latent Semantic Analysis 150 7.3.1 Model Hypothesis .......................................... 150 7.3.2 Parameter Learning ......................................... 151 7.4 Latent Dirichlet Allocation .......................................... 153 7.4.1 Model Hypothesis .......................................... 153 7.4.2 Joint Probability ............................................ 155 7.4.3 Inference in LDA ........................................... 158 7.4.4 Inference for New Documents ............................. 160 7.5 Further Reading 161 8 Sentiment Analysis and Opinion Mining 163 8.1 History of Sentiment Analysis and Opinion Mining 163 8.2 Categorization of Sentiment Analysis Tasks 164 8.2.1 Categorization According to Task Output 164 8.2.2 According to Analysis Granularity 165 8.3 Methods for Document/Sentence-Level Sentiment Analysis 168 8.3.1 Lexicon- and Rule-Based Methods 169 8.3.2 Traditional Machine Learning Methods 170 8.3.3 Deep Learning Methods 174 8.4 Word-Level Sentiment Analysis and Sentiment Lexicon Construction 178 8.4.1 Knowledgebase-Based Methods 178 8.4.2 Corpus-Based Methods 179 8.4.3 Evaluation of Sentiment Lexicons 182 8.5 Aspect-Level Sentiment Analysis 183 8.5.1 Aspect Term Extraction .................................... 183 8.5.2 Aspect-Level Sentiment Classi?cation .................... 186 8.5.3 Generative Modeling of Topics and Sentiments .......... 191 8.6 Special Issues in Sentiment Analysis................................ 193 8.6.1 Sentiment Polarity Shift .................................... 193 8.6.2 Domain Adaptation ......................................... 195 8.7 Further Reading ...................................................... 198 9 Topic Detection and Tracking ............................................. 201 9.1 History of Topic Detection and Tracking ........................... 201 9.2 Terminology and Task De?nition.................................... 202 9.2.1 Terminology ................................................ 202 9.2.2 Task ......................................................... 203 9.3 Story/Topic Representation and Similarity Computation .......... 206 9.4 Topic Detection....................................................... 209 9.4.1 Online Topic Detection ..................................... 209 9.4.2 Retrospective Topic Detection ............................. 211 9.5 Topic Tracking........................................................ 212 9.6 Evaluation ............................................................ 213 9.7 Social Media Topic Detection and Tracking ........................ 215 9.7.1 Social Media Topic Detection.............................. 216 9.7.2 Social Media Topic Tracking .............................. 217 9.8 Bursty Topic Detection............................................... 217 9.8.1 Burst State Detection ....................................... 218 9.8.2 Document-Pivot Methods .................................. 221 9.8.3 Feature-Pivot Methods ..................................... 222 9.9 Further Reading ...................................................... 224 10 Information Extraction 227 10.1 Concepts and History 227 10.2 Named Entity Recognition 229 10.2.1 Rule-based Named Entity Recognition 230 10.2.2 Supervised Named Entity Recognition Method 231 10.2.3 Semisupervised Named Entity Recognition Method 239 10.2.4 Evaluation of Named Entity Recognition Methods 241 10.3 Entity Disambiguation ............................................... 242 10.3.1 Clustering-Based Entity Disambiguation Method ........ 243 10.3.2 Linking-Based Entity Disambiguation .................... 248 10.3.3 Evaluation of Entity Disambiguation .. . . . ................. 254 10.4 Relation Extraction ................................................... 256 10.4.1 Relation Classi?cation Using Discrete Features .......... 258 10.4.2 Relation Classi?cation Using Distributed Features ....... 265 10.4.3 Relation Classi?cation Based on Distant Supervision .. . . 268 10.4.4 Evaluation of Relation Classi?cation . ..................... 269 10.5 Event Extraction 270 10.5.1 Event Description Template................................ 270 10.5.2 Event Extraction Method ................................... 272 10.5.3 Evaluation of Event Extraction ............................ 281 10.6 Further Reading ...................................................... 281 11 Automatic Text Summarization 285 11.1 Main Tasks in Text Summarization 285 11.2 Extraction-Based Summarization 287 11.2.1 Sentence Importance Estimation 287 11.2.2 Constraint-Based Summarization Algorithms 298 11.3 Compression-Based Automatic Summarization 299 11.3.1 Sentence Compression Method 300 11.3.2 Automatic Summarization Based on Sentence Compression 305 11.4 Abstractive Automatic Summarization 307 11.4.1 Abstractive Summarization Based on Information Fusion 307 11.4.2 Abstractive Summarization Based on the Encoder-Decoder Framework .............................. 313 11.5 Query-Based Automatic Summarization ............................ 316 11.5.1 Relevance Calculation Based on the Language Model . . . 317 11.5.2 Relevance Calculation Based on Keyword Co-occurrence .............................................. 317 11.5.3 Graph-Based Relevance Calculation Method ............. 318 11.6 Crosslingual and Multilingual Automatic Summarization ......... 319 11.6.1 Crosslingual Automatic Summarization .. . ................ 319 11.6.2 Multilingual Automatic Summarization .. . . ............... 323 11.7 Summary Quality Evaluation and Evaluation Workshops.......... 325 11.7.1 Summary Quality Evaluation Methods .................... 325 11.7.2 Evaluation Workshops...................................... 330 11.8 Further Reading ...................................................... 332 References 335
展開全部

文本數據挖掘(英文版) 作者簡介

Chengqing Zong is professor at the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences. He serves as chairs for many prestigious conferences such as ACL-IJCNLP, IJCAI, IJCAI-ECAI, AAAI and COLING, etc., and served as associate editors for prestigious journals such as TALLIP, Machine Translation, etc. He is the President of Asian Federation on Natural Language Processing and a member of International Committee on Computational Linguistics.

商品評論(0條)
暫無評論……
書友推薦
本類暢銷
編輯推薦
返回頂部
中圖網
在線客服
主站蜘蛛池模板: 昆明网络公司|云南网络公司|昆明网站建设公司|昆明网页设计|云南网站制作|新媒体运营公司|APP开发|小程序研发|尽在昆明奥远科技有限公司 | 高博医疗集团上海阿特蒙医院 | RV减速机-蜗轮蜗杆减速机-洗车机减速机-减速机厂家-艾思捷 | 汕头市盛大文化传播有限公司,www.11400.cc | 浙江富广阀门有限公司| 国际学校_国际学校哪个好_国际课程学校-国际学校择校网 | 阴离子聚丙烯酰胺价格_PAM_高分子聚丙烯酰胺厂家-河南泰航净水材料有限公司 | 防潮防水通风密闭门源头实力厂家 - 北京酷思帝克门窗 | 高博医疗集团上海阿特蒙医院 | ★店家乐|服装销售管理软件|服装店收银系统|内衣店鞋店进销存软件|连锁店管理软件|收银软件手机版|会员管理系统-手机版,云版,App | PSI渗透压仪,TPS酸度计,美国CHAI PCR仪,渗透压仪厂家_价格,微生物快速检测仪-华泰和合(北京)商贸有限公司 | 沈阳激光机-沈阳喷码机-沈阳光纤激光打标机-沈阳co2激光打标机 | 液压压力机,液压折弯机,液压剪板机,模锻液压机-鲁南新力机床有限公司 | 济宁工业提升门|济宁电动防火门|济宁快速堆积门-济宁市统一电动门有限公司 | 新能源汽车教学设备厂家报价[汽车教学设备运营18年]-恒信教具 | IHDW_TOSOKU_NEMICON_EHDW系列电子手轮,HC1系列电子手轮-上海莆林电子设备有限公司 | 脉冲布袋除尘器_除尘布袋-泊头市净化除尘设备生产厂家 | 避光流动池-带盖荧光比色皿-生化流动比色皿-宜兴市晶科光学仪器 东莞爱加真空科技有限公司-进口真空镀膜机|真空镀膜设备|Polycold维修厂家 | 印刷人才网 印刷、包装、造纸,中国80%的印刷企业人才招聘选印刷人才网! | 盐水蒸发器,水洗盐设备,冷凝结晶切片机,转鼓切片机,絮凝剂加药系统-无锡瑞司恩机械有限公司 | 全温度恒温培养摇床-大容量-立式-远红外二氧化碳培养箱|南荣百科 | 合肥风管加工厂-安徽螺旋/不锈钢风管-通风管道加工厂家-安徽风之范 | 高楼航空障碍灯厂家哪家好_航空障碍灯厂家_广州北斗星障碍灯有限公司 | 高压无油空压机_无油水润滑空压机_水润滑无油螺杆空压机_无油空压机厂家-科普柯超滤(广东)节能科技有限公司 | 西安中国国际旅行社(西安国旅)| 数显恒温油浴-电砂浴-高温油浴振荡器-常州迈科诺仪器有限公司 | 宝元数控系统|对刀仪厂家|东莞机器人控制系统|东莞安川伺服-【鑫天驰智能科技】 | 有机肥设备生产制造厂家,BB掺混肥搅拌机、复合肥设备生产线,有机肥料全部加工设备多少钱,对辊挤压造粒机,有机肥造粒设备 -- 郑州程翔重工机械有限公司 | 水厂污泥地磅|污泥处理地磅厂家|地磅无人值守称重系统升级改造|地磅自动称重系统维修-河南成辉电子科技有限公司 | 【化妆品备案】进口化妆品备案流程-深圳美尚美化妆品有限公司 | 服务器之家 - 专注于服务器技术及软件下载分享 | 金属抛光机-磁悬浮抛光机-磁力研磨机-磁力清洗机 - 苏州冠古科技 | 欧美日韩国产一区二区三区不_久久久久国产精品无码不卡_亚洲欧洲美洲无码精品AV_精品一区美女视频_日韩黄色性爱一级视频_日本五十路人妻斩_国产99视频免费精品是看4_亚洲中文字幕无码一二三四区_国产小萍萍挤奶喷奶水_亚洲另类精品无码在线一区 | 电机修理_二手电机专家-河北豫通机电设备有限公司(原石家庄冀华高压电机维修中心) | 多功能三相相位伏安表-变压器短路阻抗测试仪-上海妙定电气 | 水上浮桥-游艇码头-浮动码头-游船码头-码瑞纳游艇码头工程 | 广州中央空调回收,二手中央空调回收,旧空调回收,制冷设备回收,冷气机组回收公司-广州益夫制冷设备回收公司 | 郑州大巴车出租|中巴车租赁|旅游大巴租车|包车|郑州旅游大巴车租赁有限公司 | SRRC认证_电磁兼容_EMC测试整改_FCC认证_SDOC认证-深圳市环测威检测技术有限公司 | 直流电能表-充电桩电能表-导轨式电能表-智能电能表-浙江科为电气有限公司 | 外贸资讯网 - 洞悉全球贸易,把握市场先机 |