中图网(原中国图书网):网上书店,尾货特色书店,30万种特价书低至2折!

歡迎光臨中圖網 請 | 注冊

包郵 文本數據挖掘(英文版)

出版社:清華大學出版社出版時間:2021-10-01
開本: 其他 頁數: 372
中 圖 價:¥77.4(6.5折) 定價  ¥119.0 登錄后可看到會員價
加入購物車 收藏
開年大促, 全場包郵
?新疆、西藏除外
本類五星書更多>

文本數據挖掘(英文版) 版權信息

文本數據挖掘(英文版) 本書特色

《文本數據挖掘(英文版)》面向文本挖掘任務的實際需求,通過實例從原理上對相關技術的理論方法和實現算法進行闡述,寫作風格力求言簡意賅,深入淺出,而不過多地涉及實現細節,盡量使讀者能夠在充分理解基本原理的基礎上掌握應用系統的實現方法。 It is suitable for students, researchers and practitioners interested in text data mining both as a learning text and as a reference book. Professors can readily use it for classes on text data mining or NLP.

文本數據挖掘(英文版) 內容簡介

《Text data mining》 offers thorough and detailed introduction to the fundamental theories and methods of text data mining, ranging from pre-processing (for both Chinese and English texts), text representation, feature selection, to text classification and text clustering. Also it presents predominant applications of text data mining, for example, topic model, sentiment analysis and opinion mining, topic detection and tracking, information extraction, and text automatic summarization, etc.

文本數據挖掘(英文版) 目錄

1 Introduction 1 1.1 The Basic Concepts 1 1.2 Main Tasks of Text Data Mining 3 1.3 Existing Challenges in Text Data Mining 6 1.4 Overview and Organization of This Book 9 1.5 Further Reading 12 2 Data Annotation and Preprocessing 15 2.1 Data Acquisition 15 2.2 Data Preprocessing 20 2.3 Data Annotation 22 2.4 Basic Tools of NLP 25 2.4.1 Tokenization and POS Tagging 25 2.4.2 Syntactic Parser 27 2.4.3 N-gram Language Model 29 2.5 Further Reading 30 3 Text Representation 33 3.1 Vector Space Model 33 3.1.1 Basic Concepts 33 3.1.2 Vector Space Construction 34 3.1.3 Text Length Normalization 36 3.1.4 Feature Engineering 37 3.1.5 Other Text Representation Methods 39 3.2 Distributed Representation of Words 40 3.2.1 Neural Network Language Model 41 3.2.2 C&W Model 45 3.2.3 CBOW and Skip-Gram Model 47 3.2.4 Noise Contrastive Estimation and Negative Sampling 49 3.2.5 Distributed Representation Based on the Hybrid Character-Word Method 51 3.3 Distributed Representation of Phrases 53 3.3.1 Distributed Representation Based on the Bag-of-Words Model 54 3.3.2 Distributed Representation Based on Autoencoder 54 3.4 Distributed Representation of Sentences 58 3.4.1 General Sentence Representation 59 3.4.2 Task-Oriented Sentence Representation 63 3.5 Distributed Representation of Documents 66 3.5.1 General Distributed Representation of Documents 67 3.5.2 Task-Oriented Distributed Representation of Documents 69 3.6 Further Reading 72 4 Text Representation with Pretraining and Fine-Tuning 75 4.1 ELMo: Embeddings from Language Models 75 4.1.1 Pretraining Bidirectional LSTM Language Models 76 4.1.2 Contextualized ELMo Embeddings for Downstream Tasks 77 4.2 GPT: Generative Pretraining 78 4.2.1 Transformer 78 4.2.2 Pretraining the Transformer Decoder 80 4.2.3 Fine-Tuning the Transformer Decoder 81 4.3 BERT: Bidirectional Encoder Representations from Transformer 82 4.3.1 BERT: Pretraining 83 4.3.2 BERT: Fine-Tuning 86 4.3.3 XLNet: Generalized Autoregressive Pretraining 86 4.3.4 UniLM 89 4.4 Further Reading 90 5 Text Classi?cation 93 5.1 The Traditional Framework of Text Classi?cation 93 5.2 Feature Selection 95 5.2.1 Mutual Information 96 5.2.2 Information Gain 99 5.2.3 The Chi-Squared Test Method 100 5.2.4 Other Methods 101 5.3 Traditional Machine Learning Algorithms for Text Classi?cation 102 5.3.1 Na?ve Bayes 103 5.3.2 Logistic/Softmax and Maximum Entropy 105 5.3.3 Support Vector Machine 107 5.3.4 Ensemble Methods 110 5.4 Deep Learning Methods ............................................. 111 5.4.1 Multilayer Feed-Forward Neural Network ................ 111 5.4.2 Convolutional Neural Network ............................ 113 5.4.3 Recurrent Neural Network ................................. 115 5.5 Evaluation of Text Classi?cation 120 5.6 Further Reading 123 6 Text Clustering 125 6.1 Text Similarity Measures 125 6.1.1 The Similarity Between Documents 125 6.1.2 The Similarity Between Clusters 128 6.2 Text Clustering Algorithms 129 6.2.1 K-Means Clustering 129 6.2.2 Single-Pass Clustering 133 6.2.3 Hierarchical Clustering 136 6.2.4 Density-Based Clustering 138 6.3 Evaluation of Clustering 141 6.3.1 External Criteria 141 6.3.2 Internal Criteria 142 6.4 Further Reading 143 7 Topic Model 145 7.1 The History of Topic Modeling. 145 7.2 Latent Semantic Analysis 146 7.2.1 Singular Value Decomposition of the Term-by-Document Matrix 147 7.2.2 Conceptual Representation and Similarity Computation 148 7.3 Probabilistic Latent Semantic Analysis 150 7.3.1 Model Hypothesis .......................................... 150 7.3.2 Parameter Learning ......................................... 151 7.4 Latent Dirichlet Allocation .......................................... 153 7.4.1 Model Hypothesis .......................................... 153 7.4.2 Joint Probability ............................................ 155 7.4.3 Inference in LDA ........................................... 158 7.4.4 Inference for New Documents ............................. 160 7.5 Further Reading 161 8 Sentiment Analysis and Opinion Mining 163 8.1 History of Sentiment Analysis and Opinion Mining 163 8.2 Categorization of Sentiment Analysis Tasks 164 8.2.1 Categorization According to Task Output 164 8.2.2 According to Analysis Granularity 165 8.3 Methods for Document/Sentence-Level Sentiment Analysis 168 8.3.1 Lexicon- and Rule-Based Methods 169 8.3.2 Traditional Machine Learning Methods 170 8.3.3 Deep Learning Methods 174 8.4 Word-Level Sentiment Analysis and Sentiment Lexicon Construction 178 8.4.1 Knowledgebase-Based Methods 178 8.4.2 Corpus-Based Methods 179 8.4.3 Evaluation of Sentiment Lexicons 182 8.5 Aspect-Level Sentiment Analysis 183 8.5.1 Aspect Term Extraction .................................... 183 8.5.2 Aspect-Level Sentiment Classi?cation .................... 186 8.5.3 Generative Modeling of Topics and Sentiments .......... 191 8.6 Special Issues in Sentiment Analysis................................ 193 8.6.1 Sentiment Polarity Shift .................................... 193 8.6.2 Domain Adaptation ......................................... 195 8.7 Further Reading ...................................................... 198 9 Topic Detection and Tracking ............................................. 201 9.1 History of Topic Detection and Tracking ........................... 201 9.2 Terminology and Task De?nition.................................... 202 9.2.1 Terminology ................................................ 202 9.2.2 Task ......................................................... 203 9.3 Story/Topic Representation and Similarity Computation .......... 206 9.4 Topic Detection....................................................... 209 9.4.1 Online Topic Detection ..................................... 209 9.4.2 Retrospective Topic Detection ............................. 211 9.5 Topic Tracking........................................................ 212 9.6 Evaluation ............................................................ 213 9.7 Social Media Topic Detection and Tracking ........................ 215 9.7.1 Social Media Topic Detection.............................. 216 9.7.2 Social Media Topic Tracking .............................. 217 9.8 Bursty Topic Detection............................................... 217 9.8.1 Burst State Detection ....................................... 218 9.8.2 Document-Pivot Methods .................................. 221 9.8.3 Feature-Pivot Methods ..................................... 222 9.9 Further Reading ...................................................... 224 10 Information Extraction 227 10.1 Concepts and History 227 10.2 Named Entity Recognition 229 10.2.1 Rule-based Named Entity Recognition 230 10.2.2 Supervised Named Entity Recognition Method 231 10.2.3 Semisupervised Named Entity Recognition Method 239 10.2.4 Evaluation of Named Entity Recognition Methods 241 10.3 Entity Disambiguation ............................................... 242 10.3.1 Clustering-Based Entity Disambiguation Method ........ 243 10.3.2 Linking-Based Entity Disambiguation .................... 248 10.3.3 Evaluation of Entity Disambiguation .. . . . ................. 254 10.4 Relation Extraction ................................................... 256 10.4.1 Relation Classi?cation Using Discrete Features .......... 258 10.4.2 Relation Classi?cation Using Distributed Features ....... 265 10.4.3 Relation Classi?cation Based on Distant Supervision .. . . 268 10.4.4 Evaluation of Relation Classi?cation . ..................... 269 10.5 Event Extraction 270 10.5.1 Event Description Template................................ 270 10.5.2 Event Extraction Method ................................... 272 10.5.3 Evaluation of Event Extraction ............................ 281 10.6 Further Reading ...................................................... 281 11 Automatic Text Summarization 285 11.1 Main Tasks in Text Summarization 285 11.2 Extraction-Based Summarization 287 11.2.1 Sentence Importance Estimation 287 11.2.2 Constraint-Based Summarization Algorithms 298 11.3 Compression-Based Automatic Summarization 299 11.3.1 Sentence Compression Method 300 11.3.2 Automatic Summarization Based on Sentence Compression 305 11.4 Abstractive Automatic Summarization 307 11.4.1 Abstractive Summarization Based on Information Fusion 307 11.4.2 Abstractive Summarization Based on the Encoder-Decoder Framework .............................. 313 11.5 Query-Based Automatic Summarization ............................ 316 11.5.1 Relevance Calculation Based on the Language Model . . . 317 11.5.2 Relevance Calculation Based on Keyword Co-occurrence .............................................. 317 11.5.3 Graph-Based Relevance Calculation Method ............. 318 11.6 Crosslingual and Multilingual Automatic Summarization ......... 319 11.6.1 Crosslingual Automatic Summarization .. . ................ 319 11.6.2 Multilingual Automatic Summarization .. . . ............... 323 11.7 Summary Quality Evaluation and Evaluation Workshops.......... 325 11.7.1 Summary Quality Evaluation Methods .................... 325 11.7.2 Evaluation Workshops...................................... 330 11.8 Further Reading ...................................................... 332 References 335
展開全部

文本數據挖掘(英文版) 作者簡介

Chengqing Zong is professor at the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences. He serves as chairs for many prestigious conferences such as ACL-IJCNLP, IJCAI, IJCAI-ECAI, AAAI and COLING, etc., and served as associate editors for prestigious journals such as TALLIP, Machine Translation, etc. He is the President of Asian Federation on Natural Language Processing and a member of International Committee on Computational Linguistics.

商品評論(0條)
暫無評論……
書友推薦
本類暢銷
編輯推薦
返回頂部
中圖網
在線客服
主站蜘蛛池模板: 石家庄律师_石家庄刑事辩护律师_石家庄取保候审-河北万垚律师事务所 | 电缆接头_防水接头_电缆防水接头 - 乐清市新豪电气有限公司 | 照相馆预约系统,微信公众号摄影门店系统,影楼管理软件-盟百网络 | 大_小鼠elisa试剂盒-植物_人Elisa试剂盒-PCR荧光定量试剂盒-上海一研生物科技有限公司 | 生物颗粒燃烧机-生物质燃烧机-热风炉-生物颗粒蒸汽发生器-丽水市久凯能源设备有限公司 | 【孔氏陶粒】建筑回填陶粒-南京/合肥/武汉/郑州/重庆/成都/杭州陶粒厂家 | 密封圈_泛塞封_格莱圈-[东莞市国昊密封圈科技有限公司]专注密封圈定制生产厂家 | EFM 022静电场测试仪-套帽式风量计-静电平板监测器-上海民仪电子有限公司 | 成人纸尿裤,成人尿不湿,成人护理垫-山东康舜日用品有限公司 | 知网论文检测系统入口_论文查重免费查重_中国知网论文查询_学术不端检测系统 | 颚式破碎机,圆锥破碎机,制砂机-新乡市德诚机电制造有限公司 | 太阳能发电系统-太阳能逆变器,控制器-河北沐天太阳能科技首页 | 有机肥设备生产制造厂家,BB掺混肥搅拌机、复合肥设备生产线,有机肥料全部加工设备多少钱,对辊挤压造粒机,有机肥造粒设备 -- 郑州程翔重工机械有限公司 | 德州网站制作 - 网站建设设计 - seo排名优化 -「两山建站」 | 碳化硅,氮化硅,冰晶石,绢云母,氟化铝,白刚玉,棕刚玉,石墨,铝粉,铁粉,金属硅粉,金属铝粉,氧化铝粉,硅微粉,蓝晶石,红柱石,莫来石,粉煤灰,三聚磷酸钠,六偏磷酸钠,硫酸镁-皓泉新材料 | 喷砂机厂家_自动除锈抛丸机价格-成都泰盛吉自动化喷砂设备 | 恒温振荡混匀器-微孔板振荡器厂家-多管涡旋混匀器厂家-合肥艾本森(www.17world.net) | 厦门网站建设_厦门网站设计_小程序开发_网站制作公司【麦格科技】 | 浙江华锤电器有限公司_地磅称重设备_防作弊地磅_浙江地磅售后维修_无人值守扫码过磅系统_浙江源头地磅厂家_浙江工厂直营地磅 | 浙江美尔凯特智能厨卫股份有限公司 | 沈阳楼承板_彩钢板_压型钢板厂家-辽宁中盛绿建钢品股份有限公司 轴承振动测量仪电箱-轴承测振动仪器-测试仪厂家-杭州居易电气 | 政府回应:200块在义乌小巷能买到爱情吗?——揭秘打工族省钱约会的生存智慧 | 航空连接器,航空插头,航空插座,航空接插件,航插_深圳鸿万科 | 臭氧灭菌箱-油桶加热箱-原料桶加热融化烘箱-南京腾阳干燥设备厂 臭氧发生器_臭氧消毒机 - 【同林品牌 实力厂家】 | 中高频感应加热设备|高频淬火设备|超音频感应加热电源|不锈钢管光亮退火机|真空管烤消设备 - 郑州蓝硕工业炉设备有限公司 | 安徽免检低氮锅炉_合肥燃油锅炉_安徽蒸汽发生器_合肥燃气锅炉-合肥扬诺锅炉有限公司 | RO反渗透设备_厂家_价格_河南郑州江宇环保科技有限公司 | 台式低速离心机-脱泡离心机-菌种摇床-常州市万丰仪器制造有限公司 | 合肥活动房_安徽活动板房_集成打包箱房厂家-安徽玉强钢结构集成房屋有限公司 | 耙式干燥机_真空耙式干燥机厂家-无锡鹏茂化工装备有限公司 | 亚洲工业智能制造领域专业门户网站 - 亚洲自动化与机器人网 | 升降机-高空作业车租赁-蜘蛛车-曲臂式伸缩臂剪叉式液压升降平台-脚手架-【普雷斯特公司厂家】 | 上海小程序开发-小程序制作-上海小程序定制开发公司-微信商城小程序-上海咏熠 | 水性漆|墙面漆|木器家具漆|水漆涂料_晨阳水漆官网 | 天津热油泵_管道泵_天津高温热油泵-天津市金丰泰机械泵业有限公司【官方网站】 | 搬运设备、起重设备、吊装设备—『龙海起重成套设备』 | 体视显微镜_荧光生物显微镜_显微镜报价-微仪光电生命科学显微镜有限公司 | 深圳南财多媒体有限公司介绍| 上海律师咨询_上海法律在线咨询免费_找对口律师上策法网-策法网 广东高华家具-公寓床|学生宿舍双层铁床厂家【质保十年】 | 粉末包装机,拆包机厂家,价格-上海强牛包装机械设备有限公司 | 继电器模组-IO端子台-plc连接线-省配线模组厂家-世麦德 |