论文降重

中文微博产品评论中的情感分析方法研究

时间:2017-12-03 11:05:16 编辑:知网查重入口 www.cnkiid.cn

 为了了解消费者的消费满意度,可以通过问卷调查和电话随访的方式进行,但这种抽样调查不能代表所有消费者的消费满意度,搜集数据本身不具有代表性。社交平台网络的用户较多,可以储存大量的用户体验信息,而这些微博内容能够打破时间、空间、年龄、职业等的局限性,较好地反映出用户的兴趣爱好和情感思维。因此,从用户微博评论中获取用户的情感状态、偏好,找出用户的兴趣点,可作为商家提供发掘潜在客户群体的依据,便于有的放矢地进行产品营销及市场推广。

微博文本和传统文本的差别性较大,内容短小精炼并且口语化;没有严格的语法限制,而且可能存在病句。了解这些微博文本的特殊性,根据二者的异同之处,改进传统文本分析方法,研究合适的中文微博产品评论情感分析方法,提高识别精度。

本文的工作主要包括情感倾向分类、分类算法、情感信息的检索等内容。主要研究工作包括以下几个方面:

构建基于句法依存关系和文本分类的微博产品评论情感分析模型,进行随机森林和支持向量机分类算法的性能比较,选择置信度高的训练样本集,选择微博表情符号并进行标注训练样本集;

提出利用句法依存关系和文本分类相结合的情感分析算法,特别是将随机森林分类算法应用于文本分类,并进行实验分析,主要包含预处理、评价词及评价对象抽取、情感倾向性判别等,对微博产品评论文本中蕴含的情感进行分析;

设计了中文微博产品评论情感分析的系统,阐述了系统的设计和研发过程,包括数据收集子系统、数据处理子系统和数据展示子系统的设计和实现;最后对系统进行了测试,主要包括测试环境的构建、测试结果对比分析,最终确定真正有效的微博产品评论情感分析模型。

本文中所论述的微博情感分析算法,是在中文词法、句法分析基础上结合文本分类而构建的,最终研发出的情感分析系统测试后主客观句识别可达到88%的正确度,情感倾向性识别准确率能达到71%。

 

 

关键词:微博,产品评论,情感分析

 

ASTUDYFORSENTIMENT ANALYSIS METHOD BASED ON CHINESE MICROBLOGGING PRODUCT REVIEWS

 

ABSTRACT

 

Customers' consumption satisfaction can be available through questionnaire and telephone follow-up, however, these kinds of sampling survey cannot represent all customers' consumption satisfaction because the sampling data is not typical and representative. Social platform users are quite universal and users' experience information can be stored in large quantity on these platforms. Meanwhile, the micro blog content can easily break the limitations of time, space, ages and occupation, which reflect users' interests, hobbies and emotional thinking in a better way. Therefore, to identify users' interest by acquiring their emotion status and preferences from micro blog comments would provide evidence for merchants to explore potential clients group. This would avail marketing and promotion activities to achieve targeted responses.

Micro blog text is quite different from traditional text due to its short, refine and oral-style words. In addition, there are no strict grammar rules in micro blog text; however, this may cause ill-formed sentences. Considering the peculiarity of micro blog text, it shall improve traditional text analysis methods and study suitable emotional analysis methods for Chinese micro blog products and its comments thus to elevate identification accuracy.

The thesis's tasks mainly cover emotional tendency classification, sorting algorithms, retrieval of emotion information. The main study work include the following aspects:

To build emotional analysis model of micro blog products comments based on syntactic dependency and textual classification and to compare performance and properties of random forest and SVM support vector machine classification algorithm. To select training sample with high-credibility and micro blog emoji to implement labeling for training sample.

To propose emotional analysis algorithm combining syntactic dependency and textual classification. In particular, to have random forest algorithm used in textual classification and conduct experimental analysis including preprocessing, extract of comment words & comment targets and distinguishing emotional tendency thus to analyze emotions in comments for micro blog products.

Emotional analysis system for comments on micro blog product is designed. In addition, it expounds the design, research and development process of the system, including design and implementation of data-collection subsystem, data-processing subsystem and data-presenting subsystem. In the end, it tested the system which includes testing environment construction and testing result comparison analysis. Eventually, it identifies the effective emotional analysis model for comments on micro blog products.

The emotional analysis algorithm for micro blog in the thesis is built based on Chinese morphology and syntactic analysis. The final emotional analysis system was demonstrated with 88% accuracy of distinguishing subjective and objective sentences and 71% accuracy in distinguishing emotional tendency.

 

Keywords: Weibo, product reviews, sentiment analysis

以上就是部分论文写作范文,如写好论文后想在知网查重的小伙伴,可以点击知网论文查重,然后选择相应的论文查重系统。PS:要先了解自己学校是什么检测系统!