|
基于LDA模型的主题词抽取方法
Topic Words Extraction Method Based on LDA Model
查看参考文献9篇
文摘
|
以LDA模型表示文本词汇的概率分布,通过香农信息抽取体现主题的关键词. 采用背景词汇聚类及主题词联想的方式将主题词扩充到待分析文本之外,尝试挖掘文本的主题内涵. 模型拟合基于快速Gibbs抽样算法进行. 实验结果表明,快速Gibbs算法的速度约比传统Gibbs算法高5倍,准确率和抽取效率均较高 |
其他语种文摘
|
Latent Dirichlet Allocation(LDA) is presented to express the distributed probability of words. The topic keywords are extracted according to Shannon information Wordswhich are not distinctly in the analyzed text can be included to express the topics with the help of word clustering of background and topic words association. The topic meaning is attempted to dig out. Fast Gibbs is used to estimate the parameters. Experiments show that Fast Gibbs is 5 times faster than Gibbs and the precision is satisfactory, which shows the approach is efficient |
来源
|
计算机工程
,2010,36(19):81-83 【核心库】
|
关键词
|
LDA模型
;
Gibbs抽样
;
主题词抽取
|
地址
|
长春工业大学计算机科学与工程学院, 长春, 130012
|
语种
|
中文 |
文献类型
|
研究性论文 |
ISSN
|
1000-3428 |
学科
|
自动化技术、计算机技术 |
基金
|
长春工业大学博士基金资助项目
|
文献收藏号
|
CSCD:4035384
|
参考文献 共
9
共1页
|
1.
Blei D M. Latent Dirichlet Allocation.
Journal of Machine Learning Research,2003,3:993-1022
|
CSCD被引
1337
次
|
|
|
|
2.
Li Wenbo. Text Classification Based on Labeled-LDA Model.
Chinese Journal of Computers,2008,31(4):620-627
|
CSCD被引
5
次
|
|
|
|
3.
Caol J. LDA-based Retrieval Framework for Semantic News Video Retrieval.
Proc. of Conf. on Semantic Computing,2007
|
CSCD被引
1
次
|
|
|
|
4.
Steyvers M. Probabitistic Topic Models.
Latent Semantic Analysis: A Road to Meaning,2006
|
CSCD被引
4
次
|
|
|
|
5.
Griffiths T. Finding Scientific Topics.
Proceedings of the National Academy of Sciences,2004,101(Suppl.1):5228-5235
|
CSCD被引
225
次
|
|
|
|
6.
Shi Jing. Text Segmentation Based on Model LDA.
Chinese Journal of Computers,2008,31(10):1865-1873
|
CSCD被引
1
次
|
|
|
|
7.
Nevada L V. Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation.
Proceedings of the 14th ACM SIGKDD Intemational Conference on Knowledge Discovery and Data Mining,2008:569-577
|
CSCD被引
1
次
|
|
|
|
8.
Li Hang. Topic Analysis Using a Finite Mixture Model.
Information Processing & Management,2003,39(4):521-541
|
CSCD被引
10
次
|
|
|
|
9.
Liu Ying. Comparison of Two Schemes for Automatic Keyword Extraction from MEDLINE for Functional Gene Clustering.
Proc. of IEEE Computational Systems Bioinformatics Conference,2004:394-404
|
CSCD被引
1
次
|
|
|
|
|
|