INTERNATIONAL JOURNAL OF SCIENTIFIC DEVELOPMENT AND RESEARCH International Peer Reviewed & Refereed Journals, Open Access Journal ISSN Approved Journal No: 2455-2631 | Impact factor: 8.15 | ESTD Year: 2016
open access , Peer-reviewed, and Refereed Journals, Impact factor 8.15
Web Document Clustering Approach Base on Improvise Fuzzy Clustering using Cosine Similarity and Name Entity Recognition Method
Authors Name:
Kalyani Ramesh Pole
, Vishakha R. Mote
Unique Id:
IJSDR1712014
Published In:
Volume 2 Issue 12, December-2017
Abstract:
Recent advances in computers and technology have resulted in a growing body of documents. The need is to classify the set of documents according to type. Placing related documents together is convenient for making decisions. Researchers conducting interdisciplinary research acquire repositories on different topics. The classification of the repositories according to the theme is a real need to analyze the research work. The experiments are tested on different sets of real and artificial data, such as NEWS 20, Reuters, emails, research on different topics. The term frequency inverse document frequency algorithm is used together with the fuzzy hierarchical algorithm and K-means. Initially, the experiment is being carried out in small data sets and cluster analysis was performed. The best algorithm applies to the extended data set. Together with the different groups of related documents, the resulting coefficient and the trend of measure F are presented to show the behavior of the algorithm for each data set. Our model combines two components: a mixing component used to discover latent groups in the collection of documents and a theme model component used to mine multigrain issues, including cluster-specific local issues and global topics shared between clusters. We use the variational inference to approximate the posterior part of the hidden variables and learn the parameters of the model. The experiments in two data sets demonstrate the effectiveness of our model.
Keywords:
collaborative filtering and information filtering; Web content; linguistic topological space; Name Entity Recognition, Natural Language Processing
Cite Article:
"Web Document Clustering Approach Base on Improvise Fuzzy Clustering using Cosine Similarity and Name Entity Recognition Method", International Journal of Science & Engineering Development Research (www.ijsdr.org), ISSN:2455-2631, Vol.2, Issue 12, page no.93 - 101, December-2017, Available :http://www.ijsdr.org/papers/IJSDR1712014.pdf
Downloads:
000337078
Publication Details:
Published Paper ID: IJSDR1712014
Registration ID:170875
Published In: Volume 2 Issue 12, December-2017
DOI (Digital Object Identifier):
Page No: 93 - 101
Publisher: IJSDR | www.ijsdr.org
ISSN Number: 2455-2631
Facebook Twitter Instagram LinkedIn