Domain Based Ontology and Automated Text Categorization Based on Improved Term Frequency – Inverse Document Frequency

Sukanya Ray,Nidhi Chandra

Index Terms

Term Frequency – Inverse Document Frequency, Ontology, Dependency Graph, Text Categorization


In recent years there has been a massive growth in textual information in textual information especially in the internet. People now tend to read more e-books than hard copies of the books. While searching for some topic especially some new topic in the internet it will be easier if someone knows the pre-requisites and post- requisites of that topic. It will be easier for someone searching a new topic. Often the topics are found without any proper title and it becomes difficult later on to find which document was for which topic. A text categorization method can provide solution to this problem. In this paper domain based ontology is created so that users can relate to different topics of a domain and an automated text categorization technique is proposed that will categorize the uncategorized documents. The proposed idea is based on Term Frequency – Inverse Document Frequency (tf -idf) method and a dependency graph is also provided in the domain based ontology so that the users can visualize the relations among the terms.

Sukanya Ray,Nidhi Chandra,"Domain Based Ontology and Automated Text Categorization Based on Improved Term Frequency – Inverse Document Frequency", IJMECS, vol.4, no.4, pp.28-35, 2012.


