Automatic Acquisition and Semantic Annotation of Web Tourism Information
Abstract
Data collection and semantic annotation is often the basic of information processing such as semantic relation analysis of data, big data mining and semantic information search et.al. A method which collects data from tourism web site and annotates these data with semantic tags automatically is promoted in this paper. The crawler which collects data from web site automatically is introduced firstly. Then the Chinese word segmentation tool and a classic key word extraction algorithm TF/IDF are introduced. With the help of a crawler, we collection tourism information about 247 sight spots in Beijing and 4198 sight spots in other area of China from the web sites of elong and ctrip. Then with the help of the ICTCLAS and TF/IDF, we abstract keywords from the information as semantic tags to annotate the sight spots.
Keywords
Tourism information, Word segmentation, Keywords, Semantics, Semantic annotation
DOI
10.12783/dtcse/cscbd2019/30026
10.12783/dtcse/cscbd2019/30026
Refbacks
- There are currently no refbacks.