Semantic analysis of Twitter content
thesisposted on 23.05.2021, 17:07 by Yue Feng
Semantic analysis is the process of shifting the understanding of text from the levels of phrases, clauses, sentences to the level of semantic meanings. Two of the most important semantic analysis tasks include 1) semantic relatedness measurement and 2) entity linking. The semantic relatedness measurement task aims to quantitatively identify the relationships between two words or concepts based on the similarity or closeness of their semantic meaning whereas the entity linking task focuses on linking plain text to structured knowledge resources, e.g. Wikipedia to provide semantic annotation of texts. A limitation of current semantic analysis approaches is that they are built upon traditional documents which are well structured in formal English, e.g. news; however, with the emergence of social networks, enormous volumes of information can be extracted from the posts on social networks, which are short, grammatically incorrect and can contain special characters or newly invented words, e.g. LOL, BRB. Therefore, traditional semantic analysis approaches may not perform well for analysing social network posts. In this thesis, we build semantic analysis techniques particularly for Twitter content. We build a semantic relatedness model to calculate semantic relatedness between any two words obtained from tweets and by using the proposed semantic relatedness model, we semantically annotate tweets by linking them to Wikipedia entries. We compare our work with state-of-the-art semantic relatedness and entity linking methods that show promising results.