Multiple evidence combination for web site search using server log analysis
thesisposted on 23.05.2021, 18:56 by Jin Zhou
In this thesis, a novel method is proposed to improve the retrieval performance by using web server logs. Web server logs are grouped into different sessions and then terms are extracted for each page in the session, meanwhile weights of terms are calculated. A new representation of web page from user's perspective is generated after going through the entire log. The new representation and the anchor-based representation are combined with original text-based representation. Two combination methods: combination of document representations and combination of ranking scores are investigated. In the experiments, three measurements are employed to evaluate the performance and the results show that for Cosine Similarity model, the highest improvement on top-10 precision is around 38%, for Okapi model, the hightest improvement is around 13%, for TFIDF model, the highest improvement is around 48% and for Indri model, the highest improvement is around 17%.