GENERATING ARTIFICIAL DATA FOR SCALABILITY TEST OF A FOLLOWEE TWITTER RECOMMENDER SYSTEM
thesisposted on 23.05.2021, 13:03 by Jason Li
The current methods for the evaluation of the scalability of recommender systems measure the scalability of the whole application after deployment on a cloud by calculating the running times of the application when increasing the number of nodes. This method requires the complete development and implementation of a whole application. To be able to test the recommender system during the development phase, the major problem to test the scalability and accuracy is collecting real data (i.e. social data), which is a time-consuming task and sometimes it is not possible due to privacy concerns. This thesis proposes measuring the scalability of Twitter recommender systems by simulating the software, which processes a large number of artificial tweets. A method is introduced and validated for producing artificial tweets to test a recommender system. This method of producing artificial tweets is based on using analytical modeling, tf-idf and bag-of-words model. A simulator is developed to test the scalability of a recommender system and underlying distributed environment.