In this assignment, we have the information of UNCC’s followers on Twitter. We want to categorize users based on their number of followers and status counts. Here is the description of the variables in this data.
VariablesTypeDescription
user_idNumericUser id, anonymized
locationTextLocation of the users indicated in the profile
protectedIndicatoris the account protected vs. open to public
followers_countNumericNumber of followers
friends_countNumericNumber of friends (followed by the user)
listed_countNumericNumber of twitter lists
statuses_countNumericNumber of tweets posted
favourites_countNumericNumber of other users’ tweets favorited by the user
verifiedIndicatorIs the account verified
account_created_at_yearCategoricalThe year the account was opened
deviceCategoricalThe device of the user accessing the account
has_websiteIndicatorDoes the profile description include a website
Step 2:Perform k-mean cluster analysis using the two variables of number of followers and status counts across four clusters.
After performing cluster analysis, store the cluster memberships back to the original data.
Based on the cluster membership, answer the following questions: Plot the distribution of the account creation year across the clusters. Is there a cluster that is older than the others are?
How the clusters are different in terms of the number of favorites they provide to other tweets. You may user either visual or tabular summarization.
How the clusters are different in terms of being a verified account as well as the device they use to access Twitter. You may user either visual or tabular summarization.
How can we use the findings from this cluster analysis to inform social media strategy? Be as specific as you can.
In this assignment, we have the information of UNCC’s followers on Twitter. We w
By admin