Consider looking for related projects for help or ask at the Teahouse. If you are not currently a project participant and wish to help you may still participate in the project. This status should be changed if collaborative activity resumes.
The Vandalism Studies project is a portion of the Counter-Vandalism Unit designated to conduct research related to unconstructive edits on Wikipedia. The project covers all vandalism on Wikipedia. If you'd like to get involved, please add your name to the Members list, below!
CHCSPrefect (talk·contribs) I made this account to stop Vandalism coming out of my school's IP Address, this is what this account was made for, something like this is where I belong!
Rushbugled13 (talk·contribs) I wish to help maintain the reliability of Wikipedia as a resource, and vandalism is a large problem with respect to reliability.
These are some preliminary questions may stimulate future studies. Not all questions may be answerable, so think of it more as a brainstorming section.
Analysis of vandalism
Who is responsible for vandalism? What do vandals want? What are the demographics of the vandal population?
What proportion of vandals are on dynamic IP addresses, and hence very hard to block?
Are IP edits ever responsible for improving a featured article while on the Main Page? (See also essay IPs are human too.)
What motivates people to vandalize articles? How can we minimize the satisfaction they get from doing it? (See: The motivation of a vandal)
Do vandals just choose another article to edit instead if an article is semi-protected? How can we test this?
Why do certain articles attract more vandalism than others?
What types of vandalism are there? What message are they trying to get across? Why do vandals not fully realise that their actions are futile?
What sort of financial gains can be made from using Wikipedia to advertise – are spammers just wasting their time, or can it actually be profitable? Are our anti-spam measures adequate?
What is the overall contribution from schools and universities? Are they worth having? Do universities contribute less vandalism than schools or are all ages equally immature?
How does the rate of vandalism vary throughout the day?
Would there still be problems with vandalism if unregistered editing was blocked? How can we test this hypothesis? Certain categories could be experimentally altered to block unregistered editors, but then vandals could just choose an article that wasn't protected. We would have to block all IP editing, which would certainly be controversial, even just to gather a small sample of data. The blocks would also have to allow newly registered users to edit, otherwise, there wouldn't be time to create an account and then wait 4 days. Perhaps we could use a comparative method by doing the experiments on another wiki instead?
Quantitatively, how are levels of vandalism affected (both in terms of percentage of edits and number of edits) when there is external attention draw to an article (e.g. Slashdot or The Colbert Report). Do levels of vandalism return to normal (e.g. in elephant) in all cases? How quickly?
How much of vandalism is self-reverted?
How do the levels of reverted edits compare between articles of different quality (e.g. GA vs. start class)
How often are good faith edits labeled as vandalism, either a) mistakenly and through misinterpretation of policy or b) maliciously?
Are editors any more likely to continue or desist vandalizing if warned by a bot instead of a person?
How often are vandals warned on their talk page after committing an offense?
What are the costs and benefits, and hence overall utility, of warning users? How do users respond to warnings?
Who is responsible for reverting vandalism?
What effects does semi-protection have on the level of vandalism of protected articles?
What strategies can we employ to catch vandalism quickly?
How can we catch most of it at recent changes?
How can we establish a situation where almost every article has someone responsible for maintaining it? Is this even a good idea? (See: Ownership of articles)
How good are editors at reverting vandalism? That is, is it reverted properly, or is it often dealt with poorly, e.g. removing a whole paragraph that the vandal has simply altered in meaning.
What happens to vandalism levels when edits won't show up in the current version of the article – a trial of something like stable versions, where the vandal cannot vandalize the actual article people see, or something functionally similar, is needed. Perhaps a small section (e.g. all articles in a certain category) could be tested out.
Wikipedia vandalism studies outside of this project
This list is incomplete; you can help by adding missing items.
Published
Carter, Jacobi (2 June 2010). "ClueBot and Vandalism on Wikipedia" (PDF). Archived from the original (PDF) on 2010-06-02. Retrieved 5 October 2020.
"U of M researchers reveal new findings about Wikipedia authorship and vandalism" (Press release). University of Minnesota – Department of Computer Science and Engineering. 2007-11-06. Archived from the original on 2012-09-20.
Buriol, Luciana S.; Carlos Castillo; Debora Donato; Stefano Leonardi; Stefano Millozzi (2006). "Temporal Analysis of the Wikigraph" (PDF). Sapienza University of Rome.
GroupLens Research (November 4–7, 2007). "Creating, Destroying, and Restoring Value in Wikipedia". Proceedings of the 2007 international ACM conference on Conference on supporting group work - GROUP '07. Sanibel Island, Florida, USA: University of Minnesota – Department of Computer Science and Engineering. p. 259. doi:10.1145/1316624.1316663. ISBN9781595938459.
MIT Media Lab; IBM Research (April 24–29, 2004). "Studying Cooperation and Conflict between Authors with history flow Visualizations" (PDF). Vienna: Massachusetts Institute of Technology.
Moore, Rick (2007-11-16). "New information on Wikipedia". University of Minnesota.
Smets, Koen; Bart Goethals; Brigitte Verdonk (2008). "Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach" (PDF). University of Antwerp – Department of Mathematics and Computer Science.
Wang, William Yang; McKeown, Kathleen R. (2010). "Got You!: Automatic Vandalism Detection in Wikipedia with Web-based Shallow Syntactic-Semantic Modeling" (PDF). the 23rd International Conference on Computational Linguistics.
Belani, Amit (2009-11-11). "Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach". arXiv:1001.0700 [cs.LG].
West, Andrew G.; Sampath Kannan; Insup Lee (2010). "Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata?". Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. pp. 22–28. doi:10.1145/1752046.1752050. ISBN9781450300599. S2CID 215753727.
Adler, B. Thomas; Luca de Alfaro; Santiago Mola-Velasco; Paolo Rosso; Andrew G. West (2011). "Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features". Computational Linguistics and Intelligent Text Processing. Lecture Notes in Computer Science. Vol. 6609. pp. 277–288. doi:10.1007/978-3-642-19437-5_23. hdl:10251/36621. ISBN978-3-642-19436-8.
West, Andrew G.; Insup Lee (2011). "Multilingual Vandalism Detection using Language-Independent & Ex Post Facto Evidence". Pan-Clef '11: Notebook Papers on Uncovering Plagiarism, Authorship, and Social Software Misuse.