Latest live swearing on UK twitter

 

Richard Stephens

 

School of Psychology, Keele University, United Kindom;

 

 

Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Word Count: XX (Not including reference list)

30 December 2020

 

 

Author Note: Correspondence concerning this article should be addressed to Dr Richard Stephens, School of Psychology, Keele University, UK, ST5 5BG. E-mail: r.stephens@keele.ac.uk.

 

Introduction

            Offensive or obscene language is known as swearing in the UK and cursing in the US (Soanes, 2002). That most languages include swear words (Van Lancker & Cummings, 1999) suggests they fulfil one or more useful functions and researchers have begun to explore what some of the benefits of swearing might be. Repeating a swear word has been found to alleviate the physical pain of immersing one’s hand in ice cold water (Robertson, Robinson & Stephens, 2017; Stephens, Atkins & Kingston, 2009; Stephens & Umland, 2011; Stephens & Robertson, 2020) and the social pain of being ostracised (Philipp & Lombardo, 2017). Swearing also benefits physical tasks that rely on strength and power. Stephens and colleagues (2018), and in addition it augments persuasiveness (Scherer & Sagrin, 2006) and credibility (Rassin & Heijden, 2005).

            This short paper presents the methodology for capturing, disseminating and explaining swearing behaviour on UK twitter.


Methods

Ethics

Research ethics clearance was not required on the basis that Twitter data is freely available and in the public domain. This is in keeping with previous peer-reviewed studies using Twitter scraping methodology (Chehal , Gupta & Gulati, 2020; Dodds, Harris, Kloumann, Bliss & Danforth, 2011; Dzogang, Lightman & Cristianini, 2018; Jones & Silver, 2020; Mitchell, Frank, Harris, Dodds & Danforth, 2013; North , Piwek & Joinson, 2020).

Design

           

Some code was written in R using the rtweet library for twitter scraping. The code accesses twitter via its API version 1.1 and performs a standard search of twitter accounts that have sent recent tweets, returning a sample of recent Tweets published in the past 7 days (https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/overview). The search was set to retrieve 18000 cases matching the search criteria, which is the maximum number of datasets that can be retrieved in a single search using the free access API. Typically this search returns data for tweets sent over a period ranging from approximately five minutes at peak periods to thirty minutes at low-use periods. Please note that here a “case” consists of a single row of data resulting from the Twitter API v1.1 standard search which returns 90 pieces of data per tweet including the tweet message. The search criteria were: that the tweet was sent from a UK location, defined via the command “geocode='54.20,-2,700km'”, and that retweets are excluded.

            Next the code filtered out all variables apart from: created_at, text, source, is_quote, is_retweet, favorite_count, retweet_count, quote_count, reply_count, status_url, place_name, place_full_name, place_type, country, country_code, geo_coords, coords_coords, box_coords, location, followers_count, friends_count, listed_count, statuses_count, favourites_count, account_created_at, verified, profile_image_url.

Next the code counted all instances of swear words in the tweet message text. The swear words were: cunt, motherfucker, fuck, wank, bastard, prick, bollocks, arsehole, twat, piss, shit, dickhead, bugger, sodding, crap and bloody. This set was based on Millwood-Hargrave’s (2000; p9) list of swear words in her report commissioned by the Advertising Standards Authority, British Broadcasting Corporation, Broadcasting Standards Commission and the Independent Television Commission. Cases for which messages did not contain one or more of the listed swear words were then filtered out. Typically, this reduced the sample to approximately 600-1000 cases. For each search, a single line of data was appended to a file for analysis consisting of: Start date and time, End date and time, a count of the frequency with each swear word appeared in that search, a count of the total number of swear words, the mean number of followers, statuses and favourites associated with each case, and the number of verified users that had used at least one swear word. Swearing tweets and associated data were saved in an archive.

Finally, some code was written in the R package rtweet that returned the UK trends on each occasion that a full search was run. The search for UK trends returns the top 50 trending items on Twitter at the time of the search. Each trend item is accompanied by a “tweet volume” figure, that is, the number of tweets including each term, although for around half of trending items, this datum is missing for Twitter API v1.1 (reference). Twitter trend data were compiled into a wordcloud using the wordcloud R package. Only trends whose relative influence could be characterised by a “tweet volume” datum were included in the wordcloud, and this datum was used to determine the size of items in the wordcloud.

The resulting code was saved as an R command programme which is currently being activated automatically every hour, on the hour, using the Z-Cron 5.6 free software package (Bauman, 2018). 

A second set of code was written in R which plots data in time series using the ggplot2 R package. At the time or writing a single plot has been designed showing the current UK swear rate (number of swear word occurrences per 30 seconds) and tweet rate (number of UK tweets per second). These rates were chosen based on trial and error such that for most of the time the lines overlap or are spaced closely together. This illustrates that on UK Twitter the swearing rate is about one thirtieth that of the tweeting rate, or in other words, that you can expect to see one swear word in every 30 tweets. However, spikes in the red line (swear rate) relative to the blue line (tweet rate) indicate higher than usual levels of swearing. Please see figure 1 for an example, with a swearing spike apparent at 16:30 on 03.01.2021.

 


 

Figure 1: Example time series plot of the UK swear rate (swear words per 30 sec) and tweet rate (tweets per sec).

 Further code was written that tweets the plot along with a label describing the swear rate based on the normalised difference between the swear rate and the tweet rate. Normalisation is achieved by dividing the difference scores by the pooled standard deviation of these two variables. The labels applied are shown in Table 1.

On occasions when the UK swearing rate is above the usual level, some additional code was written that posts the wordcloud to the twitter account according to the criteria in rows 1-3 of Table 1. The appropriate label from Table 1 is tweeted as well as the message: “This wordcloud of current Twitter trends indicates why.” . This plotting and wordcloud code was compiled into an R command programme that is currently being automatically activated daily at 0.15am, 7.15am, 9.15am, 11.15am, 1.15pm, 3.15pm, 5.15pm, 6.15pm, 7.15pm, 8.15pm, 9.15pm, 10.15pm and 11.15pm. The twitter account created for this purpose is called Live UK Swearing, with the twitter name @SwearingUk.

 

Table 1: Labels applied to the normalised between the swear rate minus tweet rate difference scores and criteria based on the range of difference scores, and the expected percent of scores falling in each category, assuming a normal distribution

Label

Criteria

Expected %

“Swearing on UK Twitter is extremely high"

Greater than 3.0

0.1

"Swearing on UK Twitter is very high"

1.5 to 3.0

6.6

"Swearing on UK Twitter is higher than usual"

0.5 to 1.5

24.2

"Swearing on UK Twitter is at the usual level"

-0.5 to 0.5

38.2

"Swearing on UK Twitter is lower than usual"

-1.2 to -0.5

24.2

"Swearing on UK Twitter is very low"

-3.0 to -1.2

6.6

"Swearing on UK Twitter is extremely low"

Less than -3.0

0.1

 

A third programme was written in R which posts the wordcloud to the twitter account on occasions when UK swearing is above the usual level, according to the criteria in rows 1-3 of Table 1. The appropriate label from Table 1 is tweeted as well as the message: “This wordcloud illustrates the latest sample of swearing-infused tweets.” . This programme is currently being automatically activated daily at half past the hour, on a 24 hour basis, but only results in a twitter post when the normalised difference between the swear rate (per 30s) and the tweet rate (per s) is at 0.5 or greater.

A third programme was compiled in R that tweets the following explanatory thread: (1) Explanatory thread: The graph with the grey background that you see in this timeline shows the current rate of swearing on UK Twitter (orange line) and the current rate of general UK tweeting (blue line). 1/6. (2) The swear rate is adjusted to swear word occurences every 30 seconds, while the general tweet rate is adjusted to UK tweets per second. These rates were chosen based on trial and error such that for most of the time the lines are spaced closely together. 2/6. (3) This illustrates that on UK Twitter the swearing rate is about one thirtieth that of the tweeting rate, or in other words, that you can expect to see one swear word in every 30 tweets. 3/6. (4) Spikes in the red line, relative to blue, show higher than usual levels of swearing. I characterise the size of spike based on a normal distribution of difference scores. You will see labels like 'higher than usual' or 'very high' for the current swearing rate. 4/6. (5) A higher than usual swearing level prompts tweeting of a wordcloud showing current UK twitter trends. The words give some indication of what topics prompt outbutsts of swearing on UK Twitter. 5/6. (6) Thanks for reading. Further information is available here: https://liveukswearing.blogspot.com/2021/01/v-behaviorurldefaultvmlo.html. 6/6. This programme is scheduled to run daily at 08:45 and 20:45.


 

References

Bauman, A. (2018). Z-Cron software package version 5.6, Build 04. Downloaded from: https://www.z-cron.com/

Chehal, D., Gupta, P., & Gulati, P. (2020). COVID-19 pandemic lockdown: An emotional health perspective of Indians on Twitter. International Journal Of Social Psychiatry, 002076402094074. https://doi.org/10.1177/0020764020940741

Dodds, P., Harris, K., Kloumann, I., Bliss, C., & Danforth, C. (2011). Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter. Plos ONE, 6(12), e26752. https://doi.org/10.1371/journal.pone.0026752

Dzogang, F., Lightman, S., & Cristianini, N. (2018). Diurnal variations of psychometric indicators in Twitter content. PLOS ONE, 13(6), e0197002. https://doi.org/10.1371/journal.pone.0197002

Jones, N., & Silver, R. (2020). This is not a drill: Anxiety on Twitter following the 2018 Hawaii false missile alert. American Psychologist, 75(5), 683-693. https://doi.org/10.1037/amp0000495

Millwood-Hargrave, A. (2000). Delete expletives? Report for the Advertising Standards Authority, British Broadcasting Corporation, Broadcasting Standards Commission and the Independent Television Commission. Available from: http://jastoy.plus.com/pdf/delete_expletives.pdf

Mitchell, L., Frank, M., Harris, K., Dodds, P., & Danforth, C. (2013). The Geography of Happiness: Connecting Twitter Sentiment and Expression, Demographics, and Objective Characteristics of Place. Plos ONE, 8(5), e64417. https://doi.org/10.1371/journal.pone.0064417

North, S., Piwek, L., & Joinson, A. (2020). Battle for Britain: Analyzing Events as Drivers of Political Tribalism in Twitter Discussions of Brexit. Policy & Internet. https://doi.org/10.1002/poi3.247

Philipp, M.C. & Lombardo, L. (2017). Hurt feelings and four letter words: Swearing alleviates the pain of social distress. European Journal of Social Psychology 47, 517-523. doi: 10.1002/ejsp.2264

Rassin, E & Heijden, S.V.D. (2005). Appearing credible? Swearing helps! Journal of Psychology, Crime & Law 11, 177-182. Doi: 10.1080/106831605160512331329952

Robertson, O.S., Robinson, S.J. & Stephens, R. (2017). A cross-cultural comparison of the effects of swearing on pain perception in a British and Japanese population. Scandinavian Journal of Pain 17, 267-272. doi: 10.1016/j.sjpain.2017.07.014

Scherer, C.R. & Sagarin, B.J. (2006). Indecent influence: The positive effects of obscenity on persuasion. Social Influence 1, 138–146. doi: 10.1080/15534510600747597

Soanes, C. (2002) Pocket Oxford English Dictionary. Oxford: Oxford University Press.

Stephens, R. & Umland, C. (2011). Swearing as a response to pain – effect of daily swearing frequency. Journal of Pain 12, 1274-1281. doi: 10.1016/j.jpain.2011.09.004

Stephens, R., Atkins, J. & Kingston, A. (2009). Swearing as a response to pain. NeuroReport 20, 1056-1060. doi: 10.1097/WNR.0b013e32832e64b1

Stephens, R., Robertson, O.S. (2020). Swearing as a response to pain: Assessing hypoalgesic effects of novel “swear” words. Frontiers in Psychology https://www.frontiersin.org/articles/10.3389/fpsyg.2020.00723/full.

Stephens, R., Spierer, D.K., Katehis, E. (2018). Effect of swearing on strength and power performance. Psychology of Sport and Exercise 35, 111-117.  https://doi.org/10.1016/j.psychsport.2017.11.014

van Lancker, D. & Cummings, J.L. (1999). Expletives: neurolinguistics and neurobehavioral perspectives on swearing. Brain Research Reviews 31, 83-104. doi: 10.1016/S0165-0173(99)00060-0

 

 

 

Comments