Given a DTD,
this function computes the overall opinion score based on the
proportion of text records classified as expressing positive,
negative or a neutral sentiment.
The function first transforms
the text document into a tidy-format dataframe, described as the
observed sentiment document (OSD)
(Adepeju and Jimoh, 2021),
in which each text record is assigned a sentiment class based
on the summation of all sentiment scores expressed by the words in
the text record.
opi_score(textdoc, metric = 1, fun = NULL)
textdoc | An |
---|---|
metric | (an integer) Specify the metric to utilize for
the calculation of opinion score. Valid values include
|
fun | A user-defined function given that |
Returns an opi_object
containing details of the
opinion measures from the text document.
An opinion score is derived from all the sentiments
(i.e. positive, negative (and neutral) expressed within a
text document. We deploy a lexicon-based approach
(Taboada et al. 2011) using the AFINN
lexicon
(Nielsen, 2011).
(1) Adepeju, M. and Jimoh, F. (2021). An Analytical Framework for Measuring Inequality in the Public Opinions on Policing – Assessing the impacts of COVID-19 Pandemic using Twitter Data. https://doi.org/10.31235/osf.io/c32qh (2) Malshe, A. (2019) Data Analytics Applications. Online book available at: https://ashgreat.github.io/analyticsAppBook/index.html. Date accessed: 15th December 2020. (3) Taboada, M.et al. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), pp.267-307. (4) Lowe, W. et al. (2011). Scaling policy preferences from coded political texts. Legislative studies quarterly, 36(1), pp.123-155. (5) Razorfish (2009) Fluent: The Razorfish Social Influence Marketing Report. Accessed: 24th February, 2021. (6) Nielsen, F. A. (2011), “A new ANEW: Evaluation of a word list for sentiment analysis in microblogs”, Proceedings of the ESWC2011 Workshop on 'Making Sense of Microposts': Big things come in small packages (2011) 93-98.
# Use police/pandemic posts on Twitter # Experiment with a standard metric (e.g. metric 1) score <- opi_score(textdoc = policing_dtd, metric = 1, fun = NULL) #print result print(score)#> $sentiments #> #> #> |sentiment | No_of_text_records| #> |:---------|------------------:| #> |negative | 45| #> |neutral | 1| #> |positive | 40| #> #> $opiscore #> [1] "-5.88%" #> #> $metric #> [1] "Polarity (Percentage Difference)" #> #> $equation #> [1] "((#Positive - #Negative)/(#Positive + #Negative))*100%" #> #> $OSD#> Warning: `...` is not empty. #> #> We detected these problematic arguments: #> * `needs_dots` #> #> These dots only exist to allow future extensions and should be empty. #> Did you misspecify an argument?#> # A tibble: 86 x 2 #> ID sentiment #> <int> <chr> #> 1 1 positive #> 2 2 positive #> 3 3 negative #> 4 4 negative #> 5 5 positive #> 6 6 positive #> 7 10 positive #> 8 11 positive #> 9 12 positive #> 10 13 negative #> # ... with 76 more rows #>#Example using a user-defined opinion score - #a demonstration with a component of SIM opinion #Score function (by Razorfish, 2009). The opinion #function can be expressed as: myfun <- function(P, N, O){ score <- (P + O - N)/(P + O + N) return(score) } #Run analysis score <- opi_score(textdoc = policing_dtd, metric = 5, fun = myfun) #print results print(score)#> $sentiments #> #> #> |sentiment | No_of_text_records| #> |:---------|------------------:| #> |negative | 45| #> |neutral | 1| #> |positive | 40| #> #> $opiscore #> [1] -0.04651163 #> #> $metric #> [1] "User-defined" #> #> $equation #> function(P, N, O){ #> score <- (P + O - N)/(P + O + N) #> return(score) #> } #> <environment: 0x00000000476af450> #> #> $OSD#> Warning: `...` is not empty. #> #> We detected these problematic arguments: #> * `needs_dots` #> #> These dots only exist to allow future extensions and should be empty. #> Did you misspecify an argument?#> # A tibble: 86 x 2 #> ID sentiment #> <int> <chr> #> 1 1 positive #> 2 2 positive #> 3 3 negative #> 4 4 negative #> 5 5 positive #> 6 6 positive #> 7 10 positive #> 8 11 positive #> 9 12 positive #> 10 13 negative #> # ... with 76 more rows #>