Sentiment Lexicon¶
Sentiment lexicon wrapper and generator.
Installation¶
pip install sentiment-lexicon
Usage¶
The module provides a single class, Lexicon
, that can be used as a simple wrapper around sentiment lexicon data.
The sentiment value of a given word can be accessed via the value
instance method.
from sentiment_lexicon import Lexicon
lexicon = Lexicon(words, values)
lexicon.value('good') # => 1
The class can also generate a sentiment lexicon based on positive and negative input documents.
lexicon = Lexicon.from_labelled_text(positive_documents, negative_documents)
More information is available in the documentation.
-
class
sentiment_lexicon.
Lexicon
(words, values, default=None, normalize=False)¶ A class wrapping a sentiment lexicon.
Provides a
value()
method for finding the sentiment value of a word.- Parameters
words (
List
[str
]) – List of words.values (
List
[float
]) – List of values.default (
Optional
[float
]) – The value given to words that are not in the lexicon.normalize (
Optional
[bool
]) – Determines if the values are normalized to [-1, 1].
-
default
¶ The default value.
-
data
¶ The lexicon data.
-
range
¶ The range that the sentiment values are spanning.
- Raises
ValueError – If
words
andvalues
do not have the same length.
Examples
Creating a lexicon requires one sentiment value per word
>>> Lexicon(['good'], [1]) <sentiment_lexicon.lexicon.Lexicon object at ...>
otherwise an exception is raised.
>>> Lexicon(['good'], []) Traceback (most recent call last): ... ValueError: words and values must have the same length
Normalizing the sentiment values is done by passing
True
as thenormalize
parameter.>>> lexicon = Lexicon(['good', 'bad', 'maybe'], [10, -8, 2], normalize=True) >>> lexicon.data good 1.0 bad -0.8 maybe 0.2 dtype: float64
-
value
(word, default=None)¶ Returns the sentiment value of a given word.
- Parameters
- Raises
KeyError – If
word
is not found in the lexicon and adefault
is not provided. –
- Return type
float
- Returns
The sentiment value for the word.
Examples
>>> lexicon = Lexicon(['good'], [1]) >>> lexicon.value('good') 1.0
>>> lexicon.value('bad') Traceback (most recent call last): ... KeyError: 'bad not present in the lexicon'
>>> lexicon = Lexicon(['good'], [1], default=0) >>> lexicon.value('bad') 0
>>> lexicon.value('bad', default=-1) -1
-
static
from_labelled_text
(positive, negative, min_df=0.0, alpha=0.5, ignore_case=True, **kwargs)¶ Generate a
Lexicon
based on positive and negative documents using pointwise mututal information.- Parameters
positive (
List
[str
]) – List of positive documents.negative (
List
[str
]) – List of negative documents.min_df (
Optional
[float
]) – The number of documents that a word must occur in before it is added to the lexicon.alpha (
Optional
[float
]) – Determines how much the PMI of the two classes affect each other when computing the sentiment values.ignore_case (
Optional
[bool
]) – Determines if the case of the words in the documents are ignored. IfTrue
, the words good and Good will be treated as the same.kwargs – Parameters passed to the
Lexicon
constructor.
Examples
>>> Lexicon.from_labelled_text(['This is good'], ['This is bad']) <sentiment_lexicon.lexicon.Lexicon object at ...>
>>> Lexicon.from_labelled_text(['This is good'], []) Traceback (most recent call last): ... ValueError: there must be at least one positive and one negative document
- Return type
Lexicon