Sentiment Lexicon¶

Sentiment lexicon wrapper and generator.

Installation¶

pip install sentiment-lexicon

Usage¶

The module provides a single class, Lexicon, that can be used as a simple wrapper around sentiment lexicon data. The sentiment value of a given word can be accessed via the value instance method.

from sentiment_lexicon import Lexicon

lexicon = Lexicon(words, values)

lexicon.value('good') # => 1

The class can also generate a sentiment lexicon based on positive and negative input documents.

lexicon = Lexicon.from_labelled_text(positive_documents, negative_documents)

More information is available in the documentation.

class sentiment_lexicon.Lexicon(words, values, default=None, normalize=False)¶

A class wrapping a sentiment lexicon.

Provides a value() method for finding the sentiment value of a word.

Parameters

words (List[str]) – List of words.
values (List[float]) – List of values.
default (Optional[float]) – The value given to words that are not in the lexicon.
normalize (Optional[bool]) – Determines if the values are normalized to [-1, 1].

default¶: The default value.

data¶: The lexicon data.

range¶: The range that the sentiment values are spanning.

Raises: ValueError – If words and values do not have the same length.

Examples

Creating a lexicon requires one sentiment value per word

>>> Lexicon(['good'], [1])
<sentiment_lexicon.lexicon.Lexicon object at ...>

otherwise an exception is raised.

>>> Lexicon(['good'], [])
Traceback (most recent call last):
...
ValueError: words and values must have the same length

Normalizing the sentiment values is done by passing True as the normalize parameter.

>>> lexicon = Lexicon(['good', 'bad', 'maybe'], [10, -8, 2], normalize=True)
>>> lexicon.data
good     1.0
bad     -0.8
maybe    0.2
dtype: float64

value(word, default=None)¶

Returns the sentiment value of a given word.

Parameters

word (str) – The word to find sentiment value for.
default (Optional[float]) – The value to return if not value if found for word. Takes precedence over the default attribute of Lexicon.

Raises

KeyError – If word is not found in the lexicon and a default
is not provided. –

Return type

float

Returns

The sentiment value for the word.

Examples

>>> lexicon = Lexicon(['good'], [1])
>>> lexicon.value('good')
1.0

>>> lexicon.value('bad')
Traceback (most recent call last):
...
KeyError: 'bad not present in the lexicon'

>>> lexicon = Lexicon(['good'], [1], default=0)
>>> lexicon.value('bad')
0

>>> lexicon.value('bad', default=-1)
-1

static from_labelled_text(positive, negative, min_df=0.0, alpha=0.5, ignore_case=True, **kwargs)¶

Generate a Lexicon based on positive and negative documents using pointwise mututal information.

Parameters

positive (List[str]) – List of positive documents.
negative (List[str]) – List of negative documents.
min_df (Optional[float]) – The number of documents that a word must occur in before it is added to the lexicon.
alpha (Optional[float]) – Determines how much the PMI of the two classes affect each other when computing the sentiment values.
ignore_case (Optional[bool]) – Determines if the case of the words in the documents are ignored. If True, the words good and Good will be treated as the same.
kwargs – Parameters passed to the Lexicon constructor.

Examples

>>> Lexicon.from_labelled_text(['This is good'], ['This is bad'])
<sentiment_lexicon.lexicon.Lexicon object at ...>

>>> Lexicon.from_labelled_text(['This is good'], [])
Traceback (most recent call last):
...
ValueError: there must be at least one positive and one negative document

Return type: Lexicon

Sentiment Lexicon¶

Installation¶

Usage¶

Table of Contents

This Page