“I love deadlines. I
love the whooshing noise they make as they go by.”
Most humans would be able to detect the sarcasm in the quote
above, even if it takes them a moment or two. But imagine making a computer
understand the sentiment expressed in the above sentence.
That is the sort of challenge, Dr. Erik Cambria (Assistant
Professor at the School of Computer Science and Engineering at Nanyang
Technological University) and his team at SenticNet
are trying to tackle. They are
dealing with the fundamental problems of natural language processing (NLP) for
sentiment analysis. Natural language, which is the language we use for
communicating with each other, is rather different from the way we communicate
with computers. Natural language is ambiguous, complex, chaotic. Constructed
languages, such as programming languages, adhere to strict rules and logic.
Wikipedia defines sentiment analysis as the use of NLP, text
analysis, computational linguistics, and biometrics to systematically identify,
extract, quantify, and study affective states and subjective information.
Applications involve analysing the positive, negative and neutral
sentiments in online customer reviews, surveys, feedback, social media postings
and this has great utility in range of fields, from marketing to finance and healthcare.
The problem
is much more complicated than it seems. For instance, if a statement is
sarcastic, as the one above, something which looks positive is actually
negative (love is hate). Understanding this polarity (whether a sentiment is
positive or negative) is a core aspect of sentiment analysis. It involves the
use of deep learning, psychology, and also linguistics, demonstrating the
multi-disciplinary nature of the field.
Deep
learning helps detect some patterns, such as the usual occurrence of a big
shift in polarity in a sarcastic comment (positive followed by negative), linguistics
provide insights on sentence structure, while psychology is important because
whether a statement is sarcastic or not can be dependent on the personality of
the individual.
To take
another example, saying “This phone is expensive but nice” is not the same as
saying “This phone is nice but expensive.” In fact, the sentiments expressed
are polar opposites, though the words used are the same. Here, the
understanding of sentence structure based on linguistics is key. When the ‘but’
conjunction is used, positive followed by negative yields negative but negative
followed by positive yields positive.
To
understand the approach of SenticNet to dealing with such challenges and
improving sentiment analysis, we need to look at its origins.
Origins from a commonsense knowledge base
SenticNet
started as a project in MIT Media Lab in 2009.
“They had
this big knowledge base of commonsense and I thought, why don’t we use it
for sentiment analysis,” Dr. Cambria said, “back then, sentiment analysis was
not very popular but in the past few years, its popularity has increased
dramatically. Because of the research challenges, and also because of
the business opportunities. For instance, so many companies want to know
what their customers like about their products.”
In AI
research, commonsense knowledge is the collection of facts and information that
an ordinary person is expected to know. Facts so obvious, so trivial, that no
one would think of mentioning them explicitly, like a chair is for sitting
down, or that we drink water to quench our thirst.
Natural language is only
used to communicate knowledge which we don’t have based on shared experience. The
challenge is to get this general knowledge that most people possess,
represented in a way that it is available to AI programs.
A knowledge
base here refers to a semantic network with millions of nodes, connected by
links that encode the commonsense piece of information. For example, beer and drink
could be two nodes and connecting the two would represent the taken-for-granted
information that “beer is a drink”.
The MIT Media Lab has a portal called the Open Mind Common Sense
(OMCS), which collects pieces of knowledge from volunteers on the Internet by
enabling them to enter commonsense into the system with no special training or
knowledge of computer science.
Volunteers on
the web would answer questions like– “what is a bed used for?”, “what is a
beer for?”, “where do you usually find the knife?”. Only those answers
which occurred more than a few times would be inserted into the semantic graph.
“If many people said that the bed is for sleeping, you take that as a good
piece of commonsense” Dr. Cambria said.
ConceptNet is a semantic
network based on the information in the OMCS database. SenticNet was built
based on ConceptNet, focusing on concepts that are either positive or negative,
because the eventual objective of SenticNet is to conduct sentiment analysis.
“We started
as just a knowledge base, then from there we went on into the fundamental
problems of natural language processing for sentiment analysis. While
before we were just focusing on knowledge representation, later we got more and
more interested in commonsense reasoning and linguistics. We went from having
just SenticNet to having Sentic patterns and other
reasoning techniques like AffectiveSpace and things that altogether allow us to
do sentiment analysis in a human-like way,” Dr. Cambria said describing
the evolution of SenticNet.
Machine learning is not enough
Dr. Cambria
said, “We try to take inspiration from how the human brain actually understands
things, which is a very different approach from pure machine learning.”
The big
difference between Sentic computing and other techniques is that Sentic
computing is a hybrid approach that uses machine learning alongside
knowledge representation, reasoning and linguistics.
With recent developments in machine learning methods like deep
networks, most researchers are pinning their hopes on feeding massive volumes
of data to algorithms. Dr.
Cambria believes that commonsense is key to improving AI. Simply relying only
on statistics, probabilities, co-occurrence frequencies is not enough.
He went on to highlight three big issues with machine
learning. The first is ‘Dependency’, as machine learning requires a lot of
training data and is domain-dependent.
The second issue is ‘Consistency’, as changes or tweaks in
the learning model may lead to different results. The third is ‘Transparency’,
that is, the way machine learning performs decision-making is a black box. We
do not know why the algorithms arrived at the conclusions they did. In fact,
this very same fact makes machine learning a powerful tool. Researchers don’t
need to understand the data. They
can just feed data to a neural network or whatever learning algorithm they are
using, this learns the features automatically, and then it takes decisions.
But we never know why the algorithm takes those decisions. This lack of
transparency can be a major problem if we are using AI to perform activities
that involves ethics like, say, selecting candidates for a job opening.
In the context of NLP, Dr. Cambria said that these issues
are crucial because, unlike in other fields, they prevent AI from achieving
human-like performance. AI researchers need to bridge the gap between
statistical NLP and many other disciplines that are necessary for understanding
human language, such as linguistics, commonsense reasoning, and affective
computing (affective computing is the study and development of systems and
devices that can recognise, interpret, process, human affects or emotions).
Coupling top-down and
bottom-up AI
Because of the reasons discussed above, Dr. Cambria
advocates a combination
of symbolic and sub-symbolic AI. Symbolic models, such as semantic networks,
represent a top-down approach to encode meaning. Sub-symbolic methods, such as
neural networks, represent a bottom-up approach to infer syntactic patterns
from data (syntax is the set of rules, principles, and processes that govern
word order and sentence structure). The top-down approach helps gain
transparency, while data-driven deep learning enables the automatic detection
of patterns.
In a paper titled “SenticNet 5: Discovering Conceptual
Primitives for Sentiment Analysis by Means of Context Embeddings”, Dr.
Cambria along with his co-authors explores how the two approaches might
complement each other. The paper talks about the use of the bag-of-concepts
model (as opposed to bag-of-words in which a text is represented as a bag or
set of its constituent words) for sentiment analysis. The bag-of-concepts has
the advantage over bag-of-words of being able to deal with multiword
expressions like ‘pretty ugly’ or ‘sad smile’, which would be split up in the
latter model and hence lose their polarity, i.e., their positive or negative
meaning (as in pretty used as an adjective rather than an adverb). And it
avoids the blind use of keywords and word co-occurrence counts.
But now the problem is that the bag-of-concepts model cannot
achieve a comprehensive coverage of meaningful concepts, i.e., a full list of
multiword expressions that actually make sense. Models could be used to extract
concepts from raw data but such approaches are prone to errors due to the richness
and ambiguity of natural language. This is based on the idea that there is a
finite set of mental primitives for affect-bearing concepts and a finite set of
principles of mental combination governing their interaction.
The paper goes on to propose the generalisation of concepts
with related meaning, such as ‘munch toast’ and ‘slurp noodles’, into the
conceptual primitive ‘EAT FOOD’. Sub-symbolic AI could now be used to automatically
discover the conceptual primitives that can better generalise SenticNet’s
commonsense knowledge.
This approach would also help in tackling the symbol
grounding problem. Our understanding of language is grounded in the physical
world, in sensations, in memory. A computer does not learn meaning like that. A
meaning of a word on a page or computer screen is ungrounded. And looking it up
in a dictionary would not help.
This article
explains the problem like this: “If I tried to look up the meaning of a word I
did not understand in a (unilingual) dictionary of a language I did not already
understand, I would just cycle endlessly from one meaningless definition to
another. My search for meaning would be ungrounded. In contrast, the meaning of
the words in my head — the ones I do understand — are
"grounded" (by a means that cognitive neuroscience will eventually
reveal to us). And that grounding of the meanings of the words in my head
mediates between the words on any external page I read (and understand) and the
external objects to which those words refer.”
In the approach presented in the paper, several adjectives
and verbs are defined in function of only one ‘primitive’ item thereby
grounding those meanings in that one primitive. It does not solve the symbol
grounding problem but reduces it.
Current applications
SenticNet’s
research is being applied in several projects spanning from
fundamental knowledge representation problems to applications of commonsense
reasoning in contexts such as big social data analysis and human-computer
interaction.
For
instance, a project in collaboration with Prof. Roy Welsch from MIT
Sloan School of Management focuses on natural
language based financial forecasting (NLFF). Markets are driven by
sentiments. Understanding those sentiments from data can be used for predicting
market movements.
SenticNet
is also developing tools that allow patients to easily and efficiently measure
their health related quality of life and improving human-computer interaction (HCI) by developing dialogue systems
with commonsense.
Another
project, called PONdER (Public Opinion of Nuclear Energy) aims to
collect, aggregate, and analyse opinions towards nuclear energy in different
languages and across Singapore, Malaysia, Indonesia, Thailand, and Vietnam.
Understanding how the public perceives nuclear energy in the region enables
policymakers to make informed national policies and decisions pertaining to
nuclear energy, as well as shape communication strategies to inform the public
about nuclear energy.
Dr. Cambria
said that personally he is more interested in the fundamental problems of
AI and sentiment analysis. For example, solving the symbol grounding
problem or building machines that can really understand language (IQ),
emotions (EQ), and culture (CQ).
“Today, we
still don’t have machines that really understand natural
language. Siri does not understand natural language, Watson is an amazing answering
machine but it does not understand language. At SenticNet, we want to go beyond
rule-based and stats-based systems. What we are working on is
not really NLP research anymore; it is natural language understanding.”