Difference between revisions of "NLTK: Sentiment Strength Detection in Bahasa Indonesia"

From OnnoWiki
Jump to navigation Jump to search
Line 24: Line 24:
  
 
This is work in progress. Experimental for my Master Thesis
 
This is work in progress. Experimental for my Master Thesis
 +
 +
 +
 +
=Ubah Source Code==
 +
 +
import argparse
 +
 +
def parse_args():
 +
    parser = argparse.ArgumentParser()
 +
    parser.add_argument('-i', '--infile', default='', help='input filename')
 +
    return parser.parse_args()
 +
 +
def main():
 +
    args = parse_args()
 +
    infile = args.infile
 +
 +
    filename = open(infile,'r')
 +
    fcontent=filename.read()
 +
    filename.close()
 +
 +
    ss = sentiStrength()
 +
    sc = spellCheck()
 +
    for t in fcontent:
 +
        print ss.main(t)
 +
    print "====================="       
 +
    print ss.getSentimenScore()
 +
   
 +
main()
  
  

Revision as of 10:36, 25 February 2017

SentiStrengthID

Sentiment Strength Detection in Bahasa Indonesia. This is unsupervised version of SentiStrength (http://sentistrength.wlv.ac.uk/) in Bahasa Indonesia. Core Feature:

  • Sentiment Lookup
  • Negation Word Lookup
  • Booster Word Lookup
  • Emoticon Lookup
  • Idiom Lookup
  • Question Word Lookup
  • Slang Word Lookup
  • Spelling Correction (optional) using Pater Norvig (http://norvig.com/spell-correct.html)
  • Negative emotion ignored in question
  • Exclamation marks count as +2
  • Repeated Punctuation boosts sentiment

Ignored Rule:

   repeated letters more than 2 boosts sentiment score. This rule do not applied due to my own pre-processing rule which removing word's extra character
   score +2, -2 in word "miss". Do not apply in Bahasa Indonesia.

Warning!

This is work in progress. Experimental for my Master Thesis


Ubah Source Code=

import argparse

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--infile', default=, help='input filename')
    return parser.parse_args() 

def main():
    args = parse_args()
    infile = args.infile

    filename = open(infile,'r')
    fcontent=filename.read()
    filename.close()

    ss = sentiStrength()
    sc = spellCheck()
    for t in fcontent:
        print ss.main(t)
    print "====================="        
    print ss.getSentimenScore()
    
main()


Referensi