DataTaunew | comments | leaders | submitlogin
Twitter Natural Language Processing (cmu.edu)
11 points by elyase 3467 days ago | 1 comment


5 points by pmlandwehr 3467 days ago | link

So, an important follow up to this if you don't want to deal with Java: Myle Ott made a Python port of what I believe is the latest version. It generates results that are almost-but-not-quite identical with the Java version; 83 tweets differed in a text corpus of 1,000,000. You can get it on GitHub at https://github.com/myleott/ark-twokenize-py

-----




RSS | Announcements