DataTaunew | comments | leaders | submitlogin
1 point by thegoz 2811 days ago | link | parent

may I know what preprocessing steps did you do on the strings of the books? Good job. Really useful.


3 points by jeradf 2811 days ago | link

For each book, I combined it's user reviews up to a max length of 10,0000 words. Any book with less than 500 total words was dropped. All punctuation and stop words were removed, and training was done using the doc2vec PV-DBOW method (`dm=0, dbow_words=True` in Gensim).

-----

1 point by thegoz 2810 days ago | link

you used only the reviews without using the text from the books themselves? if that is true, i am really impressed by how good it works.

-----




RSS | Announcements