DataTaunew | comments | leaders | submitlogin
1 point by syllogism 2756 days ago | link | parent

(Author here)

spaCy's definitely the fastest NLP system you can easily use.

I'm told one of Google's internal systems is >10x faster, which is intriguing, and I guess unsurprising.

A research system was published this year that gets faster than spaCy by using the GPU. I think their code might be available, but I doubt it's production quality.

Here's the Lewis et al paper, which compares against spaCy and other systems for efficiency:

http://homes.cs.washington.edu/~lsz/papers/llz-naacl16.pdf

Another paper from 2015 also concluded spaCy was the fastest.

http://aclweb.org/anthology/P/P15/P15-1038.pdf (I also wrote the other efficiency-outlier in their table, Redshift.)

spaCy's parsing algorithm is described in this too-short paper:

https://aclweb.org/anthology/D/D15/D15-1162.pdf

All of these papers are linked in the benchmarks section of the spaCy homepage: https://spacy.io

As for Keras being the most popular deep learning library for Python...I thought this was absolutely obvious? It was close in 2015, but now it's not. I guess if you count Keras users as Tensorflow users, Tensorflow might come out on top.






RSS | Announcements