DataTaunew | comments | leaders | submitlogin
SpaCy v1.0: Deep Learning with custom pipelines and Keras (explosion.ai)
15 points by jazzydag 2745 days ago | 3 comments


1 point by tarlen 2745 days ago | link

There's a couple very strong assertions in the beginning paragraphs here:

1) spyCy is the "fastest NLP library in the world" 2) Keras is "the most popular deep learning library for Python"

Is this correct? Are there any numbers to back either of these statements up elsewhere they've written about?

-----

1 point by elyase 2742 days ago | link

Regrading Keras: the author compiles monthly statistics, this is the last one:

https://twitter.com/fchollet/status/787881948274266112

-----

1 point by syllogism 2745 days ago | link

(Author here)

spaCy's definitely the fastest NLP system you can easily use.

I'm told one of Google's internal systems is >10x faster, which is intriguing, and I guess unsurprising.

A research system was published this year that gets faster than spaCy by using the GPU. I think their code might be available, but I doubt it's production quality.

Here's the Lewis et al paper, which compares against spaCy and other systems for efficiency:

http://homes.cs.washington.edu/~lsz/papers/llz-naacl16.pdf

Another paper from 2015 also concluded spaCy was the fastest.

http://aclweb.org/anthology/P/P15/P15-1038.pdf (I also wrote the other efficiency-outlier in their table, Redshift.)

spaCy's parsing algorithm is described in this too-short paper:

https://aclweb.org/anthology/D/D15/D15-1162.pdf

All of these papers are linked in the benchmarks section of the spaCy homepage: https://spacy.io

As for Keras being the most popular deep learning library for Python...I thought this was absolutely obvious? It was close in 2015, but now it's not. I guess if you count Keras users as Tensorflow users, Tensorflow might come out on top.

-----




RSS | Announcements