DataTaunew | comments | leaders | submitlogin
A Keras multithreaded DataFrame generator for millions of image files (appnexus.com)
5 points by timehaven 16 days ago | 1 comment




1 point by timehaven 15 days ago | link

This is a tutorial-style post that offers an alternative approach to dealing with large sets of training data, without resorting to copying and moving files to hard-coded directories named `train`, `validation` and `test`. Instead, you keep the files where they "naturally" reside on your system and track their locations with a Pandas DataFrame, feeding their names to the Keras generator. It scales well when dealing with millions of image files and hundreds of gigabytes of data.

reply




RSS | Announcements