DataTaunew | comments | leaders | submitlogin
A Keras multithreaded DataFrame generator for millions of image files (appnexus.com)
5 points by timehaven 2431 days ago | 1 comment


1 point by timehaven 2430 days ago | link

This is a tutorial-style post that offers an alternative approach to dealing with large sets of training data, without resorting to copying and moving files to hard-coded directories named `train`, `validation` and `test`. Instead, you keep the files where they "naturally" reside on your system and track their locations with a Pandas DataFrame, feeding their names to the Keras generator. It scales well when dealing with millions of image files and hundreds of gigabytes of data.

-----




RSS | Announcements