DataTaunew | comments | leaders | submitlogin
A Keras multithreaded DataFrame generator for millions of image files (
5 points by timehaven 355 days ago | 1 comment

1 point by timehaven 354 days ago | link

This is a tutorial-style post that offers an alternative approach to dealing with large sets of training data, without resorting to copying and moving files to hard-coded directories named `train`, `validation` and `test`. Instead, you keep the files where they "naturally" reside on your system and track their locations with a Pandas DataFrame, feeding their names to the Keras generator. It scales well when dealing with millions of image files and hundreds of gigabytes of data.


RSS | Announcements