Unverified Commit d70ee60d authored by DocGarbanzo's avatar DocGarbanzo Committed by GitHub
Browse files

Change training pipeline from tf.Sequence to tf.data (#701)

* Improve pipeline use: move from building list of pipelines of single transforms to building a single pipeline with a list of transforms (actually just looping through function to go from TubRecord -> image -> augment -> normalise -> x and TubRecord -> y).

Fixed TfmIterators and TfmIterables.
* Iterables are the containers and are sized - these are the user objects
* Iterators are protocol objects to allow iteration, they have no logic and are local to the Iterables
* build/map_pipeline both return sized Iterables
* removed all batch logic, this is not required
* still commented but left code that uses generator based pipeline as this is simpler code

Using new small temporary pipeline generator
* this keeps the TubSequence lazy and avoids to roll out the pipeline into a list
* added a test to check consistency of the pipeline
* remove empty (after moved) augmentation file
* removed augmentation from old tub (as it's not needed and we removed the old augmentation)

New pipeline changes:
* moved augmentation into own class that is used above and can be used a  threaded or non-threaded part
* moved train functionality out of template and added 'donkey train', train.py just a simple dummy script for backward compatibility

* Address code reviews:
* Re-base on current dev to use un-altered sequence.py
* Add iterator consistency test to pipeline tests
* Undo changes in fast_stretch.py
* better tf shape manipulation
* small code improvements in training.py
* remove sleep in augment part

* Address code reviews:
* Add clearing of tubrecord list and minor renamings
parent db22d316
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment