Roboflow Review 2
A particular pretrained model I was using was under-performing, because it could not detect a certain breed of dogs. It worked great on long-nosed dogs, but for some reason, it could just not detect my cousin's pug. So I did what any other sensible person would do: I decided to retrain my own model from scratch.
I had never done this before, so I assumed it would be easy. I pulled together 3 different datasets of animals, each dataset in a different annotation format. I can just combine them into one folder and everything should work out, right?
I ended up spending around 30 hours over the next week arranging, shuffling, combining, filtering, and deleting annotations for that dataset of 12,000 images. It was meticulous, repetitive, and exhausting. Every time I thought I had it, I didn’t, and I had to start all over again. Eventually I had a cleaned and annotated dataset, and I got some errors when trying to convert to Tfrecords for training. That is when I gave up and decided to use Roboflow for that project.
Roboflow.com is kind of like a cheat code. It lets you skip past all the boring stuff and get straight to the good stuff. I didn’t have to process my annotations, it took them in from all 3 formats I was using and standardized them. It told me which annotations were likely mistakes, and which images to correct. That 30 hour project was completed in 30 minutes using Roboflow, and I wish I had used it at the beginning rather than waste so much of my precious, delicious time. It takes all the boring, tedious, draining work out of ML: the dataset cleaning, checking for health, etc. I can even use it to just check if my dataset is labeled correctly. It is a bit expensive if you are using very large datasets, but if you factor in how much your time is worth, and how much time you are saving, it is definitely worth the trouble.
So after uploading to Roboflow I was able to export all my data as tfrecords easily, but it turns out I didn’t have to.
I noticed a button that said "Google AutoML" on the Roboflow interface. From what I understand, Google AutoML gives great results, and I have heard from others that it is the best object detection model training software out there. However, Google’s own documentation for the service is ridiculously bad. The process to get AutoML up and running is long and complicated: I have to connect via command line, create an IAM role, create a storage unit, upload the images, create a .csv file in their proprietary format (which Google gives no instructions on how to do), and then somehow figure out how to link that csv file to their autoML service. Google suffers from the problem of only employing geniuses with 10+ years of code experience: Their geniuses are completely unable to communicate with beginner or intermediate coders.
I was particularly impressed with the way Roboflow handled Google AutoML, because I didn't have to do any of those things. I literally just clicked once to get the CSV link, and then copy-pasted it into AutoML. Easy peasy. Even better, Roboflow hosted my dataset on their own google cloud account so I didn’t have to pay for the data storage. I loved it.
In the end, Google AutoML gave me terrible results, and they charged me $160 for a model that is only 73% accurate. I didn’t even download it because it isn’t accurate enough for my purposes, AND Google wanted to charge me an additional fee to transfer the model to cloud storage before downloading it. They would not allow any user to download the trained model directly, because that would be too easy: Instead of clicking a download link, you must transfer the trained model to a Google cloud storage account (which you must set up and pay for) and THEN download the trained model from that storage account using the command line. What a ridiculous way to go, and talk about non-user-friendly.
So anyways, I decided to train my model elsewhere using Colab notebooks. Roboflow came to the rescue again. I was able to take the same dataset, and try it out on Yolo v5. Normally this would have taken me hours to convert my 12,000 images into Yolo format… but Roboflow did it in seconds and with one click. I was able to train my dataset on Yolo v5, and i discovered that one of my component datasets (remember how at the beginning I said I combined 3 datasets) was causing all the trouble. After I eliminated it and reuploaded my other datasets to Roboflow, again I was able to convert all the annotations to Yolo v5 and train again. I ended up getting a custom-trained model in the 90% accuracy range.
I highly recommend www.Roboflow.com because of the amount of time they save me on dataset preparation, analysis, and annotation conversion.