In this post, I’ll give you some tips on creating a building detector using Label Maker and the TensorFlow Object Detection API. We have a full walkthrough on Github if you want to follow along step-by-step.
Label Maker prepares training data for machine learning algorithms with satellite imagery. It does a lot of the heavy lifting so that you can focus on creating great Artificial Intelligence (AI). Label Maker works with a range of frameworks such as TensorFlow, MXNet and Keras. We’ve demonstrated training an image classification with Keras and AWS and created an MXNet building classifier with Amazon Sagemaker. Preparing data for TensorFlow presents some unique challenges, so we’ve created utility scripts to easily plug in your prepared training data from Label Maker to the TensorFlow Object Detection API.
Our goal is to detect buildings in Mapbox satellite imagery over Mexico City. Building counts are a good indictor for urban management, economic development, and will also be helpful in natural disaster response after earthquakes, wildfires, or floods.
Label Maker wasn’t only built to create training data for TensorFlow, but also other AI models that build on top of Keras, MXnet or Theano. To prepare training data for TensorFlow Object Detection, use the standard Label Maker configuration and command line steps but skip the final
label-maker package and use the included
python tf_records_generation.py --label_input=labels.npz \ --train_rd_path=data/train_buildings.record \ --test_rd_path=data/test_buildings.record
This will output the required TFRecord files. TFRecord file format is a simple record-oriented binary format that many TensorFlow applications use. Once you have these, you’re ready to set up the model and start training.
To draw predicted bounding boxes over the image tiles after the training, use our utility script:
python tf_od_predict.py --model_name=building_od_ssd \ --path_to_label=data/building_od.pbtxt \ --test_image_path=images/test
You will be able to view your model prediction over the tiles as follows:
We also prepared a script to evaluate the model performance using intersection over union (IOU, also known as Jaccard index):
python tf_iou.py --model_name=building_od_ssd \ --path_to_label=data/building_od.pbtxt \ --test_image_path=images/test
I ran the model from our full walkthrough over the weekend (~ 50 hours) on a local CPU with about 19,000 steps. There are 227 buildings in the test dataset, and 191 buildings are predicted correctly by the model (84%). When the IOU is higher than 0.5, we consider the model has predicted the buildings correctly.
Our Ayacucho Data Team, together with 372 OpenStreetMap mappers, heavily contributed to mapping buildings in this area after a powerful earthquake jolted central Mexico City in fall 2017. As a result, we were able to get more accurate predictions due to the buildings being labeled well.
If you want to run this example don’t forget to check out our detailed walkthrough of TensorFlow Object Detection. Let me know (@geonanayi) how it works and if you’d like to see any new features or framework integrations in Label Maker.