My journey from the Appalachian Trail to Metis to DigitalGlobe

Alan Schoen, DigitalGlobe Data Scientist
Jun 23 2017

My name is Alan, and I’m a data scientist at DigitalGlobe. I got here by way of a twisting trail that led me over mountains, to a city, and finally to the most intense experience of my life—12 weeks at a coding bootcamp.

This story starts in Montreal in early 2015. After completing graduate school and working abroad for 10 years, I decided it was time to move back home to the U.S. Instead of jumping right into a job search, I first crossed a big item off my to-do list: hike the Appalachian Trail. Commonly referred to as “the AT,” the trail runs almost 2,200 miles over the tallest hills and mountains from Springer Mountain, Georgia to Mount Katahdin, Maine. I hiked for six months to complete the AT in what is known as the “flip-flop” pattern. I started in the middle, hiked to the northern terminus in Maine, then got a ride back to the middle and finished the trail in southern Georgia.

Getting my hair cut in Lebanon, New Hampshire after hiking for months.

Things can get pretty uncomfortable on the trail. My gear got soaked by rain. I hiked through scorching hot days, only to be uncomfortably cold at night. And I learned how to deal with all kinds of insects and lots of bears. But I saw many places I would not have been able to see any other way, and I met great people in the big community of hikers and people who help hikers.

Here I am at the southern terminus of the AT on Springer Mountain, Georgia, shortly after I finished the entire trail.

At the end of my AT journey, I moved to Washington, D.C. to look for a job. That trail was tough for different reasons. I knew a six-month gap in my work history would make finding a job hard, but I underestimated how challenging it would be. Because my professional connections were in Canada and Germany, my network was not particularly helpful. After applying to about 300 data scientist jobs in D.C. and New York, I concluded that I lacked the necessary experience with modern programming languages. I had previously worked with research and proofs of concept, but not with production-level solutions.

To become an attractive job candidate, I needed new skills. “Coding bootcamps,” while expensive, cover a lot of material quickly and typically have good graduate job placement. I researched different curriculums, online reviews and graduate project presentations. I chose Metis because it offered the training I wanted—in particular Python programming language. I knew Python had a lot of the same capabilities as Matlab (which I had previously used extensively), but was free, flexible and open source, which means many tools are available to help you use Python for nearly every purpose—a huge advantage.

The 12 weeks I spent at Metis were intense. Students complete five projects requiring the implementation of a data science product—from gathering and processing data, to building a model and presenting results. The bootcamp also provided career counseling, including resume and LinkedIn reviews, interview preparation and connections with employers. The program concludes with a three-week “passion project” where students choose their own topic and create a data product using the skills they developed during the course. Each student prepares a five-minute presentation to be given on Metis Career Day, an event where employers who are interested in hiring Metis graduates gather to watch.

I got interested in geospatial data on my AT hike. I learned firsthand that you miss a lot of information when you average data over large spatial areas. My interest in geospatial information led me to Adam Estrada (now my boss at DigitalGlobe) who I met at the Metis Career Day. About two weeks after we met, I got an offer to be a data scientist at DigitalGlobe and I accepted it. Now I work on neural network models for our cloud platform, and also help with projects like an immersive virtual reality experience with machine learning. At Metis I was exposed to a broad range of data science topics that would have taken years to learn on my own. That training has been a huge help in my job.

I am so proud that DigitalGlobe and Metis have teamed up to create more awareness about geospatial information in the data scientist community with the Metis/DigitalGlobe Data Challenge. DigitalGlobe provides satellite imagery, data and some technical help to Metis students for their passion projects. The students design their own projects to get information from satellite images using deep learning. DigitalGlobe’s constellation of satellites constantly produces new imagery, so we can receive a lot of valuable, timely information by analyzing each new image as it comes in. To extract that information, we train neural networks to interpret satellite images and turn them into geospatial data.

The Metis/DigitalGlobe Data Challenge is designed to help students come up with new projects. Metis students do not typically have the resources to gather their ground truth data, so we are using readily available public data. I created a project portfolio of resources to help the students, which includes:

  • primer on training neural networks on satellite data
  • Satellite images provided by SpaceNet, a public collection of DigitalGlobe satellite images published by CosmiQ Works, DigitalGlobe and NVIDIA
  • OpenStreetMap, an editable open-source map of the world

There are a lot of unique challenges when working with geospatial data. My role is to provide students with the extra help necessary to complete their projects in the three-week timeframe. The first students to participate in the Metis/DigitalGlobe Data Challenge completed their presentations yesterday. They set out to train a segmentation neural net to recognize different kinds of terrain and land use. Below are some visualizations of their work.

If you’d like to learn more about the technical aspects of the Metis/DigitalGlobe Data Challenge or if you want to try training neural networks on satellite imagery, check out my post on the DeepCore blog.

Metis/DigitalGlobe Data Challenge Visualization Results:

Above is a training example that Chris Franzini used to train his water-detection model. On the left is the original satellite image. The middle shows the ground truth superimposed on the satellite image. The image on the right is the raw ground truth image used as a target in training.

Chris also created this video of his neural network learning over time. As training progresses, the model learns to recognize which parts of the image contain water.

Victoria Aston trained a neural network to find solar panels on top of buildings. To increase the model accuracy, she first built a model to detect buildings. There are many public sources of information about buildings, but not solar panels. She searched through images and labeled solar panels to create training examples like this:

Below is an example where her model searched a satellite image for solar panels. Solar panels are hard to find because they are not very visually distinct, but Victoria’s model is successful in identifying solar panels. It still gets some false detections, but that will improve as the model learns more.

Matt Maresca trained a neural network to classify several different types of land, including farms. This example shows an instance where the model correctly identified farmland.

Want to learn more about the Metis/DigitalGlobe Data Challenge? It might be the start of an amazing journey of your own. The walking 2,000 miles part is completely optional.