Feature Importance
What is the above image? This is a graph that tells you which features in my final model from my capstone project on food deserts, census tracts with low access to fresh, healthy food, were most important in predicting if a census tract was in fact a food desert or not. For this blog, I will be focusing on the two most important features.
Low Vehicle Access
The TractHUNV feature represents how many housing units in a census tract do not own or have access to a vehicle. Interestingly, this tracked with an observation I made during my exploratory data analysis before I started modeling:
As you can see here, having low vehicle access in an urban environment increases your chances of living in a food desert when compared to living in an urban environment alone.
Why would this be? My initial theory was that there must be such a tremendous public transportation problem in the United States of America that having access to a vehicle even in an urban environment significantly improves your ability to obtain healthy food.
This theory was further strengthened as a result of TractHUNV being my most important feature in my final model. If I was making recommendations to the federal government, I would strongly implore them to invest in expanding access to public transportation, especially in urban environments.
Youth Population
The second most important feature in my final model was TractKids, which represented the numbers of people in a census tract under the age of 18. This means that census tracts with a large number of youths were more likely to have low access to healthy food.
My theory as to the cause of this is that grocery store corporations have concluded that areas with high youth populations are not as profitable and therefore, they are less likely to build grocery stores in these types of areas.
To counter this, the federal government could incentivize these corporations to build grocery stores in these types of food deserts with partial subsidies.
Conclusion
As you can see, feature importance is an excellent way to gain insights into your model and illuminate you as to what features in your model were most useful in making predictions.