In this post, I take a few statistical/ML models and look at whether they learn an interaction effect (in a toy example where there's no evidence either way about whether there is an interaction).
I found this to be a nice way to understand the algorithms better, and I also use this to make a case for using algorithms that explicitly capture uncertainty.