Learning ML 1

I’ve decided to get a lot better at ML, and I enjoyed my Rust learning log so much that I’m going to do something similar for this. However, I think there will be a lot of differences between learning ML and learning Rust. Rust is an easier learning environment in some important ways:

Causal models - In Rust, we have a reasonably good idea of what’s going on, and we can look things up when we don’t. We can understand, for example, why we can’t take a second mutable reference to some object: this would violate the invariants of the borrow checker. If we want to understand why the borrow checker is set up this way, we can do that. In ML, most explanations for why something works well are post-hoc, and solutions built from first principles fail more often than they succeed. We can’t simply learn the underlying logic of ML to predict what will and won’t work.
Iteration speed - Rust allows you to fail even faster than most compiled languages, because so many errors are caught by the compiler, and it also gives you incredibly helpful error messages. In ML, on the other hand, slow training makes it hard to test your ideas, and when they fail you typically don’t know why.

I expect that learning ML will be a lot like learning to be a mason or herbalist used to be: lots of arcane tricks without much grounding in a causal system, with expensive experimentation getting in the way of rapid improvement. Learning masonry or herbalism(?) on your own is not the move: you need to apprentice yourself to someone and watch how they do it. Absent that, here are some of my ideas:

Online Courses

I’ve taken online ML courses before and consider them more or less worthless, as they focus way more on theory than on practice. If we continue the masonry metaphor, it appears to me that online ML courses consider it extremely important that all masons should have a deep background in geology. Sure, rocks are important to masonry, and knowing which ones to use is probably really helpful. But at some point those rocks have to be assembled into a building, and I’m not convinced that most college professors even know how to do that. It would be very fair to point out that sooner or later, masonry gets revolutionized by stronger theoretical understanding (like trigonometry), but I’m not trying to invest time in that at the moment.

I’d be delighted to be wrong here, a good online course would be super convenient, and I’ll keep trying to find one.

Reading Kaggle

Kaggle seems like a great resource, though they maybe (understandably) underemphasize robotics applications. I expect that one of the best ways to improve will be to look at the top 3 submissions to a contest relative to the 50th percentile or so to try and get an idea of what the best people in the world do differently than the middle of the pack (I assume the bottom 50% aren’t really trying, which isn’t a very interesting way to fail).

Small experiments (maybe competing in Kaggle?)

My computer is a 2016 Dell laptop, and I absolutely love it. I plan to get another one just like it if it ever breaks for good, but it’s not exactly an ML powerhouse, so I can’t train big neural nets unless I’m willing to pay for time on a cluster (which I currently can’t afford). But that doesn’t stop me from training small neural nets. I’m not sure how well I should expect this to work, though. If the last year or two in ML has made anything clear, it’s that large NNs have fundamentally different capabilities than small ones. Shouldn’t we expect that they need different approaches, too?

Reading Reddit

There seem to be two relevant Reddit communities if you want to learn ML: r/learnmachinelearning and r/MachineLearning. I don’t expect that reading Reddit alone would be a good way to learn, but maybe it will be a good way to track the field and see what other people are doing.

2023/06/06

Online Courses

Reading Kaggle

Small experiments (maybe competing in Kaggle?)

Reading Reddit