Interested in working together? I'm currently available for hire! Click here to learn more.


Demystifying machine learning

19th of December, 2018

Classical software is primarily based around the notion of decision trees. Any programmer who has coded an if-else-then statement can understand the basics of data science.

To demonstrate this, let’s take a domain that we’re all familiar with: music.

Predicting musical tastes

Imagine we want to build a music streaming service that gets better with time, continuously understanding our users and slowly building out song preferences based on their listening habits (personal preference) combined with the listening habits of everyone else using the service (market trends).

The first step towards gaining a deep understanding of our users is to capture their behaviour. We can achieve this using events, which are simply interesting things that have happened in the past, such as someone listening to a song.

In the music domain two events may be as follows:

(These two events already tells us a lot about someone.)

With a stream of events we can begin to predict the future. In our case, the main question may be as follows: “out of any three songs in our database, which three songs are the most likely to resonate with our user and lead them to continue subscribing to our service?”

Or better yet, can our predictions become so valuable to users that they recommend the service to their friends? “You really have to try out this new site. I’ve discovered so much new music that I wouldn’t have heard otherwise!”

Based on this goal, we decide to create a “music affinity profile” for users. This allows our system to determine the next cohort of recommendations for a user based on personal listening preferences along with how closely recommended songs relate to their favourite songs. Affinity gives us a model to work with in order to provide solid recommendations for someone’s next sonic adventure.

There are a variety of well known ways to approach topics such as creating profiles (and hybrids of those) including user and item-based collaborative filtering.

Now that we have an idea of what we want to accomplish from a business perspective — provide spot-on listening recommendations to users so they continue to subscribe to our service and recommend our service to friends — we need to decide on the implementation approach.

The easiest option is to implement a recommendation engine is by coding a classical decision tree in the form of if-else-then statements, likely baking our own preferences directly into the code.

Another option is to hire a “domain expert” — perhaps someone with deep music industry knowledge — to create a set of recommendation rules in a system like Operational Decision Maker by IBM. This approach is similar to if-else-then statements but removes logic from our application code. This is a common approach in all sorts of web applications, particularly online shopping. It helps create a boundary between application code and business rules, enabling domain experts to update decision logic without costly code changes.

Hard-coded decision making is a low cost, low effort option that works well for UI and other low-level concerns, but not so well for sophisticated decisions like a recommendation engine.

Both of the above options will work, but they may not be ideal. There are four main issues:

  1. Either a developer or a domain expert has to impart a lot of their own judgement on the listening habits of others. The quality of predictions will be 100% dependant on the quality of your domain expert(s). Recommendations are based on intuition rather than mathematical reasoning.
  2. Musical tastes change rapidly. Static rules are unlikely to be accurate in perpetuity and the quality will degrade over time. Tekashi 6ix9ine may be hot today, but what about later this afternoon or tomorrow? We always want to recommend the hottest mumble rapper of the moment to anyone with a strong affinity towards mumble rap. However, tastes in classic rock may change a little more slowly and predictions stay relevant for a longer. This may lead to skewed business results like your streaming service staying popular with Pink Floyd fans but losing favour with Lil Pump fans.
  3. We lose real-time insights. A system such as a music streaming service is literally a data generating machine. Skips, pauses, replays, and searches are all extremely valuable events that we can leverage to learn about user behaviour and therefore make better predictions. Hard-coded business rules will not adapt to this type of real-time feedback; the best we can hope for from a static report is advice on how to perform a costly round of code changes or updates to our rules engine. Totally inefficient.
  4. Including every song ever recorded in a set of business rules would be extremely impractical. The best possible outcome of such an implementation would be a radio station that recommends a subset of music, whether popular or niche, depending on the nature and branding of the streaming service (e.g, “Rock95” vs “HipHop97”).

Conclusion

To recap, we have three options to building our song recommendation service:

  1. Hard-coded logic
  2. Business logic engine
  3. Machine learning

Before embarking on a data science project, one of the first decisions to make is if expert-driven decision making is a good match for your business needs, or if you need to explore automated decision making with machine learning.

While predictive analytics and machine learning will generate better predictions over the long run, there’s a cost involved, so only embark on such a project if you can back it up with a business reason.

Many systems of tomorrow such as computer vision and natural language processing will rely completely on automated decision making, so learning the ins and outs early is a high value skill that will pay dividends over the years to come.

I hope this is about the simplest explanation of machine learning that you’ll ever read.

Next

In a future blog post, I’ll describe the roles involved on a machine learning project, and why there are many opportunities for developers from all backgrounds. I hope to showcase that not every developer needs an advanced mathematics degree to be productive in this emerging and important discipline.