Blog
Series
Videos
Talks

Demystifying machine learning

19th of December, 2018

Classical software is primarily based around the notion of decision trees. Any programmer who has coded an if-else-then statement can understand the basics of data science.

To demonstrate this, let’s take a domain that we’re all familiar with: music.

Predicting musical tastes

Imagine we want to build a music streaming service that gets better with time, continuously understanding our users and slowly building out song preferences based on their listening habits (personal preference) combined with the listening habits of everyone else using the service (market trends).

The first step towards gaining a deep understanding of our users is to capture their behaviour. We can achieve this using events, which are simply interesting things that have happened in the past, such as someone listening to a song.

In the music domain two events may be as follows:

(These two events already tells us a lot about someone.)

With a stream of events we can begin to predict the future. In our case, the main question may be as follows: “out of any three songs in our database, which three songs are the most likely to resonate with our user and lead them to continue subscribing to our service?”

Or better yet, can our predictions become so valuable to users that they recommend the service to their friends? “You really have to try out this new site. I’ve discovered so much new music that I wouldn’t have heard otherwise!”

Intelligent computing with collaborative filtering

Based on this goal, we decide to create a “music affinity profile” for users. This allows our system to determine the next cohort of recommendations for a user based on personal listening preferences along with how closely recommended songs relate to their favourite songs. Affinity gives us a model to work with in order to provide solid recommendations for someone’s next sonic adventure. Below is a very simple example of an affinity profile of a music listener.

 1{
 2    "user_19234": {
 3        "genre": {
 4            "rock": {
 5                "affinity": 0.89,
 6                "subgenres": {
 7                    "classic": { ... },
 8                    "progressive": { ... }
 9                }
10            }
11        }
12    }
13}

There are a variety of well known ways to approach topics such as creating profiles (and hybrids of those) including user-based and item-based collaborative filtering.

Now that we have an idea of what we want to accomplish from a business perspective — provide spot-on listening recommendations to users so they continue to subscribe to our service and recommend our service to friends — we need to decide on the implementation approach.

Classical computing with if-then-else statements

The easiest option is to implement a recommendation engine is by coding a classical decision tree in the form of if-else-then statements, likely baking our own preferences directly into the code.

Another option is to hire a “domain expert” — perhaps someone with deep music industry knowledge — to create a set of recommendation rules in a system like Operational Decision Maker by IBM. This approach is similar to if-else-then statements but removes logic from our application code. This is a common approach in all sorts of web applications, particularly online shopping. It helps create a boundary between application code and business rules, enabling domain experts to update decision logic without costly code changes.

Hard-coded decision making is a low cost, low effort option that works well for UI and other low-level concerns, but not so well for sophisticated decisions like a recommendation engine.

The drawbacks of decision trees

Both of the above options will work, but they may not be ideal. There are four main issues with classical computing using if-then-else statements or rules engines, specifically when building recommendation systems:

  1. Either a developer or a domain expert has to hard-code a lot of their own preferences into the recommendation logic. The quality of recommendations are 100% dependant on the quality of your domain expert(s) when building recommendation systems using decision trees. Recommendations become based on intuition rather than modelling and reasoning through machine learning. In other words, as a developer, if I enjoy listening to Napalm Death, you can be sure it will be recommended to users.
  2. Musical tastes change rapidly. Static rules are unlikely to be accurate in perpetuity and the quality of recommendations will degrade over time. Tekashi 6ix9ine may be hot today, but what about later this afternoon or tomorrow? We always want to recommend the right song at the right time. Tastes in mumble rap evolve quickly, whereas tastes in classical music change slowly. Mozart will always be relevant to someone, but 6ix9ine may eventually be completely irrelevant. Without predictive analytics, failing to account for changing tastes may lead to skewed business results, like your streaming service staying popular with Pink Floyd fans but losing favour with Lil Pump fans.
  3. We lose real-time global insights if not using events to train our recommendation models. A system such as a music streaming service is literally a data generating machine. Skips, pauses, replays, and searches are all extremely valuable events that we can leverage to learn about user behaviour and therefore make better predictions. Hard-coded business rules will not adapt to this type of real-time feedback; the best we can hope for from a static report is advice on how to perform a costly round of code changes or updates to our rules engine.
  4. Including every song ever recorded in a set of business rules is extremely impractical. The best possible outcome of such an implementation would be a radio station that recommends a subset of music, whether popular or niche, depending on the nature and branding of the streaming service (e.g, “Rock95” vs “HipHop97”).

Conclusion

To recap, we have three options to building our song recommendation service:

  1. Hard-coded logic (if-then-else statements)
  2. Business logic engine (sometimes called a “rules engine”)
  3. Machine learning (collaborative filtering or other learning algorithms)

Before embarking on a data science project, one of the first decisions to make is if expert-driven decision making is a good match for your business needs, or if you need to explore automated decision making with machine learning.

While predictive analytics and machine learning will generate better predictions over the long run, there’s a cost involved, so only embark on such a project if you can back it up with a business reason.

Many systems of tomorrow such as computer vision and natural language processing will rely completely on automated decision making, so learning the ins and outs early is a high value skill that will pay dividends over the years to come.

I hope this is about the simplest explanation of machine learning that you’ll ever read.

Next

In a future blog post, I’ll describe the roles involved on a machine learning project, and why there are many opportunities for developers from all backgrounds. I hope to showcase that not every developer needs an advanced mathematics degree to be productive in this emerging and important discipline.

This work by Kevin Webber is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.