(This post was originally sent out to the Digital Circle Mailing List on 26th November but is included here for completeness).
Timely to have this in the week of AICon – the inaugural convention on artificial intelligence and machine learning in Northern Ireland. But it’s too late to get a ticket as it’s completely sold out!
When you hear the term “machine learning” or “artificial intelligence” what immediately comes to mind? Is it self-driving cars, robot overlords or the impending doom that an algorithm might, just might, be after your job eventually? The media want to give you the big stories, but the reality of machine learning is really simple.
Let’s bring things back to basics, the definition of machine learning is this:
Machine learning is a branch of artificial intelligence. Using computing, we design systems that can learn from data in a manner of being trained. The systems might learn and improve with experience and, with time, refine a model that can be used to predict outcomes of questions based on the previous learning.
There are two approaches that can be used in order to gain that learning. The first is by using supervised learning, the approach is to define what the data is. For example, if I have a big list of text from Twitter and I want to classify into categories, I’d take the raw twitter data:
Really loving the album Healing Is Difficult by Sia!
#fashion I'm selling my Louboutins! Who's interested? #louboutins
I've got my Kafka cluster working on a load of data. #data
Before any machine learning can take place, I would have to label the categories of the data first. So the tweets above would then look like
MUSIC: Really loving the album Healing Is Difficult by Sia!
SHOES: #fashion I'm selling my Louboutins! Who's interested? #louboutins
BIGDATA: I've got my Kafka cluster working on a load of data. #data
Once the training is done and I feed in new tweets, the model should be telling me what category a tweet is based on my previous labelling. Alternatively, unsupervised learning lets the machine learning algorithm learn from the data without any hints, it will figure out patterns during its training.
Do I Need Machine Learning?
The simple answer here is, it depends.
If you think you want to be able to predict a future transaction, classify a customer to a segment based on their clicked links, then yes machine learning is going to help you. Not every company needs machine learning and don’t be told that you do, some of these techniques take time and money to setup and don’t always deliver on the accuracy we hope for. While there are services by the large cloud providers to help you, they still take time to learn and money to run. Always run a discovery phase during a machine learning exploration and set the timescale and budget tightly, a few days to explore the possibilities.
Start with the most basic question, “What is the question I’m trying to answer here?”, this is where you talk about what you are trying to achieve. I’ve seen more projects fail by not answering this one question. Throwing data at a problem without knowing what the problem is, it’s just plain foolish, and surprisingly common.
A Basic Project
Let’s work through a basic scenario. “Can I predict who is going to buy this product?”, is a good question to answer for example. The question is set, after that it’s putting the pieces of the puzzle together.
Project process for a machine learning system is pretty simple:
- Collate the data
- Do data cleaning and check the quality.
- Run machine learning tools.
- Present the results
Collate the Data
The next stage is to confirm you have data; in this case it would be transaction data. If it were retail, it may be the data taken from a point of sale system or e-commerce store and cross referenced with data taken from a customer relationship management (CRM) system.
The data may need some cleaning and preparation before it can be used for any type of machine learning training. Data cleaning takes the most time to prepare and can chew up a large amount of the budget. At this point it’s a good idea to inspect the data you are going to use for training, is it balanced and give a representation to the question you are trying to answer? If the data leans too much to one area then you’ll get biased results.
Don’t use everything you have, do random sets of data for training and see how the model looks once it’s been trained. Machine learning is an iterative process, not a one stop thing, it takes time to develop.
Run Machine Learning Tools
There are loads of tools out there, some are in the cloud and some are development tools that you’ll need a good programmer for. Not everyone knows what they are doing so ask questions up front and look to see previous work done for other companies if you are bringing someone in. Ultimately, anyone can run a machine learning model, the question is can the do it and add value to the business, if they can’t, then move on to someone who can.
Present the Results
Machine learning models will usually give you some form of accuracy score. Training happens with a certain percentage of the data and then the remainder is used for evaluation. These accuracy results are important. If it’s 100% accurate then there’s a good chance that’s not to be believed and the data was created to fit the outcome, also known as overfitting. From there the model then needs to be implemented in a such a way it can be used in your day-to-day working.
Using Someone Else’s Model?
For some things it makes no sense to go training an algorithm. Images are a good example of this, the training for a small image set can take over a week with something like a convolutional neural network. It’s possible to use existing models, the technique is called transfer learning and means you can save a lot of time and money using the work or someone else and then extracting the features or slightly modifying the algorithm in its final stages. It doesn’t work for everything but for things like images and video it could pay dividends in the long run.
The Creepy Line….
With machine learning comes great responsibility. Some algorithms, like neural networks, are essentially black boxes and it’s highly improbable that you can explain the prediction made by it. Great when you get started but a PR nightmare if your black box makes a bad prediction and the customer goes public about it.
Introducing machine learning into an organisation is just not a technical matter, it brings in all departments as there is potential impact in the way some predictions will affect the day-to-day operations of the company. The key points to evaluate are that you are actually have a business case to answer, have data of decent quality to run the training on and then be in a position to support the predictions in a wider business sense.
By Jase Bell
Jase is a Data Engineer, specialising in high volume data streams and BigData. He is also the author of Machine Learning: Hands On for Developers and Technical Professionals, the 2nd edition is in production for early 2020. A well regarded conference speaker, Jase is also on the programme committee for O’Reilly’s Strata Data Conference for San Jose and London