This year I attended the 2 days of conference at Devoxx UK taking place in London on 10-11th May. This article is a summary of the notes I took during the second day. You can read the previous article talking about my first day. And if you’re interested in more details in a talk, you can watch the associated video.
Deep Learning: The Future of Artificial Intelligence, with Matthew Renze
In the past, we had to explicitly program a computer step by step to get it to solve a problem (involving if-then statements, for loops and other logical operations). In the future, machines are going to teach themselves how to solve a problem on their own, we just have to provide the data.
What is Deep Learning?
Deep Learning is a form of Artificial Intelligence (AI) that uses a type of Machine Learning (ML) called an Artificial Neural Network with multiple hidden layers in attempt to learn hierarchical representations of the underlying data in order to make predictions given new data.
Machine Learning in essence is the application of Statistics to the problems of Artificial Intelligence. We are teaching machines how to solve problems by identifying statistical patterns in data.
Use (existing) Data -> to learn a Function -> in order to do a Prediction (on new data)
A neural network is a ML algorithm based on a very crude approximation of the way how we used to believe the brain and the neurons are working (the brain is still a black box for scientists).
An artificial neuron takes a set of inputs, applies a function to produce a set of outputs. We represent this neuron mathematically (i.e. inputs and outputs are numbers) so we can use it in a computational model.
A neural network is composed of several neurons organized into different layers: the input layer (the data we feed), 1 or more hidden layers and the output layer (the prediction). A deep neural network contains more than one hidden layer. Adding more than one hidden layer essentially allows us to model much more complex function than a simple single layer.
An example of Deep Learning
An example of deep neural network is a person recognition model based on a picture. The lower hidden layers represents more abstract shapes like geometric primitives (e.g. horizontal, vertical lines) while the intermediate layers are representing more complex features like specific parts of the body (mouth, nose, eye, etc.) and the highest hidden layer will represent the face and ultimately be able to identify the person. The accuracy is increasing as we are approaching the last layer and ultimately make a prediction.
Why are we talking about Deep Learning only now?
After all, the first ML algorithm Perceptron was created in 1957, more than 60 years ago!
- We live in the era of Big Data. In the past 2 years, we have created more data than the entire rest of human history. We never had so much data available, this data is essential to train complex models.
- Computers have never been so powerful: faster CPUs, more memory, solid-state drive. We can leverage the power of GPUs, matrix operations is something needed for video game graphics but also for ML. We also have access to distributed computing technologies where we can share the data processing between multiple machines.
What can we do with Machine Learning?
- Classification. We want to predict a discrete variable that can only take on a certain number of values. Is it a cat or a dog? Is this email a spam or not? Does this person have a cancer or not? What is the category of this article?
- Regression. We want to predict a continuous variable that has an infinite number of possible values. At what price should I sell this house? What is the credit score of this person?
- Text generation. Generate the title of an article, the description of an image based on its content, convert voice to text for automatic subtitles.
- Image generation. Simulate face aging, paint a new Rembrandt that even expert could not identify as a fake, create celebrities that look familiar but don’t exist, create an image based on a description.
- Audio generation. Make an algorithm talk to a human on a phone call (Google Duplex), change your voice based on a text (voice editing) to avoid recording your voice again in case you got it wrong.
- Video generation. Based on rushes of videos and audios from Barack Obama, create a new video with an AI-powered lip sync.
If you have browsed some of the examples above, you will probably find very challenging to differentiate between a real and a computer-generated content. That is quite scary when you think about it and we can barely imagine how it will be in 10 years!
How do I get started if I want to do ML in my company?
Option 1 – Deep Learning as a Service (Google Cloud, AWS, Microsoft Cognitives, IBM Watson…)
It involves a 3rd party company that owns the model and the data. Basically, you can query their API with your new data in order to make a prediction.
This option is good for narrow use cases, if you don’t want to reinvent the wheel and use an already trained model.
Pro: Simple, quick, inexpensive
Cons: Narrow, remote (e.g. you are not protected against network failures or latency), pay-per-use (if your usage increases, so will be the cost)
Option 2 – Deep Learning platform (Microsoft Azure, Cognitive Services…)
This option is good for custom use cases. You upload your data to train the model (transfer learning) then you can query an API to get your predictions.
Pro: Simple, quick, inexpensive
Cons: You need training data to train your model, remote, pay-per-use (per transaction for the prediction but also the transactions for the training)
Option 3 – Do it yourself (TensorFlow, Torch…)
To use if option 1 and 2 fail. You create from scratch your own algorithm, provide the data and host it yourself.
Pro: Custom (you can tune it as you like), local, private (in case you are dealing with sensitive data)
Cons: Complex, labor, expensive
Teaching kids about machine learning, with Dale Lane
Dale has created a website to teach kids about Machine Learning using the visual programming language platform Scratch. Scratch is a way to introduce programming to kids with a simple interface where you can drag and drop blocks representing programming operations (for loops, if-then statements…) and assemble them to create more advanced logic. Dale is using ScratchX (experimental extensions for Scratch) and has created new blocks related to Machine Learning (using IBM Watson under the hood).
He has invented many exercises so the kids can grasp the different concepts of ML using concrete and fun examples. The exercises are mixing text, number and image recognition. Each exercise is usually composed of a phase when you train the model from the web application, then you can evaluate the prediction with Scratch. You can also refine the training set and see the effects on the model predictions.
- Create a chatbot, e.g. an animal that can answer questions about its species.
- Predict the newspaper an article was extracted from based on its title.
- Make Pacman avoid ghosts by playing the game multiple times. The more you play and train the model, the better Pacman gets.
- Categorize images. Uploading images of cups and cars to train the model, so it can classify a new and unknown image to one of these 2 categories. Uploading images of book cover for different categories (sci-fi, romance, thriller…) then the model can predict the category of a new cover. To make it more interactive you can also upload images from the webcam, for example take pictures of your hand for the game rock / paper / scissors and play against the computer. The algorithm will recognize the shape you are doing with your hands if you train it enough.
- Recognize handwritten postcode to figure out which city the mail should be delivered to. That’s a good real-life example.
- Play where is Wally-like games. The computer automatically detects where Scratch the cat is located in a picture.
- And many other exercises.
Dale also mentioned the problems of AI:
- Issue with background, weather conditions. If it is obvious for a human, a computer can make silly mistakes if the training set is incomplete.
- The Russian tank story where the model was trained with high resolution pictures of American tanks but low resolution and blurry pictures of Russian tanks. It performs really poorly to identify tanks in reality.
- Google photo can add automatically text to describe a picture, some black people were categorized as gorillas…
- A model that can advise medicine but is sponsored by a pharmaceutical company. What if the company asks to add more references of its product in the training set, is that something right?
There are a lot of important notions around ethics, bias, model overfitting and the quality of the training set. It’s important that kids understand about these by themselves, and they usually do!
At the end, Dale gave use some links to go further:
- Machine Learning for Kids the website of Dale where you can find all the exercises.
- Teachable Machine from Google to train a model with your webcam.
- Quick, Draw! from Google to recognize a drawing.
- Moral Machine from MIT to judge different scenario involving a self-driving car.
Easy Microservices with JHipster, with Sendil Kumar N
This presentation was a live-coding session. I heard about JHipster a while ago but never had the opportunity to see it in action and the fact the project was initiated by a French developer was another reason to go.
JHipster is making the creation of new applications faster by generating the boiler plate code and dealing with most of the configuration (for example the configuration for Kubernetes, Maven POM files, etc.). It is a command-line tool where you can choose to create a monolith application, a microservice or a UAA (User Accounting and Authorizing service for securing your app using OAuth2). Once you have chosen your type of application, there are many options you can choose for your tech stack: frontend, backend, data source, build, logging, deployment CI/CD, service registry, documentation (e.g. Swagger), testing frameworks.
For the demo, Sendil has created a gateway application that was querying a microservice application. He ran the app locally and then deployed it on Google Cloud Platform (GCP) via Kubernetes, without having to type a single line of code.
Decide the frameworks and technologies you want to use will probably be the tough part as there are so many options. Also you have to keep in mind you need to fine-tune the configuration (e.g. the default config they provide for Kubernetes may not suit your needs).
Troubleshooting & Debugging Production Microservices in Kubernetes, with Ray Tsang
Ray shows us a little Guest book application where a user can post a message with a name (Guestbook service) and also get greetings after posting (Hello service). This is a Spring Boot application deployed on Kubernetes, and made of an UI and 2 microservices (each one with multiple instances). A 500 error page was showing up when he tried to access it indicating
null and a 404. This time, it was not a demo effect or an illustration of Murphy’s Law and was part of the presentation. He went through the process of debugging the issue step by step using a variety of tools provided by Google Cloud Platform.
Looking at the logs, he was able to identify 2 instances of the UI service that were having a lot of errors compared to the other instances. Kill/restart them is not the solution as it won’t prevent the error from happening again. So he decided to get one of the faulty instances out of the load balancer with one command in
kubectl (a command line interface for running commands against Kubernetes clusters) by changing the
serving flag to false. The idea is to isolate it from production traffic and be free to debug it. Note that by disabling the instance, Kubernetes automatically spins up a new instance of the service. Then he configured the port forwarding on this pod to be able to query this specific instance locally. With Stackdriver Trace, he was able to see the call tracing. Filtering on 5xx errors and the particular instance, he then realized an issue occurred at the Hello service level where the hello endpoint is called but responded with 404. It seemed to be the case when the name was missing. To confirm the behaviour, he added some logging on the fly to display the name, and indeed it was empty and the app was failing. Basically, some validation was missing on the form to make the name mandatory.
To recap, we can mention the 4 main debugging tools provided with GCP:
- Browsing and querying the logs with Stackdriver Logging (a Splunk-like tool)
- Tracing all the calls with Stackdriver Trace (a Zipkin-like tool)
- Adding debug logs and breakpoints on the fly to a prod instance with Stackdriver Debug
- Having the history of errors in the logs plus various metrics on the services (response time…) with Stackdriver Monitoring
It was a nice presentation but I can just regret that the title was not mentioning the fact the demo will mainly use Google Cloud commercial tools, I thought it will be more focus on Kubernetes. But still worth it to know what Google is providing. I have to admit that playing with breakpoints and adding logs in the code of a remote prod instance, all from a web interface, was quite impressive.
Fully serverless, a case study, with Stephen Colebourne & Chris Kent
At OpenGamma, they have created a new financial platform built on AWS and decided to use serverless technologies (AWS Lambda).
What is serverless and Lambda?
Serverless is basically when the infrastructure is invisible, you don’t know where the code runs and you don’t have control.
An AWS Lambda is basically made of 2 classes: a simple interface and its implementation, where you implement the
handleRequest method. The Lambda terminates once the method completes.
You package the code in a JAR (that includes all the needed dependencies) and upload it in AWS to be run.
How can you trigger a Lambda?
- You can use CloudWatch (CRON-like).
- You can use an event. It can be a REST API call, a file is saved in AWS S3, a message arrives in a queue, a row is added to a table or a call directly from another Lambda.
Limits, limits everywhere!
When working with Lambdas, you have to be very careful with the AWS Lambda Limits.
- A default timeout of execution: 5 minutes
- A maximum amount of memory you can use: 3GB
- A maximum size for a JAR file that you can upload: 50Mb
- A maximum of disk space you can use: 512Mb
For some specific heavy processing, they hit most of these limits and they had to find different ways to handle it:
- Use AWS Batch instead which includes serverless features too, the only downside is that they cannot trigger the process instantly, i.e. when you submit a batch job, it will be processed at some point but you cannot assume it will be instant (can be in 2 or even 10 minutes). This limitation was fine for them, obviously this won’t work for everyone.
- Split and pre-process the data as much as possible before passing it to the Lambda. The Lambda should not do any of this time and memory consuming pre-processing. Also as the data is loaded on demand, it introduces some latency but it was acceptable for them.
- Instead of passing a copy of the data between Lambdas which can be memory/network consuming and costly, they save the data in S3 and then use the reference to it (metadata).
They have ended up with 4 services, formed of a total of 23 Lambdas and 2 Batch.
The automatic scaling with Lambda is the main benefit. It is totally transparent and managed by AWS. When there is a lot of requests, Lambda are created and share the load. When all requests have been processed, the Lambdas are scaling down (to 0 when idle). However, it’s not a silver bullet because you can have a bottle neck at another level of your system (e.g. your data store is too slow) and you will not be able to fully leverage Lambda scalability.
A downside of the Lambdas is the cold start, it takes time to start. In fact, a Lambda is not really destroyed after it completes, as it can be re-used for the next request, AWS has a keep-alive mechanism, which means Lambdas typically live for few minutes after they execute. They decided to call the Lambda with a bit of code every few minutes to keep it available and “warm”. Obviously, it’s a hack and it’s not guaranteed that it will still work in the future, but Chris told us it was quite common…
For the debugging, things are not easy, you need to deploy to AWS to test, it’s hard to debug. Nowadays there are libraries and frameworks to help you do that.
The logs from the Lambdas are going into AWS CloudWatch Logs. This tool is not great for usability so they have created a Lambda to copy logs from AWS to a dedicated log aggregator tool: Sumo Logic, a Splunk-like tool. For the alerts and monitoring, they use CloudWatch Metrics and they have set up alerts in Sumo Logic too that makes it easier to find the logs.
Building from small simple pieces pushes the complexity elsewhere at the infrastructure level (during build time) and also the interactions between the components (during run time). The monitoring and alerting is also more complex as the system becomes even more fragmented. It is important to keep in mind that a Lambda has very restrictive limits, go above one and your Lambda will be killed. Note that the limits are evolving and are lifted overtime. The technology is still young and the tooling and frameworks available is quite limited, but given the level of excitement around Lambda, it’s clear that it’s going to improve.
On the other hand, you don’t need to think about servers and maintaining them. The scaling is transparent and automatically handles the load. You have large potential cost savings using Lambda, for example, in case your system is country-specific, you know that the traffic will be very limited at night hours. Finally, the programming model is very simple, just a JAR file, just a function.
This second day at Devoxx was great, especially the talks in the morning about Machine Learning. I came out of these 2 days with a lot of innovative ideas. Thanks Devoxx UK and maybe see you next year again!