Deep Beer Designer

Posted by Johanna Pingel, October 18, 2018

70 views (last 30 days) | 0 Likes | 2 comments

This post is from Ieuan Evans, who has created a very unique example combining deep learning with LSTM and beer. (Please drink responsibly!)

I love craft beer. Nowadays, there are so many choices that it can be overwhelming, which is a great problem to have! Lately I have found myself becoming lazy when it comes to carefully selecting beers in a bar, and I tend to just go for the beer with the best sounding name.

I started to wonder: Could MATLAB automatically analyze a list of names and select a beer for me? Why stop there? Could I get MATLAB to design a unique beer just for me?

In this example, I will show how to classify beer styles given the name, how to generate new beer names, and even automatically generate some tasting notes too.

Happe Hill Hefeweizen

The nice blur effect is from image processing toolbox.

"A rich and fruity traditional Marthe Belgian yeast. Full-bodied with nutty undertones and a slightly sweet fruit flavor."

(MATLAB-generated name and tasting notes. Not bad!)

Import Data

There are two data sources available for this example:

A dataset of craft beers from Kaggle: https://www.kaggle.com/nickhould/craft-cans
A beer list of beers from the Cambridge Beer Festival in the UK: https://www.cambridgebeerfestival.com/products/cbf45-beer

Load the craft beers data from Kaggle.

rng(0)
filename = "beers.csv";
dataKaggle = readtable(filename,'TextType','string','Encoding','UTF-8');

View a random sample of the data.

idx = randperm(size(dataKaggle,1),10);
disp(dataKaggle(idx,["name" "style"]))

Name	Style
_______________________________________	_______________________________________
"Walloon (2014)"	"Saison / Farmhouse Ale"
"Yoshi's Nectar"	"California Common / Steam Beer"
"1327 Pod's ESB"	"Extra Special / Strong Bitter (ESB)"
"Parade Ground Coffee Porter"	"American Porter"
"Perpetual Darkness"	"Belgian Strong Dark Ale"
"La Frontera Premium IPA"	"American IPA"
"Canyon Cream Ale"	"Cream Ale"
"Pace Setter Belgian Style Wit"	"Witbier"
"Squatters Hop Rising Double IPA"	"American Double / Imperial IPA"
"Good Vibes IPA"	"American IPA"

Load the data from the Cambridge Beer Festival, which in addition to names and styles, also contains tasting notes. Extract the data using the HTML parsing tools from Text Analytics Toolbox.

url = "https://www.cambridgebeerfestival.com/products/cbf44-beer";
code = webread(url);
tree = htmlTree(code);

Extract the beer names.

subtrees = findElement(tree,"span[class=""productname""]");
name = extractHTMLText(subtrees);

Extract the tasting notes.

subtrees = findElement(tree,"span[class=""tasting""]");
notes = extractHTMLText(subtrees);
dataCambridge = table(name,notes);

Visualize the tasting notes in a word cloud. The wordcloud function in Text Analytics Toolbox creates word clouds directly from string data.

figure
wordcloud(notes);
title("Tasting Notes")

Classify Beer Style First, using the Kaggle data, create a long short-term memory (LSTM) deep learning model to classify the beer style given the name. Visualize the distribution of the beer styles using a word cloud.

textData = dataKaggle.name;
labels = categorical(dataKaggle.style);
figure
wordcloud(labels);
title("Beer Styles")

As you can see in the wordcloud, the styles are very imbalanced, with some styles containing only a few instances. To improve the model, remove the styles with fewer than 5 instances, and then split the data into 90% training and 10% testing partitions.

(The details of the data preparation can be found in the full example file)

Convert each beer name to a sequence of integers, where each integer represents a character. The responses are the beer styles.

YTrain = labelsTrain;
YTest = labelsTest;
YTrain(1:6)

ans = 6x1 string array

American Pale Lager

American IPA

American Double / Imperial IPA

American IPA

Oatmeal Stout

Next create the deep learning network architecture. Use a word embedding layer to learn an embedding of characters and map the integers to vectors. Use a bidirectional LSTM (BiLSTM) layer to learn bidirectional long-term dependencies between the characters in the beer names.

To learn stronger interactions between the hidden units of the BiLSTM layer, include an extra fully connected layer of size 50. Use dropout layers to help prevent the network from overfitting.

numFeatures = 1;
embeddingDimension = 100;
numCharacters = max([XTrain{:}]);
numClasses = numel(categories(YTrain));
layers = [
    sequenceInputLayer(numFeatures)
    wordEmbeddingLayer(embeddingDimension,numCharacters)
    bilstmLayer(200,'OutputMode','last')
    dropoutLayer(0.5)
    fullyConnectedLayer(50)
    dropoutLayer(0.5)
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];

Specify the training options.

options = trainingOptions('adam', ...
    'MaxEpochs',100, ...
    'InitialLearnRate',0.01, ...
    'GradientThreshold',2, ...
    'Shuffle','every-epoch', ...
    'ValidationData',{XTest,YTest}, ...
    'ValidationFrequency',80, ...
    'Plots','training-progress', ...
    'Verbose',false);

Train the network.

beerStyleNet = trainNetwork(XTrain,YTrain,layers,options);

Here, we can see that the model overfits. The model has effectively memorized the training data, but not generalized well enough to get as high accuracy on the test data.

This is perhaps expected: lots of beer names don't given that much away when it comes to the style, so the network has little to work with. Some are easy to classify since they contain the style of the beer in the name.

For example, what style of beer do you think the following are? Can you beat the classifier?

idx = [1 4 5 8 9 10 12 14 15 17];
textDataTest(idx)

ans = 10x1 string array

"Sophomoric Saison"

"Divided Sky"

"Honey Kolsch"

"Alaskan Amber"

"California Lager"

"Brotherhood Steam"

"Angry Orchard Apple Ginger"

"Long Leaf"

"This Season's Blonde"

"Raja"

Compare your guesses vs. predictions made by the network vs. the correct labels

YPred = classify(beerStyleNet,XTest);
disp(table(textDataTest(idx),YPred(idx),YTest(idx),'VariableNames',["Name" "Prediction" "True"]))

Name	Prediction	True
______________________________	______________________________	______________________________
"Sophomoric Saison"	Saison / Farmhouse Ale	Saison / Farmhouse Ale
"Divided Sky"	American Amber / Red Ale	American IPA
"Honey Kolsch"	Kölsch	Kölsch
"Alaskan Amber"	American Amber / Red Ale	Altbier
"California Lager"	American Amber / Red Lager	American Amber / Red Lager
"Brotherhood Steam"	American Pale Wheat Ale	California Common / Steam Beer
"Angry Orchard Apple Ginger"	Cider	Cider
"Long Leaf"	Munich Helles Lager	American IPA
"This Season's Blonde"	Cream Ale	American Blonde Ale
"Raja"	Fruit / Vegetable Beer	American Double / Imperial IPA

So, can I use this network to select a beer for me? Suppose the test set contains all the beers available at a bar. I tend to go for some kind of IPA. Let's see which of these beers are classified as an IPA. This could be any of the class labels containing "IPA".

classNames = string(beerStyleNet.Layers(end).Classes);
idx = contains(classNames,"IPA");
classNamesIPA = classNames(idx)

ans = 5x1 string array

"American IPA"

"American White IPA"

"Belgian IPA"

"English India Pale Ale (IPA)"

[YPred,scores] = classify(beerStyleNet,XTest);
idx = contains(string(YPred),"IPA");
selection = textDataTest(idx);

Let's see what proportion of these actually are labelled as some kind of IPA.

accuracyIPA = mean(contains(string(YTest(idx)),"IPA"))

accuracyIPA = 0.7241

View the top 10 predictions sorted by classification score. And to make it even more exciting let's exclude any names with "IPA" in the name

topScores = max(scores(idx,:),[],2);
[~,idxSorted] = sort(topScores,'descend');
selectionSorted = selection(idxSorted);
% remove with IPA in the name
idx = contains(selectionSorted,["IPA" "India Pale Ale"]);
selectionSorted(idx) = [];
selectionSorted(1:10)

ans = 10x1 string array

"American Idiot Ale (2012)"

"Citra Faced"

"Hopped on the High Seas (Calypso)"

"Bengali Tiger"

"The Sword Iron Swan Ale"

"The 26th"

"Isis"

"En Parfaite Harmonie"

"Sanctified"

"Sockeye Maibock"

Looks like some good suggestions!

Generate New Beer Names

We have created a deep network that does a reasonable job of finding a beer for me. My next desire is for MATLAB to design a beer for me. First it needs a name. To do this, I'll use an LSTM network for sequence forecasting which predicts the next character of a sequence. To improve the model, I'll also include the beer names from the Cambridge Beer Festival in the UK. Validation data is not helpful here, so we will train on all the data.

textData = [dataKaggle.name; dataCambridge.name];

To help with the generation, replace all the space characters with a "·" (middle dot) character, insert a start of text character at the beginning, and an end of text character at the end.

startOfTextCharacter = compose("\x0002");
whitespaceCharacter = compose("\x00B7");
endOfTextCharacter = compose("\x2403");

For the predictors, insert the start of text character before the beer names. For the responses, append the end of text character after the beer names. Here, the responses are the same as the predictors, shifted by one time step.

textDataPredictors = startOfTextCharacter + replace(textData," ",whitespaceCharacter);
textDataResponses = replace(textData," ",whitespaceCharacter) + endOfTextCharacter;

XTrain = cellfun(@double,textDataPredictors,'UniformOutput',false);
YTrain = cellfun(@(Y) categorical(cellstr(Y')'),textDataResponses,'UniformOutput',false);

View the first sequence of predictors and responses.

XTrain{1}

ans = 1x9

2 80 117 98 183 66 101 101 114

YTrain{1}

ans = 1x9 categorical

P u b · B e e r ␃

Construct the network architecture.

numFeatures = 1;
numClasses = numel(categories([YTrain{:}]));
numCharacters = max([XTrain{:}]);

layers = [
    sequenceInputLayer(numFeatures)
    wordEmbeddingLayer(200,numCharacters)
    lstmLayer(400)
    dropoutLayer(0.5)
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];

Specify the training options.

options = trainingOptions('adam', ...
    'InitialLearnRate',0.01, ...
    'GradientThreshold',2, ...
    'Shuffle','every-epoch', ...
    'Plots','training-progress', ...
    'Verbose',false);

Train the network.

beerNameNet = trainNetwork(XTrain,YTrain,layers,options);

Here, the network might look like it is not doing particularly well. Again, this might be expected. To get high accuracy, the network must generate the training data exactly. We don't want the network to overfit too much because the network will simply generate the training data.

Generate some beer names using the generateText function, which is included in the full example file at the end of the post.

numBeers = 30;
generatedBeers = strings(numBeers,1);
for i = 1:numBeers
    generatedBeers(i) = generateText(beerNameNet,startOfTextCharacter,whitespaceCharacter,endOfTextCharacter);
end

Sometimes, the network might simply predict beer names from the training data. Remove them.

idx = ismember(generatedBeers,textData);
generatedBeers(idx) = [];

View the generated beers.

 generatedBeers

generatedBeers =

"Firis Amber"

"Sprecian Claisper"

"Worther Pale Ale"

"Ma's Canido Winter Ale"

"Hop Roust"

"Honey Fuddel Pilsner"

"Slowneck Lager"

"CuDas Colora Lager"

"No Ryer Pilsner"

"Dark Light IPA"

Generate Tasting Notes

We have our beer names, we now need some tasting notes. Similar to the name generator, create a tasting note generator from the Cambridge Beer Festival notes.

textData = dataCambridge.notes;

As before, to help with the name generation, replace all the space characters with a "·" (middle dot) character, insert a start of text character at the beginning, and an end of text character at the end.

Once again, define the network architecture, specify the training options, and train the network. (details are found in the main example file - link at the very end of this post)

Generate some tasting notes using the generateText function, listed at the end of the example.

numBeers = 5;
for i = 1:numBeers
    generatedNotes = generateText(beerNotesNet,startOfTextCharacter,whitespaceCharacter,endOfTextCharacter)
end

"This pale ale has a good assertive pale and full-bodied and lagerong aftertaste."

"A full-bodied Imperial stout with flavour with a slight but fuity bite from The Fussion of roasted, malty flavours and a delicate character that is also present in the aftertaste with a silk stout. Unfined."

"Light copper traditional bitter with good malt flavours. Brewed with the finest English Maris Otter taste and a rowner fruit and bitter sweet finish."

"Stout brewed with a variety of flavoursomen. Unfined."

"Mixed malt and fruit start thise in the boil."

Perfect! I can now get started on brewing my own perfect beer. You can run the code many times to generate more names and tasting notes. My favorite design that I have seen so far is:

Hopky Wolf IPA

"This Double IPA has a big malt backbone and flavours of grapefruit, orange and lemon with an underlying floral quality and tent complex. Well balanced aroma reflects its taste. It's hopped with a blend of Fuggle and Golding hops."

Now I just need MATLAB to automate the brewing process...