
To improve the model, remove the styles with fewer than 5 instances, and then split the data into 90% training and 10% testing partitions. textData = dataKaggle.name Īs you can see in the wordcloud, the styles are very imbalanced, with some styles containing only a few instances. Visualize the distribution of the beer styles using a word cloud. figureĬlassify Beer Style First, using the Kaggle data, create a long short-term memory (LSTM) deep learning model to classify the beer style given the name. The wordcloud function in Text Analytics Toolbox creates word clouds directly from string data. subtrees = findElement(tree,"span") ĭataCambridge = table(name,notes) Visualize the tasting notes in a word cloud.

Name = extractHTMLText(subtrees) Extract the tasting notes. Tree = htmlTree(code) Extract the beer names.

Extract the data using the HTML parsing tools from Text Analytics Toolbox. Load the data from the Cambridge Beer Festival, which in addition to names and styles, also contains tasting notes.
