Yesterday I published a post about how I used Spell, a machine learning framework that allows you to train your models faster without the pain of dependency management on your machine, to implement a style transfer. I've continued to use Spell, and have expanded into exploring other training models.
Today I wanted to write about making your own text bot with Spell. I've always been a huge fan of Emily Dickinson's poetry. I know some find her work a bit morbid, but her poetry has spoken to me throughout many years and I continue to marvel at how someone who rarely left her home could have such incredible insight into the human condition, the natural world, and the realities of life and death.
Since Spell conveniently had a video tutorial for creating a bot trained on lyrics from DOOM, I figured I could apply this quite easily to Emily Dickinson's poetry.
Surprisingly for me, the hardest part was not the training but the web scraping component. I started out with the Node.js tutorial for web scraping (https://medium.freecodecamp.org/the-ultimate-guide-to-web-scraping-with-node-js-daa2027dcd3) and then poked around some documentation for puppeteer as well as cheerio to figure out how I could get the scraper to press the "next" button when it reached the end of the results on a page.
At first I decided to scrape poets.org, which only had 5 pages of poetry by Emily Dickinson. It seemed like a paltry amount, and I knew Emily Dickinson had been very prolific, so I did some searching online to find a better source. Although a bit more difficult to scrape, poemhunter.com ended up being a better source with 31 pages of poetry by Emily Dickinson.
I did run into some issues at this point, with lots of errors about the socket hanging up that I thought was crashing the scraper. I wasn't able to deduce exactly what was the cause, but by modifying my file writing function to add each poem to the file one by one instead of appending each poem to an increasingly long string and writing the file at the end, I still got the data back that I was expecting.
All of Emily Dickinson's poems were added to the same text file to make it easy for the LSTM training model to process the data. Once the model was created, I downloaded it back to my machine and created a small web app with ML5 that uses the model to generate new poetry, with an adjustable variable called "temperature" that allows you to change the amount of randomness you see in the generated text (as opposed to the AI drawing purely from the poetry data in the file).
Here is a brief sample of output from the bot, for your reading pleasure:
Birds at the Corn— The Sun— With Men Orchard with Me— The Bone at House— The Bonnet Day— The Milly strain of He can the March— And then I star me— And Death— A Dread His Eyes the Sun— And then the Sun too Day the Birds— The Room To only still— And I too see the Soul That Desert face of Sun— The Man As House He stand— And stand to the Soul a Bee— And then the Soul as Beauty
If you're interested in creating your own text bot with Spell, check out this video tutorial which I used:
Of course if you're interested in seeing my code, I have it up on github here:
Neural network trained on Emily Dickinson poetry
Run npm install
Run npm start to start the http-server
Navigate to the URL provided in http-server message (probably localhost:8081) and add /ml5/index.html