NLP models that produce fluent-sounding text are coming into vogue again. (I say again because systems like Eliza and Markov chain text generators have been around for decades.) A set of new systems trained using the transformer deep learning architecture including BERT and GPT-2 have been setting new high water marks across various NLP leaderboards. It's an exciting time!
The problem is that, along with that excitement, we see an increasing desire to assign human-like cognition to text generated by NLP systems. Take this tweet for example:
Tyler Roost@tylerroost@huggingface The last answer was not left blank by me, rather the only suggestions from the transformer I was getting were spaces. This may be too much of an anthropomorphism of the model, but I believe this to be an indication of consciousness.16:48 PM - 11 Nov 2019
First I want to make it very clear that I'm not trying to dunk on Tyler here. I've seen similar questions asked by lots of very smart folks and I think it's a perfectly reasonable thing to wonder about.
I genuinely understand the desire to ascribe consciousness to ML systems. After all, folks have been hollering about AGI and the singularity for years. And humans have a deep seated desire to see human qualities in non-human things.
That said, this very natural tendency, compounded by the fever pitch hype cycle and cherry picked example could lead a casual observer of the field to genuinely start wondering: are these systems genuinely showing patterns of humanlike thought?
Short answer: no.
Systems like BERT and GPT-2 do not have consciousness. They don't understand language in a grounded way. They don't have keep track of information between different generated utterances. They don't "know" that down is the opposite of up or that three is more than two or that a child is a kind of human.
What they do have is highly, highly optimized models of (usually English) words that humans tend to use together in specific orders. In other words, they're very good statistical approximations of patterns of language use. This ACL paper has some good experimental results that provide evidence for this as well as some of the accompanying drawbacks.
Why is this important?
On the one hand, it's not! BERT, GPT-2 et al aren't designed to be grounded language models or include knowledge about relationships between entities. There's absolutely nothing in the algorithm design or training data to ensure that text generated by these models is factual. This isn't a drawback of the models: it's just not in scope.
On the other hand, it's very important that users with these models understand that this is the case. These are language models and, like all language models, are designed to be components in larger NLP systems rather than an entire system in themselves.
So, while it's definitely fun to play around with text generated by these models, it's akin to interacting with a parrot that's been taught to mimic your ringtone. It may sound like a phone, but it has none of the other features that make it one.