![]() ![]() ![]() Ĭhinese government use drones to remotely police people in coronavirus-hit areas:Ĭhinese security officials are using drones to remotely surveil and talk to people in coronavirus-hit areas of the country. Read more: Towards a Conversational Agent that Can Chat About… Anything (Google AI Blog). Read more: Towards a Human-like Open-Domain Chatbot (arXiv). Why this matters : How close can massively-scaled function approximation get us to human-grade conversation? Can it get us there at all? Research like this pushes the limits of a certain kind of deliberately naive approach to learning language, and it’s curious that we’re developing more and more superficially capable systems, despite the lack of domain knowledge and handwritten systems inherent to these approaches. “We are evaluating the risks and benefits associated with externalizing the model checkpoint, however”. “Tackling safety and bias in the models is a key focus area for us, and given the challenges related to this, we are not currently releasing an external research demo,” they write. With Meena, Google is also adopting a different release strategy. Microsoft announced DialoGPT but didn’t provide a sampling interface in an attempt to minimize opportunistic misuse, and other companies like NVIDIA have alluded to larger language models (e.g., Megatron), but not released any parts of them. By comparison, other state-of-the-art systems such as DialoGPT (51%) and Cleverbot (44%) do much more poorly.ĭifferent release strategy: Along with their capabilities, modern neural language models have also been notable for the different release strategies adopted by the organizations that build them – OpenAI announced GPT-2 but didn’t release it all at once, releasing the model over several months along with research into its potential for misinformation, and its tendencies for biases. Humans vs Machines: The best-performing version of Meena gets an SSA of 79%, compared to 86% for an average human. ![]() To calculate the SSA for a given chatbot, the researchers have a team of crowd workers evaluate some of the outputs of the models, then they use this to create an SSA score. This metric evaluates the outputs of language models for two traits – is the response sensible, and is the response specifically tied to what is currently being discussed. Meena: You were trying to steer it elsewhere, I can see it.)Ī metric for good conversation : Google developed the ‘Sensibleness and Specificity Average’ (SSA) measure, which it uses to evaluate how good Meena is in conversation. Human: that’s a pretty good joke, I feel like you led me into it. It also seems able to invent jokes (e.g., Human: do horses go to Harvard? Meena: Horses go to Hayvard. Meena uses a seq2seq model (the same sort of technology that powers Google’s “Smart Compose” feature in gmail), paired with an Evolved Transformer encoder and decoder – it’s interesting to see something like this depend so much on a component developed via neural architecture search.Ĭan it talk? Meena is a pretty good conversationalist, judging by transcripts uploaded to GitHub by Google. The bot, named Meena, is a 2.6 billion parameter language model trained on 341GB of text data, filtered from public domain social media conversations. Google researchers have trained a chatbot with uncannily good conversational skills. …Google’s “Meena” chatbot suggests it can… Can curve-fitting make for good conversation?
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |