Self-driving Tesla

🚀🇰🇵 Da Fight fo’ Make A.I. Smalla (an’ Smarta) 📰

Teachin’ da big kine language models fewer words might help ’em sound mo’ human-like. 💡🤖

Wen we talk ’bout dem artificial intelligence chatbots, mo’ of everyting usually stay bettah. Large language models like ChatGPT an’ Bard, dey stay produce original conversational text dat improve when you feed ’em more data. Every single day, bloggers stay jump on da internet fo’ explain how da newest advancements — like dis cool app dat summarize articles, A.I.-generated podcasts, or one fancy model dat can answer any question ’bout professional basketball — going “change everyting.” 📚🎙️🏀💻

But fo’ make da A.I. biggah an’ mo’ capable, you gotta have da processing power dat only few companies get. An’ now, people starting to worry dat dis small group, includin’ Google, Meta, OpenAI, an’ Microsoft, going end up wit complete control ova da technology. 😱💻

Plus, da biggah da language models, da hahdah dey stay undahstand. People even call ’em “black boxes,” and dat’s not jus’ any random peeps, but even da designers themselves. Da top folks in dis field all say dey get dis uneasy feelin’ dat da goals of A.I. might not line up wit our own goals. So, while biggah might seem bettah, it also stay mo’ mysterious an’ exclusive. 😬📦💡

Den in January, dis group of young brainiacs workin’ in natural language processing, which be da branch of A.I. focused on undahstandin’ language, wen throw one challenge out deyah. Dey wen call fo’ teams fo’ create language models dat still stay effective but use datasets way smaller than da ones used by da big kine models. Da goal stay fo’ make mini-models dat can do almost as much as da high-end models, but be smalla, mo’ accessible, an’ mo’ compatible wit humans. Dey wen give ’em one fancy name too — da BabyLM Challenge. 😎💪🔬

“We stay challengin’ people fo’ tink small an’ focus on buildin’ efficient systems dat can serve way mo’ people,” said Aaron Mueller, one computer scientist from Johns Hopkins University an’ one of da BabyLM organizers. Anothah organizer, Alex Warstadt from ETH Zurich, added, “Da challenge stay puttin’ da spotlight on how humans learn language, instead of jus’ askin’, ‘How big we can make our models?'” 🧠🔍💻

Da large language models stay be neural networks dat predict da next word in one sentence or phrase. Dey stay train fo’ dis job by usin’ a whole bunch of words from transcripts, websites, novels, an’ newspapers. Da typical model going make guesses based on example phrases, an’ as it get closah to da right answer, it going adjust itself. 🤔📖💡

By repeatin’ dis process again an’ again, da model go create maps dat show how words all connect an’ relate to each oddah. In general, da mo’ words a model get trained on, da bettah it going become; every phrase give da model some context, an’ da mo’ context it get, da mo’ detailed impression it going have ’bout da meanin’ of each word. OpenAI’s GPT-3, dat came out in 2020, wen get trained on 200 billion words; DeepMind’s Chinchilla, dat came out in 2022, wen get trained on one trillion words. 🌐📚💡

To Ethan Wilcox, one linguist from ETH Zurich, da idea dat somethin’ not even human can generate language be real exciting. He wen start askin’, “Eh, can we use A.I. language models fo’ study how humans learn language?” Fo’ example, da theory of nativism, dat traced back to Noam Chomsky’s early work, say humans learn language fast an’ efficient ’cause we stay born wit some natural undahstandin’ of how language work. But guess wat? Language models also learn language quick, an’ dey no get any natural undahstandin’ of how language work — so maybe dis nativism theory no stay solid. 🤔🗣️📚

Da big challenge we get heah be dat language models learn real different from humans. Humans get bodies, social lives, an’ plenty sensations. We can smell stuff, feel stuff, bump into stuff, an’ even taste stuff. From da get-go, we get exposed to simple spoken words an’ syntax dat hardly show up in written form. So Dr. Wilcox wen conclude dat one computer, even aftah get trained on gazillions of written words, can only tell us so much ’bout our own language learnin’ process. 🤷‍♂️👃👅🧠

But suppose one language model only get exposed to da same words as one young human, maybe it going interact wit language in ways dat can help answer some questions we get ’bout our own language skills. 🤔🗣️📚

So, wit da help of ’bout six oddah colleagues, Dr. Wilcox, Dr. Mueller, an’ Dr. Warstadt came up wit dis BabyLM Challenge fo’ try bring language models little bit closah to how humans undahstand language. Back in January, dey wen send out dis announcement fo’ teams fo’ train language models on da same amount of words dat one 13-year-old human would come across — ’bout 100 million words. Dey going test da models fo’ see how good dey can generate language an’ undahstand da subtle nuances, an’ den crown one winnah. 🎉🧠🔍🔬

Da day da challenge was announced, Eva Portelance, one linguist from McGill University, wen come across it. Her research stay in da space where computer science an’ linguistics meet. Way back in da 1950s, da early A.I. tinkahs wen aim fo’ model human thinkin’ inside computers. Da basic unit of information processing in A.I. be da “neuron,” an’ da early language models in da ’80s an’ ’90s wen get direct inspiration from da human brain. But den, as dem processors wen get mo’ power an’ companies wen start focus on marketable products, da computer scientists wen figure out dat trainin’ language models on plenny data was often easier than tryin’ to fit ’em into structures informed by human psychology. So, as Dr. Portelance wen say, “Dey give us text dat sound humanlike, but no get no real connection between us an’ how dey work.” 🖥️📚🧠

Fo’ da scientists who stay interested in understandin’ how da human mind work, dese big models no stay give ’em much insight. An’ fo’ make mattahs worse, only few researchers get access to ’em ’cause dey need plenty power. As Dr. Wilcox wen say, “Only few industry labs wit lotta resources can afford fo’ train models wit billions of parameters on trillions of words.” 😕💻💡

“An’ even loadin’ ’em stay tough,” Dr. Mueller add. “So now, research in da field stay feelin’ less fair.” 😬💡

But dis BabyLM Challenge, Dr. Portelance say, can be seen as one move away from da competition fo’ make bigger language models, an’ one move toward A.I. dat stay more accessible an’ mo’ intuitive. Even da big industry labs starting fo’ see da potential in dis research. Sam Altman, da top guy at OpenAI, wen recently say dat makin’ da models biggah no go make ’em bettah like how we see improvements in da past. An’ companies like Google an’ Meta stay investin’ in research fo’ make more efficient language models dat get inspiration from how humans undahstand language. ‘Cause suppose one model can generate language wit less data, den maybe dey can make ’em even biggah too. 💡💻📚

But fo’ da folks behind dis BabyLM Challenge, da main goal no really ’bout da money. “Only pride,” Dr. Wilcox say. 👏💪💡


NOW IN ENGLISH

🚀🇰🇵 The Fight to Make A.I. Smaller (and Smarter) 📰

When it comes to artificial intelligence chatbots, more of everything usually seems better. Large language models like ChatGPT and Bard generate original conversational text that improves as they are fed more data. Every single day, bloggers take to the internet to explain how the newest advancements — like this cool app that summarizes articles, A.I.-generated podcasts, or a fancy model that can answer any question about professional basketball — will “change everything.” 📚🎙️🏀💻

But to make the A.I. bigger and more capable, you need the processing power that only a few companies have. And now, people are starting to worry that this small group, including Google, Meta, OpenAI, and Microsoft, will end up with complete control over the technology. 😱💻

Plus, the bigger the language models, the harder they are to understand. People even call them “black boxes,” and that’s not just any random folks, but even the designers themselves. The top figures in this field all express this uneasy feeling that the goals of A.I. might not align with our own goals. So, while bigger might seem better, it also becomes more mysterious and exclusive. 😬📦💡

Then in January, a group of young geniuses working in natural language processing, which is the branch of A.I. focused on understanding language, threw out a challenge. They called for teams to create language models that are still effective but use datasets much smaller than the ones used by the larger models. The goal is to create mini-models that can do almost as much as the high-end models but are smaller, more accessible, and more compatible with humans. They even gave it a fancy name — the BabyLM Challenge. 😎💪🔬

“We’re challenging people to think small and focus on building efficient systems that can serve way more people,” said Aaron Mueller, a computer scientist from Johns Hopkins University and one of the organizers of BabyLM. Another organizer, Alex Warstadt from ETH Zurich, added, “The challenge puts the spotlight on how humans learn language, instead of just asking, ‘How big can we make our models?'” 🧠🔍💻

The large language models are neural networks designed to predict the next word in a sentence or phrase. They are trained for this task using a large amount of words from transcripts, websites, novels, and newspapers. The typical model makes guesses based on example phrases, and as it gets closer to the right answer, it adjusts itself. 🤔📖💡

By repeating this process again and again, the model creates maps that show how words are connected and related to each other. In general, the more words a model is trained on, the better it becomes; every phrase provides the model with some context, and the more context it gets, the more detailed impression it has about the meaning of each word. OpenAI’s GPT-3, released in 2020, was trained on 200 billion words; DeepMind’s Chinchilla, released in 2022, was trained on a trillion. 🌐📚💡

To Ethan Wilcox, a linguist from ETH Zurich, the idea that something not even human can generate language is really exciting. He started asking, “Hey, can we use A.I. language models to study how humans learn language?” For example, the theory of nativism, which traces back to Noam Chomsky’s early work, claims that humans learn language quickly and efficiently because we are born with a natural understanding of how language works. But guess what? Language models also learn language quickly, and they don’t have any natural understanding of how language works — so maybe this nativism theory is not solid. 🤔🗣️📚

The challenge we face here is that language models learn very differently from humans. Humans have bodies, social lives, and plenty of sensations. We can smell things, feel things, bump into things, and even taste things. From the very beginning, we are exposed to simple spoken words and syntax that hardly show up in written form. So Dr. Wilcox concluded that a computer, even after being trained on a vast amount of written words, can only tell us so much about our own language learning process. 🤷‍♂️👃👅🧠

But what if a language model is only exposed to the same words as a young human? Maybe it would interact with language in ways that could help answer some questions we have about our own language skills. 🤔🗣️📚

So, with the help of about six other colleagues, Dr. Wilcox, Dr. Mueller, and Dr. Warstadt came up with this BabyLM Challenge to try to bring language models a little bit closer to how humans understand language. Back in January, they sent out this announcement for teams to train language models on the same amount of words that a 13-year-old human would encounter — about 100 million words. They will test the models to see how well they can generate language and understand the subtle nuances, and then declare a winner. 🎉🧠🔍🔬

The day the challenge was announced, Eva Portelance, a linguist from McGill University, came across it. Her research sits in the space where computer science and linguistics meet. Way back in the 1950s, the early A.I. pioneers aimed to model human thinking inside computers. The basic unit of information processing in A.I. is the “neuron,” and the early language models in the ’80s and ’90s directly drew inspiration from the human brain. But then, as processors became more powerful and companies started focusing on marketable products, the computer scientists figured out that training language models on massive amounts of data was often easier than trying to fit them into structures informed by human psychology. So, as Dr. Portelance said, “They give us text that sounds humanlike, but there’s no real connection between us and how they work.” 🖥️📚🧠

For the scientists who are interested in understanding how the human mind works, these large models don’t give them much insight. And to make matters worse, only a few researchers have access to them because they require significant power. As Dr. Wilcox said, “Only a few industry labs with huge resources can afford to train models with billions of parameters on trillions of words.” 😕💻💡

“And even loading them is tough,” Dr. Mueller added. “So now, research in the field is feeling less fair.” 😬💡

But this BabyLM Challenge, Dr. Portelance said, can be seen as a move away from the competition to make bigger language models and a move toward A.I. that is more accessible and more intuitive. Even the large industry labs are starting to see the potential in this research. Sam Altman, the top guy at OpenAI, recently said that making the models bigger won’t lead to the same level of improvements as seen in the past. And companies like Google and Meta are investing in research to make more efficient language models that draw inspiration from how humans understand language. After all, if a model can generate language with less data, maybe they can scale it up too. 💡💻📚

But for the folks behind this BabyLM Challenge, the main goal is not about the money. “Just pride,” Dr. Wilcox said. 👏💪💡

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *