Machine Learning, Human Teaching
Posted 28 Jul at 8:57 pm in Productivity
-
Language Learning Models may be getting “dumber” over time
-
Chat GPT 3.5 and 4 have been rather inconsistent
-
We are at the tip of the iceberg for these programs and systems
Large Language Models and Machine Learning – Tech’s Latest Star
If you have opened any form of social media in the last 6 months or so, chances are you’ve had the concept of machine learning, artifical intelligence (AI), or large language models shoved in your face more than once. It seems to pick up right where the Metaverse discussion fizzled off… touted as a the future and heavily abused a as buzzword. While the Metaverse is still a developing idea, as is Machine Learning, many companies seem to latch onto these ideas for the sake of “keeping up with the Joneses”, if you will. While it seems as though the stance here is doubtful, I am rather optimistic for the future of machine learning; I am more interested, at this time, in discussing the direction these models are heading.
Breaking It Down
What is Machine Learning?
While it has been around for some time (since the early 1950s, late 1940s), machine learning has recently stepped into the spotlight with recent developments in artificial intelligence. In a more traditional style, developers will code rules for a system to follow in order to achieve a certain task. In machine learning, developers instead will train machines to learn these rules themselves. It’s fed a set of inputs (data) and expected outputs (results), and it uses this information to create an algorithm that can produce the desired output from a given input. There are a few different types of machine learning (supervised, unsupervised, and reinforcement), but that is the general idea.”
What Is Large Language Model?
Large language models, or LLMs for short, are a breakthrough in the field of Natural Language Processing (NLP) that has revolutionized the way computers can understand/generate human language. LLMs fall into the broader category of AI deep learning and the domain of neural networks.
LLMs learn from “unsupervised learning”, meaning they learn from a vast amount of text data and no explicit human supervision. Typically, they train these models on enormous datasets sourced from anything from your favorite book to an article from The Inquirer this morning.
During training, the LLM processes the text data and learns to predict the probability of a word or phrase appearing in a given context. This process, known as language modeling, equips the LLM with a deeper understanding of grammar, syntax, and semantics.
“Always Be Learning”
As impressive as it is, many recent machine learning models learn from our inputs. Circling back to the idea of AI and machine learning, AI’s performance depends on the data it’s trained on and the quality of that data. So, if there are poor user entries, then there are rather “poor” outputs. If the training data exhibits bias, noise, or limited scope, the AI model may not perform as desired, and it may appear “dumber” in certain situations.
Imagine you have a helpful robot friend who loves learning new things and giving you answers to your questions. It’s like a smart assistant that can talk to you and understand what you say.
Now, the robot friend learns by listening to your questions and the answers you give. It uses this information to get better at answering similar questions in the future.
The caveat is, feeding your robot friend the wrong information or provide confusing, it will get confused. It will try its best to learn from what you say, but if what you say is incorrect or misleading, the robot might start giving you wrong answers.
In the same way, if developers train an AI system on poor user inputs or wrong data, it can start making mistakes and providing inaccurate responses.
If it learns from biased or unreliable information, it might start giving biased or unreliable answers.
Deep Dive
The Numbers
Through the remainder of this article the discussion will unravel into the numbers of large language models (such as OpenAI’s Chat GPT 3.5 & 4) and how they have developed (or not) over time.
As aforementioned, these language models are continuously updating and will progress or, as the data will show, regress over time. See Figure 1 below to further understand the drift of data over time.
In short, the data visualized is showing the answers for the same questions, a year apart.
These performance shifts are important for a few reasons. The language model learns over time and, some of that learning has no all been beneficial. When looking at Figure 1, you may notice that some data sets have actually worsened over time. The bottom right example of the figure visualizes this pretty well; in March, there was an direct execution rate of 52% in GPT-4’s code, though in June it was only able to produce 10% directly executable code.
Monitoring Performance
A study performed by researchers at Stanford and UC Berkley concluded that inquisition about whether or not a given integer was prime, was a fair question to ask. It is a fairly simple question, though it requires some reasoning. Looking at the question in regard to prime numbers, in March 2023, GPT-4s was at an impressive 97.6%… dropping all the way to 2.4% in June. Contrary, GPT-3.5 made a leap from 7.4% to 86.8% accuracy.
Though the data shows and I likely do not need to say it, these programs provide inconsistent results. While they are great at some tasks, they are not great at all tasks. It is unfair to pin it entirely on the user, but we can infer that some human intervention has taken place, affecting the performance of these models.
In Closing
There are clearly some kinks to be worked out in the LLMs shown, and as a whole. With that being said, they are impressive pieces of technology and show promise for what is to come. While they aren’t perfect, they aren’t meant to be; they are meant to learn over time and, if their teacher (us) fails to be perfect then we should not expect these systems to perform with such perfection either.
To read more topics like this, check out some of our other blogs:
No Comments