They also announced that they would publish an open-source legal agreement to further facilitate model sharing partnerships. OpenAI says it is aware of at least five existing GPT-2 replicas. According to a recent statement on their website, they released a 774 million parameter model after researching “about the potential abuse and social benefits of the model after subsequent research with partners and the AI community” the February version was $124 million.
One of these was published last week, in the Wired reports, They created them with free cloud computing provided by Google and used text from millions of websites linked to Reddy, such as GPT-2. They say that a high-school with some coding knowledge can do the same. Its creators, Aaron Gokaslan and Vanya Cohen wanted to show you that you don’t have to be a Silicon Valley technology company with a few million dollars and a PHD grade to create these duplicate-text-generated programs.
“It gives everyone the opportunity to have an important conversation about safety and help researchers protect against potential future breaches,” Cohen told Wired. Their argument is that it is better to keep these algorithms open so that we know how we are working.
The purpose of these algorithms is to create a complete article on any subject from a man-written prompt – and they can produce incredibly admirable copies filled with false quotes and abuse take the example of a naked group of unicorns. But it is not perfect. It is a machine-learning program based on the statistical pattern of language rather than understanding its results. This could lead to a story of an underwater fire, for example, or an article that looks like someone threw a bunch of loosely connected sentences together. As the research team behind GPT-2 explains, it is better on issues concerned with politics and popular culture (8 million or so pages used for algorithm training are better on both pages) or better on technical issues.