Ian Mulvany

November 22, 2022

Two pieces of news regarding large language models

image.png

(a scientist writing a paper with a robot - Dalle)

Within the last week papers with code and Facebook released Galactica - a large language model trained on the scientific research literature. https://galactica.org. It looks like you could download the model and get started with it pretty quickly (though I failed to get it running on a Mac M1 due to TensorFlow issues.)

In addition to synthesising the research literature, it is able to generate LaTeX equations, and provide literature references (I would have loved to see if it could automatically alter citation reference styles!). 

There were two schools of thought. Yann LeCun from Facebook, who's team codeveloped Galactica, said:
https://twitter.com/ylecun/status/1593229143545896960 (basically this is a tool to help write papers, much in the same way that CoPilot is a tool to help write code, it should not be used as a primary source).

Many others tried it out and described it as a bullshit generator, something with no understanding that could quickly generate text that looked convincing, but that was totally meaningless. 

As a result of the kerfuffle Galactica has removed the public demo. 


The second piece of interesting news is that GPT-4 is rumoured to be released soon - https://thealgorithmicbridge.substack.com/p/gpt-4-rumors-from-silicon-valley. There is no clear indication on what GPT-4 will look like exactly, but there are claims that it will be as much of an improvement from GPT-3 as GPT-3 was from GPT-2. What that means in practice is that the number of tasks that it will be able to do should be larger, with a better level of accuracy. 

These two pieces of news taken together indicate that wile large language models do not contain any insight, they should continue to evolve to be powerful tools in support of knowledge creation, and the types of tasks that they can assist with should equally increase over time. 

It is likely that we are not near to hitting a ceiling on performance for these models yet, so it is a very good time to think about how we might best work with them. 

About Ian Mulvany

Hi, I'm Ian - I work on academic publishing systems. You can find out more about me at mulvany.net. I'm always interested in engaging with folk on these topics, if you have made your way here don't hesitate to reach out if there is anything you want to share, discuss, or ask for help with!