Three interesting Machine Learning Papers

I follow a monthly roundup of Machine Learning papers by Davis Blalock. https://dblalock.github.io/about/. It's a fantastic way to keep up with where the focus of the research community is, even at a very high level. You can catch it here: https://dblalock.substack.com/

I'd say the trends I've been seeing tend to be around areas such as optimisation, fundamental architectures, reasoning, safety, and implementation.

My impression is that there are tons of optimisations available, things like how to select the most appropriate model, how to batch queries, how to batch the computational pipeline. The story we have today about energy inefficiencies is clearly driving a lot of work in this area and I think there will be a lot of improvement here.

Most of the papers are beyond me, but his summaries are accessible, so here are some papers that I liked the sound of, from his recent roundup. What I like about these papers is that they are all showing us how to use this technology better. That there are often so many papers like this is a strong signal that we are very much at the early stages of what these kinds of things can do.

Pause tokens

https://arxiv.org/abs/2310.02226

This paper shows that training models with artificial pauses in their inference steps, gives a big improvement in performance. The authors suggest that this might be because it allows the LLM to spend more time processing before responding more thinking time?). In this way it may act a bit like a chain of reasoning. I'm reminded of Deep Thought from the hitchhiker's guide to the galaxy that took seven and a half million years to create an answer that no one understood.

A classification of hallucinations:

https://arxiv.org/abs/2309.01219

This is a very useful survey paper on the state of our current understanding on how to deal with hallucinations.

From their paper:

> We argue that the definition appears to have considerably expanded due to the versatility of LLMs. To this end, we categorise hallucination within the context of LLMs as follows:

> • Input-conflicting hallucination, where LLMs generate content that deviates from the source input provided by users;

> • Context-conflicting hallucination, where LLMs generate content that conflicts with previously generated information by itself;

> • Fact-conflicting hallucination, where LLMs generate content that is not faithful to established world knowledge.

> We present examples for each type of hallucination in Table 1, and discuss them in detail below.

The paper introduced me to this wonderful definition - hallucination snowballing. Azaria and Mitchell (2023) - which is where an LLM starts out on a response, and even if there might be a chance that the LLM could catch that it is wrong, it optimises for getting the answer it has committed to out first. I guess what it might look like if it didn't do this would be the LLMs response being something along the lines of "oh, hang on, that might not be right". Basically LLMs like to mansplain.

The paper does outline the approaches available to combat this, few of these are surprising:

- Better training data

- Human feedback

- Intercepting the output during decoding

- Connect to external databases

- Uncertainty estimation

- Use multiple LLMs in tandem.

- Experimental results indicate that LLMs might “know” when the statements they gener- ate are false, and SAPLMA can effectively ex- tract such information.

Overall it's good to see work like this

LLMs are optimizers

https://arxiv.org/abs/2309.03409

In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values, then the new solutions are evaluated and added to the prompt for the next optimization step

References:

Goyal, S., Ji, Z., Rawat, A., Menon, A., Kumar, S. & Nagarajan, V. (2023). Think before you speak: Training Language Models With Pause Tokens. arXiv: 2310.02226

Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A., Bi, W., Shi, F. & Shi, S. (2023). Siren's Song in the AI Ocean: A Survey on Hallucination in Large

Language Models. arXiv: 2309.01219

Yang, C., Wang, X., Lu, Y., Liu, H., Le, Q., Zhou, D. & Chen, X. (2023). Large Language Models as Optimizers. arXiv: 2309.03409

Classifications from OpenAI:

categories:

1. monthly roundup of ml papers

2. optimisation in ml

3. fundamental architectures in ml

4. reasoning in ml

5. safety in ml

6. implementation of ml models

7. training models with artificial pauses

8. classification of hallucinations in llms

9. approaches to combat hallucinations in llms

10. llms as optimizers

Three interesting Machine Learning Papers - Sept 2023