Chris Pal

Bernhard Schölkopf

2024-03-21

ArXiv (prépublication)

Multi-Resolution Continuous Normalizing Flows

Vikram Voleti

Chris Finlay

Adam M. Oberman

2024-03-21

Annals of Mathematics and Artificial Intelligence (publié)

IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez

Nicholas Botzer

David Vazquez

Marco Pedersoli

Issam Hadj Laradji

2024-03-11

ICLR.cc/2024/Workshop/LLMAgents (poster)

IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez

Nicholas Botzer

David Vazquez

Marco Pedersoli

Issam Hadj Laradji

2024-03-11

ICLR.cc/2024/Workshop/LLMAgents (poster)

Self-evaluation and self-prompting to improve the reliability of LLMs

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their level of … (voir plus)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a simple objective that can encourage the model to produce generation that the model is confident in. To optimize this objective, we introduce ReSearch, an iterative search algorithm based on self-evaluation and self-prompting. Our method results in fewer hallucinations overall, both for known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to decline, when the model assesses that it cannot provide a response without a high proportion of hallucination.

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (publié)

Self-evaluation and self-prompting to improve the reliability of LLMs

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their level of … (voir plus)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a simple objective that can encourage the model to produce generation that the model is confident in. To optimize this objective, we introduce ReSearch, an iterative search algorithm based on self-evaluation and self-prompting. Our method results in fewer hallucinations overall, both for known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to decline, when the model assesses that it cannot provide a response without a high proportion of hallucination.

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (publié)

Reinforcement Learning for Blind Stair Climbing with Legged and Wheeled-Legged Robots

Simon Chamorro

Victor Klemm

Miguel de La Iglesia Valls

Roland Siegwart

In recent years, legged and wheeled-legged robots have gained prominence for tasks in environments predominantly created for humans across v… (voir plus)arious domains. One significant challenge faced by many of these robots is their limited capability to navigate stairs, which hampers their functionality in multi-story environments. This study proposes a method aimed at addressing this limitation, employing reinforcement learning to develop a versatile controller applicable to a wide range of robots. In contrast to the conventional velocity-based controllers, our approach builds upon a position-based formulation of the RL task, which we show to be vital for stair climbing. Furthermore, the methodology leverages an asymmetric actor-critic structure, enabling the utilization of privileged information from simulated environments during training while eliminating the reliance on exteroceptive sensors during real-world deployment. Another key feature of the proposed approach is the incorporation of a boolean observation within the controller, enabling the activation or deactivation of a stair-climbing mode. We present our results on different quadrupeds and bipedal robots in simulation and showcase how our method allows the balancing robot Ascento to climb 15cm stairs in the real world, a task that was previously impossible for this robot.

2024-02-09

ArXiv (prépublication)

Reinforcement Learning for Blind Stair Climbing with Legged and Wheeled-Legged Robots

Simon Chamorro

Victor Klemm

Miguel de La Iglesia Valls

Roland Siegwart

In recent years, legged and wheeled-legged robots have gained prominence for tasks in environments predominantly created for humans across v… (voir plus)arious domains. One significant challenge faced by many of these robots is their limited capability to navigate stairs, which hampers their functionality in multi-story environments. This study proposes a method aimed at addressing this limitation, employing reinforcement learning to develop a versatile controller applicable to a wide range of robots. In contrast to the conventional velocity-based controllers, our approach builds upon a position-based formulation of the RL task, which we show to be vital for stair climbing. Furthermore, the methodology leverages an asymmetric actor-critic structure, enabling the utilization of privileged information from simulated environments during training while eliminating the reliance on exteroceptive sensors during real-world deployment. Another key feature of the proposed approach is the incorporation of a boolean observation within the controller, enabling the activation or deactivation of a stair-climbing mode. We present our results on different quadrupeds and bipedal robots in simulation and showcase how our method allows the balancing robot Ascento to climb 15cm stairs in the real world, a task that was previously impossible for this robot.

2024-02-09

ArXiv (prépublication)

LitLLM: A Toolkit for Scientific Literature Review

Shubham Agarwal

Issam Hadj Laradji

Laurent Charlin

Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (voir plus) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.

2024-02-02

ArXiv (prépublication)

LitLLM: A Toolkit for Scientific Literature Review

Shubham Agarwal

Issam Hadj Laradji

Laurent Charlin

Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (voir plus) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.

2024-02-02

ArXiv (prépublication)

LitLLM: A Toolkit for Scientific Literature Review

Shubham Agarwal

Issam Hadj Laradji

Laurent Charlin

Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (voir plus) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.

2024-02-02

ArXiv (prépublication)

Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models

Pablo Pernias

Dominic Rampas

Mats Leon Richter

Marc Aubreville

2024-01-16

ICLR.cc/2024/Conference (présentation orale)