NVIDIA has just released Chat with RTX, a free AI chatbot for users running an NVIDIA GeForce RTX 30 or 40 Series GPU or NVIDIA RTX Ampere or Ada Generation GPU. Put simply, this tech demo utilizes retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software and NVIDIA RTX acceleration to bring generative AI capabilities to local, GeForce-powered Windows PCs.
Once installed, you’ll be able to quickly and easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2. This means that instead of searching through notes or saved content, users can simply type queries. Chat with RTX supports various file formats, including .txt, .pdf, .doc/.docx and .xml. Download it here now.
- NVIDIA GeForce RTX 3060 12GB GDDR6 dedicated graphics card
- 1710 MHz GPU clock speed and 1807 MHz memory clock speed
- DisplayPort x 3 (v1.4a) and HDMI 2.1 x 1 output interfaces

Chat with RTX shows the potential of accelerating LLMs with RTX GPUs. The app is built from the TensorRT-LLM RAG developer reference project, available on GitHub. Developers can use the reference project to develop and deploy their own RAG-based applications for RTX, accelerated by TensorRT-LLM. Learn more about building LLM-based applications,” said NVIDIA.