local platform for LLMs
21 Mar 2025
Week Ten
It’s worth looking at the AI and diagrams article that was posted to Hacker News, along with the HN comments, which are more generally focused on prompting techniques. The article discusses the use of AI to generate diagrams, and concludes that it’s good for simple diagrams and brainstorming, but struggles with complex diagrams and diagrams of systems that require insights into the system that are not well-documented.
Last week’s edition:
pause to look at this week’s edition
Today we’ll create a local RAG chatbot, using Ollama, DeepSeek R1, LangChain, ChromaDB, and Gradio.
We’ll have to install a lot of stuff and we’ll have to use the localRAGchatbot.qmd
file after we do some installations.
Ollama is an open-source framework designed to facilitate the deployment of large language models on local environments. It aims to simplify the complexities involved in running and managing these models, providing a seamless experience for users across different operating systems. (source: nixos wiki)
Visit Ollama’s website for detailed installation instructions, or install directly via Homebrew on macOS:
For Windows and Linux, follow the platform-specific steps provided on the Ollama website.
DeepSeek R1 is a large language model (LLM) developed by DeepSeek, a Chinese AI company. It is designed to provide advanced natural language processing capabilities, including text generation, question answering, and more. It was released on 10 January 2025. It is incredibly popular but also controversial due to its adherence to Chinese government censorship policies. We can run it locally, which is a big attraction, meaning that your data never leaves your machine.
Pull the DeepSeek R1 model onto your machine:
By default, this downloads the 7B DeepSeek R1 model (which is 4.4GB). If you’re interested in a specific distilled variant (e.g., 1.5B, 7B, 14B), or the full 671B parameter model (671b, which is 404GB) just specify its tag, like:
Do this in a separate terminal tab or a new terminal window:
You must keep this terminal window open. In other words, you must keep Ollama running while you’re using DeepSeek R1.
Once installed, you can interact with the model right from your terminal:
Or, to run the 1.5B distilled model:
Or, to prompt the model:
On my machine, the model manifest is located at
The model itself is located at
divided across several files. The name deepseek-r1
is important because that is the name used by Ollama to refer to the model. If you store other models, they will be stored in this structure and can be accessed by name.
You can verify that you have the 7B model by saying
One of the blob files should be about 4.4GB.
We now turn our attention to the localRAGchatbot.qmd
file. You must download that file from Canvas, along with Xiao2025.pdf
. This latter file is a book, Xiao and Zhu (2025).
The first thing to do is to copy and paste the pip install ...
commands into a terminal window. Then you can try to render the localRAGchatbot.qmd
file in RStudio.
If you can’t render the file, there may be several reasons. The first thing to try is to download the localRAGchatbot.py
file and try to run that in Python by saying python localRAGchatbot.py
at a terminal prompt. It takes about seven minutes to run on my machine. If that doesn’t work, the problem may be in your Python installation. If it does work, the problem may be in your RStudio installation.
END
This slideshow was produced using quarto
Fonts are Roboto Light, Roboto Bold, and JetBrains Mono Nerd Font