Running LLMs Locally
Article Objective
In this article, we will run a large language model (LLM) locally, on consumer hardware. This allows one to use LLMs offline, which offers privacy, customization, and no usage fees.
Installation
I am using a laptop, with Windows OS. To start, download Ollama from Ollama.com. Once installed, in your command line try:
ollama --version
If it is installed, you should see the version number.
Next, we will download and run a smaller model of the open source DeepSeek-R1 Model.
To download:
llama pull deepseek-r1:1.5b
Next, to start an interactive session in terminal:
llama run deepseek-r1:1.5b
You can now chat directly with the model! For further capabilities, we can use increasingly large parameter models, but it requires more RAM. As a general rule of thumb, you want ~1GB RAM per B parameters. The main attraction of running LLM’s locally is privacy, instead of sending queries to wherever the LLM is hosted, you are sending queries directly to your own hardware.