AFAIK, currently there’re several ways to run LLM(large language model) locally.
Ollama
Go to “Ollama“ website to download the “ollama” application for your OS(Linux/macOS/Windows). Install it. Then you need to select a model to run with Ollama. Go to this page to select. For example, I select “deepseek-r1:32b” from here.
Run the following command to start.
1 | ollama run deepseek-r1:32b |
The first time you run a given model, it will be downloaded and then start.
llamafile(standalone with given model file)
From here.
In this mode, you need to have the “llamafile” application and a model file.
Download “llamafile” from “here“, such as “llamafile-0.9.0”. Extract the zip file if required and “chmod +x” to make it executable.
Go to ModelScope and select a “GGUF” format module file, such as “Qwen2.5-Coder-7B-Instruct-GGUF“ and download.
Run.
1 | # Before running, you can run "./llamafile-0.9.0 --help" to get the knowledge |
llamafile(all-in-one)
From here.
In this mode, all you need is to download one file such as “llava-v1.5-7b-q4.llamafile“, make it executable(“chmod +x”) and run:
1 | ./llava-v1.5-7b-q4.llamafile |
After that your browser should be opened automatically and display a chat interface. You can adjust the settings and start chatting.
An interesting part you may not notice above is that, one “llamafile” binary file can be used in almost all known operating systems like “Linux”, “BSD”, “macOS” and “Windows”, which is accomplished by combining “llama.cpp“ and “Cosmopolitan Libc“.
Resources
Hugging Face: The platform where the machine learning community collaborates on models, datasets, and applications.
ModelScope: 共享/共创/共进,构建持续创新的 AI 开源生态。And the team ever create a Python library “modelscope“. The library seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications.
TrendShift: A dashboard shows the trending GitHub repositories and developers.
build-your-own-x: This Git repository is a compilation of well-written, step-by-step guides for re-creating our favorite technologies from scratch. For example, “Python: A BitTorrent client in Python 3.5“.