Show HN: Bodhi App – Run Open Source/Weights HuggingFace LLMs Locally

21 points by anagri 6 days ago

Hi HN,

I'm excited to share Bodhi App, a tool designed to simplify running open-source Large Language Models (LLMs) locally on your laptops. While we currently support M2 Macs, we plan to support other platforms as our community grows.

# Problem

To use LLMs, you typically need to purchase a subscription from providers like OpenAI or Anthropic, or use OpenAI API credits with compatible Chat UIs. These options can not only burden you financially, but also raise data security and privacy concerns.

Many laptops are capable of running powerful open-source LLMs, but for non-tech users, setting them up can be challenging.

# Solution: Bodhi App

Bodhi App allows you to run LLMs on your own hardware, ensuring data privacy and cost savings. Our goal is to bring the power of LLMs to everyone.

Built with non-tech users in mind, Bodhi App ships with a simple Chat UI, making it easy to start conversing with an LLM. It also exposes OpenAI-compatible APIs, enabling other apps to use LLM services without relying on external providers.

# Features

Bodhi App currently supports:

1. Running GGUF format open-source LLMs from the Huggingface repository.

2. An in-built Chat UI to quickly start conversations with an LLM.

3. A powerful and familiar CLI to download and configure models from the Huggingface ecosystem.

4. Exposing LLMs as OpenAI-compatible APIs for use by other apps.

# Feature Comparison: Bodhi App vs. Ollama

## Bodhi App

- Targeted at non-tech users.

- Includes a simple Chat UI to get started quickly.

- Integrates well with the Huggingface ecosystem:

  - Use Huggingface repo/filename to run a model.

  - Use tokenizer_config.json for chat templates.

- Currently only supports Mac M2.

## Ollama

- Requires some technical insight.

- No inbuilt Chat UI.

- Requires baking the model using a custom process:

  - Modelfile

  - Golang template to specify chat templates.

- Supports various OS platforms.

Bodhi App leverages the Huggingface ecosystem, avoiding the need to reinvent the wheel with Modelfile etc., and making it easier to get started quickly.

# Quickstart

Try Bodhi App today by following these simple steps:

```bash

brew tap BodhiSearch/apps

brew install --cask bodhi

bodhi run llama3:instruct

```

# Documentation and Tutorials

- Technical docs on GitHub (README.md, docs folder): https://github.com/BodhiSearch/BodhiApp

- YouTube playlist covering Bodhi App features in detail: https://www.youtube.com/playlist?list=PLavvg7KIktFI1ZaFc2nLe...

# Conclusion

We would love for HN to try out Bodhi App and provide feedback. You can reach us through:

- Raising an issue on GitHub: https://github.com/BodhiSearch/BodhiApp

- Connecting with the developer on Twitter: https://twitter.com/AmirNagri

- Leaving a comment on our YouTube tutorials: https://www.youtube.com/playlist?list=PLavvg7KIktFI1ZaFc2nLe...

Please show your support by starring the repo on Github.

Thank you for your time!

Best, The Bodhi Team

_shuklahimanshu 4 days ago

This looks great for both tech and non tech folks and solves for use cases from both groups. Will definitely try it out. Thanks for open sourcing this!

abhinavrai 6 days ago

Interesting! This makes it so simple. Thanks for building & open sourcing this!

anagri 6 days ago

thanks @abhinavrai, looking forward to your feeedback, hope you get to use it when you are developing GenAI apps locally as well.

akashkahlon 6 days ago

Congratulations on getting something built, this looks interesting, will definitely try it out.

anagri 6 days ago

thanks @akashkahlon, looking forward to your feedback

irfn 6 days ago

Is there a performance comparison with Ollama. Do both use llama.cpp for serving?

anagri 6 days ago

@irfn - that's an interesting idea. will definitely try to create benchmark using my local M2 machine and llama3-7b, just for comparison.
yes, ollama and Bodhi App both use llama.cpp but our approaches are different. Ollama embeds a binary within its binary, that it copies to a tmp folder and runs this webserver. any request that comes to ollama is then forwarded to this server, and replied sent back to the client.
Bodhi embeds the llama.cpp server, so there is no tmp binary that is copied. when a request comes to Bodhi App, it invokes the code in llama.cpp and sends the response back to client. So there is no request hopping.
Hope that approach do provide us with some benefits.
Also Bodhi uses Rust as programming language. IMHO rust have excellent interface with C/C++ libraries, so the C-code is invoked using the C-FFI bridge. And given Rust's memory safety, fearless concurrency and zero cost abstractions, should definitely provide some performance benefit to Bodhi's approach.
Will get back to you once I have results for these benchmarks. Thanks for the idea.
Hope you try Bodhi, and have some equally valuable feedback on the app.
Cheers.

nandx64 5 days ago

Tried it locally, this is smooth. Great one @anagri

anagri 5 days ago

@nandx thanks for shoutout, more features coming soon, feel free to raise issues or feature request using GitHub issues.
Thanx

kartik7153 6 days ago

will check this out @anagri

anagri 5 days ago

@kartik7153 - looking forward to your feedback

kaiwren 6 days ago

awesome!

anagri 5 days ago

@kaiwren thanx for shoutout, looking forward to your feedback