So far, running LLMs has required a large amount of computing resources, mainly GPUs. Running locally, a simple prompt with a typical LLM takes on an average Mac ...
You can configure the client with a configuration file (yaml or json) or directly in code. You can also use different authentication methods : ...