Skip to main content

Frequently Asked Questions (FAQ)

General

Is l3mcore free?

Yes, l3mcore is open source. Costs come from the backends you configure (e.g., OpenAI API) or the hardware for local models.

Do I need a GPU to use l3mcore?

No. l3mcore's ML router runs on CPU. The backends you use (Ollama, vLLM) may or may not need a GPU depending on the models you choose.

Does l3mcore save my conversations?

No. l3mcore is stateless middleware. Prompts pass through and a sanitized summary is logged in logs/app.log, but there is no conversation storage.


Configuration

How many experts can I have?

The limit is set by max_experts in experts.json (default 15). Technically, you can increase this number, but with many experts, the vectorization time on startup increases (happens only once).

Can I change experts without restarting?

Currently no. Experts are loaded on startup. You must restart l3mcore for changes in experts.json to take effect.

Can I change the router model?

Yes. You can change the embeddings model by modifying the model_path parameter in config/config.json. By default it uses intfloat/multilingual-e5-small. You can point to any other model compatible with Hugging Face SentenceTransformers, to a local disk path, or leave it empty "" to disable ML routing and use only keywords (which reduces the router's RAM/CPU consumption to zero).

How do I know if the router is working well?

Check the logs: tail -f logs/app.log | grep Router. You should see scores > 0.6 for clear prompts.


Compatibility

Does it work with Continue (IDE plugin)?

Yes. In Continue, configure the OpenAI provider with the base URL http://your-ip:11435/v1.

Does it work with LiteLLM?

Yes. l3mcore exposes an OpenAI-compatible API, so LiteLLM can point to it as a provider.

Does it work with Langchain / LlamaIndex?

Yes. Use the Langchain/LlamaIndex OpenAI client pointing to http://your-ip:11435/v1.


Performance

How long does the router take to decide?

With multilingual-e5-small: ~10-20ms. With keywords only: < 1ms. The expert model's inference time dominates the total time.

What happens if an external backend (OpenAI) fails?

The Expert Dispatcher returns an HTTP 502/503 error to the client. There are no automatic retries currently. You can implement retry logic in an after_generation plugin.


Security

Can I expose l3mcore to the internet?

Not directly. Put it behind a reverse proxy (Nginx/Caddy) with authentication. l3mcore is designed for internal networks and homelabs.

Does l3mcore filter content?

Not by default. You can implement content filters using the plugin system.