Frequently Asked Questions (FAQ)

General

Is l3mcore free?

Yes, l3mcore is open source. Costs come from the backends you configure (e.g., OpenAI API) or the hardware for local models.

Do I need a GPU to use l3mcore?

No. l3mcore's ML router runs on CPU. The backends you use (Ollama, vLLM) may or may not need a GPU depending on the models you choose.

Does l3mcore save my conversations?

No. l3mcore is stateless middleware. Prompts pass through and a sanitized summary is logged in logs/app.log, but there is no conversation storage.

Configuration

How many experts can I have?

The limit is set by max_experts in experts.json (default 15). Technically, you can increase this number, but with many experts, the vectorization time on startup increases (happens only once).

Can I change experts without restarting?

Currently no. Experts are loaded on startup. You must restart l3mcore for changes in experts.json to take effect.

Can I change the router model?

Yes. You can change the embeddings model by modifying the model_path parameter in config/config.json. By default it uses intfloat/multilingual-e5-small. You can point to any other model compatible with Hugging Face SentenceTransformers, to a local disk path, or leave it empty "" to disable ML routing and use only keywords (which reduces the router's RAM/CPU consumption to zero).

How do I know if the router is working well?

Check the logs: tail -f logs/app.log | grep Router. You should see scores > 0.6 for clear prompts.

Compatibility

Does it work with Continue (IDE plugin)?

Yes. In Continue, configure the OpenAI provider with the base URL http://your-ip:11435/v1.

Does it work with LiteLLM?

Yes. l3mcore exposes an OpenAI-compatible API, so LiteLLM can point to it as a provider.

Does it work with Langchain / LlamaIndex?

Yes. Use the Langchain/LlamaIndex OpenAI client pointing to http://your-ip:11435/v1.

Performance

How long does the router take to decide?

With multilingual-e5-small: ~10-20ms. With keywords only: < 1ms. The expert model's inference time dominates the total time.

What happens if an external backend (OpenAI) fails?

The Expert Dispatcher returns an HTTP 502/503 error to the client. There are no automatic retries currently. You can implement retry logic in an after_generation plugin.

Security

Can I expose l3mcore to the internet?

Not directly. Put it behind a reverse proxy (Nginx/Caddy) with authentication. l3mcore is designed for internal networks and homelabs.

Does l3mcore filter content?

Not by default. You can implement content filters using the plugin system.

General​

Is l3mcore free?​

Do I need a GPU to use l3mcore?​

Does l3mcore save my conversations?​

Configuration​

How many experts can I have?​

Can I change experts without restarting?​

Can I change the router model?​

How do I know if the router is working well?​

Compatibility​

Does it work with Continue (IDE plugin)?​

Does it work with LiteLLM?​

Does it work with Langchain / LlamaIndex?​

Performance​

How long does the router take to decide?​

What happens if an external backend (OpenAI) fails?​

Security​

Can I expose l3mcore to the internet?​

Does l3mcore filter content?​