Reachy Mini · Python app · by SantiPa

Marginalia

A small voice in the margin of the page. Ask Reachy Mini about any book, character, motif, or piece of lore — from Hamlet and Mrs Dalloway to Confessions of a Justified Sinner, Dune, Borges' Ficciones, and the Silmarils. Reachy reads each answer aloud — or twist either antenna for a silent reading.

Q. What is Borges doing with mirrors in “The Library of Babel”?
A mirror in Borges is almost never an instrument of recognition — it is an instrument of multiplication. [1] In “The Library of Babel” the polished surface at the back of every hexagonal gallery doubles the bookshelves to infinity, so the architecture itself becomes an argument: any catalogue of the universe must already contain a perfect copy of the catalogue…
§ How to read this ¶ Further reading ♪ Ava — neural voice ♪ Andrew — companion

Preview only. The real conversation lives at http://0.0.0.0:8042 when the app is running on your robot.

How to use it on your robot

  1. Open the dashboard. Launch the Reachy Mini Control App on your laptop and connect to your robot (or the bundled simulator).
  2. Discover Apps → search marginalia. Click Install. The dashboard pulls this Space, sets up a private virtualenv, and registers the app's entry point.
  3. Hit Start. The browser opens at http://0.0.0.0:8042 automatically.
  4. Ask a question. Reachy sways its head while it thinks, nods the moment the first token lands, then starts speaking the answer paragraph by paragraph — the first words come through the speaker about a second after the LLM finishes, while the rest of the answer is still being synthesised in the background. The head dips, antennas perk up, and the body sways with the audio. Each answer comes with a How to read this panel of tips tailored to the work, plus Further reading for amplification.
  5. Prefer to read in silence? Twist either antenna and Reachy will switch to read mode — text only on the page, no speaker. Twist again to bring the voice back. The pill at the top of the in-app UI mirrors the same toggle if your hands are on the keyboard.

What every answer contains

How it works

  1. Wikipedia search. Your question goes through a single combined Wikipedia API call that returns titles, intro extracts, full URLs, and thumbnail images for the top hits — one round-trip, no key required.
  2. LLM, streamed. The question and the search excerpts are sent to a Hugging Face hosted model (meta-llama/Llama-3.3-70B-Instruct by default) via InferenceClient.chat_completion(stream=True). No extra API key — the app uses the cached hf auth login token already on your machine.
  3. Server-Sent Events. Tokens stream back to the browser, which renders markdown progressively with a deep-emerald drop-cap, gilded citation underlines, and chapter-style ornaments separating the three sections.
  4. Embodied reading. A 50 Hz state machine drives the head, antennas, and body together: idle sway → thinking sway with curling antennas → a single nod the moment the first token arrives → an attentive bob while the rest streams in → audio-reactive head and antenna motion while the answer is read aloud, so the robot moves with the voice rather than on a fixed loop.
  5. Voice. The cleaned answer prose is split into paragraph chunks. Each chunk is synthesised through edge-tts (Microsoft's free neural voices — Ava, Andrew, Sonia, Ryan, …) and pushed to the robot speaker as soon as it's ready, so the next chunk renders while the previous one is already playing. Fallbacks: HF Inference TTS, pyttsx3, gTTS. Twisting either antenna above ~23° toggles spoken vs silent mode and is acknowledged with a small head nod.

Try without a robot

The Reachy Mini Control App ships with a built-in MuJoCo simulator — the same install path works without hardware. Pick the simulation connection mode in the desktop app, then install Marginalia from Discover Apps as usual.

More: install guide · SDK on GitHub