LANCUN
§ 01 · CORE TECH
Distributed Architecture

Multi-Agents
Collaboration Architecture

A distributed agent architecture for efficient collaboration. Fuses speech recognition, memory, tool use and execution loops — natively suited to automated operation, embodied intelligence, and multi-sensor coordination. From conversational AI to actionable AI.

Input
Multimodal
Cognition
LLM Orchestration
Execution
Soft + Hardware
Persistence
Closed-loop
§ 02 · CLOSED-LOOP ARCHITECTURE

Input · Cognition · Execution · Memory Feedback

Not layered stacks — a closed loop. Three-stage flow plus memory continuously feeding decisions back, forming a self-evolving Agent system.

横向滑动查看完整架构图

Input & Recognition

Voice in · PCM 16kHz
Text in · Text Stream
DashScope ASR · server_vad endpoint

Cognition & Decision

Prompt assembly · Platform + persona + memory + skills
Agents System
LLM → Tool → LLM → Response

Execution & Output

Streaming TTS · PCM 24kHz
Action execution · Hardware / workflow / system tools
Text reply · Text Response

Persistence & Memory

chat_state JSONB · PostgreSQL
Input & RecognitionMillisecond recognition · multi-language mix · long-form uninterrupted
Cognition & DecisionThink-tool-think loop · model plans proactively rather than reacting
Execution & OutputThree channels in parallel · intent → real-world action in one shot
Persistence & MemoryStructured session state on disk · each turn becomes next turn's experience
§ 03 · CAPABILITIES

Three capability leaps

More natural multimodal interaction

Voice, text, vision, sensor input fused seamlessly; output side generates TTS streams, text replies, and action commands in sync. One input fires three outputs — conversation pace approaches human.

Stronger tool use & orchestration

From single-call to flow orchestration — Agents chain multiple tools to complete complex tasks autonomously. The model decides which tool, when; developers only declare what each tool can do.

Software automation × embodied control

Operate software UIs (click, type, navigate), drive hardware (servos, lights, displays), and link multi-sensor input into spatial awareness. AI no longer just thinks in the cloud.

From “Conversational AI” · to “Actionable AI”