impress_asr_input/docs/DEVELOPMENT.md

# 开发指南

## 项目结构

```
impress-asr-input/
├── src/
│   ├── core/                    # 核心模块
│   │   ├── audio-recorder.ts    # 音频采集
│   │   ├── audio-processor.ts   # 音频处理（VAD、重采样等）
│   │   ├── speech-recognizer.ts # ONNX 语音识别引擎
│   │   ├── text-output.ts       # 文本输出
│   │   └── index.ts             # 模块导出
│   ├── ui/                      # Electron UI
│   │   └── index.html           # 主界面
│   ├── electron-main.ts         # Electron 主进程
│   ├── preload.ts               # Electron 预加载脚本
│   ├── main.ts                  # CLI 入口
│   └── utils/
│       └── config.ts            # 配置管理
├── models/                      # ONNX 模型文件（需自行下载）
├── scripts/
│   └── postinstall.js           # 安装后脚本
├── test/
│   └── audio-processor.test.ts  # 单元测试
├── package.json
├── tsconfig.json
└── PRD.md
```

## 开发环境设置

### 前置要求

- Node.js >= 20.0.0
- npm >= 9.0.0

### 安装步骤

```bash
# 安装依赖
npm install

# 下载模型文件（见下文）

# 开发模式运行
npm run dev

# 开发模式运行 Electron
npm run dev:electron
```

## 模型下载

### 推荐模型

#### 1. SenseVoice（推荐）

```bash
# HuggingFace 下载
# https://huggingface.co/FunAudioLLM/SenseVoice/tree/main

# 或使用 ModelScope
# https://www.modelscope.cn/models/iic/SenseVoiceSmall
```

将 `model.onnx` 放入 `models/` 目录。

#### 2. Whisper ONNX

```bash
# HuggingFace
# https://huggingface.co/onnx-community/whisper-base

# 直接下载
huggingface-cli download onnx-community/whisper-base --local-dir models/
```

#### 3. Paraformer（中文优化）

```bash
# ModelScope
# https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punct
```

### 模型配置

在 `src/main.ts` 或设置界面中指定模型路径：

```typescript
const recognizer = new SpeechRecognizer({
  modelPath: './models/model.onnx',
  language: 'zh',
  useVad: true,
  beamSize: 5,
});
```

## 开发命令

```bash
# 编译 TypeScript
npm run build

# 运行 CLI
npm start -- start

# 运行测试
npm test

# 代码检查
npm run lint

# 构建 Electron 应用
npm run build:electron
```

## 核心模块说明

### AudioRecorder（音频采集）

负责从麦克风采集音频数据。

```typescript
const recorder = new AudioRecorder({
  sampleRate: 16000,
  channels: 1,
  chunkDuration: 100,
});

recorder.on('data', (chunk: AudioChunk) => {
  // 处理音频数据
});

await recorder.start();
```

**注意**: 当前实现基于 Web Audio API，在纯 Node.js 环境中需要使用其他方案（如 `node-audio` 或 Electron 的音频 API）。

### SpeechRecognizer（语音识别）

基于 ONNX Runtime 的语音识别引擎。

```typescript
const recognizer = new SpeechRecognizer({
  modelPath: './models/model.onnx',
  language: 'zh',
  useVad: true,
});

recognizer.on('result', (result: RecognitionResult) => {
  console.log(result.text);
});

await recognizer.initialize();
recognizer.start();
```

### TextOutput（文本输出）

将识别结果输出到剪贴板。

```typescript
const output = new TextOutput({
  outputMode: 'clipboard',
});

output.output({ text: '你好', isFinal: true, confidence: 0.95, timestamp: Date.now() });
```

### SimpleVAD（语音端点检测）

简单的能量检测 VAD 实现。

```typescript
const vad = new SimpleVAD({
  energyThreshold: 0.01,
  silenceDuration: 500,
});

const { isSpeaking, isFinal } = vad.process(audioFrame, 16000);
```

## 添加新模型支持

1. 在 `models/` 目录创建模型配置文件：

```typescript
// src/core/models/sensevoice.ts
export const senseVoiceConfig = {
  inputShape: [1, 16000],
  outputKeys: ['output', 'logits'],
  // ...
};
```

2. 在 `SpeechRecognizer` 中添加模型适配逻辑。

## 常见问题

### Q: 如何调试音频采集？

```typescript
recorder.on('data', (chunk) => {
  console.log('音频数据:', chunk.data.length, '采样率:', chunk.sampleRate);
});
```

### Q: 识别延迟高？

1. 使用量化模型（int8）
2. 减少 `chunkDuration`
3. 启用 `useVad` 减少无效识别

### Q: Electron 打包失败？

检查 `package.json` 中的 `build` 配置，确保模型文件被包含。

## 贡献指南

1. Fork 项目
2. 创建特性分支 (`git checkout -b feature/AmazingFeature`)
3. 提交更改 (`git commit -m 'Add some AmazingFeature'`)
4. 推送到分支 (`git push origin feature/AmazingFeature`)
5. 开启 Pull Request