swift-voice-input
ストリーミング認識とフローティングプレビュー UI を備えたプロトコル指向の音声入力
English | 日本語
swift-voice-input
A Swift package for voice input on iOS / macOS. Protocol-oriented design allows you to swap in multiple speech recognition backends including Apple Speech.
Features
- Protocol-oriented — Abstracts the recognition engine behind the
SpeechRecognizerprotocol. Plug in Apple Speech, Whisper, local LLMs, and more - Real-time streaming — Partial results delivered incrementally via
AsyncStream. UsepartialTextfor live display - Floating preview — An Aqua Voice-style overlay shows recognized text in real time above the input field
- DesignSystem compliant — UI components fully follow the token system of swift-design-system
- Two-target layout —
VoiceInput(Core) andVoiceInputUI(SwiftUI) are separated. Depend on Core alone when you don't need UI - Swift Concurrency — Actor-based audio engine and
@Observablestate management for safe concurrent access
Installation
Add the dependency to Package.swift:
dependencies: [
.package(url: "https://github.com/no-problem-dev/swift-voice-input.git", .upToNextMajor(from: "1.0.0")),
]
Add to your target:
// Core only (no UI needed)
.product(name: "VoiceInput", package: "swift-voice-input"),
// Core + SwiftUI components
.product(name: "VoiceInputUI", package: "swift-voice-input"),
Quick Start
Basic Usage
import VoiceInput
import VoiceInputUI
struct MyView: View {
@State private var session = VoiceInputSession()
@State private var text = ""
var body: some View {
VStack {
TextField("Type here...", text: $text)
VoiceInputButton(session: session)
}
.voiceInputOverlay(session: session) { transcript in
text = transcript
}
}
}
Inline Transcript Preview
Embed a transcript preview directly in the layout flow using InlineTranscriptView:
import VoiceInput
import VoiceInputUI
struct MyView: View {
@State private var session = VoiceInputSession()
@State private var text = ""
var body: some View {
VStack(alignment: .leading, spacing: 8) {
HStack {
TextField("Type here...", text: $text)
VoiceInputButton(session: session)
}
InlineTranscriptView(session: session) { transcript in
text = transcript
}
}
}
}
Custom Recognition Engine
Implement an Actor conforming to SpeechRecognizer:
actor WhisperRecognizer: SpeechRecognizer {
let displayName = "Whisper"
var isAvailable: Bool { true }
func requestPermissions() async -> Result<Void, SpeechRecognitionError> {
// Request microphone permission
.success(())
}
func start(locale: Locale) throws -> AsyncStream<SpeechRecognitionResult> {
// Start recognition with Whisper model
AsyncStream { _ in }
}
func stop() {
// Stop recognition
}
}
// Inject at session creation
@State private var session = VoiceInputSession(
recognizer: WhisperRecognizer()
)
Session API
let session = VoiceInputSession()
session.toggle() // Toggle start/stop
session.startListening() // Start
session.stopListening() // Stop
session.state // .idle, .requesting, .listening, .processing, .error
session.partialText // Real-time partial text
session.transcript // Finalized text
let text = session.confirm() // Confirm text + reset
session.reset() // Reset
Architecture
VoiceInput (Core)
├── Protocol/
│ ├── SpeechRecognizer # Recognition engine abstraction protocol (Actor)
│ ├── SpeechRecognitionResult # .partial / .final result type
│ └── SpeechRecognitionError # Error type
├── Engine/
│ ├── AppleSpeechRecognizer # Default Apple Speech implementation
│ └── PermissionRequester # Microphone & speech recognition permissions
└── Session/
└── VoiceInputSession # @Observable state management
VoiceInputUI (SwiftUI + DesignSystem)
├── Button/
│ └── VoiceInputButton # Microphone toggle button
├── Inline/
│ └── InlineTranscriptView # Inline transcript preview
└── Overlay/
└── TranscriptOverlayModifier # .voiceInputOverlay() modifier
Requirements
| Requirement | Version |
|---|---|
| iOS | 17.0+ |
| macOS | 14.0+ |
| Swift | 6.2+ |
| Xcode | 26.0+ |
License
MIT License — see LICENSE for details.
Links
同じカテゴリの OSS — UI / SwiftUI
swift-design-system
Swift PackageSwiftUI 向けの型安全で拡張可能なデザインシステム
swift-ui-routing
Swift PackageSwiftUI 向けの型安全で宣言的なルーティングライブラリ
swift-statable
Swift PackageAsyncValue パターンで Observable な状態管理を実現する Swift マクロ
swift-markdown-view
Swift PackageDesignSystem 統合とシンタックスハイライトを備えた SwiftUI ネイティブな Markdown レンダリング
swift-latex-view
Swift PackageSwiftUI ネイティブな LaTeX 数式レンダリング。LLM 出力にも堅牢
swift-cached-remote-image
Swift Packageメモリ & ディスクの二層キャッシュで高速表示する SwiftUI リモート画像
swift-google-slides-view
Swift PackageGoogle Slides API の JSON を SwiftUI で描画。A2A アーティファクトのストリーミングに対応
swift-document-scanner
Swift Package矩形検出・OCR・カメラ撮影を備えた iOS 向けドキュメントスキャン基盤