当前位置: 首页 > news >正文

C#使用whisper.net实现语音识别(语音转文本)

目录

介绍

效果

输出信息 

项目

代码

下载 


介绍

github地址:https://github.com/sandrohanea/whisper.net

Whisper.net. Speech to text made simple using Whisper Models

模型下载地址:https://huggingface.co/sandrohanea/whisper.net/tree/main/classic

效果

输出信息 

whisper_init_from_file_no_state: loading model from 'ggml-small.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3
whisper_model_load: mem required  =  743.00 MB (+   16.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     =  464.68 MB
whisper_model_load: model size    =  464.44 MB
whisper_init_state: kv self size  =   15.75 MB
whisper_init_state: kv cross size =   52.73 MB
00:00:00->00:00:20: 皇鶴楼,崔昊,西人已成皇鶴去,此地空于皇鶴楼,皇鶴一去不复返,白云千载空悠悠。
00:00:20->00:00:39: 青川莉莉汉阳树,方草七七英五周,日暮相关何处事,燕泊江上世人愁。

项目

代码

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using Whisper.net;
using static System.Net.Mime.MediaTypeNames;

namespace C_使用whisper.net实现语音转文本
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        string fileFilter = "*.wav|*.wav";
        string wavFileName = "";
        WhisperFactory whisperFactory;
        WhisperProcessor processor;
        private async void button2_Click(object sender, EventArgs e)
        {
            if (wavFileName == "")
            {
                return;
            }

            try
            {
                button2.Enabled = false;
                using var fileStream = File.OpenRead(wavFileName);
                await foreach (var result in processor.ProcessAsync(fileStream))
                {
                    Console.WriteLine($"{result.Start}->{result.End}: {result.Text}\r\n");
                    txtResult.Text += $"{result.Start}->{result.End}: {result.Text}\r\n";
                }
            }
            catch (Exception ex)
            {
                MessageBox.Show(ex.Message);
            }
            finally
            {
                button2.Enabled = true;
            }

        }

        private void Form1_Load(object sender, EventArgs e)
        {
            whisperFactory = WhisperFactory.FromPath("ggml-small.bin");

            processor = whisperFactory.CreateBuilder()
               .WithLanguage("zh")//.WithLanguage("auto")
               .Build();

            wavFileName = "085黄鹤楼.wav";
            txtFileName.Text = wavFileName;
        }

        private void button1_Click(object sender, EventArgs e)
        {
            OpenFileDialog ofd = new OpenFileDialog();
            ofd.Filter = fileFilter;
            if (ofd.ShowDialog() != DialogResult.OK) return;

            txtResult.Text = "";

            wavFileName = ofd.FileName;
            txtFileName.Text = wavFileName;
        }
    }
}

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using Whisper.net;
using static System.Net.Mime.MediaTypeNames;namespace C_使用whisper.net实现语音转文本
{public partial class Form1 : Form{public Form1(){InitializeComponent();}string fileFilter = "*.wav|*.wav";string wavFileName = "";WhisperFactory whisperFactory;WhisperProcessor processor;private async void button2_Click(object sender, EventArgs e){if (wavFileName == ""){return;}try{button2.Enabled = false;using var fileStream = File.OpenRead(wavFileName);await foreach (var result in processor.ProcessAsync(fileStream)){Console.WriteLine($"{result.Start}->{result.End}: {result.Text}\r\n");txtResult.Text += $"{result.Start}->{result.End}: {result.Text}\r\n";}}catch (Exception ex){MessageBox.Show(ex.Message);}finally{button2.Enabled = true;}}private void Form1_Load(object sender, EventArgs e){whisperFactory = WhisperFactory.FromPath("ggml-small.bin");processor = whisperFactory.CreateBuilder().WithLanguage("zh")//.WithLanguage("auto").Build();wavFileName = "085黄鹤楼.wav";txtFileName.Text = wavFileName;}private void button1_Click(object sender, EventArgs e){OpenFileDialog ofd = new OpenFileDialog();ofd.Filter = fileFilter;if (ofd.ShowDialog() != DialogResult.OK) return;txtResult.Text = "";wavFileName = ofd.FileName;txtFileName.Text = wavFileName;}}
}

下载 

源码下载

http://www.lryc.cn/news/239706.html

相关文章:

  • 从零开始学习typescript——运算符(算术运算符、赋值运算符、比较运算符)
  • likeshop单商户商城系统 任意文件上传漏洞复现
  • CentOS 7 使用pugixml 库
  • 深度学习 loss 是nan的可能原因
  • [ 云计算 | AWS 实践 ] 基于 Amazon S3 协议搭建个人云存储服务
  • 第二十章:多线程
  • CentOS 7启动时报“Started Crash recovery kernel arming.....shutdown....”问题处理过程
  • Android 13 - Media框架(14)- OpenMax(二)
  • 【Python大数据笔记_day11_Hadoop进阶之MR和YARNZooKeeper】
  • 飞桨——总结PPOCRLabel中遇到的坑
  • LeetCode(30)长度最小的子数组【滑动窗口】【中等】
  • Niushop 开源商城 v5.1.7:支持PC、手机、小程序和APP多端电商的源码
  • Navmesh 寻路
  • YOLOv5 分类模型 数据集加载 3
  • 『亚马逊云科技产品测评』活动征文|AWS 存储产品类别及其适用场景详细说明
  • Mac | Vmware Fusion | 分辨率自动还原问题解决
  • SQL知多少?这篇文章让你从小白到入门
  • centos7安装MySQL—以MySQL5.7.30为例
  • 3.计算机网络补充
  • 【云原生】Spring Cloud Alibaba 之 Gateway 服务网关实战开发
  • opencv-直方图均衡化
  • npm install安装报错
  • Spring Boot创建和使用(重要)
  • python 基于gdal,richdem,pysheds实现 实现洼填、D8流向,汇流累计量计算,河网连接,分水岭及其水文分析与斜坡单元生成
  • 帝国cms开发一个泛知识类的小程序的历程记录
  • Kafka官方生产者和消费者脚本简单使用
  • 如何开发干洗店用的小程序
  • 回溯算法详解
  • 边云协同架构设计
  • 【c++】——类和对象(下) 万字解答疑惑