Android 文字转语音 API 听起来像机器人

Android Text-To-Speech API Sounds Robotic

我是第一次学习 android 开发,我的目标是创建一个简单的 Hello World 应用程序,它接收一些文本并大声朗读。

我的代码基于我发现的一个示例,这是我的代码:

class MainFeeds : AppCompatActivity() {


    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main_feeds)



        card.setOnClickListener{
            Toast.makeText(this, "Hello", Toast.LENGTH_LONG).show()
            TTS(this, "Hello this is leo")
        }
    }

}


class TTS(private val activity: Activity,
          private val message: String) : TextToSpeech.OnInitListener {

          private val tts: TextToSpeech = TextToSpeech(activity, this, "com.google.android.tts")

    override fun onInit(i: Int) {
        if (i == TextToSpeech.SUCCESS) {

            val localeUS = Locale.US

            val result: Int
            result = tts.setLanguage(localeUS)

            if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                Toast.makeText(activity, "This Language is not supported", Toast.LENGTH_SHORT).show()
            } else {
                speakOut(message)
            }

        } else {
            Toast.makeText(activity, "Initilization Failed!", Toast.LENGTH_SHORT).show()
        }
    }

    private fun speakOut(message: String) {
        tts.speak(message, TextToSpeech.QUEUE_FLUSH, null, null)
    }
}

它工作得很好,我 运行 遇到的问题是合成器发出的音频听起来非常机械化,几乎就像我在使用 Google 地图时一样我与互联网断开连接。使用语音 Google 助手是否利用了我必须启用的其他一些 API?

编辑:我已经在我的 Pixel 2xl 上试过 运行 该应用程序,它听起来仍然很机械,因为它没有使用 Google 助理语音。

语音质量首先取决于您创建的 TextToSpeech 对象 "speech engine" 使用的内容:

private val tts: TextToSpeech = TextToSpeech(activity, this)

如果您输入的是:

private val tts: TextToSpeech = TextToSpeech(activity, this, "com.google.android.tts")

...那么您 运行 使用该代码的任何设备都会 尝试 使用 google 语音引擎... 但实际上它只会如果设备上存在,则使用它。

同样,使用 "com.samsung.SMT" 会尝试使用三星语音引擎(这也是高质量的,但通常只安装在三星 [真实] 设备上)。

Google 语音引擎是否可用与设备的 Android API 级别无关(只要它足够新 运行 Google 引擎),但设备上是否安装了实际的 Google 文本转语音引擎。

确保安装了 Google 引擎:

在 Android Studio 模拟器上:

创建一个新的模拟器和 select 一个在 "target" 列中具有 "Google APIs" 或 "Google Play" 的系统映像。

在真实设备上:

前往 Play 商店并安装 Google speech engine

Android 上的 TTS(或至少试图预测其行为)可能是真正的野兽。

文档:Java | Kotlin.

我做了一个小测试程序,应该可以为您解答这个问题。

它会向您显示 Google 引擎中包含的所有声音的列表,您可以单击它们并收听它们!耶!

它实际做了什么:

  • 使用 Google 文本转语音引擎初始化 TextToSpeech 对象(如果设备上存在)。
  • 让您从 ListView 中选择一个特定的声音,其中包含与代码中指定的区域设置(在本例中为英语)相对应的所有可能可用的声音...并且与 Google 文本的版本相对应-您已安装的语音转换引擎。

这样,你就可以测试所有的声音,看看你要找的"Google Assistant"声音是否在某处,如果没有,你可以继续检查[=86=的新版本] 文本转语音引擎发布。在我看来,本次测试的最高质量语音都是quality:400,并且指定需要网络连接

备注:

  • 即使是 "not installed." 语音(尤其是英语)很可能仍然 "play" 这是因为在使用 setVoice(Voice v) 时,(Google) 引擎将 return 一个 "success" int 即使请求的语音不可用(!),只要它手头上有其他 "back-up" 相同语言的语音。不幸的是,它在后台执行所有这些操作,并且仍然偷偷报告它使用的是与您请求的完全相同的语音,即使您使用 getVoice() 并比较对象也是如此。 :(.

  • 一般来说,如果语音提示已安装,那么您听到的就是您请求的语音。

  • 出于这些原因,您需要确保在测试这些语音时处于互联网上(以便在您请求不可用的语音时自动安装)...以及这样需要网络连接的声音就不会 "auto-downgrade."

  • 您可以swipe/refresh语音列表查看语音是否已经安装,或者使用系统下拉菜单查看下载...或者进入Google设备系统设置中的文字转语音设置。

  • 在列表视图中,"network required," 和 "installed," 等语音功能只是 Google 引擎报告内容的回声,可能并不准确。 :(

  • Voice class documentation中指定的最大可能语音质量是500。在我的测试中,我只能找到质量最高为400的语音。这可能是因为我没有最新版本我的测试设备上安装了 Google 个文本转语音(我没有 Play Store 访问权限来更新它)。如果您使用的是真实设备,我建议您使用 Google Play 商店安装最新版本的 Google TTS。您可以在日志中验证引擎版本。根据维基百科,撰写本文时的最新版本是 3.15.18.200023596。我的测试设备上的版本是 3.13.1.

要重新创建此测试应用程序,请在 Android Studio 中创建一个空白 Java 项目,其最小 API 为 21。(getVoices() 在 21 岁之前不起作用).

清单:

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package=" [ your.package.name ] "
    android:windowSoftInputMode="stateHidden">

    <application
        android:allowBackup="true"
        android:icon="@mipmap/ic_launcher"
        android:label="@string/app_name"
        android:roundIcon="@mipmap/ic_launcher_round"
        android:supportsRtl="true"
        android:theme="@style/AppTheme">
        <activity android:name=".MainActivity">
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />
                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>

</manifest>

主要活动:

package [ your package name ];

import android.content.Intent;
import android.content.pm.PackageManager;
import android.content.pm.ResolveInfo;
import android.graphics.Color;
import android.speech.tts.TextToSpeech;
import android.speech.tts.UtteranceProgressListener;
import android.speech.tts.Voice;
import android.support.v4.widget.SwipeRefreshLayout;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.util.Log;
import android.view.View;
import android.view.inputmethod.InputMethodManager;
import android.widget.AdapterView;
import android.widget.EditText;
import android.widget.ListView;
import android.widget.Spinner;
import android.widget.TextView;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Locale;

public class MainActivity extends AppCompatActivity {

    EditText textToSpeak;
    TextView progressView;
    TextToSpeech googleTTS;
    ListView voiceListView;
    SwipeRefreshLayout swipeRefreshLayout;
    Long timeOfSpeakRequest;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        textToSpeak = findViewById(R.id.textToSpeak);
        textToSpeak.setText("Do I sound robotic to you?  1,2,3,4... yabadabadoo.  "
                + "ooo! ahh! la-la-la-la-la!  num-num-dibby-dibby-num-tick-tock...  "
                + "Can I pronounce the word, Antidisestablishmentarianism?  "
                + "Gerp!  My pants are too tight!  "
                + "CODE RED!  CODE RED!  Initiate disassemble!  Ice Cream is cold "
                + "...in my pants.  Exterminate!  exterminate!  Directive 4 is "
                + "classified."
        );
        progressView = findViewById(R.id.progressView);
        voiceListView = findViewById(R.id.voiceListView);
        swipeRefreshLayout = findViewById(R.id.swipeRefresh);


        // Create the TTS and wait until it's initialized to do anything else
        if (isGoogleEngineInstalled()) {
            createGoogleTTS();
        } else {
            Log.i("XXX", "onCreate(): Google not installed -- nothing done.");
        }

    }

    @Override
    protected void onStart() {
        super.onStart();

        swipeRefreshLayout.setOnRefreshListener(new SwipeRefreshLayout.OnRefreshListener() {
            @Override
            public void onRefresh() {
                assignFullSetOfVoicesToVoiceListView();
            }
        });

    }

    // this is where the program really begins (when the TTS is initialized)
    private void onTTSInitialized() {

        setUpWhatHappensWhenAVoiceItemIsClicked();
        setUtteranceProgressListenerOnTheTTS();
        assignFullSetOfVoicesToVoiceListView();

    }

    // FACTORED/EXTRACTED METHODS ----------------------------------------------------------------
    // These are just pulled out to make onCreate() easier to read and the basic sequence
    // of events more obvious.

    private void createGoogleTTS() {

        googleTTS = new TextToSpeech(this, new TextToSpeech.OnInitListener() {
            @Override
            public void onInit(int status) {
                if (status != TextToSpeech.ERROR) {
                    Log.i("XXX", "Google tts initialized");
                    onTTSInitialized();
                } else {
                    Log.i("XXX", "Internal Google engine init error.");
                }
            }
        }, "com.google.android.tts");

    }

    private void setUpWhatHappensWhenAVoiceItemIsClicked() {
        voiceListView.setOnItemClickListener(new AdapterView.OnItemClickListener() {
            @Override
            public void onItemClick(AdapterView<?> parent, View view, int position, long id) {
                Voice desiredVoice = (Voice) parent.getAdapter().getItem(position);
                // if (setting the desired voice is "successful")...
                // in the case of google engine, this does not necessarily mean the voice you
                // want will actually be used. :(
                if (googleTTS.setVoice(desiredVoice) == 0) {
                    Log.i("XXX", "Speech voice set to: " + desiredVoice.toString());
                    // TTS did may "auto-downgrade" voice selection
                    // due to internal reason such as no data
                    // Unfortunately it will not tell you, and there seems to be no
                    // way of checking whether the presently selected voice (getVoice()) "equals"
                    // the desired voice.
                    speak();
                }
            }
        });
    }

    private void setUtteranceProgressListenerOnTheTTS() {

        UtteranceProgressListener blurp = new UtteranceProgressListener() {

            @Override // MIN API 15
            public void onStart(String s) {
                long timeSinceSpeakCall = System.currentTimeMillis() - timeOfSpeakRequest;
                Log.i("XXX", "progress.onStart() callback.  "
                        + timeSinceSpeakCall + " millis since speak() was called.");
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        progressView.setTextColor(Color.GREEN);
                        progressView.setText("PROGRESS: STARTED");
                    }
                });
            }

            @Override // MIN API 15
            public void onDone(String s) {
                long timeSinceSpeakCall = System.currentTimeMillis() - timeOfSpeakRequest;
                Log.i("XXX", "progress.onDone() callback.  "
                        + timeSinceSpeakCall + " millis since speak() was called.");
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        progressView.setTextColor(Color.GREEN);
                        progressView.setText("PROGRESS: DONE");
                    }
                });
            }

            // Getting an error can simply mean that the particular voice is not available
            // to the device yet... and still needs to be downloaded / is still downloading
            @Override // MIN API 15 (depracated at API 21)
            public void onError(String s) {
                long timeSinceSpeakCall = System.currentTimeMillis() - timeOfSpeakRequest;
                Log.i("XXX", "progress.onERROR() callback.  "
                        + timeSinceSpeakCall + " millis since speak() was called.");
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        progressView.setTextColor(Color.RED);
                        progressView.setText("PROGRESS: ERROR");
                    }
                });

            }
        };
        googleTTS.setOnUtteranceProgressListener(blurp);

    }

    // must happens AFTER tts is initialized
    private void assignFullSetOfVoicesToVoiceListView() {

        googleTTS.stop();

        List<Voice> tempVoiceList = new ArrayList<>();

        for (Voice v : googleTTS.getVoices()) {
            if (v.getLocale().getLanguage().contains("en")) { // only English voices
                tempVoiceList.add(v);
            }
        }

        // Sort the list alphabetically by name
        Collections.sort(tempVoiceList, new Comparator<Voice>() {
            @Override
            public int compare(Voice v1, Voice v2) {
                Log.i("XXX", "comparing item");
                return (v2.getName().compareToIgnoreCase(v1.getName()));
            }
        });

        VoiceAdapter tempAdapter = new VoiceAdapter(this, tempVoiceList);

        voiceListView.setAdapter(tempAdapter);
        swipeRefreshLayout.setRefreshing(false);
        progressView.setTextColor(Color.BLACK);
        progressView.setText("PROGRESS: ...");

    }

    private void speak() {
        HashMap<String, String> map = new HashMap<>();
        map.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, "merp");
        timeOfSpeakRequest = System.currentTimeMillis();
        googleTTS.speak(textToSpeak.getText().toString(), TextToSpeech.QUEUE_FLUSH, map);
    }

    // Checks if Google Engine is installed
    // ... (and gives more info in Logs).
    // The version number is going to dictate the quality of voices available
    private boolean isGoogleEngineInstalled() {

        final Intent ttsIntent = new Intent();
        ttsIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
        final PackageManager pm = getPackageManager();
        final List<ResolveInfo> list = pm.queryIntentActivities(ttsIntent, PackageManager.GET_META_DATA);

        boolean googleIsInstalled = false;

        for (int i = 0; i < list.size(); i++) {

            ResolveInfo resolveInfoUnderScrutiny = list.get(i);
            String engineName = resolveInfoUnderScrutiny.activityInfo.applicationInfo.packageName;

            if (engineName.equals("com.google.android.tts")) {
                String version = "null";
                try {
                    version = pm.getPackageInfo(engineName,
                            PackageManager.GET_META_DATA).versionName;
                } catch (Exception e) {
                    Log.i("XXX", "Error getting google engine verion: " + e.toString());
                }
                Log.i("XXX", "Google engine version " + version + " is installed!");
                googleIsInstalled = true;
            } else {
                Log.i("XXX", "Google Engine is not installed!");
            }

        }
        return googleIsInstalled;
    }
}

VoiceAdapter.java:

package [ your package name ];

import android.content.Context;
import android.graphics.Color;
import android.speech.tts.Voice;
import android.view.LayoutInflater;
import android.view.View;
import android.view.ViewGroup;
import android.widget.BaseAdapter;
import android.widget.TextView;

import java.util.List;

public class VoiceAdapter extends BaseAdapter {

    private Context mContext;
    private LayoutInflater mInflater;
    private List<Voice> mDataSource;

    public VoiceAdapter(Context context, List<Voice> voicesToDisplay) {
        mContext = context;
        mDataSource = voicesToDisplay;
        mInflater = (LayoutInflater) mContext.getSystemService(Context.LAYOUT_INFLATER_SERVICE);
    }

    @Override
    public int getCount() {
        return mDataSource.size();
    }

    @Override
    public Object getItem(int position) {
        return mDataSource.get(position);
    }

    @Override
    public long getItemId(int position) {
        return position;
    }

    @Override
    public View getView(int position, View convertView, ViewGroup parent) {

        // In a real app this method is not efficient,
        // and "View Holder Pattern" shoudl be used instead.
        View rowView = mInflater.inflate(R.layout.list_item_voice, parent, false);

        if (position%2 == 0) {
            rowView.setBackgroundColor(Color.rgb(245,245,245));
        }

        Voice voiceUnderScrutiny = mDataSource.get(position);

        // example output of Voice.toString() :
        // "Voice[Name: pt-br-x-afs#male_2-local, locale: pt_BR, quality: 400, latency: 200,
        // requiresNetwork: false, features: [networkTimeoutMs, notInstalled, networkRetriesCount]]"

        // Get title element
        TextView voiceTitleTextView =
                (TextView) rowView.findViewById(R.id.voice_title);

        TextView qualityTextView =
                (TextView) rowView.findViewById(R.id.voice_quality);

        TextView networkRequiredTextView =
                (TextView) rowView.findViewById(R.id.voice_network);

        TextView isInstalledTextView =
                (TextView) rowView.findViewById(R.id.voice_installed);

        TextView featuresTextView =
                (TextView) rowView.findViewById(R.id.voice_features);

        voiceTitleTextView.setText("VOICE NAME: " + voiceUnderScrutiny.getName());

        // Voice Quality...
        // ( https://developer.android.com/reference/android/speech/tts/Voice.html )
        // 100 = Very Low, 200 = Low, 300 = Normal, 400 = High, 500 = Very High
        qualityTextView.setText(  "QLTY: " + ((Integer) voiceUnderScrutiny.getQuality()).toString()  );
        if (voiceUnderScrutiny.getQuality() == 500) {
            qualityTextView.setTextColor(Color.GREEN); // set v. high quality to green
        }

        if (!voiceUnderScrutiny.isNetworkConnectionRequired()) {
            networkRequiredTextView.setText("NET_REQ?: NO");
        } else {
            networkRequiredTextView.setText("NET_REQ?: YES");
        }

        if (!voiceUnderScrutiny.getFeatures().contains("notInstalled")) {
            isInstalledTextView.setTextColor(Color.GREEN);
            isInstalledTextView.setText("INSTLLD?: YES");
        } else {
            isInstalledTextView.setTextColor(Color.RED);
            isInstalledTextView.setText("INSTLLD?: NO");
        }

        featuresTextView.setText("FEATURES: " + voiceUnderScrutiny.getFeatures().toString());

        return rowView;
    }
}

activity_main.xml:

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:focusable="true"
    android:focusableInTouchMode="true"
    tools:context=".MainActivity">

    <EditText
        android:id="@+id/textToSpeak"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:ems="10"
        android:inputType="textPersonName"
        android:text="textToSpeak..."
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent" />

    <android.support.v4.widget.SwipeRefreshLayout
        android:id="@+id/swipeRefresh"
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:layout_marginBottom="8dp"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.0"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/progressView">

    <ListView
        android:id="@+id/voiceListView"
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:layout_marginBottom="8dp"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp">

    </ListView>

    </android.support.v4.widget.SwipeRefreshLayout>

    <TextView
        android:id="@+id/progressView"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="UTTERANCE_PROGRESS:"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/textToSpeak" />

</android.support.constraint.ConstraintLayout>

list_item_voice.xml:

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="wrap_content"

    android:layout_centerInParent="true"
    android:paddingBottom="8dp"
    android:paddingLeft="16dp"
    android:paddingRight="16dp"
    android:paddingTop="8dp"
    >

    <TextView
        android:id="@+id/voice_title"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="NAME:"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent" />


    <TextView
        android:id="@+id/voice_installed"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginTop="8dp"
        android:fontFamily="monospace"
        android:text="INSTALLED? "
        android:textAlignment="textStart"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.5"
        app:layout_constraintStart_toEndOf="@+id/voice_network"
        app:layout_constraintTop_toBottomOf="@+id/voice_title" />

    <TextView
        android:id="@+id/voice_quality"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="QUALITY:"
        app:layout_constraintEnd_toStartOf="@+id/voice_network"
        app:layout_constraintHorizontal_bias="0.5"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/voice_title" />

    <TextView
        android:id="@+id/voice_features"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginBottom="8dp"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="FEATURES:"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/voice_quality" />

    <TextView
        android:id="@+id/voice_network"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginEnd="8dp"
        android:layout_marginStart="8dp"
        android:layout_marginTop="8dp"
        android:text="NET_REQUIRED?"
        app:layout_constraintEnd_toStartOf="@+id/voice_installed"
        app:layout_constraintHorizontal_bias="0.5"
        app:layout_constraintStart_toEndOf="@+id/voice_quality"
        app:layout_constraintTop_toBottomOf="@+id/voice_title" />

</android.support.constraint.ConstraintLayout>