如何使用经过训练的张量流网络从新数据生成预测?

How to generate predictions from new data using trained tensorflow network?

我想从头开始训练 Google VGGish network (Hershey et al 2017) 来预测 类 特定于我自己的音频文件。

为此,我正在使用 vggish_train_demo.py 脚本,该脚本可在其 github 存储库中使用,该脚本使用 tensorflow。我已经能够修改脚本以通过更改 _get_examples_batch() 函数从我自己的音频中提取 melspec 特征,然后根据该函数的输出训练模型。这运行到完成并打印每个时期的损失。

但是,我一直无法弄清楚如何让这个经过训练的模型根据新数据生成预测。这可以通过更改 vggish_train_demo.py 脚本来完成吗?

对于将来偶然发现此问题的任何人,我编写了这个脚本来完成这项工作。您必须在数组中保存训练和测试数据的 logmel 规范:X_train、y_train、X_test、y_test。 X_train/test 是 (n, 96,64) 个特征的数组,y_train/test 是两个 类 的形状 (n, _NUM_CLASSES) 的数组,其中 n = 0.96s 音频片段的数量和 _NUM_CLASSES = 使用的 类 的数量。

查看函数定义语句了解更多信息和我原来的 vggish github post:

### Run the network and save the predictions and accuracy at each epoch

### Train NN, output results
r"""This uses the VGGish model definition within a larger model which adds two 
layers on top, and then trains this larger model. 

We input log-mel spectrograms (X_train) calculated above with associated labels 
(y_train), and feed the batches into the model. Once the model is trained, it 
is then executed on the test log-mel spectrograms (X_test), and the accuracy is
ouput, alongside a .csv file with the predictions for each 0.96s chunk and their
true class."""
    
def main(X):   
  with tf.Graph().as_default(), tf.Session() as sess:
    # Define VGGish.
    embeddings = vggish_slim.define_vggish_slim(training=FLAGS.train_vggish)
    
    
    # Define a shallow classification model and associated training ops on top
    # of VGGish.
    with tf.variable_scope('mymodel'):
      # Add a fully connected layer with 100 units. Add an activation function
      # to the embeddings since they are pre-activation.
      num_units = 100
      fc = slim.fully_connected(tf.nn.relu(embeddings), num_units)

      # Add a classifier layer at the end, consisting of parallel logistic
      # classifiers, one per class. This allows for multi-class tasks.
      logits = slim.fully_connected(                                 
          fc, _NUM_CLASSES, activation_fn=None, scope='logits')
      tf.sigmoid(logits, name='prediction')
    
      linear_out= slim.fully_connected(                                      
          fc, _NUM_CLASSES, activation_fn=None, scope='linear_out')
      logits = tf.sigmoid(linear_out, name='logits')
    
      # Add training ops.
      with tf.variable_scope('train'):
        global_step = tf.train.create_global_step()

        # Labels are assumed to be fed as a batch multi-hot vectors, with
        # a 1 in the position of each positive class label, and 0 elsewhere.
        labels_input = tf.placeholder(
            tf.float32, shape=(None, _NUM_CLASSES), name='labels')

        # Cross-entropy label loss.
        xent = tf.nn.sigmoid_cross_entropy_with_logits(
            logits=logits, labels=labels_input, name='xent')  
        loss = tf.reduce_mean(xent, name='loss_op')
        tf.summary.scalar('loss', loss)

        # We use the same optimizer and hyperparameters as used to train VGGish.
        optimizer = tf.train.AdamOptimizer(
            learning_rate=vggish_params.LEARNING_RATE,
            epsilon=vggish_params.ADAM_EPSILON)
        train_op = optimizer.minimize(loss, global_step=global_step)

    # Initialize all variables in the model, and then load the pre-trained
    # VGGish checkpoint.
    sess.run(tf.global_variables_initializer())         
    vggish_slim.load_vggish_slim_checkpoint(sess, FLAGS.checkpoint)

    # The training loop.
    features_input = sess.graph.get_tensor_by_name(
        vggish_params.INPUT_TENSOR_NAME)
    


    accuracy_scores = []
    for epoch in range(num_epochs):#FLAGS.num_batches):
            epoch_loss = 0
            i=0
            while i < len(X_train):
                start = i
                end = i+batch_size
                batch_x = np.array(X_train[start:end])
                batch_y = np.array(y_train[start:end])

                _, c = sess.run([train_op, loss], feed_dict={features_input: batch_x, labels_input: batch_y})
                epoch_loss += c
                i+=batch_size
            #print no. of epochs and loss
            print('Epoch', epoch+1, 'completed out of', num_epochs,', loss:',epoch_loss) #FLAGS.num_batches,', loss:',epoch_loss)
            
            #If these lines are left here, it will evaluate on the test data every iteration and print accuracy
            #note this adds a small computational cost
            correct = tf.equal(tf.argmax(logits, 1), tf.argmax(labels_input, 1)) #This line returns the max value of each array, which we want to be the same (think the prediction/logits is value given to each class with the highest value being the best match)
            accuracy = tf.reduce_mean(tf.cast(correct, 'float')) #changes correct to type: float
            accuracy1 = accuracy.eval({features_input:X_test, labels_input:y_test}) 
            accuracy_scores.append(accuracy1)
            print('Accuracy:', accuracy1)#TF is smart so just knows to feed it through the model without us seeming to tell it to.



            #Save predictions for test data
            predictions_sigm = logits.eval(feed_dict = {features_input:X_test}) #not really _sigm, change back later
            #print(predictions_sigm) #shows table of predictions, meaningless if saving at each epoch
            test_preds = pd.DataFrame(predictions_sigm, columns = col_names)  #converts predictions to df
            true_class = np.argmax(y_test, axis = 1)     #This saves the true class
            test_preds['True class'] = true_class        #This adds true class to the df
        
            #Saves csv file of table of predictions for test data. NB. header will not save when using np.text for some reason
           np.savetxt("/content/drive/MyDrive/..."+"Epoch_"+str(epoch+1)+"_Accuracy_"+str(accuracy1), test_preds.values, delimiter=",") 

    
if __name__ == '__main__':
  tf.app.run()


#'An exception has occurred, use %tb to see the full traceback.' error will occur, fear not, this just means its finished (perhaps as its exited the tensorflow session?)