为什么在使用相同的 Tensorflow 模型时，Java 中的结果比 Python 中的结果更差？

Question

简介：

出于教育目的，我开发了一个 Java class 让学生能够加载 Tensorflow 模型Tensorflow SavedModel format and use them for classification purpose in Java. For example, they can create a model online with Google's Teachable Machine, download that and use that model right in Java. This also works with many image classification models on tfhub.dev。因此，我尝试使用 new 但没有很好的记录 Java API 而不是已弃用的旧 libtensorflow-API（当我正确理解所有内容时）。当我为此使用 BlueJ 时，一切都基于纯 Java 代码，下载后直接在 BlueJ 的首选项中链接所需的库。 Java 代码中的文档显示了库的下载位置。

注意： 我知道“今天的正常方式”是使用 Gradle 或 Maven 或某物。但是学生不使用这些工具。另一个注意事项：在下文中，我只使用了一些代码摘录，以简化所有内容以适应这个最小示例。

问题：

我在 Java 中加载的所有模型的结果都不错，但性能不如 Python 中的结果。 Tensorflow 网站上链接的在线演示，主要在 Jupyter 笔记本中。所以我的代码似乎有一步错误。

作为代表性测试，我现在将比较 MoveNet 模型在使用 Python 和 Java 时的性能。 MoveNet 模型“Thunder”在 256x256 像素的 image 中检测人体的 17 个关键点。我将在两种设置中使用完全相同的图像（相同的文件，但没有触摸和调整大小）（我将其上传到我的网站空间；此步骤是在更新此文本时完成的，但是结果没有差异）。

Python: MoveNet 模型在 Jupyter 笔记本中提供了一个不错的在线 Python 演示：

https://www.tensorflow.org/hub/tutorials/movenet

可以找到代码 here（注意：我通过将其上传到我的网站空间并链接到它来链接到与我的 Java 项目中相同的图像）和 class图像的化结果如下所示：

Java: 我基于 Java 的方法最终得到这样的图像：

我觉得还不错，但是不完美。对于其他型号，例如Google 的 imagenet_mobilenet 模型我得到了类似的结果，但我想当运行在 Jupyter 中进行在线演示时，它们总是好一点笔记本。我没有更多的证据——只有一种感觉。在某些情况下，来自在线演示的相同图像被识别为不同的 class - 但并非总是如此。稍后我可能会提供更多相关数据。

假设和完成的工作：

我的 Java 代码中的数据结构或算法可能有错误。我确实在网上搜索了几个星期，但我不确定我的代码是否真的准确，主要是因为那里的例子太少了。例如，我试图在将图像转换为 ND 数组的方法中更改 RGB 的顺序或计算方式。但是，我没有看到任何重大变化。也许错误在其他任何地方。不过，大概就是这样吧。如果我的代码运行良好且正确，那对我来说也没有问题 - 但我仍然想知道为什么会有差异。感谢您的回答！

代码：

这是一个包含两个 class 的完整示例（我知道，带有面板绘图的框架很糟糕 - 我为此示例快速编写了代码）

/**
 * 1. TensorFlow Core API Library: org.tensorflow -> tensorflow-core-api
 *      https://mvnrepository.com/artifact/org.tensorflow/tensorflow-core-api
 *          -> tensorflow-core-api-0.4.0.jar
 *      
 * 2.   additionally click "View All" and open:
 *      https://repo1.maven.org/maven2/org/tensorflow/tensorflow-core-api/0.4.0/
 *      Download the correct native library for your OS
 *          -> tensorflow-core-api-0.4.0-macosx-x86_64.jar
 *          -> tensorflow-core-api-0.4.0-windows-x86_64.jar
 *          -> tensorflow-core-api-0.4.0-linux-x86_64.jar 
 *      
 * 3. TensorFlow Framework Library:  org.tensorflow -> tensorflow-framework
 *      https://mvnrepository.com/artifact/org.tensorflow/tensorflow-framework/0.4.0
 *          -> tensorflow-framework-0.4.0.jar      
 *          
 * 4. Protocol Buffers [Core]: com.google.protobuf -> protobuf-java
 *      https://mvnrepository.com/artifact/com.google.protobuf/protobuf-java
 *          -> protobuf-java-4.0.0-rc-2.jar
 * 
 * 5. JavaCPP: org.bytedeco -> javacpp
 *      https://mvnrepository.com/artifact/org.bytedeco/javacpp
 *          -> javacpp-1.5.7.jar
 * 
 * 6. TensorFlow NdArray Library:  org.tensorflow -> ndarray
 *      https://mvnrepository.com/artifact/org.tensorflow/ndarray
 *          -> ndarray-0.3.3.jar
 */
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Tensor;
import org.tensorflow.ndarray.IntNdArray;
import org.tensorflow.ndarray.NdArrays;
import org.tensorflow.ndarray.Shape;
import org.tensorflow.types.TInt32;
import java.util.HashMap;
import java.util.Map;
import java.awt.image.BufferedImage;
import javax.imageio.ImageIO;
import java.awt.Color;
import java.io.File;
import javax.swing.JFrame;
import javax.swing.JButton;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.awt.BorderLayout;

public class MoveNetDemo {

    private SavedModelBundle model;
    private String inputLayerName;
    private String outputLayerName;
    private String keyName;
    private BufferedImage image;
    private float[][] output;    
    private int width;
    private int height;

    public MoveNetDemo(String pFoldername, int pImageWidth, int pImageHeight) {
        width = pImageWidth;
        height = pImageHeight;

        model = SavedModelBundle.load(pFoldername, "serve");
        // Read input and output layer names from file
        inputLayerName = model.signatures().get(0).getInputs().keySet().toString();
        outputLayerName = model.signatures().get(0).getOutputs().keySet().toString();
        inputLayerName = inputLayerName.substring(1, inputLayerName.length()-1);
        outputLayerName = outputLayerName.substring(1, outputLayerName.length()-1);
        keyName = model.signatures().get(0).key();        
    }

    // not necessary here
    public String getModelInformation() { 
        String infos = "";
        for (int i=0; i<model.signatures().size(); i++) {
            infos += model.signatures().get(i).toString();
        }         
        return infos;
    }  

    public void setData(String pFilename) {
        image = null;
        try {
            image = ImageIO.read(new File(pFilename));            
        } 
        catch (Exception e) {          
        }
    }

    public BufferedImage getData() {
        return image;
    }

    private IntNdArray fillIntNdArray(IntNdArray pMatrix, BufferedImage pImage) {        
        try {
            int w = pImage.getWidth();
            int h = pImage.getHeight();                

            for (int i = 0; i < h; i++) {
                for (int j = 0; j < w; j++) {                 
                    Color mycolor = new Color(pImage.getRGB(j, i));
                    int red = mycolor.getRed();
                    int green = mycolor.getGreen();
                    int blue = mycolor.getBlue();
                    pMatrix.setInt(red, 0, j, i, 0);
                    pMatrix.setInt(green, 0, j, i, 1);
                    pMatrix.setInt(blue, 0, j, i, 2);                                       
                }
            }
        }
        catch (Exception e) {            
        }
        return pMatrix;        
    }

    public void run() {
        Map<String, Tensor> feed_dict = null;
        IntNdArray input_matrix = NdArrays.ofInts(Shape.of(1, width, height, 3));
        input_matrix = fillIntNdArray(input_matrix, image);            
        Tensor input_tensor = TInt32.tensorOf(input_matrix);
        feed_dict = new HashMap<>();
        feed_dict.put(inputLayerName, input_tensor); 
        Map<String, Tensor> res = model.function(keyName).call(feed_dict);                
        Tensor output_tensor = res.get(outputLayerName); 

        output = new float[17][3];
        for (int i= 0; i<17; i++) {
            output[i][0] = output_tensor.asRawTensor().data().asFloats().getFloat(i*3)*256;                
            output[i][1] = output_tensor.asRawTensor().data().asFloats().getFloat(i*3+1)*256;                
            output[i][2] = output_tensor.asRawTensor().data().asFloats().getFloat(i*3+2);
        }
    }

    public float[][] getOutputArray() {
        return output;
    }

    public static void main(String[] args) {
        MoveNetDemo im = new MoveNetDemo("/Users/myname/Downloads/Code/TF_Test_04_NEW/movenet_singlepose_thunder_4", 256, 256);        
        im.setData("/Users/myname/Downloads/Code/TF_Test_04_NEW/test.jpeg");

        JFrame jf = new JFrame("TEST");
        jf.setSize(300, 300);
        jf.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        ImagePanel ip = new ImagePanel(im.getData());
        jf.add(ip, BorderLayout.CENTER);

        JButton st = new JButton("RUN");
        st.addActionListener(new ActionListener() { 
                public void actionPerformed(ActionEvent e) {
                    im.run();                            
                    ip.update(im.getOutputArray());
                    
                } 
            });        
        jf.add(st, BorderLayout.NORTH);

        jf.setVisible(true);
    }
}

和图像面板 class：

import javax.swing.JPanel;
import java.awt.image.BufferedImage;
import java.awt.Graphics;
import java.awt.Color;

public class ImagePanel extends JPanel {

    private BufferedImage image;
    private float[][] points;

    public ImagePanel(BufferedImage pImage) {        
        image = pImage;        
    }

    public void update(float[][] pPoints) {
        points = pPoints;
        repaint();
    }

    @Override
    protected void paintComponent(Graphics g) {                
        super.paintComponent(g);        
        g.drawImage(image, 0,0,null);
        g.setColor(Color.GREEN);
        if (points != null) {
            for (int j=0; j<17; j++) {                            
                g.fillOval((int)points[j][0], (int)points[j][1], 5, 5);
            } 
        }
    }
}

Answer 1

我找到了答案。我混淆了 height 和 width 两次！不知道，为什么这表现如此奇怪（几乎正确但不完美）但它现在有效。

在 Jupyter notebook 中它说：

input_image: A [1, height, width, 3]

所以我将方法 fillIntArray 更改为：

private IntNdArray fillIntNdArray(IntNdArray pMatrix, BufferedImage pImage) {        
        try {
            int w = pImage.getWidth();
            int h = pImage.getHeight();                

            for (int i = 0; i < h; i++) {
                for (int j = 0; j < w; j++) {                 
                    Color mycolor = new Color(pImage.getRGB(j, i));
                    int red = mycolor.getRed();
                    int green = mycolor.getGreen();
                    int blue = mycolor.getBlue();
                    pMatrix.setInt(red, 0, i, j, 0); // switched j and i 
                    pMatrix.setInt(green, 0, i, j, 1); // switched j and i 
                    pMatrix.setInt(blue, 0, i, j, 2); // switched j and i                                    
                }
            }
        }
        catch (Exception e) {            
        }
        return pMatrix;        
    }

因此在运行() 方法中：

IntNdArray input_matrix = NdArrays.ofInts(Shape.of(1, height, width, 3));

在 Jupyter notebook 中，您可以切换辅助函数以进行可视化，并看到首先获取 y 坐标，然后获取 x 坐标。先高后宽。在 ImagePanel class 中也进行了更改，问题得以解决，并且 class化符合预期，并且与在线演示中的质量相同！

if (points != null) {
    for (int j=0; j<17; j++) {                            
        // switched 0 and 1
        g.fillOval((int)points[j][1], (int)points[j][0], 5, 5);
    } 
}

这是：

为什么在使用相同的 Tensorflow 模型时，Java 中的结果比 Python 中的结果更差？

Why do I get worse results in Java than in Python when using the same Tensorflow models?

python

java

machine-learning

multidimensional-array

tensorflow

简介：

问题：

假设和完成的工作：

代码：