检查时出错：预期 input_1 有 4 个维度，但在本机反应中得到形状为 [320,240,3] 的数组

Question

我使用 teachable machine 创建了一个 tensorflow 模型，并想在 React Native 中实现它。我使用 cameraWithTensor 来获取输入这里是相机视图

<TensorCamera
            // Standard Camera props
            style={styles.camera}
            type={Camera.Constants.Type.front}
            // Tensor related props
            cameraTextureHeight={textureDims.height}
            cameraTextureWidth={textureDims.width}
            resizeHeight={320}
            resizeWidth={240}
            resizeDepth={3}
            onReady={makeHandleCameraStream()}
            autorender={true}
          />

这里是makeHandleCameraStream函数

const makeHandleCameraStream = ()=> {
    return (images, updatePreview, gl) => {
      const loop = async () => {
          const nextImageTensor = images.next().value;
          try {
          // const predictions = await model.estimateHands(nextImageTensor);

            const predictions = await model.predict(nextImageTensor); //this is the line where it breaks
            console.log(predictions)
            setPredictions(predictions)
          } catch (error) {
            // console.log(error.message)
          }
            
          requestAnimationFrame(loop);
      
          
      };
      loop();
    };
  }

这是我尝试使用 model.predict

时遇到的错误

Error when checking : expected input_1 to have 4 dimension(s), but got array with shape [320,240,3]

尝试更改这两行

let expandedImageTensor = tf.expandDims(nextImageTensor,0)

// error encountered without using reshape :
// Error when checking : expected input_1 to have shape [null,224,224,3] but got array with shape [1,320,240,3]
            
const predictions = await model.predict(expandedImageTensor.reshape([null,240,240,3]));
//error after adding .reshape : Size(230400) must match the product of shape ,240,240,3

Answer 1

TFJS 中图像输入的常用格式是 NHWC（数字、高度、宽度、通道），其中 N=1 表示单个图像，C=3 表示 RGB输入

这意味着您需要扩展输入以包括第一个维度 - 应该这样做：

// add dimension at position 0
const expandedImageTensor = tf.expandDims(nextImageTensor, 0); 
// use it
const predictions = await model.predict(expandedImageTensor);
// dispose at the end to avoid memory leak
tf.dispose(expandedImageTensor);

在 git 中使用 ops 发布模型的完整示例：（这只是使用 tfjs-node 的快速测试，但同样的概念也适用于 op 正在使用的 react-native）

// git clone https://github.com/ChinmayMhatre/ml_test
const fs = require('fs');
const tf = require('@tensorflow/tfjs-node');

async function main() {
  const model = await tf.loadLayersModel('file://assets/models/model.json');
  const metadata = fs.readFileSync('assets/models/metadata.json');
  const labels = JSON.parse(metadata.toString()).labels;

  const t = {}; // container that will hold all tensor variables
  const buffer = fs.readFileSync('test.jpg');
  t.decoded = tf.node.decodeJpeg(buffer); // in browser use tf.browser.fromPixels
  t.resized = tf.image.resizeBilinear(t.decoded, [224, 224]);
  t.expanded = tf.expandDims(t.resized, 0);
  t.results = await model.predict(t.expanded);
  const data = await t.results.data();
  for (const tensor of Object.keys(t)) tf.dispose(t[tensor]); // deallocate all tensors in a single swoop
  const results = [];
  for (let i = 0; i < data.length; i++) {
    results.push({ score: data[i], label: labels[i] });
  }
  results.sort((curr, prev) => prev.score - curr.score);
  console.log(results);
}

main();

我不知道模型是在什么上训练的，标签只是“10”、“20”……，所以使用我的测试输入图像的分数很低，但它有效。

在一个侧节点上，google 的 teachablemachine 已经很老了，模型既简单又不完善 - 目前有效，但我建议不要以这些模型为基础。

Answer 2

是我回答的另一个问题，也是关于使用 React Native 对实时视频提要进行预测。

一些注意事项：

您不必调整张量的大小，因为您可以使用 TensorCamera 的参数、resizeHeight、resizeWidth 和 resizeDepth这种情况下将它们设置为 224,224,3.
在函数 handleCameraStream() 中，您只需要在模型设置为状态时进行预测。
您需要使用 cancelAnimationFrames 主动取消动画帧并使用 requestAnimationFrame 获取帧的 ID，但老实说我忘记了原因。

export default function App() {
  const [isModelRead, setIsModelRead] = useState(false);
  const [useModel, setUseModel] = useState({});
  const [model, setModel] = useState(null);
  const [cameraPermission, setCameraPermission] = useState(false);
  const [predictions, setPredictions] = useState([]);

  let requestAnimationFrameId = 0;

  useEffect(() => {
    return () => {
      cancelAnimationFrame(requestAnimationFrameId);
    };
  }, [requestAnimationFrameId]);

  const setUp = async () => {
    try {
      await tf.ready();
      const { status } = await Camera.requestCameraPermissionsAsync();
      console.log(status);
      setCameraPermission(status == "granted");
      const newmodel = await tf.loadLayersModel(
        bundleResourceIO(modelJson, modelWeights)
      );
      setIsModelRead(true), setModel(newmodel);
      console.log("model loaded");
      console.log(cameraPermission);
      return model;
    } catch (error) {
      console.log("Could not load model", error);
    }
  };

  useEffect(() => {
    setUp();
  }, []);

  let textureDims;
  if (Platform.OS === "ios") {
    textureDims = {
      height: 1920,
      width: 1080,
    };
  } else {
    textureDims = {
      height: 1200,
      width: 1600,
    };
  }

  const handleCameraStream = (tensors) => {
    if (!tensors) {
      console.log("Image not found!");
    }
    const loop = async () => {
      if (model) {
        const imageTensor = tensors.next().value;
        // add dimension at position 0
        const expandedImageTensor = tf.expandDims(imageResize, 0);

        const predictions = await model.predict(expandedImageTensor, {
          batchSize: 1,
        });
        setPredictions(predictions.dataSync());
        tf.dispose(tensors);
      }
      requestAnimationFrameId = requestAnimationFrame(loop);
    };
    loop();
  };

  const predictionAvailable = () => {
    return <Text>{predictions}</Text>;
  };

  return (
    <View>
      {model && (
        <TensorCamera
          // Standard Camera props
          style={styles.camera}
          type={Camera.Constants.Type.front}
          cameraTextureHeight={textureDims.height}
          cameraTextureWidth={textureDims.width}
          resizeHeight={224}
          resizeWidth={224}
          resizeDepth={3}
          onReady={(tensors) => handleCameraStream(tensors)}
          autorender={true}
        />
      )}
      {prediction && predictionAvailable()}
    </View>
  );
}

检查时出错：预期 input_1 有 4 个维度，但在本机反应中得到形状为 [320,240,3] 的数组

Error when checking : expected input_1 to have 4 dimension(s), but got array with shape [320,240,3] in react native

machine-learning

react-native

tensorflow

tensor

tensorflow.js