Java 如何在 CameraX 预览中设置框,以便使用 ImageAnalysis 对其进行处理?
How to set a box on CameraX preview so as to processes it using ImageAnalysis in Java?
我一直在开发一个需要使用 CameraX 作为其预览流的应用程序,但它还需要一种盒子类型的叠加层,从中解码文本。我已经成功地实现了预览,但是似乎无法找到一种方法来实现一个覆盖,在不使用任何第三方应用程序的情况下将从中解码文本。现在我们可以解码整个屏幕上的文本。我在 Codelabs turtorial (link) 中看到了执行此操作的代码,但它在 Kotlin 中,我无法破译这个复杂的 Kotlin 代码。如果有人可以帮助我在不使用第三方库的情况下做到这一点,那就太好了。提前致谢。
我的XML代码:
<androidx.camera.view.PreviewView
android:id="@+id/previewView"
android:layout_width="match_parent"
android:layout_height="675dp"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintTop_toBottomOf="@+id/toolbar">
我的相机逻辑:
PreviewView mCameraView;
Camera camera;
void startCamera() {
mCameraView = findViewById(R.id.previewView);
cameraProviderFuture = ProcessCameraProvider.getInstance(this);
cameraProviderFuture.addListener(() -> {
try {
ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
bindPreview(cameraProvider);
} catch (ExecutionException | InterruptedException e) {
// No errors need to be handled for this Future.
// This should never be reached.
}
}, ContextCompat.getMainExecutor(this));
}
void bindPreview(@NonNull ProcessCameraProvider cameraProvider) {
Preview preview = new Preview.Builder().
setTargetResolution(BestSize())
.build();
CameraSelector cameraSelector = new CameraSelector.Builder()
.requireLensFacing(CameraSelector.LENS_FACING_BACK)
.build();
preview.setSurfaceProvider(mCameraView.createSurfaceProvider());
ImageAnalysis imageAnalysis = new ImageAnalysis.Builder()
.setTargetResolution(new Size(4000, 5000))
.setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
.build();
imageAnalysis.setAnalyzer(executor, image -> {
frames++;
int rotationDegrees = degreesToFirebaseRotation(image.getImageInfo().getRotationDegrees());
Image mediaImage = image.getImage();
if (mediaImage == null) {
return;
}
FirebaseVisionImage firebaseVisionImage = FirebaseVisionImage.fromMediaImage(mediaImage,
rotationDegrees);
FirebaseVisionTextRecognizer detector =
FirebaseVision.getInstance().getOnDeviceTextRecognizer();
detector.processImage(firebaseVisionImage)
.addOnSuccessListener(firebaseVisionText -> {
// Task completed successfully
String text = firebaseVisionText.getText();
if (!text.isEmpty()) {
if (firstValidFrame == 0)
firstValidFrame = frames;
validFrames++;
}
mTextView.setText(text);
image.close();
})
.addOnFailureListener(
e -> {
Log.e("Error", e.toString());
image.close();
});
});
camera = cameraProvider.bindToLifecycle(this, cameraSelector, preview);
}
private int degreesToFirebaseRotation(int degrees) {
switch (degrees) {
case 0:
return FirebaseVisionImageMetadata.ROTATION_0;
case 90:
return FirebaseVisionImageMetadata.ROTATION_90;
case 180:
return FirebaseVisionImageMetadata.ROTATION_180;
case 270:
return FirebaseVisionImageMetadata.ROTATION_270;
default:
throw new IllegalArgumentException(
"Rotation must be 0, 90, 180, or 270.");
}
}
这基本上是代码实验室中的逻辑:
-
-
然后就可以将裁剪后的图片发送给文字识别了
我找到了解决方法,并为遇到与我相同问题的人写了一篇带有演示回购的文章。这是 link:
https://medium.com/@sdptd20/exploring-ocr-capabilities-of-ml-kit-using-camera-x-9949633af0fe
- 基本上我所做的就是使用图像分析从 Camera X 预览中获取帧。
- 然后我在预览的顶部创建了一个表面视图,并在上面画了一个矩形。
- 然后我取了矩形的偏移量并据此裁剪了我的位图。
- 然后我将位图提供给 FirebaseImageAnalyzer,我得到的文本只显示在边界框中。
这是主要内容 activity 的要点:
`
public class MainActivity extends AppCompatActivity implements SurfaceHolder.Callback {
TextView textView;
PreviewView mCameraView;
SurfaceHolder holder;
SurfaceView surfaceView;
Canvas canvas;
Paint paint;
int cameraHeight, cameraWidth, xOffset, yOffset, boxWidth, boxHeight;
private ListenableFuture<ProcessCameraProvider> cameraProviderFuture;
private ExecutorService executor = Executors.newSingleThreadExecutor();
/**
*Responsible for converting the rotation degrees from CameraX into the one compatible with Firebase ML
*/
private int degreesToFirebaseRotation(int degrees) {
switch (degrees) {
case 0:
return FirebaseVisionImageMetadata.ROTATION_0;
case 90:
return FirebaseVisionImageMetadata.ROTATION_90;
case 180:
return FirebaseVisionImageMetadata.ROTATION_180;
case 270:
return FirebaseVisionImageMetadata.ROTATION_270;
default:
throw new IllegalArgumentException(
"Rotation must be 0, 90, 180, or 270.");
}
}
/**
* Starting Camera
*/
void startCamera(){
mCameraView = findViewById(R.id.previewView);
cameraProviderFuture = ProcessCameraProvider.getInstance(this);
cameraProviderFuture.addListener(new Runnable() {
@Override
public void run() {
try {
ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
MainActivity.this.bindPreview(cameraProvider);
} catch (ExecutionException | InterruptedException e) {
// No errors need to be handled for this Future.
// This should never be reached.
}
}
}, ContextCompat.getMainExecutor(this));
}
/**
*
* Binding to camera
*/
private void bindPreview(ProcessCameraProvider cameraProvider) {
Preview preview = new Preview.Builder()
.build();
CameraSelector cameraSelector = new CameraSelector.Builder()
.requireLensFacing(CameraSelector.LENS_FACING_BACK)
.build();
preview.setSurfaceProvider(mCameraView.createSurfaceProvider());
//Image Analysis Function
//Set static size according to your device or write a dynamic function for it
ImageAnalysis imageAnalysis =
new ImageAnalysis.Builder()
.setTargetResolution(new Size(720, 1488))
.setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
.build();
imageAnalysis.setAnalyzer(executor, new ImageAnalysis.Analyzer() {
@SuppressLint("UnsafeExperimentalUsageError")
@Override
public void analyze(@NonNull ImageProxy image) {
//changing normal degrees into Firebase rotation
int rotationDegrees = degreesToFirebaseRotation(image.getImageInfo().getRotationDegrees());
if (image == null || image.getImage() == null) {
return;
}
//Getting a FirebaseVisionImage object using the Image object and rotationDegrees
final Image mediaImage = image.getImage();
FirebaseVisionImage images = FirebaseVisionImage.fromMediaImage(mediaImage, rotationDegrees);
//Getting bitmap from FirebaseVisionImage Object
Bitmap bmp=images.getBitmap();
//Getting the values for cropping
DisplayMetrics displaymetrics = new DisplayMetrics();
getWindowManager().getDefaultDisplay().getMetrics(displaymetrics);
int height = bmp.getHeight();
int width = bmp.getWidth();
int left, right, top, bottom, diameter;
diameter = width;
if (height < width) {
diameter = height;
}
int offset = (int) (0.05 * diameter);
diameter -= offset;
left = width / 2 - diameter / 3;
top = height / 2 - diameter / 3;
right = width / 2 + diameter / 3;
bottom = height / 2 + diameter / 3;
xOffset = left;
yOffset = top;
//Creating new cropped bitmap
Bitmap bitmap = Bitmap.createBitmap(bmp, left, top, boxWidth, boxHeight);
//initializing FirebaseVisionTextRecognizer object
FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
.getOnDeviceTextRecognizer();
//Passing FirebaseVisionImage Object created from the cropped bitmap
Task<FirebaseVisionText> result = detector.processImage(FirebaseVisionImage.fromBitmap(bitmap))
.addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
@Override
public void onSuccess(FirebaseVisionText firebaseVisionText) {
// Task completed successfully
// ...
textView=findViewById(R.id.text);
//getting decoded text
String text=firebaseVisionText.getText();
//Setting the decoded text in the texttview
textView.setText(text);
//for getting blocks and line elements
for (FirebaseVisionText.TextBlock block: firebaseVisionText.getTextBlocks()) {
String blockText = block.getText();
for (FirebaseVisionText.Line line: block.getLines()) {
String lineText = line.getText();
for (FirebaseVisionText.Element element: line.getElements()) {
String elementText = element.getText();
}
}
}
image.close();
}
})
.addOnFailureListener(
new OnFailureListener() {
@Override
public void onFailure(@NonNull Exception e) {
// Task failed with an exception
// ...
Log.e("Error",e.toString());
image.close();
}
});
}
});
Camera camera = cameraProvider.bindToLifecycle((LifecycleOwner)this, cameraSelector, imageAnalysis,preview);
}
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
//Start Camera
startCamera();
//Create the bounding box
surfaceView = findViewById(R.id.overlay);
surfaceView.setZOrderOnTop(true);
holder = surfaceView.getHolder();
holder.setFormat(PixelFormat.TRANSPARENT);
holder.addCallback(this);
}
/**
*
* For drawing the rectangular box
*/
private void DrawFocusRect(int color) {
DisplayMetrics displaymetrics = new DisplayMetrics();
getWindowManager().getDefaultDisplay().getMetrics(displaymetrics);
int height = mCameraView.getHeight();
int width = mCameraView.getWidth();
//cameraHeight = height;
//cameraWidth = width;
int left, right, top, bottom, diameter;
diameter = width;
if (height < width) {
diameter = height;
}
int offset = (int) (0.05 * diameter);
diameter -= offset;
canvas = holder.lockCanvas();
canvas.drawColor(0, PorterDuff.Mode.CLEAR);
//border's properties
paint = new Paint();
paint.setStyle(Paint.Style.STROKE);
paint.setColor(color);
paint.setStrokeWidth(5);
left = width / 2 - diameter / 3;
top = height / 2 - diameter / 3;
right = width / 2 + diameter / 3;
bottom = height / 2 + diameter / 3;
xOffset = left;
yOffset = top;
boxHeight = bottom - top;
boxWidth = right - left;
//Changing the value of x in diameter/x will change the size of the box ; inversely proportionate to x
canvas.drawRect(left, top, right, bottom, paint);
holder.unlockCanvasAndPost(canvas);
}
/**
* Callback functions for the surface Holder
*/
@Override
public void surfaceCreated(SurfaceHolder holder) {
}
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
//Drawing rectangle
DrawFocusRect(Color.parseColor("#b3dabb"));
}
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
}
}
`
编辑:我发现您也可以使用带有图像视图而不是表面视图的 png 文件。那可能会更干净,您还可以集成自定义布局供用户叠加。
Edit2:我发现将位图发送到图像分析器可能效率低下(正在处理 MLKit Barcode reader 并且它在日志中明确抛出此警告)所以我们可以做的是:
imagePreview.setCropRect(r);
其中 imagePreview 是 ImageProxy 图像,r 是“android.graphics.Rect”。
我一直在开发一个需要使用 CameraX 作为其预览流的应用程序,但它还需要一种盒子类型的叠加层,从中解码文本。我已经成功地实现了预览,但是似乎无法找到一种方法来实现一个覆盖,在不使用任何第三方应用程序的情况下将从中解码文本。现在我们可以解码整个屏幕上的文本。我在 Codelabs turtorial (link) 中看到了执行此操作的代码,但它在 Kotlin 中,我无法破译这个复杂的 Kotlin 代码。如果有人可以帮助我在不使用第三方库的情况下做到这一点,那就太好了。提前致谢。
我的XML代码:
<androidx.camera.view.PreviewView
android:id="@+id/previewView"
android:layout_width="match_parent"
android:layout_height="675dp"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintTop_toBottomOf="@+id/toolbar">
我的相机逻辑:
PreviewView mCameraView;
Camera camera;
void startCamera() {
mCameraView = findViewById(R.id.previewView);
cameraProviderFuture = ProcessCameraProvider.getInstance(this);
cameraProviderFuture.addListener(() -> {
try {
ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
bindPreview(cameraProvider);
} catch (ExecutionException | InterruptedException e) {
// No errors need to be handled for this Future.
// This should never be reached.
}
}, ContextCompat.getMainExecutor(this));
}
void bindPreview(@NonNull ProcessCameraProvider cameraProvider) {
Preview preview = new Preview.Builder().
setTargetResolution(BestSize())
.build();
CameraSelector cameraSelector = new CameraSelector.Builder()
.requireLensFacing(CameraSelector.LENS_FACING_BACK)
.build();
preview.setSurfaceProvider(mCameraView.createSurfaceProvider());
ImageAnalysis imageAnalysis = new ImageAnalysis.Builder()
.setTargetResolution(new Size(4000, 5000))
.setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
.build();
imageAnalysis.setAnalyzer(executor, image -> {
frames++;
int rotationDegrees = degreesToFirebaseRotation(image.getImageInfo().getRotationDegrees());
Image mediaImage = image.getImage();
if (mediaImage == null) {
return;
}
FirebaseVisionImage firebaseVisionImage = FirebaseVisionImage.fromMediaImage(mediaImage,
rotationDegrees);
FirebaseVisionTextRecognizer detector =
FirebaseVision.getInstance().getOnDeviceTextRecognizer();
detector.processImage(firebaseVisionImage)
.addOnSuccessListener(firebaseVisionText -> {
// Task completed successfully
String text = firebaseVisionText.getText();
if (!text.isEmpty()) {
if (firstValidFrame == 0)
firstValidFrame = frames;
validFrames++;
}
mTextView.setText(text);
image.close();
})
.addOnFailureListener(
e -> {
Log.e("Error", e.toString());
image.close();
});
});
camera = cameraProvider.bindToLifecycle(this, cameraSelector, preview);
}
private int degreesToFirebaseRotation(int degrees) {
switch (degrees) {
case 0:
return FirebaseVisionImageMetadata.ROTATION_0;
case 90:
return FirebaseVisionImageMetadata.ROTATION_90;
case 180:
return FirebaseVisionImageMetadata.ROTATION_180;
case 270:
return FirebaseVisionImageMetadata.ROTATION_270;
default:
throw new IllegalArgumentException(
"Rotation must be 0, 90, 180, or 270.");
}
}
这基本上是代码实验室中的逻辑:
然后就可以将裁剪后的图片发送给文字识别了
我找到了解决方法,并为遇到与我相同问题的人写了一篇带有演示回购的文章。这是 link: https://medium.com/@sdptd20/exploring-ocr-capabilities-of-ml-kit-using-camera-x-9949633af0fe
- 基本上我所做的就是使用图像分析从 Camera X 预览中获取帧。
- 然后我在预览的顶部创建了一个表面视图,并在上面画了一个矩形。
- 然后我取了矩形的偏移量并据此裁剪了我的位图。
- 然后我将位图提供给 FirebaseImageAnalyzer,我得到的文本只显示在边界框中。
这是主要内容 activity 的要点: `
public class MainActivity extends AppCompatActivity implements SurfaceHolder.Callback {
TextView textView;
PreviewView mCameraView;
SurfaceHolder holder;
SurfaceView surfaceView;
Canvas canvas;
Paint paint;
int cameraHeight, cameraWidth, xOffset, yOffset, boxWidth, boxHeight;
private ListenableFuture<ProcessCameraProvider> cameraProviderFuture;
private ExecutorService executor = Executors.newSingleThreadExecutor();
/**
*Responsible for converting the rotation degrees from CameraX into the one compatible with Firebase ML
*/
private int degreesToFirebaseRotation(int degrees) {
switch (degrees) {
case 0:
return FirebaseVisionImageMetadata.ROTATION_0;
case 90:
return FirebaseVisionImageMetadata.ROTATION_90;
case 180:
return FirebaseVisionImageMetadata.ROTATION_180;
case 270:
return FirebaseVisionImageMetadata.ROTATION_270;
default:
throw new IllegalArgumentException(
"Rotation must be 0, 90, 180, or 270.");
}
}
/**
* Starting Camera
*/
void startCamera(){
mCameraView = findViewById(R.id.previewView);
cameraProviderFuture = ProcessCameraProvider.getInstance(this);
cameraProviderFuture.addListener(new Runnable() {
@Override
public void run() {
try {
ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
MainActivity.this.bindPreview(cameraProvider);
} catch (ExecutionException | InterruptedException e) {
// No errors need to be handled for this Future.
// This should never be reached.
}
}
}, ContextCompat.getMainExecutor(this));
}
/**
*
* Binding to camera
*/
private void bindPreview(ProcessCameraProvider cameraProvider) {
Preview preview = new Preview.Builder()
.build();
CameraSelector cameraSelector = new CameraSelector.Builder()
.requireLensFacing(CameraSelector.LENS_FACING_BACK)
.build();
preview.setSurfaceProvider(mCameraView.createSurfaceProvider());
//Image Analysis Function
//Set static size according to your device or write a dynamic function for it
ImageAnalysis imageAnalysis =
new ImageAnalysis.Builder()
.setTargetResolution(new Size(720, 1488))
.setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
.build();
imageAnalysis.setAnalyzer(executor, new ImageAnalysis.Analyzer() {
@SuppressLint("UnsafeExperimentalUsageError")
@Override
public void analyze(@NonNull ImageProxy image) {
//changing normal degrees into Firebase rotation
int rotationDegrees = degreesToFirebaseRotation(image.getImageInfo().getRotationDegrees());
if (image == null || image.getImage() == null) {
return;
}
//Getting a FirebaseVisionImage object using the Image object and rotationDegrees
final Image mediaImage = image.getImage();
FirebaseVisionImage images = FirebaseVisionImage.fromMediaImage(mediaImage, rotationDegrees);
//Getting bitmap from FirebaseVisionImage Object
Bitmap bmp=images.getBitmap();
//Getting the values for cropping
DisplayMetrics displaymetrics = new DisplayMetrics();
getWindowManager().getDefaultDisplay().getMetrics(displaymetrics);
int height = bmp.getHeight();
int width = bmp.getWidth();
int left, right, top, bottom, diameter;
diameter = width;
if (height < width) {
diameter = height;
}
int offset = (int) (0.05 * diameter);
diameter -= offset;
left = width / 2 - diameter / 3;
top = height / 2 - diameter / 3;
right = width / 2 + diameter / 3;
bottom = height / 2 + diameter / 3;
xOffset = left;
yOffset = top;
//Creating new cropped bitmap
Bitmap bitmap = Bitmap.createBitmap(bmp, left, top, boxWidth, boxHeight);
//initializing FirebaseVisionTextRecognizer object
FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
.getOnDeviceTextRecognizer();
//Passing FirebaseVisionImage Object created from the cropped bitmap
Task<FirebaseVisionText> result = detector.processImage(FirebaseVisionImage.fromBitmap(bitmap))
.addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
@Override
public void onSuccess(FirebaseVisionText firebaseVisionText) {
// Task completed successfully
// ...
textView=findViewById(R.id.text);
//getting decoded text
String text=firebaseVisionText.getText();
//Setting the decoded text in the texttview
textView.setText(text);
//for getting blocks and line elements
for (FirebaseVisionText.TextBlock block: firebaseVisionText.getTextBlocks()) {
String blockText = block.getText();
for (FirebaseVisionText.Line line: block.getLines()) {
String lineText = line.getText();
for (FirebaseVisionText.Element element: line.getElements()) {
String elementText = element.getText();
}
}
}
image.close();
}
})
.addOnFailureListener(
new OnFailureListener() {
@Override
public void onFailure(@NonNull Exception e) {
// Task failed with an exception
// ...
Log.e("Error",e.toString());
image.close();
}
});
}
});
Camera camera = cameraProvider.bindToLifecycle((LifecycleOwner)this, cameraSelector, imageAnalysis,preview);
}
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
//Start Camera
startCamera();
//Create the bounding box
surfaceView = findViewById(R.id.overlay);
surfaceView.setZOrderOnTop(true);
holder = surfaceView.getHolder();
holder.setFormat(PixelFormat.TRANSPARENT);
holder.addCallback(this);
}
/**
*
* For drawing the rectangular box
*/
private void DrawFocusRect(int color) {
DisplayMetrics displaymetrics = new DisplayMetrics();
getWindowManager().getDefaultDisplay().getMetrics(displaymetrics);
int height = mCameraView.getHeight();
int width = mCameraView.getWidth();
//cameraHeight = height;
//cameraWidth = width;
int left, right, top, bottom, diameter;
diameter = width;
if (height < width) {
diameter = height;
}
int offset = (int) (0.05 * diameter);
diameter -= offset;
canvas = holder.lockCanvas();
canvas.drawColor(0, PorterDuff.Mode.CLEAR);
//border's properties
paint = new Paint();
paint.setStyle(Paint.Style.STROKE);
paint.setColor(color);
paint.setStrokeWidth(5);
left = width / 2 - diameter / 3;
top = height / 2 - diameter / 3;
right = width / 2 + diameter / 3;
bottom = height / 2 + diameter / 3;
xOffset = left;
yOffset = top;
boxHeight = bottom - top;
boxWidth = right - left;
//Changing the value of x in diameter/x will change the size of the box ; inversely proportionate to x
canvas.drawRect(left, top, right, bottom, paint);
holder.unlockCanvasAndPost(canvas);
}
/**
* Callback functions for the surface Holder
*/
@Override
public void surfaceCreated(SurfaceHolder holder) {
}
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
//Drawing rectangle
DrawFocusRect(Color.parseColor("#b3dabb"));
}
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
}
}
`
编辑:我发现您也可以使用带有图像视图而不是表面视图的 png 文件。那可能会更干净,您还可以集成自定义布局供用户叠加。
Edit2:我发现将位图发送到图像分析器可能效率低下(正在处理 MLKit Barcode reader 并且它在日志中明确抛出此警告)所以我们可以做的是:
imagePreview.setCropRect(r);
其中 imagePreview 是 ImageProxy 图像,r 是“android.graphics.Rect”。