DEV Community

Cover image for On Device Camera Stream Text Recognition in KnowMyBoard Android App Using ML Kit, MVVM, Navigation Components
HMS Community
HMS Community

Posted on

On Device Camera Stream Text Recognition in KnowMyBoard Android App Using ML Kit, MVVM, Navigation Components

Introduction

In this article, we will learn how to integrate Huawei ML kit camera stream, Map kit and Location kit in Android application KnowMyBoard. Account Kit provides seamless login functionality to the app with large user base.

The text recognition service can extract text from images of receipts, business cards, and documents. This service is useful for industries such as printing, education, and logistics. You can use it to create apps that handle data entry and check tasks.

The text recognition service is able to recognize text in both static images and dynamic camera streams with a host of APIs, which you can call synchronously or asynchronously to build your text recognition-enabled apps.

Precautions
Image description

Development Overview

You need to install Android Studio IDE and I assume that you have prior knowledge of Android application development.

Hardware Requirements

A computer (desktop or laptop) running Windows 10.
Android phone (with the USB cable), which is used for debugging.

Software Requirements

Java JDK 1.8 or later.
Android Studio software or Visual Studio or Code installed.
HMS Core (APK) 4.X or later
Integration steps

Step 1. Huawei developer account and complete identity verification in Huawei developer website, refer to register Huawei ID.

Step 2. Create project in AppGallery Connect

Step 3. Adding HMS Core SDK

Let's start coding **
**navigation_graph.xml

<?xml version="1.0" encoding="utf-8"?>
<navigation xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:id="@+id/navigation_graph"
    app:startDestination="@id/loginFragment">
    <fragment
        android:id="@+id/loginFragment"
        android:name="com.huawei.hms.knowmyboard.dtse.activity.fragments.LoginFragment"
        android:label="LoginFragment"/>
    <fragment
        android:id="@+id/mainFragment"
        android:name="com.huawei.hms.knowmyboard.dtse.activity.fragments.MainFragment"
        android:label="MainFragment"/>
    <fragment
        android:id="@+id/searchFragment"
        android:name="com.huawei.hms.knowmyboard.dtse.activity.fragments.SearchFragment"
        android:label="fragment_search"
        tools:layout="@layout/fragment_search" />
</navigation>
Enter fullscreen mode Exit fullscreen mode

TextRecognitionActivity.java

public final class TextRecognitionActivity extends BaseActivity
        implements OnRequestPermissionsResultCallback, View.OnClickListener {
    private static final String TAG = "TextRecognitionActivity";
    private LensEngine lensEngine = null;
    private LensEnginePreview preview;
    private GraphicOverlay graphicOverlay;
    private ImageButton takePicture;
    private ImageButton imageSwitch;
    private RelativeLayout zoomImageLayout;
    private ZoomImageView zoomImageView;
    private ImageButton zoomImageClose;
    CameraConfiguration cameraConfiguration = null;
    private int facing = CameraConfiguration.CAMERA_FACING_BACK;
    private Camera mCamera;
    private boolean isLandScape;
    private Bitmap bitmap;
    private Bitmap bitmapCopy;
    private LocalTextTransactor localTextTransactor;
    private Handler mHandler = new MsgHandler(this);
    private Dialog languageDialog;
    private AddPictureDialog addPictureDialog;
    private TextView textCN;
    private TextView textEN;
    private TextView textJN;
    private TextView textKN;
    private TextView textLN;
    private TextView tv_language,tv_translated_txt;

    private String textType = Constant.POSITION_CN;
    private boolean isInitialization = false;
    MLTextAnalyzer analyzer;
    private static class MsgHandler extends Handler {
        WeakReference<TextRecognitionActivity> mMainActivityWeakReference;

        public MsgHandler(TextRecognitionActivity mainActivity) {
            this.mMainActivityWeakReference = new WeakReference<>(mainActivity);
        }

        @Override
        public void handleMessage(Message msg) {
            super.handleMessage(msg);
            TextRecognitionActivity mainActivity = this.mMainActivityWeakReference.get();
            if (mainActivity == null) {

                return;
            }

            //Log.d(TextRecognitionActivity.TAG, "msg what :" + msg.what);
            //Log.e("TAG", "msg what :" + msg.getTarget().getMessageName(msg));
            if (msg.what == Constant.SHOW_TAKE_PHOTO_BUTTON) {
                mainActivity.setVisible();

            } else if (msg.what == Constant.HIDE_TAKE_PHOTO_BUTTON) {
                mainActivity.setGone();

            }
        }
    }

    private void setVisible() {
        if (this.takePicture.getVisibility() == View.GONE) {
            this.takePicture.setVisibility(View.VISIBLE);
        }
    }

    private void setGone() {
        if (this.takePicture.getVisibility() == View.VISIBLE) {
            this.takePicture.setVisibility(View.GONE);
        }
    }

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        this.setContentView(R.layout.activity_text_recognition);
        if (savedInstanceState != null) {
            this.facing = savedInstanceState.getInt(Constant.CAMERA_FACING);
        }
        this.tv_language = this.findViewById(R.id.tv_lang);
        this.tv_translated_txt = this.findViewById(R.id.tv_translated_txt);
        this.preview = this.findViewById(R.id.live_preview);
        this.graphicOverlay = this.findViewById(R.id.live_overlay);
        this.cameraConfiguration = new CameraConfiguration();
        this.cameraConfiguration.setCameraFacing(this.facing);
        this.initViews();
        this.isLandScape = (this.getResources().getConfiguration().orientation == Configuration.ORIENTATION_LANDSCAPE);
        this.createLensEngine();
        this.setStatusBar();
    }

    private void initViews() {
        this.takePicture = this.findViewById(R.id.takePicture);
        this.takePicture.setOnClickListener(this);
        this.imageSwitch = this.findViewById(R.id.text_imageSwitch);
        this.imageSwitch.setOnClickListener(this);
        this.zoomImageLayout = this.findViewById(R.id.zoomImageLayout);
        this.zoomImageView = this.findViewById(R.id.take_picture_overlay);
        this.zoomImageClose = this.findViewById(R.id.zoomImageClose);
        this.zoomImageClose.setOnClickListener(this);
        this.findViewById(R.id.back).setOnClickListener(this);
        this.findViewById(R.id.language_setting).setOnClickListener(this);
        this.createLanguageDialog();
        this.createAddPictureDialog();
    }

    @Override
    public void onClick(View view) {
        if (view.getId() == R.id.takePicture) {
            this.takePicture();
        } else if (view.getId() == R.id.zoomImageClose) {
            this.zoomImageLayout.setVisibility(View.GONE);
            this.recycleBitmap();
        } else if (view.getId() == R.id.text_imageSwitch) {
            this.showAddPictureDialog();
        } else if (view.getId() == R.id.language_setting) {
            this.showLanguageDialog();
        } else if (view.getId() == R.id.simple_cn) {
            SharedPreferencesUtil.getInstance(this)
                    .putStringValue(Constant.POSITION_KEY, Constant.POSITION_CN);
            this.languageDialog.dismiss();
            this.restartLensEngine(Constant.POSITION_CN);
        } else if (view.getId() == R.id.english) {
            SharedPreferencesUtil.getInstance(this)
                    .putStringValue(Constant.POSITION_KEY, Constant.POSITION_EN);
            this.languageDialog.dismiss();
            this.preview.release();
            this.restartLensEngine(Constant.POSITION_EN);
        } else if (view.getId() == R.id.japanese) {
            SharedPreferencesUtil.getInstance(this)
                    .putStringValue(Constant.POSITION_KEY, Constant.POSITION_JA);
            this.languageDialog.dismiss();
            this.preview.release();
            this.restartLensEngine(Constant.POSITION_JA);
        } else if (view.getId() == R.id.korean) {
            SharedPreferencesUtil.getInstance(this)
                    .putStringValue(Constant.POSITION_KEY, Constant.POSITION_KO);
            this.languageDialog.dismiss();
            this.preview.release();
            this.restartLensEngine(Constant.POSITION_KO);
        } else if (view.getId() == R.id.latin) {
            SharedPreferencesUtil.getInstance(this)
                    .putStringValue(Constant.POSITION_KEY, Constant.POSITION_LA);
            this.languageDialog.dismiss();
            this.preview.release();
            this.restartLensEngine(Constant.POSITION_LA);
        } else if (view.getId() == R.id.back) {
            releaseLensEngine();
            this.finish();
        }
    }

    private void restartLensEngine(String type) {
        if (this.textType.equals(type)) {
            return;
        }
        this.lensEngine.release();
        this.lensEngine = null;
        this.createLensEngine();
        this.startLensEngine();
        if (this.lensEngine == null || this.lensEngine.getCamera() == null) {
            return;
        }
        this.mCamera = this.lensEngine.getCamera();
        try {
            this.mCamera.setPreviewDisplay(this.preview.getSurfaceHolder());
        } catch (IOException e) {
            Log.d(TextRecognitionActivity.TAG, "initViews IOException");
        }
    }

    @Override
    public void onBackPressed() {
        if (this.zoomImageLayout.getVisibility() == View.VISIBLE) {
            this.zoomImageLayout.setVisibility(View.GONE);
            this.recycleBitmap();
        } else {
            super.onBackPressed();
            releaseLensEngine();
        }
    }

    private void createLanguageDialog() {
        this.languageDialog = new Dialog(this, R.style.MyDialogStyle);
        View view = View.inflate(this, R.layout.dialog_language_setting, null);
        // Set up a custom layout
        this.languageDialog.setContentView(view);
        this.textCN = view.findViewById(R.id.simple_cn);
        this.textCN.setOnClickListener(this);
        this.textEN = view.findViewById(R.id.english);
        this.textEN.setOnClickListener(this);
        this.textJN = view.findViewById(R.id.japanese);
        this.textJN.setOnClickListener(this);
        this.textKN = view.findViewById(R.id.korean);
        this.textKN.setOnClickListener(this);
        this.textLN = view.findViewById(R.id.latin);
        this.textLN.setOnClickListener(this);
        this.languageDialog.setCanceledOnTouchOutside(true);
        // Set the size of the dialog
        Window dialogWindow = this.languageDialog.getWindow();
        WindowManager.LayoutParams layoutParams = dialogWindow.getAttributes();
        layoutParams.width = WindowManager.LayoutParams.MATCH_PARENT;
        layoutParams.height = WindowManager.LayoutParams.WRAP_CONTENT;
        layoutParams.gravity = Gravity.BOTTOM;
        dialogWindow.setAttributes(layoutParams);
    }

    private void showLanguageDialog() {
        this.initDialogViews();
        this.languageDialog.show();
    }

    private void createAddPictureDialog() {
        this.addPictureDialog = new AddPictureDialog(this, AddPictureDialog.TYPE_NORMAL);
        final Intent intent = new Intent(TextRecognitionActivity.this, RemoteDetectionActivity.class);
        intent.putExtra(Constant.MODEL_TYPE, Constant.CLOUD_TEXT_DETECTION);
        this.addPictureDialog.setClickListener(new AddPictureDialog.ClickListener() {
            @Override
            public void takePicture() {
                lensEngine.release();
                isInitialization = false;
                intent.putExtra(Constant.ADD_PICTURE_TYPE, Constant.TYPE_TAKE_PHOTO);
                TextRecognitionActivity.this.startActivity(intent);
            }

            @Override
            public void selectImage() {
                intent.putExtra(Constant.ADD_PICTURE_TYPE, Constant.TYPE_SELECT_IMAGE);
                TextRecognitionActivity.this.startActivity(intent);
            }

            @Override
            public void doExtend() {

            }
        });
    }

    private void showAddPictureDialog() {
        this.addPictureDialog.show();
    }

    private void initDialogViews() {
        String position = SharedPreferencesUtil.getInstance(this).getStringValue(Constant.POSITION_KEY);
        this.textType = position;
        this.textCN.setSelected(false);
        this.textEN.setSelected(false);
        this.textJN.setSelected(false);
        this.textLN.setSelected(false);
        this.textKN.setSelected(false);
        switch (position) {
            case Constant.POSITION_CN:
                this.textCN.setSelected(true);
                break;
            case Constant.POSITION_EN:
                this.textEN.setSelected(true);
                break;
            case Constant.POSITION_LA:
                this.textLN.setSelected(true);
                break;
            case Constant.POSITION_JA:
                this.textJN.setSelected(true);
                break;
            case Constant.POSITION_KO:
                this.textKN.setSelected(true);
                break;
            default:
        }
    }

    @Override
    protected void onSaveInstanceState(Bundle outState) {
        outState.putInt(Constant.CAMERA_FACING, this.facing);
        super.onSaveInstanceState(outState);
    }


    private void createLensEngine() {
        MLLocalTextSetting setting = new MLLocalTextSetting.Factory()
                .setOCRMode(MLLocalTextSetting.OCR_DETECT_MODE)
                // Specify languages that can be recognized.
                .setLanguage("ko")
                .create();
        analyzer = MLAnalyzerFactory.getInstance().getLocalTextAnalyzer(setting);
        //analyzer = new MLTextAnalyzer.Factory(this).create();

        if (this.lensEngine == null) {
            this.lensEngine = new LensEngine(this, this.cameraConfiguration, this.graphicOverlay);

        }
        try {
            this.localTextTransactor = new LocalTextTransactor(this.mHandler, this);
            this.lensEngine.setMachineLearningFrameTransactor(this.localTextTransactor);
           // this.lensEngine.setMachineLearningFrameTransactor((ImageTransactor) new ObjectAnalyzerTransactor());
            isInitialization = true;
        } catch (Exception e) {
            Toast.makeText(
                    this,
                    "Can not create image transactor: " + e.getMessage(),
                    Toast.LENGTH_LONG)
                    .show();
        }
    }

    private void startLensEngine() {
        if (this.lensEngine != null) {
            try {
                this.preview.start(this.lensEngine, false);
            } catch (IOException e) {
                Log.e(TextRecognitionActivity.TAG, "Unable to start lensEngine.", e);
                this.lensEngine.release();
                this.lensEngine = null;
            }
        }
    }

    @Override
    public void onResume() {
        super.onResume();
        if (!isInitialization){
           createLensEngine();
        }
        this.startLensEngine();
    }

    @Override
    protected void onStop() {
        super.onStop();
        this.preview.stop();
    }

    private void releaseLensEngine() {
        if (this.lensEngine != null) {
            this.lensEngine.release();
            this.lensEngine = null;
        }
        recycleBitmap();
    }

    @Override
    protected void onDestroy() {
        super.onDestroy();
        releaseLensEngine();
        if (analyzer != null) {
            try {
                analyzer.stop();
            } catch (IOException e) {
                // Exception handling.
                Log.e(TAG,"Error while releasing analyzer");
            }
        }
    }

    private void recycleBitmap() {
        if (this.bitmap != null && !this.bitmap.isRecycled()) {
            this.bitmap.recycle();
            this.bitmap = null;
        }
        if (this.bitmapCopy != null && !this.bitmapCopy.isRecycled()) {
            this.bitmapCopy.recycle();
            this.bitmapCopy = null;
        }
    }

    private void takePicture() {

        this.zoomImageLayout.setVisibility(View.VISIBLE);
        LocalDataProcessor localDataProcessor = new LocalDataProcessor();
        localDataProcessor.setLandScape(this.isLandScape);
        this.bitmap = BitmapUtils.getBitmap(this.localTextTransactor.getTransactingImage(), this.localTextTransactor.getTransactingMetaData());

        float previewWidth = localDataProcessor.getMaxWidthOfImage(this.localTextTransactor.getTransactingMetaData());
        float previewHeight = localDataProcessor.getMaxHeightOfImage(this.localTextTransactor.getTransactingMetaData());
        if (this.isLandScape) {
            previewWidth = localDataProcessor.getMaxHeightOfImage(this.localTextTransactor.getTransactingMetaData());
            previewHeight = localDataProcessor.getMaxWidthOfImage(this.localTextTransactor.getTransactingMetaData());
        }
        this.bitmapCopy = Bitmap.createBitmap(this.bitmap).copy(Bitmap.Config.ARGB_8888, true);

        Canvas canvas = new Canvas(this.bitmapCopy);
        float min = Math.min(previewWidth, previewHeight);
        float max = Math.max(previewWidth, previewHeight);

        if (this.getResources().getConfiguration().orientation == Configuration.ORIENTATION_PORTRAIT) {
            localDataProcessor.setCameraInfo(this.graphicOverlay, canvas, min, max);
        } else {
            localDataProcessor.setCameraInfo(this.graphicOverlay, canvas, max, min);
        }
        localDataProcessor.drawHmsMLVisionText(canvas, this.localTextTransactor.getLastResults().getBlocks());
        this.zoomImageView.setImageBitmap(this.bitmapCopy);
        // Create an MLFrame object using the bitmap, which is the image data in bitmap format.
        MLFrame frame = MLFrame.fromBitmap(bitmap);
        Task<MLText> task = analyzer.asyncAnalyseFrame(frame);
        task.addOnSuccessListener(new OnSuccessListener<MLText>() {
            @Override
            public void onSuccess(MLText text) {
                String detectText = text.getStringValue();
                // Processing for successful recognition.

            }
        }).addOnFailureListener(new OnFailureListener() {
            @Override
            public void onFailure(Exception e) {
                // Processing logic for recognition failure.
                Log.e("TAG"," Text : Processing logic for recognition failure");
            }
        });
    }

}

Enter fullscreen mode Exit fullscreen mode

Result

Image description

Tricks and Tips

Makes sure that agconnect-services.json file added.
Make sure required dependencies are added
Make sure that service is enabled in AGC
Enable data binding in gradle.build file
Make sure bottom navigation id’s should be same as fragment id’s in navigation graph
Make sure that set apk key before calling service.
Make sure that you added the module-text from below link
Make changes in gradle file application to library in module-text

Conclusion

In this article, we have learnt how to integrate Huawei ML kit camera stream, where you can extract text on device camera stream in Android application KnowMyBoard. You can check the desired result in the result section. You can also go through previous article part-4 here. Hoping Huawei ML kit capabilities are helpful to you as well, like this sample, you can make use as per your requirement.

Thank you so much for reading. I hope this article helps you to understand the integration of Huawei ML kit in Android application KnowMyBoard.

Reference

Huawei ML KitTraining video

ML Text Recognition

Module-text

Checkout in forum

Top comments (0)