tests: add tesseract training data "tessdata", still trying to enhance OCR reliability in UI tests wxPython4
authorEdouard Tisserant <edouard.tisserant@gmail.com>
Thu, 15 Dec 2022 14:45:52 +0100
branchwxPython4
changeset 3696 ea30051326e9
parent 3695 a89ebe406e35
child 3697 12b6add87876
tests: add tesseract training data "tessdata", still trying to enhance OCR reliability in UI tests
tests/ide_tests/sikuliberemiz.py
tests/tools/Docker/Dockerfile
--- a/tests/ide_tests/sikuliberemiz.py	Mon Dec 05 15:53:25 2022 +0100
+++ b/tests/ide_tests/sikuliberemiz.py	Thu Dec 15 14:45:52 2022 +0100
@@ -12,9 +12,9 @@
 
 beremiz_path = os.environ["BEREMIZPATH"]
 python_bin = os.environ.get("BEREMIZPYTHONPATH", "/usr/bin/python")
-
 opj = os.path.join
 
+tessdata_path = os.environ["TESSDATAPATH"]
 
 class KBDShortcut:
     """Send shortut to app by calling corresponding methods.
@@ -199,6 +199,7 @@
             Returns:
                 Sikuli App class instance
         """
+        sikuli.OCR.Options().dataPath(tessdata_path)
         sikuli.OCR.Options().oem(0)
 
         self.screenshotnum = 0
--- a/tests/tools/Docker/Dockerfile	Mon Dec 05 15:53:25 2022 +0100
+++ b/tests/tools/Docker/Dockerfile	Thu Dec 15 14:45:52 2022 +0100
@@ -69,7 +69,17 @@
         twisted nevow autobahn click opcua \
         wxPython==4.1.1
 
-# Point to python binary test scripts will use
+RUN set -xe && \
+    cd  /home/$UNAME && mkdir tessdata && \
+    wget -q https://github.com/tesseract-ocr/tessdata/archive/refs/tags/4.1.0.tar.gz \
+         -O tessdata.tar.gz && \
+    echo 89e25c7c40a59be7195422a01f57fcb2 tessdata.tar.gz | md5sum -c && \
+    tar --strip-components=1 -C tessdata -x -v -z -f tessdata.tar.gz && \
+    rm tessdata.tar.gz
+
+ENV TESSDATAPATH /home/$UNAME/tessdata
+
+# Points to python binary that test will use
 ENV BEREMIZPYTHONPATH /home/$UNAME/beremizenv/bin/python
 
 # easy to remember 'do_tests' alias to invoke main makefile