Mokuro: PyTorch con aceleración GPU ROCm de AMD
Mokuro es una fantástica herramienta de OCR para leer manga en japonés. Emplea PyTorch. Es posible acelerar PyTorch en GPUs, no sólo Nvidia, sino también AMD, a través de las librerías ROCm. Veamos instalación y rendimiento.

ACTUALIZACIÓN: Esto ya no me funciona con el kernel 6.12.9-arch1-1. Funcionaba con la 6.9.3.arch1-1. También funciona con 6.6.72-1-lts.
Instalación de Mokuro
Antes de empezar, podemos opcionalmente crear y activar un virtualenv:
j@arai ~/Pictures/_manga % python -m venv .venv [0]
j@arai ~/Pictures/_manga % source .venv/bin/activate [0]
Instalamos mokuro:
(.venv) j@arai ~/Pictures/_manga % pip3 install mokuro [0]
Collecting mokuro
Downloading mokuro-0.1.8-py3-none-any.whl.metadata (2.5 kB)
Collecting fire (from mokuro)
Downloading fire-0.6.0.tar.gz (88 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 88.4/88.4 kB 6.2 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting loguru (from mokuro)
Downloading loguru-0.7.2-py3-none-any.whl.metadata (23 kB)
Collecting manga-ocr>=0.1.7 (from mokuro)
Downloading manga_ocr-0.1.12-py3-none-any.whl.metadata (20 kB)
Collecting natsort (from mokuro)
Downloading natsort-8.4.0-py3-none-any.whl.metadata (21 kB)
Collecting numpy (from mokuro)
Downloading numpy-2.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.9/60.9 kB 13.2 MB/s eta 0:00:00
Collecting opencv-python>=4.1.2 (from mokuro)
Downloading opencv_python-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Collecting Pillow>=7.1.2 (from mokuro)
Downloading pillow-10.3.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting pyclipper (from mokuro)
Downloading pyclipper-1.3.0.post5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.0 kB)
Collecting requests (from mokuro)
Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting scipy (from mokuro)
Downloading scipy-1.14.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.8/60.8 kB 14.4 MB/s eta 0:00:00
Collecting shapely (from mokuro)
Downloading shapely-2.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting torch>=1.7.0 (from mokuro)
Downloading torch-2.3.1-cp312-cp312-manylinux1_x86_64.whl.metadata (26 kB)
Collecting torchsummary (from mokuro)
Downloading torchsummary-1.5.1-py3-none-any.whl.metadata (296 bytes)
Collecting torchvision>=0.8.1 (from mokuro)
Downloading torchvision-0.18.1-cp312-cp312-manylinux1_x86_64.whl.metadata (6.6 kB)
Collecting transformers>=4.25.0 (from mokuro)
Downloading transformers-4.42.3-py3-none-any.whl.metadata (43 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.6/43.6 kB 10.9 MB/s eta 0:00:00
Collecting tqdm>=4.41.0 (from mokuro)
Downloading tqdm-4.66.4-py3-none-any.whl.metadata (57 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.6/57.6 kB 14.6 MB/s eta 0:00:00
Collecting yattag (from mokuro)
Downloading yattag-1.15.2.tar.gz (28 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting fugashi (from manga-ocr>=0.1.7->mokuro)
Downloading fugashi-1.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting jaconv (from manga-ocr>=0.1.7->mokuro)
Downloading jaconv-0.3.4.tar.gz (16 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting numpy (from mokuro)
Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 8.9 MB/s eta 0:00:00
Collecting pyperclip (from manga-ocr>=0.1.7->mokuro)
Downloading pyperclip-1.9.0.tar.gz (20 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting unidic-lite (from manga-ocr>=0.1.7->mokuro)
Downloading unidic-lite-1.0.8.tar.gz (47.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.4/47.4 MB 55.4 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting filelock (from torch>=1.7.0->mokuro)
Downloading filelock-3.15.4-py3-none-any.whl.metadata (2.9 kB)
Collecting typing-extensions>=4.8.0 (from torch>=1.7.0->mokuro)
Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting sympy (from torch>=1.7.0->mokuro)
Downloading sympy-1.12.1-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch>=1.7.0->mokuro)
Downloading networkx-3.3-py3-none-any.whl.metadata (5.1 kB)
Collecting jinja2 (from torch>=1.7.0->mokuro)
Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting fsspec (from torch>=1.7.0->mokuro)
Downloading fsspec-2024.6.1-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.7.0->mokuro)
Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.7.0->mokuro)
Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.7.0->mokuro)
Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.7.0->mokuro)
Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.7.0->mokuro)
Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch>=1.7.0->mokuro)
Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch>=1.7.0->mokuro)
Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.7.0->mokuro)
Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.7.0->mokuro)
Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-nccl-cu12==2.20.5 (from torch>=1.7.0->mokuro)
Downloading nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl.metadata (1.8 kB)
Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.7.0->mokuro)
Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.7 kB)
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.7.0->mokuro)
Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting huggingface-hub<1.0,>=0.23.2 (from transformers>=4.25.0->mokuro)
Downloading huggingface_hub-0.23.4-py3-none-any.whl.metadata (12 kB)
Collecting packaging>=20.0 (from transformers>=4.25.0->mokuro)
Downloading packaging-24.1-py3-none-any.whl.metadata (3.2 kB)
Collecting pyyaml>=5.1 (from transformers>=4.25.0->mokuro)
Downloading PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting regex!=2019.12.17 (from transformers>=4.25.0->mokuro)
Downloading regex-2024.5.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 5.4 MB/s eta 0:00:00
Collecting safetensors>=0.4.1 (from transformers>=4.25.0->mokuro)
Downloading safetensors-0.4.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
Collecting tokenizers<0.20,>=0.19 (from transformers>=4.25.0->mokuro)
Downloading tokenizers-0.19.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting six (from fire->mokuro)
Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Collecting termcolor (from fire->mokuro)
Downloading termcolor-2.4.0-py3-none-any.whl.metadata (6.1 kB)
Collecting charset-normalizer<4,>=2 (from requests->mokuro)
Downloading charset_normalizer-3.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests->mokuro)
Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests->mokuro)
Downloading urllib3-2.2.2-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests->mokuro)
Downloading certifi-2024.6.2-py3-none-any.whl.metadata (2.2 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch>=1.7.0->mokuro)
Downloading MarkupSafe-2.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting mpmath<1.4.0,>=1.1.0 (from sympy->torch>=1.7.0->mokuro)
Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Downloading mokuro-0.1.8-py3-none-any.whl (129 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.8/129.8 kB 36.8 MB/s eta 0:00:00
Downloading manga_ocr-0.1.12-py3-none-any.whl (69 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.4/69.4 kB 23.1 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 62.5 MB/s eta 0:00:00
Downloading opencv_python-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.5/62.5 MB 49.8 MB/s eta 0:00:00
Downloading pillow-10.3.0-cp312-cp312-manylinux_2_28_x86_64.whl (4.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 66.8 MB/s eta 0:00:00
Downloading torch-2.3.1-cp312-cp312-manylinux1_x86_64.whl (779.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 779.1/779.1 MB 12.7 MB/s eta 0:00:00
Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 20.7 MB/s eta 0:00:00
Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 64.4 MB/s eta 0:00:00
Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 60.4 MB/s eta 0:00:00
Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 62.5 MB/s eta 0:00:00
Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 13.2 MB/s eta 0:00:00
Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 39.9 MB/s eta 0:00:00
Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 52.1 MB/s eta 0:00:00
Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 40.1 MB/s eta 0:00:00
Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 31.3 MB/s eta 0:00:00
Downloading nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 32.6 MB/s eta 0:00:00
Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 28.3 MB/s eta 0:00:00
Downloading torchvision-0.18.1-cp312-cp312-manylinux1_x86_64.whl (7.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 66.4 MB/s eta 0:00:00
Downloading tqdm-4.66.4-py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.3/78.3 kB 19.5 MB/s eta 0:00:00
Downloading transformers-4.42.3-py3-none-any.whl (9.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.3/9.3 MB 66.1 MB/s eta 0:00:00
Downloading loguru-0.7.2-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.5/62.5 kB 15.8 MB/s eta 0:00:00
Downloading natsort-8.4.0-py3-none-any.whl (38 kB)
Downloading pyclipper-1.3.0.post5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (966 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 966.6/966.6 kB 58.8 MB/s eta 0:00:00
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 kB 19.9 MB/s eta 0:00:00
Downloading scipy-1.14.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (40.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.8/40.8 MB 57.8 MB/s eta 0:00:00
Downloading shapely-2.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 63.5 MB/s eta 0:00:00
Downloading torchsummary-1.5.1-py3-none-any.whl (2.8 kB)
Downloading certifi-2024.6.2-py3-none-any.whl (164 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 164.4/164.4 kB 47.0 MB/s eta 0:00:00
Downloading charset_normalizer-3.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (141 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 141.9/141.9 kB 37.6 MB/s eta 0:00:00
Downloading huggingface_hub-0.23.4-py3-none-any.whl (402 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 402.6/402.6 kB 55.2 MB/s eta 0:00:00
Downloading fsspec-2024.6.1-py3-none-any.whl (177 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 177.6/177.6 kB 45.4 MB/s eta 0:00:00
Downloading idna-3.7-py3-none-any.whl (66 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.8/66.8 kB 19.9 MB/s eta 0:00:00
Downloading packaging-24.1-py3-none-any.whl (53 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.0/54.0 kB 17.6 MB/s eta 0:00:00
Downloading PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (724 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 725.0/725.0 kB 59.4 MB/s eta 0:00:00
Downloading regex-2024.5.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (788 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 788.8/788.8 kB 60.3 MB/s eta 0:00:00
Downloading safetensors-0.4.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 59.5 MB/s eta 0:00:00
Downloading tokenizers-0.19.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 65.0 MB/s eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading urllib3-2.2.2-py3-none-any.whl (121 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.4/121.4 kB 38.7 MB/s eta 0:00:00
Downloading filelock-3.15.4-py3-none-any.whl (16 kB)
Downloading fugashi-1.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (622 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 622.1/622.1 kB 52.7 MB/s eta 0:00:00
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 kB 35.8 MB/s eta 0:00:00
Downloading networkx-3.3-py3-none-any.whl (1.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 63.5 MB/s eta 0:00:00
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Downloading sympy-1.12.1-py3-none-any.whl (5.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 66.9 MB/s eta 0:00:00
Downloading termcolor-2.4.0-py3-none-any.whl (7.7 kB)
Downloading MarkupSafe-2.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (28 kB)
Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 57.6 MB/s eta 0:00:00
Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl (21.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.3/21.3 MB 62.1 MB/s eta 0:00:00
Building wheels for collected packages: fire, yattag, jaconv, pyperclip, unidic-lite
Building wheel for fire (pyproject.toml) ... done
Created wheel for fire: filename=fire-0.6.0-py2.py3-none-any.whl size=117032 sha256=7e965efc63cded145ad56aac976a9a48719a5be33cdc1d89ed089902e76aef9d
Stored in directory: /home/j/.cache/pip/wheels/97/4a/3f/f6f222440f03d888f026ac848b6b4ea57183ebfcd8f3cb8904
Building wheel for yattag (pyproject.toml) ... done
Created wheel for yattag: filename=yattag-1.15.2-py3-none-any.whl size=15666 sha256=9f154a02acf06c36b66c4f7024aa6a7bcaab43445fcc9db44d5372a505cf958a
Stored in directory: /home/j/.cache/pip/wheels/6a/74/92/547e659468b155688a648ab6193806c6ec52de541de6e135ef
Building wheel for jaconv (pyproject.toml) ... done
Created wheel for jaconv: filename=jaconv-0.3.4-py3-none-any.whl size=16416 sha256=fcfd0d8bfbb6638bbdea81cf069060d4756b5f806c36d0fee2dec22aab8a4b77
Stored in directory: /home/j/.cache/pip/wheels/06/50/ef/1d13b9ce39476d5f59bba936048f8c6f52f6034e54cb1b5ac4
Building wheel for pyperclip (pyproject.toml) ... done
Created wheel for pyperclip: filename=pyperclip-1.9.0-py3-none-any.whl size=11002 sha256=7314e58f51b7fdc8421f51406666ba55f1fbb992e2d28f795a5f4e9ec79d5a39
Stored in directory: /home/j/.cache/pip/wheels/e0/e8/fc/8ab8aa326e33bc066ccd5f3ca9646eab4299881af933f94f09
Building wheel for unidic-lite (pyproject.toml) ... done
Created wheel for unidic-lite: filename=unidic_lite-1.0.8-py3-none-any.whl size=47658815 sha256=af43bfc83e329c07fdc0db536c0f6c29e9a972676f3c62aba9e82b5674da157a
Stored in directory: /home/j/.cache/pip/wheels/5e/1f/0f/4d43887e5476d956fae828ee9b6687becd5544d68b51ed633d
Successfully built fire yattag jaconv pyperclip unidic-lite
Installing collected packages: yattag, unidic-lite, torchsummary, pyperclip, pyclipper, mpmath, jaconv, urllib3, typing-extensions, tqdm, termcolor, sympy, six, safetensors, regex, pyyaml, Pillow, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, natsort, MarkupSafe, loguru, idna, fugashi, fsspec, filelock, charset-normalizer, certifi, shapely, scipy, requests, opencv-python, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, fire, nvidia-cusolver-cu12, huggingface-hub, torch, tokenizers, transformers, torchvision, manga-ocr, mokuro
Successfully installed MarkupSafe-2.1.5 Pillow-10.3.0 certifi-2024.6.2 charset-normalizer-3.3.2 filelock-3.15.4 fire-0.6.0 fsspec-2024.6.1 fugashi-1.3.2 huggingface-hub-0.23.4 idna-3.7 jaconv-0.3.4 jinja2-3.1.4 loguru-0.7.2 manga-ocr-0.1.12 mokuro-0.1.8 mpmath-1.3.0 natsort-8.4.0 networkx-3.3 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 opencv-python-4.10.0.84 packaging-24.1 pyclipper-1.3.0.post5 pyperclip-1.9.0 pyyaml-6.0.1 regex-2024.5.15 requests-2.32.3 safetensors-0.4.3 scipy-1.14.0 shapely-2.0.4 six-1.16.0 sympy-1.12.1 termcolor-2.4.0 tokenizers-0.19.1 torch-2.3.1 torchsummary-1.5.1 torchvision-0.18.1 tqdm-4.66.4 transformers-4.42.3 typing-extensions-4.12.2 unidic-lite-1.0.8 urllib3-2.2.2 yattag-1.15.2
[notice] A new release of pip is available: 24.0 -> 24.1.1
[notice] To update, run: pip install --upgrade pip
Podemos actualizar pip, tal y como nos sugiere:
(.venv) j@arai ~/Pictures/_manga % pip install --upgrade pip [0]
Requirement already satisfied: pip in ./.venv/lib/python3.12/site-packages (24.0)
Collecting pip
Downloading pip-24.1.1-py3-none-any.whl.metadata (3.6 kB)
Downloading pip-24.1.1-py3-none-any.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 32.0 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 24.0
Uninstalling pip-24.0:
Successfully uninstalled pip-24.0
Successfully installed pip-24.1.1
Actualmente mokuro muestra este error con Python 3.12+ (en mi caso estoy utilizando 3.12.4):
(.venv) j@arai ~/Pictures/_manga % mokuro bookwalker/yamada_kun_jp/14 [0]
Traceback (most recent call last):
File "/mnt/media/Pictures/_manga/.venv/bin/mokuro", line 5, in <module>
from mokuro.__main__ import main
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/__init__.py", line 3, in <module>
from mokuro.manga_page_ocr import MangaPageOcr, InvalidImage
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/manga_page_ocr.py", line 7, in <module>
from comic_text_detector.inference import TextDetector
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/comic_text_detector/inference.py", line 11, in <module>
from comic_text_detector.basemodel import TextDetBase, TextDetBaseDNN
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/comic_text_detector/basemodel.py", line 9, in <module>
from comic_text_detector.models.yolov5.yolo import load_yolov5_ckpt
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/comic_text_detector/models/yolov5/yolo.py", line 4, in <module>
from comic_text_detector.utils.yolov5_utils import scale_img, check_version, check_anchor_order, initialize_weights, \
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/comic_text_detector/utils/yolov5_utils.py", line 4, in <module>
import pkg_resources as pkg
ModuleNotFoundError: No module named 'pkg_resources'
El motivo es el deprecado de pkg_resources
, contenido en setuptools
, a partir de Python 3.12. Lo instalamos:
(.venv) j@arai ~/Pictures/_manga % pip install setuptools [1]
Collecting setuptools
Using cached setuptools-70.1.1-py3-none-any.whl.metadata (6.0 kB)
Using cached setuptools-70.1.1-py3-none-any.whl (883 kB)
Installing collected packages: setuptools
Successfully installed setuptools-70.1.1
Si ejecutamos mokuro ahora, utilizará CPU (si el lector tiene curiosidad por el rendimiento, es una 7950X3D
):
(.venv) j@arai ~/Pictures/_manga % mokuro bookwalker/yamada_kun_jp/14 [0]
Paths to process:
/mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Each of the paths above will be treated as one volume. Continue? [yes/no]
yes
2024-07-01 11:08:50.906 | INFO | mokuro.run:run:48 - Processing 1/1: /mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Processing pages...: 0%| | 0/194 [00:00<?, ?it/s]2024-07-01 11:08:50.910 | INFO | mokuro.manga_page_ocr:__init__:36 - Initializing text detector
2024-07-01 11:08:50.910 | INFO | mokuro.cache:_download_if_needed:22 - Downloading https://github.com/zyddnys/manga-image-translator/releases/download/beta-0.2.1/comictextdetector.pt
2024-07-01 11:08:53.600 | INFO | mokuro.cache:_download_if_needed:30 - Finished downloading https://github.com/zyddnys/manga-image-translator/releases/download/beta-0.2.1/comictextdetector.pt
2024-07-01 11:08:53.821 | INFO | manga_ocr.ocr:__init__:15 - Loading OCR model from kha-white/manga-ocr-base
preprocessor_config.json: 100%|███████████████████████████████| 228/228 [00:00<00:00, 2.45MB/s]
tokenizer_config.json: 100%|██████████████████████████████████| 486/486 [00:00<00:00, 6.96MB/s]
vocab.txt: 100%|███████████████████████████████████████████| 24.1k/24.1k [00:00<00:00, 161MB/s]
special_tokens_map.json: 100%|████████████████████████████████| 112/112 [00:00<00:00, 1.32MB/s]
config.json: 100%|████████████████████████████████████████| 77.5k/77.5k [00:00<00:00, 11.8MB/s]
pytorch_model.bin: 100%|████████████████████████████████████| 444M/444M [00:06<00:00, 70.3MB/s]
2024-07-01 11:09:03.411 | INFO | manga_ocr.ocr:__init__:31 - Using CPU0:06<00:00, 71.6MB/s]
2024-07-01 11:09:03.689 | INFO | manga_ocr.ocr:__init__:38 - OCR ready
Processing pages...: 100%|███████████████████████████████████| 194/194 [07:55<00:00, 2.45s/it]
2024-07-01 11:16:46.126 | INFO | mokuro.run:run:56 - Processed successfully: 1/1
Veamos los paquetes instalados:
(.venv) j@arai ~/Pictures/_manga % pip freeze [130]
certifi==2024.6.2
charset-normalizer==3.3.2
filelock==3.15.4
fire==0.6.0
fsspec==2024.6.1
fugashi==1.3.2
huggingface-hub==0.23.4
idna==3.7
jaconv==0.3.4
Jinja2==3.1.4
loguru==0.7.2
manga-ocr==0.1.12
MarkupSafe==2.1.5
mokuro==0.1.8
mpmath==1.3.0
natsort==8.4.0
networkx==3.3
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.40
nvidia-nvtx-cu12==12.1.105
opencv-python==4.10.0.84
packaging==24.1
pillow==10.3.0
pyclipper==1.3.0.post5
pyperclip==1.9.0
PyYAML==6.0.1
regex==2024.5.15
requests==2.32.3
safetensors==0.4.3
scipy==1.14.0
setuptools==70.1.1
shapely==2.0.4
six==1.16.0
sympy==1.12.1
termcolor==2.4.0
tokenizers==0.19.1
torch==2.3.1
torchsummary==1.5.1
torchvision==0.18.1
tqdm==4.66.4
transformers==4.42.3
typing_extensions==4.12.2
unidic-lite==1.0.8
urllib3==2.2.2
yattag==1.15.2
Pytorch con ROCm
Primero desinstalamos torch
y torchvision
«normales» sin ROCm:
(.venv) j@arai ~/Pictures/_manga % pip3 uninstall torch torchvision [1]
Found existing installation: torch 2.3.1
Uninstalling torch-2.3.1:
Would remove:
/mnt/media/Pictures/_manga/.venv/bin/convert-caffe2-to-onnx
/mnt/media/Pictures/_manga/.venv/bin/convert-onnx-to-caffe2
/mnt/media/Pictures/_manga/.venv/bin/torchrun
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/functorch/*
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torch-2.3.1.dist-info/*
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torch/*
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchgen/*
Proceed (Y/n)? Y
Successfully uninstalled torch-2.3.1
Found existing installation: torchvision 0.18.1
Uninstalling torchvision-0.18.1:
Would remove:
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision-0.18.1.dist-info/*
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libcudart.7ec1eba6.so.12
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libjpeg.ceea7512.so.62
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libnvjpeg.f00ca762.so.12
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libpng16.7f72a3c5.so.16
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libz.4e87b236.so.1
/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torchvision/*
Proceed (Y/n)? Y
Successfully uninstalled torchvision-0.18.1
E instalamos torch
y torchvision
con soporte ROCm. La guía de PyTorch y ROCm de AMD recomienda utilizar los repositorios repo.radeon.com
, pero no he tenido problema con los de pytorch.org
:
(.venv) j@arai ~/Pictures/_manga % pip3 install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.0 [0]
Looking in indexes: https://download.pytorch.org/whl/rocm6.0
Collecting torch
Downloading https://download.pytorch.org/whl/rocm6.0/torch-2.3.1%2Brocm6.0-cp312-cp312-linux_x86_64.whl (2193.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 GB 5.4 MB/s eta 0:00:00
Collecting torchvision
Downloading https://download.pytorch.org/whl/rocm6.0/torchvision-0.18.1%2Brocm6.0-cp312-cp312-linux_x86_64.whl (65.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.7/65.7 MB 43.1 MB/s eta 0:00:00
Requirement already satisfied: filelock in ./.venv/lib/python3.12/site-packages (from torch) (3.15.4)
Requirement already satisfied: typing-extensions>=4.8.0 in ./.venv/lib/python3.12/site-packages (from torch) (4.12.2)
Requirement already satisfied: sympy in ./.venv/lib/python3.12/site-packages (from torch) (1.12.1)
Requirement already satisfied: networkx in ./.venv/lib/python3.12/site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in ./.venv/lib/python3.12/site-packages (from torch) (3.1.4)
Requirement already satisfied: fsspec in ./.venv/lib/python3.12/site-packages (from torch) (2024.6.1)
Requirement already satisfied: numpy in ./.venv/lib/python3.12/site-packages (from torchvision) (1.26.4)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in ./.venv/lib/python3.12/site-packages (from torchvision) (10.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./.venv/lib/python3.12/site-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath<1.4.0,>=1.1.0 in ./.venv/lib/python3.12/site-packages (from sympy->torch) (1.3.0)
Installing collected packages: torch, torchvision
Successfully installed torch-2.3.1+rocm6.0 torchvision-0.18.1+rocm6.0
NOTA: si no hemos desinstalado torch
y torchvision
en el primer paso, seguirá la versión normal y no la de ROCm:
Requirement already satisfied: torch in ./.venv/lib/python3.12/site-packages (2.3.1)
Requirement already satisfied: torchvision in ./.venv/lib/python3.12/site-packages (0.18.1)
Si intentamos ejecutar mokuro
ahora, saltará este error de HIP:
(.venv) j@arai ~/Pictures/_manga % mokuro bookwalker/yamada_kun_jp/14 [0]
amdgpu.ids: No such file or directory
amdgpu.ids: No such file or directory
Paths to process:
/mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Each of the paths above will be treated as one volume. Continue? [yes/no]
yes
2024-07-01 11:31:52.362 | INFO | mokuro.run:run:48 - Processing 1/1: /mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Processing pages...: 0%| | 0/194 [00:00<?, ?it/s]2024-07-01 11:31:52.366 | INFO | mokuro.manga_page_ocr:__init__:36 - Initializing text detector
2024-07-01 11:31:52.594 | INFO | manga_ocr.ocr:__init__:15 - Loading OCR model from kha-white/manga-ocr-base
2024-07-01 11:31:54.273 | INFO | manga_ocr.ocr:__init__:25 - Using CUDA
Processing pages...: 0%| | 0/194 [00:02<?, ?it/s]
2024-07-01 11:31:54.507 | ERROR | mokuro.run:run:52 - Error while processing /mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Traceback (most recent call last):
File "/mnt/media/Pictures/_manga/.venv/bin/mokuro", line 8, in <module>
sys.exit(main())
│ │ └ <function main at 0x75a320e3aa20>
│ └ <built-in function exit>
└ <module 'sys' (built-in)>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/__main__.py", line 7, in main
fire.Fire(run)
│ │ └ <function run at 0x75a320e3ac00>
│ └ <function Fire at 0x75a320efe0c0>
└ <module 'fire' from '/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/fire/__init__.py'>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
│ │ │ │ │ └ 'mokuro'
│ │ │ │ └ {}
│ │ │ └ Namespace(verbose=False, interactive=False, separator='-', completion=None, help=False, trace=False)
│ │ └ ['bookwalker/yamada_kun_jp/14']
│ └ <function run at 0x75a320e3ac00>
└ <function _Fire at 0x75a320d3b100>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
│ └ <function _CallAndUpdateTrace at 0x75a320d3b240>
└ <function run at 0x75a320e3ac00>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
│ │ └ {}
│ └ ['bookwalker/yamada_kun_jp/14']
└ <function run at 0x75a320e3ac00>
> File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/run.py", line 50, in run
ovg.process_dir(path, as_one_file=as_one_file)
│ │ │ └ True
│ │ └ PosixPath('/mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14')
│ └ <function OverlayGenerator.process_dir at 0x75a320ea9760>
└ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/overlay_generator.py", line 86, in process_dir
self.init_models()
│ └ <function OverlayGenerator.init_models at 0x75a320e3b600>
└ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/overlay_generator.py", line 58, in init_models
self.mpocr = MangaPageOcr(self.pretrained_model_name_or_path, self.force_cpu, disable_ocr=self.disable_ocr, **self.kwargs)
│ │ │ │ │ │ │ │ │ │ └ {}
│ │ │ │ │ │ │ │ │ └ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
│ │ │ │ │ │ │ │ └ False
│ │ │ │ │ │ │ └ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
│ │ │ │ │ │ └ False
│ │ │ │ │ └ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
│ │ │ │ └ 'kha-white/manga-ocr-base'
│ │ │ └ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
│ │ └ <class 'mokuro.manga_page_ocr.MangaPageOcr'>
│ └ None
└ <mokuro.overlay_generator.OverlayGenerator object at 0x75a320d27530>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/mokuro/manga_page_ocr.py", line 38, in __init__
self.mocr = MangaOcr(pretrained_model_name_or_path, force_cpu)
│ │ │ └ False
│ │ └ 'kha-white/manga-ocr-base'
│ └ <class 'manga_ocr.ocr.MangaOcr'>
└ <mokuro.manga_page_ocr.MangaPageOcr object at 0x75a358af0bc0>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/manga_ocr/ocr.py", line 36, in __init__
self(example_path)
│ └ PosixPath('/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/manga_ocr/assets/example.jpg')
└ <manga_ocr.ocr.MangaOcr object at 0x75a320e57650>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/manga_ocr/ocr.py", line 53, in __call__
x = self.model.generate(x[None].to(self.model.device), max_length=300)[0].cpu()
│ │ │ │ │ │ └ <property object at 0x75a320e0db20>
│ │ │ │ │ └ VisionEncoderDecoderModel(
│ │ │ │ │ (encoder): ViTModel(
│ │ │ │ │ (embeddings): ViTEmbeddings(
│ │ │ │ │ (patch_embeddings): ViTPatchEmbeddin...
│ │ │ │ └ <manga_ocr.ocr.MangaOcr object at 0x75a320e57650>
│ │ │ └ tensor([[[-0.6471, -0.9373, -0.8824, ..., 0.1059, 0.1059, 0.0980],
│ │ │ [-0.8431, -0.8745, -0.1373, ..., 0.1059, 0...
│ │ └ <function GenerationMixin.generate at 0x75a320faf7e0>
│ └ VisionEncoderDecoderModel(
│ (encoder): ViTModel(
│ (embeddings): ViTEmbeddings(
│ (patch_embeddings): ViTPatchEmbeddin...
└ <manga_ocr.ocr.MangaOcr object at 0x75a320e57650>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
│ │ └ {'max_length': 300}
│ └ <unprintable tuple object>
└ <function GenerationMixin.generate at 0x75a320faf880>
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/transformers/generation/utils.py", line 1664, in generate
self._prepare_special_tokens(generation_config, kwargs_has_attention_mask, device=device)
│ │ │ │ └ device(type='cuda', index=0)
│ │ │ └ False
│ │ └ GenerationConfig {
│ │ "decoder_start_token_id": 2,
│ │ "early_stopping": true,
│ │ "eos_token_id": 3,
│ │ "length_penalty": 2.0,
│ │ "...
│ └ <function GenerationMixin._prepare_special_tokens at 0x75a320faf740>
└ VisionEncoderDecoderModel(
(encoder): ViTModel(
(embeddings): ViTEmbeddings(
(patch_embeddings): ViTPatchEmbeddin...
File "/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/transformers/generation/utils.py", line 1513, in _prepare_special_tokens
if eos_token_id is not None and torch.isin(elements=eos_token_id, test_elements=pad_token_id).any():
│ │ │ │ └ tensor(0, device='cuda:0')
│ │ │ └ tensor([3], device='cuda:0')
│ │ └ <built-in method isin of type object at 0x75a3fd3887e0>
│ └ <module 'torch' from '/mnt/media/Pictures/_manga/.venv/lib/python3.12/site-packages/torch/__init__.py'>
└ tensor([3], device='cuda:0')
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
2024-07-01 11:31:54.516 | INFO | mokuro.run:run:56 - Processed successfully: 0/1
Podemos evitarlo con la variabla de entorno HSA_OVERRIDE_GFX_VERSION
. En mi caso, tengo una RX 7600, que es gfx1102
. Utilizo el valor «11.0.0
» para que ROCm trate a la GPU como si fuese esa versión, que funciona correctamente:
(.venv) j@arai ~/Pictures/_manga % HSA_OVERRIDE_GFX_VERSION=11.0.0 mokuro bookwalker/yamada_kun_jp/14 [130]
amdgpu.ids: No such file or directory
amdgpu.ids: No such file or directory
Paths to process:
/mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Each of the paths above will be treated as one volume. Continue? [yes/no]
yes
2024-07-01 11:33:01.913 | INFO | mokuro.run:run:48 - Processing 1/1: /mnt/media/Pictures/_manga/bookwalker/yamada_kun_jp/14
Processing pages...: 0%| | 0/194 [00:00<?, ?it/s]2024-07-01 11:33:01.917 | INFO | mokuro.manga_page_ocr:__init__:36 - Initializing text detector
2024-07-01 11:33:02.139 | INFO | manga_ocr.ocr:__init__:15 - Loading OCR model from kha-white/manga-ocr-base
2024-07-01 11:33:04.002 | INFO | manga_ocr.ocr:__init__:25 - Using CUDA
2024-07-01 11:33:06.723 | INFO | manga_ocr.ocr:__init__:38 - OCR ready
Processing pages...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 194/194 [03:58<00:00, 1.23s/it]
2024-07-01 11:37:00.122 | INFO | mokuro.run:run:56 - Processed successfully: 1/1
¡Hemos obtenido una mejora de 99,58% respecto a la CPU 7950X3D! ¡Prácticamente el doble!
ACTUALIZACIÓN: Esto ya no me funciona con el kernel 6.12.9-arch1-1:
(.venv) j@arai ~/Pictures/0_manga/kodomo_no_jikan % HSA_OVERRIDE_GFX_VERSION=11.0.0 mokuro 02 [143]
amdgpu.ids: No such file or directory
Paths to process:
/mnt/media/Pictures/0_manga/kodomo_no_jikan/02
Each of the paths above will be treated as one volume. Continue? [yes/no]
yes
2025-01-30 19:21:48.281 | INFO | mokuro.run:run:48 - Processing 1/1: /mnt/media/Pictures/0_manga/kodomo_no_jikan/02
Processing pages...: 0%| | 0/195 [00:00<?, ?it/s]2025-01-30 19:21:48.284 | INFO | mokuro.manga_page_ocr:__init__:36 - Initializing text detector
2025-01-30 19:21:48.507 | INFO | manga_ocr.ocr:__init__:15 - Loading OCR model from kha-white/manga-ocr-base
2025-01-30 19:21:50.280 | INFO | manga_ocr.ocr:__init__:25 - Using CUDA
zsh: segmentation fault (core dumped) HSA_OVERRIDE_GFX_VERSION=11.0.0 mokuro 02
ACTUALIZACIÓN 2: He actualizado al último kernel (6.15.5-arch1-1) y me seguía dando el mismo error. He desinstalado toch, torchvision, torchaudio:
(.venv) j@arai ~/Pictures/0_manga % pip3 remove torch torchvision torchaudio [0]
ERROR: unknown command "remove"
(.venv) j@arai ~/Pictures/0_manga __ERROR__% pip3 uninstall torch torchvision torchaudio [1]
Found existing installation: torch 2.7.1
Uninstalling torch-2.7.1:
Would remove:
/mnt/media/Pictures/0_manga/.venv/bin/torchfrtrace
/mnt/media/Pictures/0_manga/.venv/bin/torchrun
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/functorch/*
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torch-2.7.1.dist-info/*
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torch/*
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchgen/*
Proceed (Y/n)? Y
Successfully uninstalled torch-2.7.1
Found existing installation: torchvision 0.22.1
Uninstalling torchvision-0.22.1:
Would remove:
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision-0.22.1.dist-info/*
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libcudart.45e7f3ed.so.12
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libjpeg.cee450dc.so.8
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libnvjpeg.e5f20359.so.12
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libpng16.ca116d9f.so.16
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libsharpyuv.e09fa5b1.so.0
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libwebp.16dd7af3.so.7
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision.libs/libz.8053c0ed.so.1
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchvision/*
Proceed (Y/n)? Y
Successfully uninstalled torchvision-0.22.1
Found existing installation: torchaudio 2.7.1
Uninstalling torchaudio-2.7.1:
Would remove:
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchaudio-2.7.1.dist-info/*
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torchaudio/*
/mnt/media/Pictures/0_manga/.venv/lib/python3.12/site-packages/torio/*
Proceed (Y/n)? Y
Successfully uninstalled torchaudio-2.7.1
… y los he vuelto a instalar apuntando a ROCm 6.3:
(.venv) j@arai ~/Pictures/0_manga % pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3 [0]
Looking in indexes: https://download.pytorch.org/whl/rocm6.3
Collecting torch
Downloading https://download.pytorch.org/whl/rocm6.3/torch-2.7.1%2Brocm6.3-cp312-cp312-manylinux_2_28_x86_64.whl (4543.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 GB 2.5 MB/s eta 0:00:00
Collecting torchvision
Downloading https://download.pytorch.org/whl/rocm6.3/torchvision-0.22.1%2Brocm6.3-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (6.1 kB)
Collecting torchaudio
Downloading https://download.pytorch.org/whl/rocm6.3/torchaudio-2.7.1%2Brocm6.3-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (6.6 kB)
Requirement already satisfied: filelock in ./.venv/lib/python3.12/site-packages (from torch) (3.15.4)
Requirement already satisfied: typing-extensions>=4.10.0 in ./.venv/lib/python3.12/site-packages (from torch) (4.12.2)
Requirement already satisfied: setuptools in ./.venv/lib/python3.12/site-packages (from torch) (70.1.1)
Requirement already satisfied: sympy>=1.13.3 in ./.venv/lib/python3.12/site-packages (from torch) (1.14.0)
Requirement already satisfied: networkx in ./.venv/lib/python3.12/site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in ./.venv/lib/python3.12/site-packages (from torch) (3.1.4)
Requirement already satisfied: fsspec in ./.venv/lib/python3.12/site-packages (from torch) (2024.6.1)
Collecting pytorch-triton-rocm==3.3.1 (from torch)
Downloading https://download.pytorch.org/whl/pytorch_triton_rocm-3.3.1-cp312-cp312-linux_x86_64.whl.metadata (1.5 kB)
Requirement already satisfied: numpy in ./.venv/lib/python3.12/site-packages (from torchvision) (1.26.4)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in ./.venv/lib/python3.12/site-packages (from torchvision) (10.3.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in ./.venv/lib/python3.12/site-packages (from sympy>=1.13.3->torch) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./.venv/lib/python3.12/site-packages (from jinja2->torch) (2.1.5)
Downloading https://download.pytorch.org/whl/pytorch_triton_rocm-3.3.1-cp312-cp312-linux_x86_64.whl (254.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 254.2/254.2 MB 23.6 MB/s eta 0:00:00
Downloading https://download.pytorch.org/whl/rocm6.3/torchvision-0.22.1%2Brocm6.3-cp312-cp312-manylinux_2_28_x86_64.whl (3.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 72.0 MB/s eta 0:00:00
Downloading https://download.pytorch.org/whl/rocm6.3/torchaudio-2.7.1%2Brocm6.3-cp312-cp312-manylinux_2_28_x86_64.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 67.5 MB/s eta 0:00:00
Installing collected packages: pytorch-triton-rocm, torch, torchvision, torchaudio
Attempting uninstall: pytorch-triton-rocm
Found existing installation: pytorch-triton-rocm 3.1.0
Uninstalling pytorch-triton-rocm-3.1.0:
Successfully uninstalled pytorch-triton-rocm-3.1.0
Successfully installed pytorch-triton-rocm-3.3.1 torch-2.7.1+rocm6.3 torchaudio-2.7.1+rocm6.3 torchvision-0.22.1+rocm6.3
[notice] A new release of pip is available: 24.1.1 -> 25.1.1
[notice] To update, run: pip install --upgrade pip
Con esto, ¡ha vuelto ha funcionar!
(.venv) j@arai ~/Pictures/0_manga % HSA_OVERRIDE_GFX_VERSION=11.0.0 mokuro kodomo_no_jikan/08 [0]
2025-07-08 17:11:09.639 | WARNING | mokuro.run:run:55 - Legacy HTML output is deprecated and will not be further developed. It's recommended to use .mokuro format and web reader instead. Legacy HTML will be disabled by default in the future. To explicitly enable it, run with option --legacy-html.
2025-07-08 17:11:09.639 | INFO | mokuro.run:run:63 - Scanning paths...
Found 1 volumes:
/mnt/media/Pictures/0_manga/kodomo_no_jikan/08 (partially processed)
Each of the paths above will be treated as one volume.
Continue? [yes/no]yes
2025-07-08 17:11:11.933 | INFO | mokuro.run:run:133 - Processing 1/1: /mnt/media/Pictures/0_manga/kodomo_no_jikan/08
Processing pages...: 0%| | 0/203 [00:00<?, ?it/s]2025-07-08 17:11:11.944 | INFO | mokuro.manga_page_ocr:__init__:41 - Initializing text detector, using device cuda
2025-07-08 17:11:12.440 | INFO | manga_ocr.ocr:__init__:15 - Loading OCR model from kha-white/manga-ocr-base
2025-07-08 17:11:14.207 | INFO | manga_ocr.ocr:__init__:25 - Using CUDA
2025-07-08 17:11:16.622 | INFO | manga_ocr.ocr:__init__:38 - OCR ready
Processing pages...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 203/203 [04:44<00:00, 1.40s/it]
2025-07-08 17:15:56.366 | INFO | mokuro.run:run:146 - Processed successfully: 1/1