Sugoi Toolkit V9 - Fastest toolkit in the west

Sugoi Toolkit V9 - Fastest toolkit in the west

Download Link: (public link available on every 15th-16th)

https://www.patreon.com/mingshiba/

List of updates: (major ones are explained below)

  • 2x speed for EVERY programs using Sugoi Offline Model
  • 10x image recognition CPU speed on VN OCR
  • Official GPU installation script for offline translation and transcription model
  • Copy image to translate with Sugoi Manga OCR (beta release)
  • ChatGPT is now available with Sugoi Translator (beta release)
  • 30% faster CPU transcription speed - 12s vs 18s for ASMR input
  • Sugoi Translator can now disable OCR image in the menu
  • Updated NodeJS from version 12 to version 20
  • Added Sugoi Offline Debug to menu for easier debugging
  • Removed Sugoi Image Upscaler
  • Removed Sugoi Manga Downloader

Introduction:

Sugoi Toolkit V9.0 is all about speed. Significant improvement in default translation, image recognition, and transcription speed. Bundled with the official GPU installation script for even more amazing performance.

Two major requested features are now in beta such as copy image to translate for Sugoi Manga OCR and ChatGPT support for Sugoi Translator.

2x speed for EVERY programs using Sugoi Offline Model:

Input text: その場所に向けて伸ばしている右腕は、同じく伸ばしている左腕とは少し異なっていた。

  • Sugoi Toolkit V8: 1300 milliseconds
  • Sugoi Toolkit V9: 700 milliseconds

Sugoi Offline Model is now using CT2 package by default, replacing previous fairseq library. Accuracy is about the same while CPU processing speed is twice as fast (even more so when enabling GPU). There are tricks to further boost the speed like increasing inter or intra threads but that will be for another post (these parameters and more are in the User-Settings.json so you can test them out if you want)

10x image recognition CPU speed on VN OCR:

  • Sugoi Toolkit V8: 900 milliseconds
  • Sugoi Toolkit V9: 80 milliseconds

The image to text default speed of VN OCR is now 10 times faster compared to before. Just for reference, human reaction time is 250 milliseconds. This change is made thanks to preloading Tesseract binary and it's language data (more than 20MB) instead of calling them every single request as commonly done.

Official GPU installation script for offline translation and transcription model:

Users have been requesting an official GPU script for many versions now. Your wish is granted. Inside this folder you'll find a "install-cuda.bat" file. Click on that and soon the translation model and audio/video transcription model should be blazingly fast (I haven't measured the speed but at least twice as fast compared to CPU)

QUICK NOTE: after installation is completed, you can safely remove these two folder and zip file to save space

Copy image to translate with Sugoi Manga OCR (beta release):

Copying image to translate will lead to much better text box detection and is also a lot more convenient as users can now view the whole page at once. However, for some reasons after copying the image, the viewing window won't pop up as usual so you have to manually click on that window on the task bar. I'll try to resolve this issue in the next version.

ChatGPT is now available with Sugoi Translator (beta release):

On Sugoi Toolkit menu, there is a "new" option called "Sugoi Translator Premium". Get your API key from ChatGPT website and paste it in line 15 of User-Settings.json file. In the future, I plan to make this an universal translator that worked with local LLAMA model or other online services like Gemini.