Visual Novel OCR Guide

Visual Novel OCR Guide

Links:

Download link is available in the about section of my patreon page. VN OCR is bundled along other programs in the Sugoi Translation Toolkit package. https://www.patreon.com/mingshiba

For video demo and tutorials, please check these two Youtube videos:

Introduction:

Visual Novel OCR, the software and the movement, represents a new approach to grab text from visual novels or dialogue-heavy games in general. As long as the game is on your screen, VN OCR will work. An increasingly common trend nowadays is game streaming/cloud gaming. Game content is processed on a different machine (PS4, Xbox, server, etc) and only graphical videos are sent back to your local machine (text hooker won’t work on MP4, PNG, or JPEG format). In this case, the only solution is OCR technology

OCR stands for “optical character recognition”, or image to text to put it simply. Visual Novel OCR leverages Tesseract 5, the best open-source OCR engine available along with pre-trained models for Japanese horizontal and vertical text recognition.

Motivation:

Previously, there were two major methods used to understand Japanese games. The first is to wait for fans or official translation (requires substantial amount of patience), and the second is to use text hooker softwares like VNR or Textractor. The latter approach worked by injecting a monitor script to the running game to find and “hook” text data, mainly dialogue, for dictionary lookup or direct machine translation. This method works very well with common game engines like Kirikiri (Fate Stay Night) or Renpy (Doki Doki Literature). However, it becomes very complicated or impossible when handling newer engines, in-house engines, or emulators.

Unsuccessful attempt to hook text on Suidoken 1 (RetroArch emulator)

Texthooker vs OCR?

Text hooker:

  • Text hooker is the fastest and most accurate method
  • However, hard to find hook codes at times
  • Doesn’t work on some games

OCR:

  • OCR works on all games that are on the screen
  • However, accuracy depends on image quality
  • For VNO, users need some practice on color setting

The Verdict:

Use both because why not. Users who downloaded both Textractor and Visual Novel OCR are reported to be happier because they don’t have to decide what to use, and now they can play all the visual novels they want.

Download and Installation:

Download link is included in the above section of this article.

The program doesn’t require any installation except that you will need to permit “NodeJS” on first usage to operate on your machine. This is the back-end of VNO that handles online translation and connects various moving components into one cohesive package.

When you opened the program, you should be able to see the main menu window and Translation Aggregator (a very flexible software)

Features:

Mirror Screen Capture:

As mentioned in the previous section, this is one of the two key features in Visual Novel OCR. It is sort of a permanent mirror/window laying on top of the dialogue section in games. This allows users to conveniently get the coordinates of the text area and extract the content inside. First, drag and resize the text capture window to fit the dialogue section.

Then, click “transparent” to make it into a see-through window

As you can see from the picture, there are three buttons corresponding to three things you can do.

Return:

Go back to the semi-transparent window so you can move around

Translate:

Let you crop the text inside the screen. You can use the backtick(`) or TAB button instead of clicking too. (on first attempt, you need to manually set up color contrast threshold, will get to that later)

Crop:

Let you crop/snip any area on the screen like menu items or choices

Translate text after optimal color contrast threshold.

Crop in-game choices

Color Contrast Threshold:

The most important feature of Visual Novel OCR. It is a window that lets you adjust color contrast properties so the final output is an image with black/colored text on white background

A sample setting

The reason why human eyes can discern text from background is because text has distinct shapes and is also “brighter”. VNO focused on the distinct color aspects that summed up as HSV or (color, saturation, brightness in the game window). Usually with saturation, brightness, or a combination of the two, users can capture all the text in game. This setting requires a bit of practice, but not difficult. One thing to note is also thin lines will yield better result than thick lines

Thick line (can lead to inaccurate kanjis)

Thin line (much better now)

Translation Aggregator:

This is a great program that can do many things. It can hook text, get translation from many services online, and act as a very useful dictionary. The bundled translation aggregator serves mainly the 3rd functionality, but technically, you can freely customize it.

Why OCR with Visual Novel OCR?

VNO is not the first tool that offers OCR functionality that can work with visual novels, some general OCR tools like Capture2Text let users snip the screen for text extraction, even for direct translation like text hooker software. However, the process to capture text is tedious, requiring users to constantly drag a long rectangle over new dialogue. Not to mention, the accuracy can range from quite bad to really bad when encountering a somewhat transparent background, which is common in visual novels.

Capture2Text doesn’t work with semi-transparent background

Visual Novel OCR has been successful in solving these two issues to make OCR finally become a user-friendly alternative to text hooking. The two main mechanisms involved are mirror screen capture and color contrast threshold.