Touhou Patch Center:Wine secrets

From Touhou Patch Center
Jump to navigation Jump to search

Fonts

Have you ever wondered why fonts don't always work as intended in Wine? No? I think it's interesting.

Explanations

Displaying

When an application displays text containing valid code points that have no corresponding graphic in a particular font, the rendering engine cannot draw these symbols in the chosen typeface. When this happens, the engine can do one of three things:

  1. draw a placeholder character (like ⍰)
  2. skip the symbol entirely
  3. try drawing the glyph using another font

For application users and developers, option 3 is the most convenient solution; it allows the software to retreat to another font and retry drawing the symbol. On Windows operating systems, this behavior happens automatically for (all?) fonts. Generally, Windows graphics libraries will try to apply supplementary fonts if a valid character cannot be displayed. Some developers of internationalized software rely on this feature to display text from a wide range of scripts.

Wine's imitation of this behavior can be unreliable. Some versions/configurations of Wine seemingly neglect to apply supplementary fonts, and instead, draw generic .notdef symbols from the current font when faced with unsupported scripts. Unfortunately, the fonts used by thcrap language patches do not provide full coverage of all of the writing systems that appear in community translations. So some Wine environments may not be able to show all symbols in translated games.

Font Links

font stack
0 base font
1 font X
2 font Y
3 font Z

Windows operating systems support a feature called "font linking" via its GDI and GDI+ APIs. It allows users and applications to set custom rules that control how individual fonts can supplement each other, effectively creating layered composite fonts. A font-linked typeface has a font stack containing 1 base font, and 1 or more linked fonts. If a decoded character needs to be rendered, and it's not found in the base font, it falls though to the next (linked) font in the stack. It keeps traveling down the list until one of the candidate fonts finds the corresponding glyph in its 'cmap' & 'glyf' tables, in order to draw it. Once the character is drawn, subsequent linked fonts cannot overwrite its presentation. If the character falls through to the very bottom of the stack, a .notdef glyph form the base font (if present) is rendered in its place.

Wine also supports the GDI API. Although it's missing some of the functionality from Microsoft's version, the font-linking feature remains intact for (most?) wine runners. Font links can be added or removed by the user in much the same way, by editing registry values.

Learn more about Font Linking at Microsoft Learn, TTF font structure at SIL

Tutorials

Font Link Tofu Fix

Tested in these environments:

Working

Wine      [ver:9.0]  [form:WineHQ(RPM)] [prefix:32bit,64bit] / Fedora [DE:Gnome] / thcrap [lang:en] / th:07,08
Wine-GE   [ver:8-26] [from:Lutris]      [prefix:32bit,64bit] / Fedora [DE:Gnome] / thcrap [lang:en] / th:07,08
Proton-GE [ver:9-12] [from:Lutris]      [prefix:32bit,64bit] / Fedora [DE:KDE]   / thcrap [lang:en-literal] / th:06,07,09 


Not Working


( feel free to add your own test results here ⮥ )

Steps

1. Find the typeface family of the font your patch is using.
  • You must look through your thcrap patch stack (thcrap/config/??.js), and find your patch's corresponding "font" in the repos directory. For older games, check the thXX.js files.
  • What you actually need is the typeface family, which is often confused with the font name. Example: write Touhou Biolinum instead of THBiolinum.
  • Keep in mind that patches may override each over, so you must find the "final" patch that specifies its own typeface.
2. Install the fonts you would like to use. Take note of their file names! - If you use Proton-GE to run the games, this step isn't required since unlike Wine and Wine-GE, it has the required fonts preinstalled.
  • For thcrap patches, ensure your fonts support the Unicode character set (ISO 10646). Most modern fonts will meet this requirement.
  • Make sure at least one of these fonts support the missing scripts you want your game to display.
  • There are multiple directories that Wine checks for fonts. You can place font files in any of them.
  • These are the relevant font paths for Linux:
~/.fonts/ if you want to use a personal font across all wine prefixes
/usr/share/fonts/ if you installed a font with a package manager, it's somewhere here
~/.wine/drive_c/windows/Fonts/ if you want to bundle fonts within the standard prefix
regedit in Lutris
3. Find the wine launcher for your prefix, and use it to open the registry editor.
  • If you use wine from the command line:
wine regedit if the wine executable is in your $PATH
/path/to/your/client/wine regedit if it's not in your $PATH, find the path and prepend it to the command
  • If you use wine through Lutris:
You can find the registry editor at the bottom of the interface, by clicking the ▲ button next to "Platform:"
At this point, you may back up the registry if you wish, by choosing Registry/Export Registry File...
4. Navigate to the following registry key and add a new multi-string value.
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink
Value Example
Name Type Data
Aroania REG_MULTI_SZ
NotoSansCJK-VF.ttc
notosans_jp.ttf
last-resort.ttf
  • The Name should be the base "font" (the typeface family you fond in step 1)
  • The Data should be a list of linked fonts (the filenames of fonts you decided to use in step 2)
  • Make sure there are no leading or trailing spaces anywhere! Spaces in the middle of the family name are OK.
5. Run your game again with thcrap to check if the tofu is gone.

Drawbacks
Here are a few known issues with this solution:

Size Mismatch
Some font-linked typefaces do not match in size. If the new font draws the text too large, symbols may get "cut off" at the top.
Immortal Tofu
Sometimes, .notdef symbols still remain, even when Last Resort is linked as the final font.
System Font Weirdness
Wine's system typeface families like Times New Roman may not font-link as expected. Probably due to conflicts with preexisting font mappings in the registry.

Encoding

You may be asking how changing the locale on your host OS affects Wine's environment. This section intends to explain this sorcery.

Explanations

Encoding Mismatch

For a computer to store text, it must convert it to binary data (1s and 0s), and for a program to display the text back to the user, it must decode that data back into human-readable form. So, in theory, when the user types characters into a text editor, the software encodes those symbols using a specific standard. To show text onscreen, the program decodes that same series of bytes (using the same standard) to find the addresses of each glyph used. It then looks up those addresses (called "code points") in a font's code page (that also complies with the standard) to draw their associated outlines. But what if the encoder and decoder do not follow the same standard? The wrong symbols would be shown to the user.

Encoding History

Throughout computing history, there have been many encoding schemes for plain text. Before the rise of Unicode, standards were often localized to geographic regions where people share a common writing system, such as Cyrillic (cp1251), Greek (cp1253), and Arabic (cp1256). In Japan, JIS (Japanese Industrial Standards) became widely adopted in many industries; most notably, their 'X' subset for digital information processing. Originating in JIS X 0208 Appendix 1, a versatile (if not convoluted) encoding called Shift JIS or "SJIS" became the basis for Japanese locales on Windows and Mac OSes. SJIS can encode many character sets, including JIS X 0201, JIS X 0208, and most of US-ASCII ( note that \ is replaced by ¥ ). However, SJIS is not compatible with modern character encodings like UTF-8, which uses the Unicode character set. It's also incompatible with 8-bit code pages such as Windows-1252, AKA "Windows ANSI" (misnomer).

Touhou Encoding

Touhou Project games frequently utilize Shift JIS in programmed string literals and documentation. It won't always match what the system uses. For example, here's how the TH08 game title appears with various decoding schemes:

Encoding Representation
Raw Bytes (Hex) 93 8C 95 FB 89 69 96 E9 8F B4 81 40 81 60 20 49 6D 70 65 72 69 73 68 61 62 6C 65 20 4E 69 67 68 74 2E
Shift JIS   I m p e r i s h a b l e N i g h t .
ISO 8859-15 û i é Ž @ ` I m p e r i s h a b l e N i g h t .
Windows-1252 Œ û i é ´ @ ` I m p e r i s h a b l e N i g h t .
UTF-8 i @ ` I m p e r i s h a b l e N i g h t .
  valid space separators
  "non-printing" control characters and unused code points
  invalid bytes. they're un-decodable!

To avoid confusion, I should emphasize that UTF-8 can also encode the kanji shown by SJIS; the difference is that UTF-8 must use Unicode's scalar values and its own encoding strategy, which produces a totally different binary representation. Also note that Windows uses its own extended version of SJIS known as MS932; it features even more kanji, and extra special characters like circled numerals.

Thankfully, Users are unlikely to see any mojibake while using thcrap because it enforces UTF-8 internally to represent translated text strings, but other mods (like vpatch) may introduce SJIS of their own. And finally, there's an encoding hurdle for *nix users tying to run ZUN's installers and updaters, which thcrap cannot localize.

Learn more about Shift JIS at sljfaq.org

Linux to Wine Locale Mapping

A locale is a collection of regional settings that control how the operating system encodes and displays information. On Linux systems, the locale identifier is typically split into two parts: the region-specific language code, and the encoding 'charmap'. For Japanese, there are two predominant charmaps available.


Extended Unix Code – Japanese

ja_JP.EUC-JP | ja_JP.UJIS | ja_JP
(these are all equivalent)

Unicode Transformation Format – 8-bit

ja_JP.UTF-8


As you may have already noticed, neither of these options is compatible with SJIS. So how does wine remedy this issue? By translating locales between the host OS and the compatibility layer.

first half: language
Wine uses this half of the Unix-style locale to decide which Windows code page to simulate using.
If it's ja_JP, wine uses Microsoft's code page 932 (MS932) to decode non-Unicode text.
If it was en_US, wine would use code page 1252, and for ru_RU it would use cp1251.

second half: encoding
Wine uses this half (mostly) to control what encoding is used when files are created, opened, written, and deleted.
Wine essentially takes the glyph series decoded by the code page, and re-encodes it into your locale's current charmap.
If it's UTF-8, wine uses this encoding when referencing files, or if EUC-JP is used, wine reads and writes filenames using that standard.

example #1: scalar mismatch
To illustrate this, I will use the TH06 installer as an example. If you run the installer on Linux with the locale set to en_US.UTF-8 , the program would not be able to copy files with Japanese characters into the destination folder.
Here's why in detail:

  1. Installer would try creating the game's directory 東方紅魔郷 [hex: 938C-95FB-8D67-9682-8BBD] using SJIS
    • The name's data would be decoded with cp1252, producing “Œ•ûg–‚‹½ [hex: 93-8C-95-FB-8D-67-96-82-8B-BD] in the compatibility layer (8D is invisible)
    • When creating the folder on your filesystem, Wine would re-encode these glyphs as UTF-8, because in this scenario, your unix locale is configured to use it
    • The folder name would become “Œ•ûg–‚‹½ [hex: E2809C-C592-E280A2-C3BB-C28D-67-E28093-E2809A-E280B9-C2BD]
    • Warning: C28D is Reverse Line Feed, it could mess with your terminal
  2. Installer then tries copying 東方紅魔郷.exe using SJIS to the newly created folder
    • Wine translates this request as: copy kouma/“Œ•ûg–‚‹½.exe into folder /path/to/“Œ•ûg–‚‹½/
    • Wine searches the kouma folder for this filename, but never finds it
    • kouma's 東方紅魔郷.exe isn't found because it contains (properly) UTF8-encoded kanji
    • Even though kouma filenames and the locale both use UTF-8, the scalar values no longer match
  3. Installer shows the user an error (in mojibake) and aborts the installation