2022-12-21
You may want to try to use the unicode-math
package from
CTAN to provide Unicode input for math s.t., you can write
∑
instead of \sum
. Also, perhaps more
importantly, copying text from a PDF created using
unicode-math
will give you proper Unicode Math symbols,
independent of whether you chose to write \sum
or
∑
in the input.
However, if you try to use this neat package in a short sample document such as:
\documentclass{article}
\usepackage{unicode-math}
\begin{document}
\[ \mathbb{N}\setminus\{0\} \]
\end{document}
The output will be rendered as:
ℕ {0}
With a silent error that the requested character isn’t found in Latin Modern Math:
Missing character: There is no ⧵ in font [latinmodern-math.otf]/OT:script=math; language=dflt;!
This is to some parts down to history, and some to some perhaps “idealogical” choice on how fonts and encodings should behave.
\setminus
vs. \smallsetminus
problem.Historically, the \setminus
command produced the
backslash and \smallsetminus
a differently sloped “smaller”
slash 1:
\setminus
is inplain.tex
, Knuth’s original, and reuses the backslash (space considerations). When the smaller, differently sloped form was requested, it was included in theamsfonts
under the name\smallsetminus
inamssymb
.
There are two different “idealogies” surrounding the correct approach w.r.t. fonts & Unicode:
\setminus
would exist or make any meaningful
sense. This is further highlighted by arguing that semantically, there
can only be one set minus, and independent of how the actual
glyph looks like, any set minus glyph should be encoded as one and only
one symbol in the output. However, if a user wants to use both symbols
they have to use two different fonts.While the GUST team providing Latin Modern Math and other LaTeX fonts
seem to hold opinion (1), the Unicode consortium standardised (2). They
chose the names “REVERSE SOLIDUS OPERATOR” for the “traditional” TeX
style since it uses the reverse solidus, and the name “SET MINUS” for
symbol that LaTeX users were used to calling smallsetminus
.
This is perhaps confusing to LaTeX users.
Worsening the situation, since the GUST team is of opinion (1), they did not only implement only one “variant”, they also mapped this to just one of the two provided code points, leaving the other one unmapped. As mentioned in (1), possibly because they see “SET MINUS” as a semantic description whereas “REVERSE SOLIDUS OPERATOR” is not. Arguably, independent of the visual result, any glyph representing a substraction of two sets should be encoded as “SET MINUS” only.
The result is that \setminus
using
unicode-math
will lead to a lookup of the “REVERSE SOLIDUS
OPERATOR” which doesn’t exist in LM Math (the default font).
UNICODE | HTML | TeX/amssymb | LM (Math) | unicode-math |
---|---|---|---|---|
U+2216 SET MINUS | smallsetminus | smallsetminus | ∖ | |
U+29F5 REV SOLIDUS OP | setminus | setminus | N/A | |
———————– | ————— | —————- | ———– | —————- |
U+005C REV SOLIDUS | bsol | backslash | \ |
Since the GUST team seems to not want to provide both symbols, the easiest approach for them would be to simply map the same code point to the same symbol. This would not break any existing code. It only would forestall any future plans to give “REVERSE SOLIDUS OPERATOR” a different symbol. As indeed, the “SET MINUS” symbol provided by GUST looks different to the “REVERSE SOLIDUS”, this may indeed pose an issue. Finally, the “missing” character could be seen as a warning to the user that their setup isn’t complete/correct yet and they need to make a choice how “SET MINUS” is supposed to look and entered into the source code. The more appropriate fix could be to build an actual “REVERSE SOLIDUS OPERATOR” based on the existing “REVERSE SOLIDUS” glyph.
Since this issue isn’t fixed at time of writing (Dec 2022), there are three solutions with slight differences to the problem.
In any case, we first provide the missing character when directly part of the source. In this case, we assume that indeed a REVERSE SOLIDUS OPERATOR is requested and we construct it using the REVERSE SOLIDUS and adding binary math operator spacing:
Now we have three choices.
Route \setminus
to this newly created glyph:
\AtBeginDocument{\renewcommand{\setminus}{^^^^29f5}}
This will yield virtually the same optical output that LaTeX users
would expect. However, this way, \setminus
would
semantically be typeset using the Unicode code point REVERSE
SOLIDUS.
Always use SET MINUS. This forces \setminus
to emit
SET MINUS and not REVERSE SOLIDUS OPERATOR as before, effectively making
it behave exactly like \smallsetminus
.
\AtBeginDocument{\renewcommand{\setminus}{^^^^2216}}
This way, both commands yield the same output and there’s no way to enter REVERSE SOLIDIUS OPERATOR other than by direct Unicode input.
Always use REVERSE SOLIDUS. This is the same as (1) but
additionally, also overrides \smallsetminus
to
\setminus
, thus making them behave the same and both emit
REVERSE SOLIDUS.
\AtBeginDocument{\renewcommand{\setminus}{^^^^29f5}}
\AtBeginDocument{\renewcommand{\smallsetminus}{\setminus}}
I just wanted to provide this for completeness’ sake, but I have no idea why you would want to do that.
Other solutions could include simply overriding the
\setminus
command to \smallsetminus
but this
would have the disadvantage of being “wrong” any time the font chose to
provide a SET MINUS symbol which is not equivalent to a REVERSE SOLIDUS
(as, in fact, LM MATH does).
\AtBeginDocument{\renewcommand{\setminus}{\smallsetminus}}
Similarly, one could simply map \setminus
directly to
the custom created backslash glyph. The only difference to (1) would be
that entering the unicode code point directly would still emit REVERSE
SOLIDUS OPERATOR and thus result in a failing lookup.
\AtBeginDocument{\renewcommand{\setminus}{\mathbin{\backslash}}}
Sometimes, finding the “correct” choice is quite difficult. The GUST team mostly follows the “semantic” reading that disregarding how the output “looks”, a set minus operator should always be encoded as the set minus code point.
The unicode-math
team is simply consistent with the
naming that HTML and others chose for the two characters and provides
both commands but for accessing different code points. Whether these
code points only differ in looks or also in semantics is the big
debate.
Unfortunately, knowing this doesn’t help the troubled user. In my case, at least, it forced me to go down this rabbit hole for too long of a time and I did learn something. So, thanks, guess?
https://tex.stackexchange.com/questions/140279/which-unicode-math-fonts-support-setminus/140343#comment1383759_523798↩︎