tex_glyphs/lib.rs
1/*! This crate provides a way to access glyphs from TeX fonts. It is intended to be used by
2crates using [`tex_engine`](https://crates.io/crates/tex_engine).
3
4TeX deals with fonts by parsing *font metric files* (`.tfm` files), which contain information
5about the dimensions of each glyph in the font. So from the point of view of (the core of) TeX,
6a *glyph* is just an index $0 \leq i \leq 255$ into the font metric file.
7
8In order to find out what the glyph actually looks like, we want to ideally know the corresponding
9unicode codepoint. This crate attempts to do exactly that.
10
11# Usage
12
13This crate attempts to associate a tex font (identified by the file name stem of its `.tfm` file) with:
141. A list of [`FontModifier`](fontstyles::FontModifier)s (e.g. bold, italic, sans-serif, etc.)
152. A [`GlyphList`], being an array `[`[`Glyph`]`;256]`
16
17A [`Glyph`] then is either undefined (i.e. the glyph is not present in the font, or the crate couldn't
18figure out what exactly it is) or presentable as a string.
19
20Consider e.g. `\mathbf{\mathit{\Gamma^\kappa_\ell}}` (i.e. $\mathbf{\mathit{\Gamma^\kappa_\ell}}$).
21From the point of view of TeX, this is a sequence of 3 glyphs, represented as indices into the font
22`cmmib10`, namely 0, 20, and 96.
23
24Here's how to use this crate to obtain the corresponding unicode characters, i.e. `ð`, `ðŋ` and `â`:
25
26### Instantiation
27
28First, we instantiate a [`FontInfoStore`](encodings::FontInfoStore) with a function that
29allows it to find files. This function should take a string (e.g. `cmmib10.tfm`) and return a string
30(e.g. `/usr/share/texmf-dist/fonts/tfm/public/cm/cmmib10.tfm`). This could be done by calling `kpsewhich`
31for example, but repeated and frequent calls to `kpsewhich` are slow, so more efficient alternatives
32are recommended.
33
34```no_run
35use tex_glyphs::encodings::FontInfoStore;
36let mut store = FontInfoStore::new(|s| {
37 std::str::from_utf8(std::process::Command::new("kpsewhich")
38 .args(vec!(s)).output().expect("kpsewhich not found!")
39 .stdout.as_slice()).unwrap().trim().to_string()
40});
41```
42This store will now use the provided function to find your `pdftex.map` file, which lists
43all the fonts that are available to TeX and associates them with `.enc`, `.pfa` and `.pfb` files.
44
45### Obtaining Glyphs
46
47If we now query the store for the [`GlyphList`] of some font, e.g. `cmmib10`, like so:
48```no_run
49# use tex_glyphs::encodings::FontInfoStore;
50# let mut store = FontInfoStore::new(|s| {
51# std::str::from_utf8(std::process::Command::new("kpsewhich")
52# .args(vec!(s)).output().expect("kpsewhich not found!")
53# .stdout.as_slice()).unwrap().trim().to_string()
54# });
55let ls = store.get_glyphlist("cmmib10");
56```
57...it will attempt to parse the `.enc` file associated with `cmmib10`, if existent. If not, or if this
58fails, it will try to parse the `.pfa` or `.pfb` file. If neither works, it will search for a `.vf` file
59and try to parse that. If that too fails, it will return an empty [`GlyphList`].
60
61From either of those three sources, it will then attempt to associate each byte index with a
62[`Glyph`]:
63```no_run
64# use tex_glyphs::encodings::FontInfoStore;
65# let mut store = FontInfoStore::new(|s| {
66# std::str::from_utf8(std::process::Command::new("kpsewhich")
67# .args(vec!(s)).output().expect("kpsewhich not found!")
68# .stdout.as_slice()).unwrap().trim().to_string()
69# });
70# let ls = store.get_glyphlist("cmmib10");
71let zero = ls.get(0);
72let twenty = ls.get(20);
73let ninety_six = ls.get(96);
74println!("0={}={}, 20={}={}, and 96={}={}",
75 zero.name(),zero,
76 twenty.name(),twenty,
77 ninety_six.name(),ninety_six
78);
79```
80```text
810=Gamma=Î, 20=kappa=Κ, and 96=lscript=â
82```
83
84### Font Modifiers
85
86So far, so good - but the glyphs are not bold or italic, but in `cmmib10`, they are.
87So let's check out what properties `cmmib10` has:
88```
89# use tex_glyphs::encodings::FontInfoStore;
90# let mut store = FontInfoStore::new(|s| {
91# std::str::from_utf8(std::process::Command::new("kpsewhich")
92# .args(vec!(s)).output().expect("kpsewhich not found!")
93# .stdout.as_slice()).unwrap().trim().to_string()
94# });
95let font_info = store.get_info("cmmib10").unwrap();
96println!("{:?}",font_info.styles);
97println!("{:?}",font_info.weblink);
98```
99```text
100ModifierSeq { blackboard: false, fraktur: false, script: false, bold: true, capitals: false, monospaced: false, italic: true, oblique: false, sans_serif: false }
101Some(("Latin Modern Math", "https://fonts.cdnfonts.com/css/latin-modern-math"))
102```
103...so this tells us that the font is bold and italic, but not sans-serif, monospaced, etc.
104Also, it tells us that the publically available web-compatible quivalent
105of this font is called "Latin Modern Math" and that we can find it at the provided
106URL, if we want to use it in e.g. HTML :)
107
108Now we only need to apply the modifiers to the glyphs:
109```
110# use tex_glyphs::encodings::FontInfoStore;
111# let mut store = FontInfoStore::new(|s| {
112# std::str::from_utf8(std::process::Command::new("kpsewhich")
113# .args(vec!(s)).output().expect("kpsewhich not found!")
114# .stdout.as_slice()).unwrap().trim().to_string()
115# });
116# let ls = store.get_glyphlist("cmmib10");
117# let zero = ls.get(0);
118# let twenty = ls.get(20);
119# let ninety_six = ls.get(96);
120# let font_info = store.get_info("cmmib10").unwrap();
121use tex_glyphs::fontstyles::FontModifiable;
122println!("{}, {}, and {}",
123 zero.to_string().apply(font_info.styles),
124 twenty.to_string().apply(font_info.styles),
125 ninety_six.to_string().apply(font_info.styles)
126);
127```
128```text
129ð, ðŋ, and â
130```
131
132The [`apply`](fontstyles::FontModifiable::apply)-method stems
133from the trait [`FontModifiable`](fontstyles::FontModifiable), which is implemented
134for any type that implements `AsRef<str>`, including `&str` and `String`.
135It also provides more direct methods, e.g. [`make_bold`](fontstyles::FontModifiable::make_bold),
136[`make_italic`](fontstyles::FontModifiable::make_italic), [`make_sans`](fontstyles::FontModifiable::make_sans), etc.
137
138# Fixing Mistakes
139The procedure above for determining glyphs and font modifiers is certainly not perfect; not just
140because `enc` and `pfa`/`pfb` files might contain wrong or unknown glyph names, but also because
141font modifiers are determined heuristically. For that reason, we provide a way to fix mistakes:
1421. The map from glyphnames to unicode is stored in the file [glyphs.map](https://github.com/Jazzpirate/RusTeX/blob/main/tex-glyphs/src/resources/glyphs.map)
1432. Font modifiers, web font names and links, or even full glyph lists can be added
144 to the markdown file [patches.md](https://github.com/Jazzpirate/RusTeX/blob/main/tex-glyphs/src/resources/patches.md),
145 which additionally serves as a how-to guide for patching any mistakes you might find.
146
147Both files are parsed *during compilation*.
148
149If you notice any mistakes, feel free to open a pull request for these files.
150*/
151#![allow(text_direction_codepoint_in_literal)]
152#![warn(missing_docs)]
153
154pub mod encodings;
155pub mod fontstyles;
156pub mod glyphs;
157mod parsing;
158
159pub use crate::glyphs::{Combinator, Glyph, GlyphList};
160pub use encodings::FontInfoStore;
161
162include!(concat!(env!("OUT_DIR"), "/codegen.rs"));
163
164#[cfg(test)]
165mod tests {
166 use super::fontstyles::{FontModifiable, FontModifier};
167 use super::*;
168 use crate::encodings::FontInfoStore;
169 #[test]
170 fn test_glyphmap() {
171 assert_eq!(Glyph::get("AEacute").to_string(), "Įž");
172 assert_eq!(Glyph::get("contourintegral").to_string(), "âŪ");
173 assert_eq!(Glyph::get("bulletinverse").to_string(), "â");
174 assert_eq!(Glyph::get("Gangiacoptic").to_string(), "ÏŠ");
175 assert_eq!(Glyph::get("zukatakana").to_string(), "ãš");
176 assert_eq!("test".make_bold().to_string(), "ðððŽð");
177 assert_eq!("test".make_bold().make_sans().to_string(), "ððēðð");
178 assert_eq!(
179 "test"
180 .apply_modifiers(&[FontModifier::SansSerif, FontModifier::Bold])
181 .to_string(),
182 "ððēðð"
183 );
184 }
185 fn get_store() -> FontInfoStore<String, fn(&str) -> String> {
186 FontInfoStore::new(|s| {
187 std::str::from_utf8(
188 std::process::Command::new("kpsewhich")
189 .args(vec![s])
190 .output()
191 .expect("kpsewhich not found!")
192 .stdout
193 .as_slice(),
194 )
195 .expect("unexpected kpsewhich output")
196 .trim()
197 .to_string()
198 })
199 }
200
201 #[test]
202 fn test_encodings() {
203 let mut es = get_store();
204 let names = es
205 .all_encs()
206 .take(50)
207 .map(|e| e.tfm_name.clone())
208 .collect::<Vec<_>>();
209 for n in names {
210 es.get_glyphlist(n);
211 }
212 }
213 #[test]
214 fn print_table() {
215 env_logger::builder()
216 .filter_level(log::LevelFilter::Debug)
217 .try_init()
218 .expect("failed to initialize tests");
219 let mut es = get_store();
220 log::info!(
221 "cmr10:\n{}",
222 es.display_encoding("cmr10").expect("cmr10 not found")
223 );
224 log::info!(
225 "cmbx10:\n{}",
226 es.display_encoding("cmbx10").expect("cmbx not found")
227 );
228 log::info!(
229 "wasy10:\n{}",
230 es.display_encoding("wasy10").expect("cmbx not found")
231 );
232 /*
233 log::info!("ptmr7t:\n{}",es.display_encoding("ptmr7t").unwrap());
234 log::info!("ecrm1095:\n{}",es.display_encoding("ecrm1095").unwrap());
235 log::info!("ec-lmr10:\n{}",es.display_encoding("ec-lmr10").unwrap());
236 log::info!("jkpbitc:\n{}",es.display_encoding("jkpbitc").unwrap());
237 log::info!("ot1-stix2textsc:\n{}",es.display_encoding("ot1-stix2textsc").unwrap());
238 log::info!("stix-mathbbit-bold:\n{}",es.display_encoding("stix-mathbbit-bold").unwrap());
239 log::info!("MnSymbolE10:\n{}",es.display_encoding("MnSymbolE10").unwrap());
240 */
241 }
242 /*
243 #[test]
244 fn vfs() {
245 env_logger::builder().filter_level(log::LevelFilter::Debug).try_init().unwrap();
246 use tex_engine::engine::filesystem::kpathsea::*;
247 let mut store = encodings::EncodingStore::new(|s| {
248 match KPATHSEA.which(s).map(|s| s.to_str().map(|s| s.to_string())).flatten() {
249 Some(s) => s,
250 _ => "".into()
251 }
252 });
253 let vfs = &KPATHSEA.post.clone();
254 for v in vfs.values() {
255 match v.extension() {
256 Some(e) if e == "vf" => {
257 let name = v.file_stem().unwrap().to_str().unwrap();
258 log::info!("{}",v.display());
259 match store.display_encoding(name) {
260 Some(s) => log::info!("{}",s),
261 None => log::info!("Failed!")
262 }
263 print!("");
264 }
265 _ => ()
266 }
267 }
268 }
269
270 */
271}