Skip to main content

tex_glyphs/
lib.rs

1/*! This crate provides a way to access glyphs from TeX fonts. It is intended to be used by
2crates using [`tex_engine`](https://crates.io/crates/tex_engine).
3
4TeX deals with fonts by parsing *font metric files* (`.tfm` files), which contain information
5about the dimensions of each glyph in the font. So from the point of view of (the core of) TeX,
6a *glyph* is just an index $0 \leq i \leq 255$ into the font metric file.
7
8In order to find out what the glyph actually looks like, we want to ideally know the corresponding
9unicode codepoint. This crate attempts to do exactly that.
10
11# Usage
12
13This crate attempts to associate a tex font (identified by the file name stem of its `.tfm` file) with:
141. A list of [`FontModifier`](fontstyles::FontModifier)s (e.g. bold, italic, sans-serif, etc.)
152. A [`GlyphList`], being an array `[`[`Glyph`]`;256]`
16
17A [`Glyph`] then is either undefined (i.e. the glyph is not present in the font, or the crate couldn't
18figure out what exactly it is) or presentable as a string.
19
20Consider e.g. `\mathbf{\mathit{\Gamma^\kappa_\ell}}` (i.e. $\mathbf{\mathit{\Gamma^\kappa_\ell}}$).
21From the point of view of TeX, this is a sequence of 3 glyphs, represented as indices into the font
22`cmmib10`, namely 0, 20, and 96.
23
24Here's how to use this crate to obtain the corresponding unicode characters, i.e. `𝜞`, `ðœŋ` and `ℓ`:
25
26### Instantiation
27
28First, we instantiate a [`FontInfoStore`](encodings::FontInfoStore) with a function that
29allows it to find files. This function should take a string (e.g. `cmmib10.tfm`) and return a string
30(e.g. `/usr/share/texmf-dist/fonts/tfm/public/cm/cmmib10.tfm`). This could be done by calling `kpsewhich`
31for example, but repeated and frequent calls to `kpsewhich` are slow, so more efficient alternatives
32are recommended.
33
34```no_run
35use tex_glyphs::encodings::FontInfoStore;
36let mut store = FontInfoStore::new(|s| {
37    std::str::from_utf8(std::process::Command::new("kpsewhich")
38        .args(vec!(s)).output().expect("kpsewhich not found!")
39        .stdout.as_slice()).unwrap().trim().to_string()
40});
41```
42This store will now use the provided function to find your `pdftex.map` file, which lists
43all the fonts that are available to TeX and associates them with `.enc`, `.pfa` and `.pfb` files.
44
45### Obtaining Glyphs
46
47If we now query the store for the [`GlyphList`] of some font, e.g. `cmmib10`, like so:
48```no_run
49# use tex_glyphs::encodings::FontInfoStore;
50# let mut store = FontInfoStore::new(|s| {
51#     std::str::from_utf8(std::process::Command::new("kpsewhich")
52#         .args(vec!(s)).output().expect("kpsewhich not found!")
53#         .stdout.as_slice()).unwrap().trim().to_string()
54# });
55let ls = store.get_glyphlist("cmmib10");
56```
57...it will attempt to parse the `.enc` file associated with `cmmib10`, if existent. If not, or if this
58fails, it will try to parse the `.pfa` or `.pfb` file. If neither works, it will search for a `.vf` file
59and try to parse that. If that too fails, it will return an empty [`GlyphList`].
60
61From either of those three sources, it will then attempt to associate each byte index with a
62[`Glyph`]:
63```no_run
64# use tex_glyphs::encodings::FontInfoStore;
65# let mut store = FontInfoStore::new(|s| {
66#     std::str::from_utf8(std::process::Command::new("kpsewhich")
67#         .args(vec!(s)).output().expect("kpsewhich not found!")
68#         .stdout.as_slice()).unwrap().trim().to_string()
69# });
70# let ls = store.get_glyphlist("cmmib10");
71let zero = ls.get(0);
72let twenty = ls.get(20);
73let ninety_six = ls.get(96);
74println!("0={}={}, 20={}={}, and 96={}={}",
75    zero.name(),zero,
76    twenty.name(),twenty,
77    ninety_six.name(),ninety_six
78);
79```
80```text
810=Gamma=Γ, 20=kappa=ι, and 96=lscript=ℓ
82```
83
84### Font Modifiers
85
86So far, so good - but the glyphs are not bold or italic, but in `cmmib10`, they are.
87So let's check out what properties `cmmib10` has:
88```
89# use tex_glyphs::encodings::FontInfoStore;
90# let mut store = FontInfoStore::new(|s| {
91#     std::str::from_utf8(std::process::Command::new("kpsewhich")
92#         .args(vec!(s)).output().expect("kpsewhich not found!")
93#         .stdout.as_slice()).unwrap().trim().to_string()
94# });
95let font_info = store.get_info("cmmib10").unwrap();
96println!("{:?}",font_info.styles);
97println!("{:?}",font_info.weblink);
98```
99```text
100ModifierSeq { blackboard: false, fraktur: false, script: false, bold: true, capitals: false, monospaced: false, italic: true, oblique: false, sans_serif: false }
101Some(("Latin Modern Math", "https://fonts.cdnfonts.com/css/latin-modern-math"))
102```
103...so this tells us that the font is bold and italic, but not sans-serif, monospaced, etc.
104Also, it tells us that the publically available web-compatible quivalent
105of this font is called "Latin Modern Math" and that we can find it at the provided
106URL, if we want to use it in e.g. HTML :)
107
108Now we only need to apply the modifiers to the glyphs:
109```
110# use tex_glyphs::encodings::FontInfoStore;
111# let mut store = FontInfoStore::new(|s| {
112#     std::str::from_utf8(std::process::Command::new("kpsewhich")
113#         .args(vec!(s)).output().expect("kpsewhich not found!")
114#         .stdout.as_slice()).unwrap().trim().to_string()
115# });
116# let ls = store.get_glyphlist("cmmib10");
117# let zero = ls.get(0);
118# let twenty = ls.get(20);
119# let ninety_six = ls.get(96);
120# let font_info = store.get_info("cmmib10").unwrap();
121use tex_glyphs::fontstyles::FontModifiable;
122println!("{}, {}, and {}",
123    zero.to_string().apply(font_info.styles),
124    twenty.to_string().apply(font_info.styles),
125    ninety_six.to_string().apply(font_info.styles)
126);
127```
128```text
129𝜞, ðœŋ, and ℓ
130```
131
132The [`apply`](fontstyles::FontModifiable::apply)-method stems
133from the trait [`FontModifiable`](fontstyles::FontModifiable), which is implemented
134for any type that implements `AsRef<str>`, including `&str` and `String`.
135It also provides more direct methods, e.g. [`make_bold`](fontstyles::FontModifiable::make_bold),
136[`make_italic`](fontstyles::FontModifiable::make_italic), [`make_sans`](fontstyles::FontModifiable::make_sans), etc.
137
138# Fixing Mistakes
139The procedure above for determining glyphs and font modifiers is certainly not perfect; not just
140because `enc` and `pfa`/`pfb` files might contain wrong or unknown glyph names, but also because
141font modifiers are determined heuristically. For that reason, we provide a way to fix mistakes:
1421. The map from glyphnames to unicode is stored in the file [glyphs.map](https://github.com/Jazzpirate/RusTeX/blob/main/tex-glyphs/src/resources/glyphs.map)
1432. Font modifiers, web font names and links, or even full glyph lists can be added
144   to the markdown file [patches.md](https://github.com/Jazzpirate/RusTeX/blob/main/tex-glyphs/src/resources/patches.md),
145   which additionally serves as a how-to guide for patching any mistakes you might find.
146
147Both files are parsed *during compilation*.
148
149If you notice any mistakes, feel free to open a pull request for these files.
150*/
151#![allow(text_direction_codepoint_in_literal)]
152#![warn(missing_docs)]
153
154pub mod encodings;
155pub mod fontstyles;
156pub mod glyphs;
157mod parsing;
158
159pub use crate::glyphs::{Combinator, Glyph, GlyphList};
160pub use encodings::FontInfoStore;
161
162include!(concat!(env!("OUT_DIR"), "/codegen.rs"));
163
164#[cfg(test)]
165mod tests {
166    use super::fontstyles::{FontModifiable, FontModifier};
167    use super::*;
168    use crate::encodings::FontInfoStore;
169    #[test]
170    fn test_glyphmap() {
171        assert_eq!(Glyph::get("AEacute").to_string(), "Įž");
172        assert_eq!(Glyph::get("contourintegral").to_string(), "âˆŪ");
173        assert_eq!(Glyph::get("bulletinverse").to_string(), "◘");
174        assert_eq!(Glyph::get("Gangiacoptic").to_string(), "ÏŠ");
175        assert_eq!(Glyph::get("zukatakana").to_string(), "゚");
176        assert_eq!("test".make_bold().to_string(), "𝐭𝐞𝐎𝐭");
177        assert_eq!("test".make_bold().make_sans().to_string(), "𝘁ð—ē𝘀𝘁");
178        assert_eq!(
179            "test"
180                .apply_modifiers(&[FontModifier::SansSerif, FontModifier::Bold])
181                .to_string(),
182            "𝘁ð—ē𝘀𝘁"
183        );
184    }
185    fn get_store() -> FontInfoStore<String, fn(&str) -> String> {
186        FontInfoStore::new(|s| {
187            std::str::from_utf8(
188                std::process::Command::new("kpsewhich")
189                    .args(vec![s])
190                    .output()
191                    .expect("kpsewhich not found!")
192                    .stdout
193                    .as_slice(),
194            )
195            .expect("unexpected kpsewhich output")
196            .trim()
197            .to_string()
198        })
199    }
200
201    #[test]
202    fn test_encodings() {
203        let mut es = get_store();
204        let names = es
205            .all_encs()
206            .take(50)
207            .map(|e| e.tfm_name.clone())
208            .collect::<Vec<_>>();
209        for n in names {
210            es.get_glyphlist(n);
211        }
212    }
213    #[test]
214    fn print_table() {
215        env_logger::builder()
216            .filter_level(log::LevelFilter::Debug)
217            .try_init()
218            .expect("failed to initialize tests");
219        let mut es = get_store();
220        log::info!(
221            "cmr10:\n{}",
222            es.display_encoding("cmr10").expect("cmr10 not found")
223        );
224        log::info!(
225            "cmbx10:\n{}",
226            es.display_encoding("cmbx10").expect("cmbx not found")
227        );
228        log::info!(
229            "wasy10:\n{}",
230            es.display_encoding("wasy10").expect("cmbx not found")
231        );
232        /*
233        log::info!("ptmr7t:\n{}",es.display_encoding("ptmr7t").unwrap());
234        log::info!("ecrm1095:\n{}",es.display_encoding("ecrm1095").unwrap());
235        log::info!("ec-lmr10:\n{}",es.display_encoding("ec-lmr10").unwrap());
236        log::info!("jkpbitc:\n{}",es.display_encoding("jkpbitc").unwrap());
237        log::info!("ot1-stix2textsc:\n{}",es.display_encoding("ot1-stix2textsc").unwrap());
238        log::info!("stix-mathbbit-bold:\n{}",es.display_encoding("stix-mathbbit-bold").unwrap());
239        log::info!("MnSymbolE10:\n{}",es.display_encoding("MnSymbolE10").unwrap());
240         */
241    }
242    /*
243       #[test]
244       fn vfs() {
245           env_logger::builder().filter_level(log::LevelFilter::Debug).try_init().unwrap();
246           use tex_engine::engine::filesystem::kpathsea::*;
247           let mut store = encodings::EncodingStore::new(|s| {
248               match KPATHSEA.which(s).map(|s| s.to_str().map(|s| s.to_string())).flatten() {
249                   Some(s) => s,
250                   _ => "".into()
251               }
252           });
253           let vfs = &KPATHSEA.post.clone();
254           for v in vfs.values() {
255               match v.extension() {
256                   Some(e) if e == "vf" => {
257                       let name = v.file_stem().unwrap().to_str().unwrap();
258                       log::info!("{}",v.display());
259                       match store.display_encoding(name) {
260                           Some(s) => log::info!("{}",s),
261                           None => log::info!("Failed!")
262                       }
263                       print!("");
264                   }
265                   _ => ()
266               }
267           }
268       }
269
270    */
271}