unicode_linebreak

Function split_at_safe

source
pub fn split_at_safe(s: &str) -> (&str, &str)
Expand description

Divides the string at the last index where further breaks do not depend on prior context.

The trivial index at eot is excluded.

A common optimization is to determine only the nearest line break opportunity before the first character that would cause the line to become overfull, requiring backward traversal, of which there are two approaches:

  • Cache breaks from forward traversals
  • Step backward and with split_at_safe find a pos to safely search forward from, repeatedly

§Examples

use unicode_linebreak::{linebreaks, split_at_safe};
let s = "Not allowed to break within em dashes: — —";
let (prev, safe) = split_at_safe(s);
let n = prev.len();
assert!(linebreaks(safe).eq(linebreaks(s).filter_map(|(i, x)| i.checked_sub(n).map(|i| (i, x)))));