Awk to take lines from Plan 9’s /lib/unicode and prepend the actual glyph and a tab: awk ‘{cmd=sprintf(“unicode %s”, $1); cmd | getline c; printf(“%s %s\n”, c, $0)}’
Unicode characters in iskeyword ⌘ Read more
Tag proposal: typesetting
For stories relating to: how text is laid out an
Discussion: should it include e.g. stories about Pango? Unicode (partially relevant to typesetting/typography I suppose)? These are a bit further from “typsetting” but is relevant to typography and encoding? Should it include stories about markup languages?
See searches for e.g. [LaTeX](https://lobste.rs/search … ⌘ Read more
How to display non-printable unicode characters? ⌘ Read more
How to display non-printable unicode characters? ⌘ Read more
@lyse@lyse.isobeef.org The underlines are a bit much, yes. It appears to be related to my font (Helvetica) … Maybe they do some Unicode trickery these days, I don’t know. 🫤
fn sub(foo: &String) {
println!("We got this string: [{}]", foo);
}
fn main() {
// "Hello", 0x00, 0x00, "!"
let buf: [u8; 8] = [0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x00, 0x00, 0x21];
// Create a string from the byte array above, interpret as UTF-8, ignore decoding errors.
let lossy_unicode = String::from_utf8_lossy(&buf).to_string();
sub(&lossy_unicode);
}
Create a string from a byte array, but the result isn’t a string, it’s a cow 🐮, so you need another to_string() to convert your “string” into a string.
- https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8_lossy
- https://doc.rust-lang.org/std/borrow/enum.Cow.html
I still have a lot to learn.
(into_owned() instead of to_string() also works and makes more sense to me, it’s just that the compiler suggested to_string() first, which led to this funny example.)
(Where is there no bass emoji in Unicode? Pah.)
Trinity Desktop Environment R14.1.4 released
The Trinity Desktop Environment, the modern-day continuation of the KDE 3.x series, has released version R14.1.4. This maintenance release brings new vector wallpapers and colour schemes, support for Unicode surrogate characters and planes above zero (for emoji, among other things), tabs in kpdf, transparency and other new visual effects for Dekorator, and much more. TDE R14.1.4 is already available for a variety of Linux distributions, and c … ⌘ Read more
Understanding surrogate pairs: why some Windows filenames can’t be read
Windows was an early adopter of Unicode, and its file APIs use UTF‑16 internally since Windows 2000-used to be UCS-2 in Windows 95 era, when Unicode standard was only a draft on paper, but that’s another topic. Using UTF-16 means that filenames, text strings, and other data are stored as sequences of 16‑bit units. For Windows, a properly formed surrogate pair is perfectly acceptable. However … ⌘ Read more
@bender@twtxt.net @prologic@twtxt.net I’m not exactly asking yarnd to change. If you are okay with the way it displayed my twts, then by all means, leave it as is. I hope you won’t mind if I continue to write things like 1/4 to mean “first out of four”.
What has text/markdown got to do with this? I don’t think Markdown says anything about replacing 1/4 with ¼, or other similar transformations. It’s not needed, because ¼ is already a unicode character that can simply be directly inserted into the text file.
What’s wrong with my original suggestion of doing the transformation before the text hits the twtxt.txt file? @prologic@twtxt.net, I think it would achieve what you are trying to achieve with this content-type thing: if someone writes 1/4 on a yarnd instance or any other client that wants to do this, it would get transformed, and other clients simply wouldn’t do the transformation. Every client that supports displaying unicode characters, including Jenny, would then display ¼ as ¼.
Alternatively, if you prefer yarnd to pretty-print all twts nicely, even ones from simpler clients, that’s fine too and you don’t need to change anything. My 1/4 -> ¼ thing is nothing more than a minor irritation which probably isn’t worth overthinking.
Unicode doesn’t distinguish between a dollar sign with one and a dollar sign with two strokes, which makes me sad.
Account Problems
⌘ Read more
[Emacs] 替數學符號設定專用字型 ⌘ Read more
Weird Unicode Math Symbols
⌘ Read more
someday i will descend upon the unicode consortium and add sub/superscript version of the whole latin alphabet
@lyse@lyse.isobeef.org What the heck? no emoji? do you even Unicode!
@prologic@twtxt.net Yeah like normally I’m just a little annoyed and just say “whatever” and shrug it off, but come on I am searching for emojis here. Do you really need to harvest my user data for what is essentially a fuzzy search in the Unicode table?
@prologic@twtxt.net lol. just testing some Unicode.
On the blog: Where Have All the Emoji Gone? https://john.colagioia.net/blog/2021/09/29/emoji.html #programming #techtips #unicode #blog
https://metacpan.org/release/WOLFSAGE/perl-5.35.4/changes#Unicode-14.0-is-supported Perl 5.35.4 版之後所對應的 Unicode 版本已經推進到 14.0.0 了。
@prologic@twtxt.net should we enable all unicode glyphs for tags? https://txt.sour.is/conv/55yrura
I wrote a ‘banner’-like program for Plan 9 (and p9p) that uses the Unicode box drawing characters: http://txtpunk.com/banner/index.html
https://www.materialui.co/unicode-characters design unicode web
huh, txtnish seems to have problems with linebreaks & unicode;.
Accented and other unicode characters in groff/troff ⌘
Teletext graphics characters among those added to Unicode – Teletext Art http://teletextart.co.uk/teletext-graphics-characters-among-those-added-to-unicode
Because of the use of ‘rune’ to refer to unicode codepoints in go, a fulthark transliteration program might have somewhat confusing source…
A Spectre is Haunting Unicode https://www.dampfkraft.com/ghost-characters.html
unum - Interconvert numbers, Unicode, and HTML/XHTML characters http://www.fourmilab.ch/webtools/unum/
Does unicode also work? 💚☎
All fonts I tried are either ugly or the unicode glyphs are so small that they become unreadable. #fonts
The pretty format is very similar to twtxt without the unicode glyphs and the relative date.