Just typing twts directly into my twtxt file.
Details:
- Opening my twtxt file remotely using
vim scp://user@remote:port//path/to/twtxt.txt
- Inserting the date, time and tab part of the twt with
:.!echo "$(date -Is)\t"
- In case I need to add a new line I just
Ctrl+Shift+u, type in the2028and hitEnter
- In order to replay, you just steal a twt hash from your favorite Yarn instance.
It looks tedious, but itās fun to know I can twt no matter where I am, as long as can ssh in.
@zvava@twtxt.net My clients trusts the first url field it finds. If there is none, it uses the URL that Iām using for fetching the feed.
No validation, no logging.
In practice, Iāve not seen issues with people messing with this field. (What I do see, of course, is broken threads when people do legitimate edits that change the hash.)
I donāt see a way how anyone can impersonate anybody else this way. š¤ Sure, you could use my URL in your url field, but then what? You will still show up as zvava in my client or, if you also change your nick field, as movq (zvava).
@zvava@twtxt.net Yes, the specification defines the first url to be used for hashing. No matter if it points to a different feed or whatever. Just unsubscribe from malicious feeds and youāre done.
Since the first url is used for hashing, it must never change. Otherwise, it will break threading, as you already noticed. If your feed moves and you wanna keep the old messages in the same new feed, you still have to point to the old url location and keep that forever. But you can add more urls. As I said several times in the past, in hindsight, using the first url was a big mistake. It would have been much better, if the last encountered url were used for hashing onwards. This way, feed moves would be relatively straightforward. However, that ship has sailed. Luckily, feeds typically donāt relocate.
@movq@www.uninformativ.de You were seeing that mayn hash collisions for you to notice this? š±
The twtiverse appears to have shrunk. Among the 61 feeds that I follow, I donāt see any hash collisions anymore. š¤
Exactly, @zvava@twtxt.net, I agree. (Although, in my client at least, I wouldnāt use hashes anywhere.)
@alexonit@twtxt.alessandrocutolo.it Yeah I think weāre overstating the UNIX principles a bit here 𤣠I get what youāre trying to say though @zvava@twtxt.net š If I could go back in time and do it all over again, I would have gotten the Hash length correct and I would have used SHA-256 instead. But someone way smarter than me designed the Twt Hash spec, we adopted it and well here we are today, it works⢠š
@zvava@twtxt.net Going to have to hard disagree here Iām sorry. a) no-one reads the raw/plain twtxt.txt files, the only time you do is to debug something, or have a stick beak at the comments which most clients will strip out and ignore and b) Iām sorry youāve completely lost me! Iām old enough to pre-date before Linux became popular, so Iām not sure what UNIX principles you think are being broken or violated by having a Twt Subject (Subject) whose contents is a cryptographic content-addressable hash of the āthingā⢠youāre replying to and forming a chain of other replies (a thread).
Iām sorry, but the simplest thing to do is to make the smallest number of changes to the Spec as possible and all agree on a āMagic Dateā for which our clients use the modified function(s).
@bender@twtxt.net Well honestly, this is just it. My strong position on this is quite simple:
Do the simplest thing that could work.
Itās one of the age old UNIX philosphies.
Therefore, the simplest thing⢠to do here is to just increase the hash length, mark a magic⢠date/time as @lyse@lyse.isobeef.org has indicated and call it a day. Weāll then be fine for a few hundred years, at which point thereāll be no-one left alive to give a shit⢠anyway š¤£
@prologic@twtxt.net considering other alternatives we have seeing (of which I have lost track already), yes. Why donāt you guys (client makers) take a step at a time and, for now, increase the hash length to deal with the collisions. Then location-based addressing can be added⦠or not, you know. š
Of course we still have to fix the hashing algorithm and length.
@lyse@lyse.isobeef.org I donāt think thereās any point in continuing the discussion of Location vs. Content based addressing.
I want us to preserve Content based addressing.
Letās improve the user experience and fix the hash commission problems.
@prologic@twtxt.net I know we wonāt ever convince each other of the otherās favorite addressing scheme. :-D But I wanna address (haha) your concerns:
I donāt see any difference between the two schemes regarding link rot and migration. If the URL changes, both approaches are equally terrible as the feed URL is part of the hashed value and reference of some sort in the location-based scheme. It doesnāt matter.
The same is true for duplication and forks. Even today, the ācannonical URLā has to be chosen to build the hash. Thatās exactly the same with location-based addressing. Why would a mirror only duplicate stuff with location- but not content-based addressing? I really fail to see that. Also, who is using mirrors or relays anyway? I donāt know of any such software to be honest.
If there is a spam feed, I just unfollow it. Done. Not a concern for me at all. Not the slightest bit. And the byte verification is THE source of all broken threads when the conversation start is edited. Yes, this can be viewed as a feature, but how many times was it actually a feature and not more behaving as an anti-feature in terms of user experience?
I donāt get your argument. If the feed in question is offline, one can simply look in local caches and see if there is a message at that particular time, just like looking up a hash. Whereās the difference? Except that the lookup key is longer or compound or whatever depending on the cache format.
Even a new hashing algorithm requires work on clients etc. Itās not that you get some backwards-compatibility for free. It just cannot be backwards-compatible in my opinion, no matter which approach we take. Thatās why I believe some magic time for the switch causes the least amount of trouble. You leave the old world untouched and working.
If these are general concerns, Iām completely with you. But I donāt think that they only apply to location-based addressing. Thatās how I interpreted your message. I could be wrong. Happy to read your explanations. :-)
Here is just a small list of things⢠that Iām aware will break, some quite badly, others in minor ways:
- Link rot & migrations: domain changes, path reshuffles, CDN/mirror use, or moving from txt ā jsonfeed will orphan replies unless every reader implements perfect 301/410 history, which they wonāt.
- Duplication & forks: mirrors/relays produce multiple valid locations for the same post; readers see several āparentsā and split the thread.
- Verification & spam-resistance: content addressing lets you dedupe and verify youāre pointing at exactly the post you meant (hash matches bytes). Location anchors can be replayed or spoofed more easily unless you add signing and canonicalization.
- Offline/cached reading: without the original URL being reachable, readers canāt resolve anchors; with hashes they can match against local caches/archives.
- Ecosystem churn: all existing clients, archives, and tools that assume content-derived IDs need migrations, mapping layers, and fallback logic. Expect long-lived threads to fracture across implementations.
Weāve been discussing the idea of changing the threading model from Content-based Addressing to Location-based addressing for years now. The problem is quite complex, but I feel I have to keep reminding yāall of the potential perils of changing this and the pros/cons of each model:
With content-addressed threading, a reply points at something thatās intrinsically identified (hash of author/feed URI + timestamp + content). That ID never changes as long as the content doesnāt. Switching to location-based anchors makes the reply target extrinsicāit now depends on where the post currently lives. In a pull-based, decentralised network, locations drift. The moment they do, thread identity fragments.
@zvava@twtxt.net There would be only one hash for a message. Some to be defined magic date selects which hash to use. If the message creation timestamp is before this epoch, hash it with v1, otherwise hammer it through v2. Eventually, support for v1 could be dropped as nobody interacts with the old stuff anymore. But Iād keep it around in my client, because why not.
If users choose a client which supports the extensions, they donāt have to mess around with v1 and v2 hashing, just like today.
As for the school of thought, personally, Iād prefer something else, too. Iām in camp location-based addressing, or whatever it is called. There more I think about it, a complete redesign of twtxt and its extensions would be necessary in my opinion. Retrofitting has its limits. Of course, this is much more work, though.
@zvava@twtxt.net I was about to suggest that you post some examples. By now, weāre pretty good at debugging hashing issues, because that happens so often. š But it looks like you figured it out on your own. āļø
@zvava@twtxt.net we have to amend the spec and increase the hash length. We just havenāt done so yet š
@zvava@twtxt.net may I recommend to change the mention format upon hitting reply to something similar to what itās used in Yarn, and perhaps hiding the hash on the post too? Looking good!
@movq@www.uninformativ.de Yeah, weāve seen how this plays out in practice 𤣠@dce@hashnix.club My advice, do what @movq@www.uninformativ.de has hinted at and donāt change the 1st # url = field in your feed. Iām not sure if you had already, but the first url field is kind of important in your feed as it is used as the āHashing URIā for threading.
@dce@hashnix.club Ah, oh, well then. š„“
My client supports that, if you set multiple url = fields in your feedās metadata (the top-most one must be the āmainā URL, that one is used for hashing).
But yeah, multi-protocol feeds can be problematic and some have considered it a mistake to support them. š¤
huh.. so not even trying to be compatible with existing hashes?
I have zero mental energy for programming at the moment. š«¤
Iāll try to implement the new hashing stuff in jenny before the ādeadlineā. But I donāt think youāll see any texudus development from me in the near future. ā¹ļø
@ About the URL, since it no longer used for hashing there might be no need to change it. I agree that we keep all the parts that already are out there for the most parts. Instead of a contact field you could also just use links like: link = Email mailto:user@example.dk or link = Signal https://signal.me/sthF4raI5Lg_ybpJwB1sOptDla4oU7p[...]
@lyse@lyse.isobeef.org Yeah to avoid cutting off bits at the end making hashes end in either q or a š¤£
if clauses to this. My point is: Every time I see a hash, Iād like to have a hint as to where to find the corresponding twt.
The reason I think this can work so well and Iām in full support of it is that itās the least disruptive way to resolve the issue of:
where did this hash come from?
@prologic@twtxt.net Not sure Iād attach any if clauses to this. My point is: Every time I see a hash, Iād like to have a hint as to where to find the corresponding twt.
@movq@www.uninformativ.de If weāre focusing on solving the āmissing rootsā problems. I would start to think about āclient recommendationsā. The first recommendation would be:
- Replying to a Twt that has no initial Subject must itself have a Subject of the form (hash; url).
This way itās a hint to fetching clients that follow B, but not A (in the case of no mentions) that the Subject/Root might (very likely) is in the feed url.
If we must stick to hashes for threading, can we maybe make it mandatory to always include a reference to the original twt URL when writing replies?
Instead of
(<a href="https://we.loveprivacy.club/search?q=%23123467">#123467</a>) hello foo bar
you would have
(<a href="https://we.loveprivacy.club/search?q=%23123467">#123467</a> http://foo.com/tw.txt) hello foo bar
or maybe even:
(<a href="https://we.loveprivacy.club/search?q=%23123467">#123467</a> 2025-04-30T12:30:31Z http://foo.com/tw.txt) hello foo bar
This would greatly help in reconstructing broken threads, since hashes are obviously unfortunately one-way tickets. The URL/timestamp would not be used for threading, just for discovery of feeds that you donāt already follow.
I donāt insist on including the timestamp, but having some idea which feed weāre talking about would help a lot.
7 to 12 and use the first 12 characters of the base32 encoded blake2b hash. This will solve two problems, the fact that all hashes today either end in q or a (oops) š
And increasing the Twt Hash size will ensure that we never run into the chance of collision for ions to come. Chances of a 50% collision with 64 bits / 12 characters is roughly ~12.44B Twts. That ought to be enough! -- I also propose that we modify all our clients and make this change from the 1st July 2025, which will be Yarn.social's 5th birthday and 5 years since I started this whole project and endeavour! š± #Twtxt #Update
I will be adding the code in for yarnd very soon⢠for this change, with a if the date is >= 2025-07-01 then compute_new_hashes else compute_old_hashes
Finally I propose that we increase the Twt Hash length from 7 to 12 and use the first 12 characters of the base32 encoded blake2b hash. This will solve two problems, the fact that all hashes today either end in q or a (oops) š
And increasing the Twt Hash size will ensure that we never run into the chance of collision for ions to come. Chances of a 50% collision with 64 bits / 12 characters is roughly ~12.44B Twts. That ought to be enough! ā I also propose that we modify all our clients and make this change from the 1st July 2025, which will be Yarn.socialās 5th birthday and 5 years since I started this whole project and endeavour! š± #Twtxt #Update
I had Chick-fil-A breakfast today (sausage, egg, and cheese biscuit, hash browns, coffee, and orange juice). Then at lunch my work place offered hot dogs. I had two (kosher, if that matters), plus a coke, a macadamia nuts cookie, and a small chocolate brownie.
So, here I am, at home, feeling hungry but guilty and refusing to eat anything else for the rest of the day. To top it off, I have only clocked 4,000 steps today (and I donāt feel like walking). I am going to hell, am I?
MaxAgeDays configuration at the pod level, that now some profiles are rather empty. This is only because well, they're a bit "inactive" so to speak š£ļø Not sure what to do about this at the moment... Open to ideas? š”
yes it used be http:// only and to keep hashes from breaking i added # url = http://... and now we are stock with it due to the curret specs.
Hmmm thereās a bug somewhere in the way Iām ingesting archived feeds š¤
sqlite> select * from twts where content like 'The web is such garbage these days%';
hash = 37sjhla
feed_url = https://twtxt.net/user/prologic/twtxt.txt/1
content = The web is such garbage these days š Or is it the garbage search engines? š¤
created = 2024-11-14T01:53:46Z
created_dt = 2024-11-14 01:53:46
subject = #37sjhla
mentions = []
tags = []
links = []
sqlite>
Some A hole has been trying to pull every single Twtxt feed that existed/still exists since forever. How do I know? Welpā Theyāve been querying my Timeline⢠instance for all of it, every single twtxt file and twt Hash they can find. šš¤¦ It must have been going on for days and I have just noticed⦠+ itās all coming from the same ASN AS136907 HWCLOUDS-AS-AP HUAWEI CLOUDS
Thank you Huawei for the DDos you sons of Glitches!!!
@quark@ferengi.one No editing old Twts that are the root of a thread with replies in the ecosystem. Just results in a fork. Unless the client has an implementation that does not store Twts keyed by Hash.
@david@collantes.us @andros@twtxt.andros.dev The correct hash would be si4er3q. See https://twtxt.dev/exts/twt-hash.html, a timezone offset of +00:00 or -00:00 must be replaced by Z.
(That said, thereās a bug in jenny as well. It only replaces +00:00, not -00:00. š¤”)
@prologic@twtxt.net interesting. What would happen on a hash collision? š¤
@bender@twtxt.net Itās a bug in the UI for sure. The hash is the primary key.
@david@collantes.us Yeah, weāve been debugging that a bit yesterday. Looks like the wrong input (sometimes) gets fed to the hash function ā broken threads.
./yarnc debug <your feed url>:
The actual hash is fs7673q.
./yarnc debug <your feed url>:
@prologic@twtxt.net thatās not what I see. The hash znf6csa cannot be found.
@prologic@twtxt.net There was no edit according to my Git history. š¤ On my end, the hash is fs7673q and thatās also what kat used to reply.
Doesnāt look like it Hmmm
sqlite> select * from twts where content LIKE '%Linux installation%';
hash = znf6csa
feed_url = https://www.uninformativ.de/twtxt.txt
content = I wonder if my current Linux installation will actually make it to 20 years:
$ head -n 1 /var/log/pacman.log
[2011-07-07 11:19] installed filesystem (2011.04-1)
Itās not toooo far into the future.
It would be crazy ⦠20 years without reinstalling once ⦠phew. š„“
created = 2025-04-07T19:59:51Z
subject = (#znf6csa)
mentions = []
tags = []
links = []
@movq@www.uninformativ.de Apparently you wrote it :D The hash doesnāt lie? 𤣠https://twtxt.net/twt/znf6csa
@prologic@twtxt.net What happened here ā did I edit my twt or is this hash wrong? š„“
@bender@twtxt.net Yeah, as you mentioned in the other thread, @andros@twtxt.andros.devās hashes appear to be not quite right. š¤
@andros@twtxt.andros.dev Hm, looks correct to me. The image to be displayed is a thumbnail and this links to the full-sized image. The thumbnail (JPG) is auto-generated from the full image (PNG), hence the two extensions.
What does look strange, though, is that your client came up with the hash pqsmcka, while it should have been te5quba. š¤
True. Though if the idea turns out to be better.. then community will adopt it.
if you look at the subject for that twt you will see that it uses the extended hash format to include a URL address.
@andros@twtxt.andros.dev I believe you have just reproduced the bug⦠it looks like youāve replayed to a twt but the hash is wrong. I can see the hash here from Jenny, but it doesnāt look like it corresponds to any{twt,thing}. if you check it out on any yarn instance it wonāt look like a replay.