Yes, that is exactly what I meant. I like that collection and âtwtxt v2â feels like a departure.
Maybe thereâs an advantage to grouping it into one spec, but IMO that shouldnât be done at the same time as introducing new untested ideas.
See https://yarn.social (especially this section: https://yarn.social/#self-host) â It really doesnât get much simpler than this đ¤Ł
Again, I like this existing simplicity. (I would even argue you donât need the metadata.)
That page says âFor the best experience your client should also support some of the Twtxt ExtensionsâŚâ but it is clear you donât need to. I would like it to stay that way, and publishing a big long spec and calling it âtwtxt v2â feels like a departure from that. (I think the content of the document is valuable; Iâm just carping about how itâs being presented.)
More thoughts about changes to twtxt (as if we havenât had enough thoughts):
- There are lots of great ideas here! Is there a benefit to putting them all into one document? Seems to me this could more easily be a bunch of separate efforts that can progress at their own pace:
1a. Better and longer hashes.
1b. New possibly-controversial ideas like edit: and delete: and location-based references as an alternative to hashes.
1c. Best practices, e.g. Content-Type: text/plain; charset=utf-8
1d. Stuff already described at dev.twtxt.net that doesnât need any changes.
We wonât know what will and wonât work until we try them. So Iâm inclined to think of this as a bunch of draft ideas. Maybe later when weâve seen it play out it could make sense to define a group of recommended twtxt extensions and give them a name.
Another reason for 1 (above) is: I like the current situation where all you need to get started is these two short and simple documents:
https://twtxt.readthedocs.io/en/latest/user/twtxtfile.html
https://twtxt.readthedocs.io/en/latest/user/discoverability.html
and everything else is an extension for anyone interested. (Deprecating non-UTC times seems reasonable to me, though.) Having a big long âtwtxt v2â document seems less inviting to people looking for something simple. (@prologic@twtxt.net you mentioned an anonymous comment âyouâve ruined twtxtâ and while I donât completely agree with that commenterâs sentiment, I would feel like twtxt had lost something if it moved away from having a super-simple core.)All that being said, these are just my opinions, and Iâm not doing the work of writing software or drafting proposals. Maybe I will at some point, but until then, if youâre actually implementing things, youâre in charge of what you decide to make, and Iâm grateful for the work.
@prologic@twtxt.net Done. Also, I went ahead and made two changes: changed hexadecimal to base64 for hashes (wasnât sure if anyone objected), and changed âMUST follow the chainâ to âSHOULD follow the chain.
@prologic@twtxt.net Thanks for pointing out it lasts four hours. Thatâs a big window! I wonder when most people will be on. I might aim for halfway through unless I hear otherwise. (12:00Z is a bit early for me.)
@bender@twtxt.net I am also in camp no edit signals. deletes only breaks the head of a thread. all the replies are unaffected.
Yes. I have only twtxt and scp hook in twet and it enough
Wait, webfinger? Mandate this ruin philosophy âtwtxt is just text fileâ
I dont think that is ruined twtxt. Twtxt v2 is just standartize twtxt and yarn extensions. What is bad?
@bender@twtxt.net I believe it is Unix-Unix Copy Protocol. Not Unix Copy-Copy Protocol.
@aelaraji@aelaraji.com ooooh! Itâs that kind mission! /me stands, salutes, turns around, and exits the room. LOL.
@aelaraji@aelaraji.com why, having a party with lots of libations? LOL.
@david@collantes.us having offsets were nice because it gives you context of where the user is in relation to you.
@prologic@twtxt.net thanks. I hate it. Might as well use UUID
This is only first draft quality, but I made some notes on the #twtxt v2 proposal. http://a.9srv.net/b/2024-09-25
No, json is overhead. I love twtxt for simplicity where blog is just text file and not several json files where fields are repeatedâŚ
(#2024-09-24T12:53:35Z) What does this screenshot show? The resolution it too low for reading the textâŚ
(#2024-09-24T12:45:54Z) @prologic@twtxt.net Iâm not really buying this one about readability. Itâs easy to recognize that this is a URL and a date, so you skim over it like you would we mentions and markdown links and images. If you are not suppose to read the raw file, then we might a well jam everything into JSON like mastodon
(#2024-09-24T12:44:35Z) There is a increase in space/memory for sure. But calculating the hashes also takes up CPU. Iâm not good with that kind of math, but itâs a tradeoff either way.
(#2024-09-24T12:39:32Z) @prologic@twtxt.net It might be simple for you to run echo -e "\t\t" | sha256sum | base64, but for people who are not comfortable in a terminal and got their dev env set up, then that is magic, compared to the simplicity of just copy/pasting what you see in a textfile into another textfile â Basically what @movq@www.uninformativ.de also said. Iâm also on team extreme minimalism, otherwise we could just use mastodon etc. Replacing line-breaks with a tab would also make it easier to handwrite your twtxt. You donât have to hardwrite it, but at least you should have the option to. Just as i do with all my HTML and CSS.
(#2024-09-24T12:34:31Z) WebMentions does would work if we agreed to implement it correctly. I never figured out how yarndâs WebMentions work, so I decide to make my own, which Iâm the only one usingâŚ
I had a look at WebSub, witch looks way more complex than WebMentions, and seem to need a lot more overhead. We donât need near realtime. We just need a way to notify someone that someone they donât know about mentioned or replied to their post.
Aggred. But reading twtxt in raw form sounds⌠I canât do this
Some more arguments for a local-based treading model over a content-based one:
The format:
(#<DATE URL>)or(@<DATE URL>)both makes sense: # as prefix is for a hashtag like we allredy got with the(#twthash)and @ as prefix denotes that this is mention of a specific post in a feed, and not just the feed in general. Using either can make implementation easier, since most clients already got this kind of filtering.Having something like
(#<DATE URL>)will also make mentions via webmetions for twtxt easier to implement, since there is no need for looking up the#twthash. This will also make it possible to make 3th part twt-mentions services.Supporting twt/webmentions will also increase discoverability as a way to know about both replies and feed mentions from feeds that you donât follow.
Finally pubnix is alive! Thatâs im missing? Im only reading twtxt.net timeline because twtxt-v2.sh works slowly for displaying timelineâŚ
@movq@www.uninformativ.de Yes, the tools are surprisingly fast. Still, magrep takes about 20 seconds to search through my archive of 140K emails, so to speed things up I would probably combine it with an indexer like mu, mairix or notmuch.
@falsifian@www.falsifian.org I believe the preserve means to include the original subject hash in the start of the twt such as (#somehash)
@bender@twtxt.net Ha! Maybe I should get on the Markdown train. Youâre taking away my excuses.
Sorry, youâre right, I should have used numbers!
Iâm donât understand what âpreserve the original hashâ could mean other than âmake sure thereâs still a twt in the feed with that hashâ. Maybe the text could be clarified somehow.
Iâm also not sure what you mean by markdown already being part of it. Of course people can already use Markdown, just like presumably nothing stopped people from using (twt subjects) before they were formally described. But itâs not universal; e.g. as a jenny user I just see the plain text.
@falsifian@www.falsifian.org The GDPR does not apply to the processing of data for a purely personal or household activity that is not connected to a professional or commercial activity.
@prologic@twtxt.net Do you feel the same about published vs. privately stored data?
For me thereâs a distinction. I feel very strongly that I should be able to retain whatever private information I like. On the other hand, I do have some sympathy for requests not to publish or propagate (though I personally feel itâs still morally acceptable to ignore such requests).
@lyse@lyse.isobeef.org Iâd suggest making the whole content-type thing a SHOULD, to accommodate people just using some hosting service they donât have much control over. (The same situation could make detecting followers hard, but IMO âplease email me if you follow meâ is still legit twtxt, even if inconvenient.)
@prologic@twtxt.net Thanks for writing that up!
I hope it can remain a living document (or sequence of draft revisions) for a good long time while we figure out how this stuff works in practice.
I am not sure how I feel about all this being done at once, vs. letting conventions arise.
For example, even today I could reply to twt abc1234 with â(#abc1234) Edit: âŚâ and I think all you humans would understand it as an edit to (#abc1234). Maybe eventually it would become a common enough convention that clients would start to support it explicitly.
Similarly we could just start using 11-digit hashes. We should iron out whether itâs sha256 or whatever but thereâs no need get all the other stuff right at the same time.
I have similar thoughts about how some users could try out location-based replies in a backward-compatible way (append the replyto: stuff after the legacy (#hash) style).
However I recognize that Iâm not the one implementing this stuff, and itâs less work to just have everything determined up front.
Misc comments (I havenât read the whole thing):
Did you mean to make hashes hexadecimal? You lose 11 bits that way compared to base32. Iâd suggest gaining 11 bits with base64 instead.
âClients MUST preserve the original hashâ â do you mean they MUST preserve the original twt?
Thanks for phrasing the bit about deletions so neutrally.
I donât like the MUST in âClients MUST follow the chain of reply-to referencesâŚâ. If someone writes a client as a 40-line shell script that requires the user to piece together the threading themselves, IMO we shouldnât declare the client non-conforming just because they didnât get to all the bells and whistles.
Similarly I donât like the MUST for user agents. For one thing, you might want to fetch a feed without revealing your identty. Also, it raises the bar for a minimal implementation (Iâm again thinking again of the 40-line shell script).
For âwho followsâ lists: why must the long, random tokens be only valid for a limited time? Do you have a scenario in mind where they could leak?
Why canât feeds be served over HTTP/1.0? Again, thinking about simple software. I recently tried implementing HTTP/1.1 and it wasnât too bad, but 1.0 would have been slightly simpler.
Why get into the nitty-gritty about caching headers? This seems like generic advice for HTTP servers and clients.
Iâm a little sad about other protocols being not recommended.
I donât know how I feel about including markdown. I donât mind too much that yarn users emit twts full of markdown, but Iâm more of a plain text kind of person. Also it adds to the length. I wonder if putting a separate document would make more sense; that would also help with the length.
@movq@www.uninformativ.de I cases of these kind of âabuseâ of social trust. Then I think people should just delete their replies, unfollow the troll and leave them to shouting in the void. This is a inter-social issue, not a technical issue. Anything can be spoofed. We are not building a banking app, we are just having conversation and if trust are broken then communication breaks down. These edge-cases are all very hypothetical and not something I think we need to solve with technology.
Weâre happy to report that @burglar@yarnd.lyse.isobeef.org was taken into custody, @prologic@twtxt.net. Always remember, criminals cannot escape the law.
Our investigations revealed: https://lyse.isobeef.org/tmp/twtinjector.tar.bz2
Fear not, @prologic@twtxt.net, weâre deploying our helicopter and will arrive shortly.
Heads up, @prologic@twtxt.net! Weâre seeing increased spate of burglaries in your neighbourhood. Please stay alert, while we keep you safe out there.
@prologic@twtxt.net I have no specifics, only hopes. (I have seen some articles explaining the GDPR doesnât apply to a âpurely personal or household activityâ but I donât really know what that means.)
I donât know if itâs worth giving much thought to the issue unless either you expect to get big enough for the GDPR to matter a lot (I imagine making money is a prerequisite) or someone specifically brings it up. Unless you enjoy thinking through this sort of thing, of course.
@david@collantes.us Thanks, thatâs good feedback to have. I wonder to what extent this already exists in registry servers and yarn pods. I havenât really tried digging into the past in either one.
How interested would you be in changes in metadata and other comments in the feeds? Iâm thinking of just permanently saving every version of each twtxt file that gets pulled, not just the twts. It wouldnât be hard to do (though presenting the information in a sensible way is another matter). Compression should make storage a non-issue unless someone does something weird with their feed like shuffle the comments around every time I fetch it.
@movq@www.uninformativ.de I donât think it has to be like that. Just make sure the new version of the twt is always appended to your current feed, and have some convention for indicating itâs an edit and which twt it supersedes. Keep the original twt as-is (or delete it if you donât want new followers to see it); doesnât matter if itâs archived because you arenât changing that copy.
@prologic@twtxt.net Do you have a link to some past discussion?
Would the GDPR would apply to a one-person client like jenny? I seriously hope not. If someone asks me to delete an email they sent me, I donât think I have to honour that request, no matter how European they are.
I am really bothered by the idea that someone could force me to delete my private, personal record of my interactions with them. Would I have to delete my journal entries about them too if they asked?
Maybe a public-facing client like yarnd needs to consider this, but that also bothers me. I was actually thinking about making an Internet Archive style twtxt archiver, letting you explore past twts, including long-dead feeds, see edit histories, deleted twts, etc.
@prologic@twtxt.net what time in UTC?
@david@collantes.us Well, I wouldnât recommend using my code for your main jenny use anyway. If you want to try it out, set XDG_CONFIG_HOME and XDG_CACHE_HOME to some sandbox directories and only run my code there. If @movq@www.uninformativ.de is interested in any of this getting upstreamed, Iâd be happy to try rebasing the changes, but otherwise itâs a proof of concept and fun exercise.
@david@collantes.us Hello!
@prologic@twtxt.net Hi. i have noticed sometimes when i hit the back button i lose all the surrounding layout and just have a list of twts.

I wrote some code to try out non-hash reply subjects formatted as (replyto ), while keeping the ability to use the existing hash style.
I donât think we need to decide all at once. If clients add support for a new method then people can use it if they like. The downside of course is that this costs developer time, so I decided to invest a few hours of my own time into a proof of concept.
With apologies to @movq@www.uninformativ.de for corrupting jennyâs beautiful code. I donât write this expecting you to incorporate the patch, because it does complicate things and might not be a direction you want to go in. But if you like any part of this approach feel free to use bits of it; I release the patch under jennyâs current LICENCE.
Supporting both kinds of reply in jenny was complicated because each email can only have one Message-Id, and because itâs possible the target twt will not be seen until after the twt referencing it. The following patch uses an sqlite database to keep track of known (url, timestamp) pairs, as well as a separate table of (url, timestamp) pairs that havenât been seen yet but are wanted. When one of those âwantedâ twts is finally seen, the mail file gets rewritten to include the appropriate In-Reply-To header.
Patch based on jenny commit 73a5ea81.
https://www.falsifian.org/a/oDtr/patch0.txt
Not implemented:
- Composing twts using the (replyto âŚ) format.
- Probably other important things Iâm forgetting.
@prologic@twtxt.net Thatâs definitely a little less depressing, when thinking of it that way 𤣠Be interesting when the hype dies down.
@quark@ferengi.one It does not. That is why Iâm advocating for not using hashes for treads, but a simpler link-back scheme.
@prologic@twtxt.net where was that idea?
@prologic@twtxt.net the basic idea was to stem the hash.. so you have a hash abcdef0123456789... any sub string of that hash after the first 6 will match. so abcdef, abcdef012, abcdef0123456 all match the same. on the case of a collision i think we decided on matching the newest since we archive off older threads anyway. the third rule was about growing the minimum hash size after some threshold of collisions were detected.
@prologic@twtxt.net Wikipedia claims sha1 is vulnerable to a âchosen-prefix attackâ, which I gather means I can write any two twts I like, and then cause them to have the exact same sha1 hash by appending something. I guess a twt ending in random junk might look suspcious, but perhaps the junk could be worked into an image URL like
. If thatâs not possible now maybe it will be later.git only uses sha1 because theyâre stuck with it: migrating is very hard. There was an effort to move git to sha256 but I donât know its status. I think there is progress being made with Game Of Trees, a git clone that uses the same on-disk format.
I canât imagine any benefit to using sha1, except that maybe some very old software might support sha1 but not sha256.
@sorenpeter@darch.dk hmm, how does your client handles âa little editingâ? I am sure threads would break just as well. đ