@falsifian@www.falsifian.org based on Twt Subject Extension, your subject is invalid. You can have custom subjects, that is, not a valid hash, but you simply canāt put anything, and expect it to be treated as a TwtSubject, me thinks.
Hmm, but yarnd also isnāt showing these twts as being part of a thread. @prologic@twtxt.net you said yarnd respects customs subjects. Shouldnāt these twts count as having a custom subject, and get threaded together?
yarnd just doesnāt render the subject. Fair enough. Itās (replyto http://darch.dk/twtxt.txt 2024-09-15T12:50:17Z), and if you donāt want to go on a hunt, the twt hash is weadxga: https://twtxt.net/twt/weadxga
@sorenpeter@darch.dk I like this idea. Just for fun, Iām using a variant in this twt. (Also because Iām curious how it non-hash subjects appear in jenny and yarn.)
URLs can contain commas so I suggest a different character to separate the url from the date. Is this twt Iāve used space (also after āreplytoā, for symmetry).
I think this solves:
- Changing feed identities: although @mckinley@twtxt.net points out URLs can change, I think this syntax should be okay as long as the feed at that URL can be fetched, and as long as the current canonical URL for the feed lists this one as an alternate.
- editing, if you donāt care about message integrity
- finding the root of a thread, if youāre not following the author
An optional hash could be added if message integrity is desired. (E.g. if you donāt trust the feed author not to make a misleading edit.) Other recent suggestions about how to deal with edits and hashes might be applicable then.
People publishing multiple twts per second should include sub-second precision in their timestamps. As you suggested, the timestamp could just be copied verbatim.
Maybe Iām being a bit too purist/minimalistic here. As I said before (in one of the 1372739 posts on this topic ā or maybe I didnāt even send that twt, I donāt remember š ), I never really liked hashes to begin with. They arenāt super hard to implement but they are kind of against the beauty of the original twtxt ā because you need special client support for them. Itās not something that you could write manually in your
twtxt.txtfile. With @sorenpeter@darch.dkās proposal, though, that would be possible.
Tangentially related, I was a bit disappointed to learn that the twt subject extension is now never used except with hashes. Manually-written subjects sounded so beautifully ad-hoc and organic as a way to disambiguate replies. Maybe Iāll try it some time just for fun.
@movq@www.uninformativ.de Iām glad you like it. A mention (@<movq https://www.uninformativ.de/twtxt.txt>) is also long, but we live with it anyway. In a way a replyto: is just a mention of a twt instead of a feed/person. Maybe we chould even model the syntax for replies on mentions: (#<2024-09-17T08:39:18Z https://www.eksempel.dk/twtxt.txt>) ?!
@prologic@twtxt.net Brute force. I just hashed a bunch of versions of both tweets until I found a collision.
I mostly just wanted an excuse to write the program. I donāt know how I feel about actually using super-long hashes; could make the twts annoying to read if you prefer to view them untransformed.
@prologic@twtxt.net earlier you suggested extending hashes to 11 characters, but hereās an argument that they should be even longer than that.
Imagine I found this twt one day at https://example.com/twtxt.txt :
2024-09-14T22:00Z Useful backup command: rsync -a ā$HOMEā /mnt/backup
and I responded with ā(#5dgoirqemeq) Thanks for the tip!ā. Then Iāve endorsed the twt, but it could latter get changed to
2024-09-14T22:00Z Useful backup command: rm -rf /some_important_directory
which also has an 11-character base32 hash of 5dgoirqemeq. (Iām using the existing hashing method with https://example.com/twtxt.txt as the feed url, but Iām taking 11 characters instead of 7 from the end of the base32 encoding.)
Thatās what I meant by āspoofingā in an earlier twt.
I donāt know if preventing this sort of attack should be a goal, but if it is, the number of bits in the hash should be at least two times log2(number of attempts we want to defend against), where the ātwo timesā is because of the birthday paradox.
Side note: current hashes always end with āaā or āqā, which is a bit wasteful. Maybe we should take the first N characters of the base32 encoding instead of the last N.
Code I used for the above example: https://fossil.falsifian.org/misc/file?name=src/twt_collision/find_collision.c
I only needed to compute 43394987 hashes to find it.
HTTPS is supposed to do [verification] anyway.
TLS provides verification that nobody is tampering with or snooping on your connection to a server. It doesnāt, for example, verify that a file downloaded from server A is from the same entity as the one from server B.
I was confused by this response for a while, but now I think I understand what youāre getting at. You are pointing out that with signed feeds, I can verify the authenticity of a feed without accessing the original server, whereas with HTTPS I canāt verify a feed unless I download it myself from the origin server. Is that right?
I.e. if the HTTPS origin server is online and I donāt mind taking the time and bandwidth to contact it, then perhaps signed feeds offer no advantage, but if the origin server might not be online, or I want to download a big archive of lots of feeds at once without contacting each server individually, then I need signed feeds.
feed locations [being] URLs gives some flexibility
It does give flexibility, but perhaps we should have made them URIs instead for even more flexibility. Then, you could use a tag URI,
urn:uuid:*, or a regular old URL if you wanted to. The spec seems to indicate that theurltag should be a working URL that clients can use to find a copy of the feed, optionally at multiple locations. Iām not very familiar with IP{F,N}S but if it ensures you own an identifier forever and that identifier points to a current copy of your feed, it could be a great way to fix it on an individual basis without breaking any specs :)
Iām also not very familiar with IPFS or IPNS.
I havenāt been following the other twts about signatures carefully. I just hope whatever you smart people come up with will be backwards-compatible so it still works if Iām too lazy to change how I publish my feed :-)
@sorenpeter@darch.dk There was a client that would generate a unique hash for each twt. It didnāt get wide adoption.
@prologic@twtxt.net do that mean that for every new post (not replies) the client will have to generate a UUID or similar when posting and add that to to the twt?
@lyse@lyse.isobeef.org This looks like a nice way to do it.
Another thought: if clients canāt agree on the url (for example, if we switch to this new way, but some old clients still do it the old way), that could be mitigated by computing many hashes for each twt: one for every url in the feed. So, if a feed has three URLs, every twt is associated with three hashes when it comes time to put threads together.
A client stills need to choose one url to use for the hash when composing a reply, but this might add some breathing room if thereās a period when clients are doing different things.
(From what I understand of jenny, this would be difficult to implement there since each pseudo-email can only have one msgid to match to the in-reply-to headers. I donāt know about other clients.)
@movq@www.uninformativ.de Another idea: just hash the feed url and time, without the message content. And donāt twt more than once per second.
Maybe you could even just use the time, and rely on @-mentions to disambiguate. Not sure how that would work out.
Though I kind of like the idea of twts being immutable. At least, itās clear which version of a twt youāre replying to (assuming nobody is engineering hash collisions).
@prologic@twtxt.net Some criticisms and a possible alternative direction:
Key rotation. Iām not a security person, but my understanding is that itās good to be able to give keys an expiry date and replace them with new ones periodically.
It makes maintaining a feed more complicated. Now instead of just needing to put a file on a web server (and scan the logs for user agents) I also need to do this. What brought me to twtxt was its radical simplicity.
Instead, maybe we should think about a way to allow old urls to be rotated out? Like, my metadata could somehow say that X used to be my primary URL, but going forward from date D onward my primary url is Y. (Or, if you really want to use public key cryptography, maybe something similar could be used for key rotation there.)
Itās nice that your scheme would add a way to verify the twts you download, but https is supposed to do that anyway. If you donāt trust https to do that (maybe you donāt like relying on root CAs?) then maybe your preferred solution should be reflected by your primary feed url. E.g. if you prefer the security offered by IPFS, then maybe an IPNS url would do the trick. The fact that feed locations are URLs gives some flexibility. (But then rotation is still an issue, if I understand ipns right.)
@movq@www.uninformativ.de @prologic@twtxt.net Another option would be: when you edit a twt, prefix the new one with (#[old hash]) and some indication that itās an edited version of the original tweet with that hash. E.g. if the hash used to be abcd123, the new version should start ā(#abcd123) (redit)ā.
What I like about this is that clients that donāt know this convention will still stick it in the same thread. And I feel itās in the spirit of the old pre-hash (subject) convention, though thatās before my time.
I guess it may not work when the edited twt itself is a reply, and there are replies to it. Maybe that could be solved by letting twts have more than one (subject) prefix.
But the great thing about the current system is that nobody can spoof message IDs.
I donāt think twtxt hashes are long enough to prevent spoofing.
@prologic@twtxt.net Perfect, thanks. For my own future reference: curl -H āAccept: application/jsonā https://twtxt.net/twt/st3wsda
@prologic@twtxt.net Specifically, I could view yarndās copy here, but only as rendered for a human to view: https://twtxt.net/twt/st3wsda
@movq@www.uninformativ.de thanks for getting to the bottom of it. @prologic@twtxt.net is there a way to view yarndās copy of the raw twt? The edit didnāt result in a visible change; being able to see what yarnd originally downloaded would have helped me debug.
@prologic@twtxt.net One of your twts begins with (#st3wsda): https://twtxt.net/twt/bot5z4q
Based on the twtxt.net web UI, it seems to be in reply to a twt by @cuaxolotl@sunshinegardens.org which begins āIāve been sketching outā¦ā.
But jenny thinks the hash of that twt is 6mdqxrq. At least, thereās a very twt in their feed with that hash that has the same text as appears on yarn.social (except with ā instead of ā).
Based on this, it appears jenny and yarnd disagree about the hash of the twt, or perhaps the twt was edited (though I canāt see any difference, assuming ā vs ā is just a rendering choice).
@prologic@twtxt.net I believe you when you say registries as designed today do not crawl. But when I first read the spec, it conjured in my mind a search engine. Now I donāt know how things work out in practice, but just based on reading, I donāt see why it canāt be an API for a crawling search engine. (In fact I donāt see anything in the spec indicating registry servers shouldnāt crawl.)
(I also noticed that https://twtxt.readthedocs.io/en/latest/user/registry.html recommends āThe registries should sync each others user list by using the users endpointā. If I understood that right, registering with one should be enough to appear on others, even if they donāt crawl.)
Does yarnd provide an API for finding twts? Is it similar?
I just manually followed the steps at https://dev.twtxt.net/doc/twthashextension.html and got 6mdqxrq. I wonder what happened. Did @cuaxolo@sunshinegardens.org edit the twt in some subtle way after twtxt.net downloaded it? I couldnāt spot a diff, other than ā appearing as ā on yarn.social, which I assume is a transformation done by twtxt.net.
@prologic@twtxt.net How does yarn.socialās API fix the problem of centralization? I still need to know whose API to use.
Say I see a twt beginning (#hash) and I want to look up the start of the thread. Is the idea that if that twt is hosted by a a yarn.social pod, it is likely to know the thread start, so I should query that particular pod for the hash? But what if no yarn.social pods are involved?
The community seems small enough that a registry server should be able to keep up, and I can have a couple of others as backups. Or I could crawl the list of feeds followed by whoever emitted the twt that prompted my query.
I have successfully used registry servers a little bit, e.g. to find a feed that mentioned a tag I was interested in. Was even thinking of making my own, if I get bored of my too many other projects :-)
@movq@www.uninformativ.de Thanks, it works!
But when I tried it out on a twt from @prologic@twtxt.net, I discovered jenny and yarn.social seem to disagree about the hash of this twt: https://twtxt.net/twt/st3wsda . jenny assigned it a hash of 6mdqxrq but the URL and prologicās reply suggest yarn.social thinks the hash is st3wsda. (And as a result, jenny āfetch-context didnāt work on prologicās twt.)
@movq@www.uninformativ.de I think you are worrying about a non-issue. I see nothing to do on your example twt, because there is no context. Furthermore, if I wanted to follow the feed, everything I need is already on that twt example. :-)
@prologic@twtxt.net Yes, fetching the twt by hash from some service could be a good alternative, in case the twt I have does not @-mention the source. (Besides yarnd, maybe this should be part of the registry API? I donāt see fetch-by-hash in the registry API docs.)
@movq@www.uninformativ.de I donāt know if Iād want to discard the twts. I think what Iām looking for is a command ājenny -g https://host.org/twtxt.txtā to fetch just that one feed, even if itās not in my follow list. I could wrap that in a shell script so that when I see a twt in reply to a feed I donāt follow, I can just tap a key and the feed will get added to my maildir. I guess the script would look for a mention at the start of a selected twt and call jenny -g on the feed.
@movq@www.uninformativ.de Is there a good way to get jenny to do a one-off fetch of a feed, for when you want to fill in missing parts of a thread? I just added @slashdot@feeds.twtxt.net to my private follow file just because @prologic@twtxt.net keeps responding to the feed :-P and I want to know what heās commenting on even though I donāt want to see every new slashdot twt.
twts are taking a very long time to post from yarn after the latest upgrade. Like a good 60 seconds.
@prologic@twtxt.net I donāt know if this is new, but Iām seeing:
Jul 25 16:01:17 buc yarnd[1921547]: time="2024-07-25T16:01:17Z" level=error msg="https://yarn.stigatle.no/user/stigatle/twtxt.txt: client.Do fail: Get \"https://yarn.stigatle.no/user/stigatle/twtxt.txt\": dial tcp 185.97.32.18:443: i/o timeout (Client.Timeout exceeded while awaiting headers)" error="Get \"https://yarn.stigatle.no/user/stigatle/twtxt.txt\": dial tcp 185.97.32.18:443: i/o timeout (Client.Timeout exceeded while awaiting headers)"
I no longer see twts from @stigatle@yarn.stigatle.no at all.
This is a test. I am not seeing twts from @stigatle@yarn.stigatle.no and it seems like @prologic@twtxt.net might not be seeing twts from me. Do people see this?
@prologic@twtxt.net I am not seeing twts from @stigatle@yarn.stigatle.no anymore. Are you seeing twts from me?
@prologic@twtxt.net There are a lot of logs being generated by yarnd, which is something I havenāt seen before too:
Jul 25 14:32:42 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:42 (162.211.155.2) "GET /twt/ubhq33a HTTP/1.1" 404 29 643.251µs
Jul 25 14:32:43 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:43 (162.211.155.2) "GET /twt/112073211746755451 HTTP/1.1" 400 12 505.333µs
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (111.119.213.103) "GET /twt/whau6pa HTTP/1.1" 200 37360 35.173255ms
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (162.211.155.2) "GET /twt/112343305123858004 HTTP/1.1" 400 12 455.069µs
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (168.199.225.19) "GET /external?nick=lovetocode999&uri=http%3A%2F%2Fwww.palapa.pl%2Fbaners.php%3Flink%3Dhttps%3A%2F%2Fwww.dwnewstoday.com HTTP/1.1" 200 36167 19.582077ms
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (162.211.155.2) "GET /twt/112503061785024494 HTTP/1.1" 400 12 619.152µs
Jul 25 14:32:46 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:46 (162.211.155.2) "GET /twt/111863876118553837 HTTP/1.1" 400 12 817.678µs
Jul 25 14:32:46 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:46 (162.211.155.2) "GET /twt/112749994821704400 HTTP/1.1" 400 12 540.616µs
Jul 25 14:32:47 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:47 (103.204.109.150) "GET /external?nick=lovetocode999&uri=http%3A%2F%2Fampurify.com%2Fbbs%2Fboard.php%3Fbo_table%3Dfree%26wr_id%3D113858 HTTP/1.1" 200 36187 15.95329ms
Iāve seen that nick=lovetocode999 a bunch.
There are also a bunch of log messages scrolling by. Iāve never seen this much activity in the log:
Jul 25 01:37:39 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:39 (149.71.56.69) "GET /external?nick=lovetocode999&uri=https://pagez.co.uk/services/your-own-100-fully-owned-online-vi>
Jul 25 01:37:39 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:39 (162.211.155.2) "GET /twt/112135496802692324 HTTP/1.1" 400 12 826.65µs
Jul 25 01:37:40 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:40 (51.222.253.14) "GET /conv/muttriq HTTP/1.1" 200 36881 20.448309ms
Jul 25 01:37:40 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:40 (162.211.155.2) "GET /twt/112730114943543514 HTTP/1.1" 400 12 663.493µs
Jul 25 01:37:40 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:40 (27.75.213.253) "GET /external?nick=lovetocode999&uri=http%3A%2F%2Falfarah.jo%2FHome%2FChangeCulture%3FlangCode%3Den>
Jul 25 01:37:40 buc.ci yarnd[829]: time="2024-07-25T01:37:40Z" level=error msg="http://bynet.com.br/log_envio.asp?cod=335&email=%21%2AEMAIL%2A%21&url=https%3A%2F%2Fwww.almanacar.c>
Jul 25 01:37:40 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:40 (162.211.155.2) "GET /twt/111674756400660911 HTTP/1.1" 400 12 545.106µs
Jul 25 01:37:40 buc.ci yarnd[829]: time="2024-07-25T01:37:40Z" level=warning msg="feed FetchFeedRequest: @<lovetocode999 http://alfarah.jo/Home/ChangeCulture?langCode=en&returnUrl>
Jul 25 01:37:41 buc.ci yarnd[829]: [yarnd] 2024/07/25 01:37:41 (162.211.155.2) "GET /twt/112507964696096567 HTTP/1.1" 400 12 838.946µs
Something really weird is going on?
The delete twt is not working.
So updated. Seems to duplicate here in the ui. And what is this āRead Moreā on every twt now?
Testing something.. can someone mention me in a twt?
It not that easy @xuu@txt.sour.is since I implemented webmentions in a different way that how it have been done in yarnd to work with txt-files. You can find the code in webmention_endpoint.php and new_twt.php at main Ā· sorenpeter/timeline
@shreyan@twtxt.net What do you mean when you say federation protocol?
Either use webfinger for identity like mastodon etc. or use ATproto from Bluesky (or both?)
We can use webmentions or create our own twt-mentions for notifying someones feed (WIP code at: https://github.com/sorenpeter/timeline/tree/webmention/views)
Iām not sure we need much else. I would not even bother with encryption since other platforms does that better, and for me twtxt/yarn/timeline is for making things public
Twtxt spec enhancement proposal thread š§µ
Adding attributes to individual twts similar to adding feed attributes in the heading comments.
https://git.mills.io/yarnsocial/go-lextwt/pulls/17
The basic use case would be for multilingual feeds where there is a default language and some twts will be written a different language.
As seen in the wild: https://eapl.mx/twtxt.txt
The attributes are formatted as [key=value]
They can show up in the twt anywhere it is not enclosed by another element such as codeblock or part of a markdown link.
> ?
@sorenpeter@darch.dk this makes sense as a quote twt that references a direct URL. If we go back to how it developed on twitter originally it was RT @nick: original text because it contained the original text the twitter algorithm would boost that text into trending.
i like the format (#hash) @<nick url> > "Quoted text"\nThen a comment
as it preserves the human read able. and has the hash for linking to the yarn. The comment part could be optional for just boosting the twt.
The only issue i think i would have would be that that yarn could then become a mess of repeated quotes. Unless the client knows to interpret them as multiple users have reposted/boosted the thread.
The format is also how iphone does reactions to SMS messages with +number liked: original SMS
> ?
@eapl.me@eapl.me this is interesting. Is the square bracket something used in the wild for multilingual twts?
@prologic@twtxt.net what are your thoughts? Should we extend the parser to handle [lang] and [boost] ? Or a generic attribute spec. Single word is a boolean attribute. And one with an = is a string key/value.
Early morning twt from me
If youāre reading this, this is my first automated twt. I added a line to twtxt.txt, typed āmakeā, and everything else was automatic.
Here itās 6.40 am. I like to get my twts done early
when is twt.nfld.uk coming back?
@prologic@twtxt.net I do, but you didnāt specify in your twt that you needed to use a github account. I copy pasted the ssh command you posted verbatim!
<darch> testing out generating twt-hash using php
Letās assume for a moment that an answer to a question would be met with so many words you donāt know what the answer was at all. Why? Why do this? Is this a stereotype of academics and philosophers? If so, itās not a very straight-forward way of thinking, let alone answering a simple question.
Well, I canāt know whatās in these peoplesā minds and hearts. Personally I think itās a way of dissembling, of sowing doubt, and of maintaining plausible deniability. The strategy is to persuade as many people as possible to change their minds, and then force the remaining people to accept the idea because they think too many other people believe it.
Letās say you want, for whatever reason, to get a lot of people to accept an idea that you know most people find horrible. The last thing you should do is express the idea clearly and concisely and repeat it over and over again. All youād accomplish is to cement peopleās resistance to you, and label yourself as a person who harbors horrible ideas that they donāt like. So you canāt do that.
What do you do instead? The entire field of ārhetoricā, dating back at least to Plato and Aristotle (400 years BC), is all about this. How to persuade people to accept your idea, even when they resist it. There are way too many techniques to summarize in a twt, but it seems almost obvious that you have to use more words and to use misleading or at least embellished or warped descriptions of things, because thatās the opposite of clearly and concisely expressing yourself, which would directly lead to people rejecting your idea.
Thatās how I think of it anyway.
@prologic@twtxt.net I would politely suggest again that we not react to people with bad attitudes who talk shit about yarn. If twt is forked, it should be forked to add features that are otherwise not possible. Not to appease people who will probably never be appeased.
Iām not super a fan of using json. I feel we could still use text as the medium. Maybe a modified version to fix any weakness.
What if instead of signing each twt individually we generated a merkle tree using the twt hashes? Then a signature of the root hash. This would ensure the full stream of twts are intact with a minimal overhead. With the added bonus of helping clients identify missing twts when syncing/gossiping.
Have two endpoints. One as the webfinger to link profile details and avatar like you posted. And the signature for the merkleroot twt. And the other a pageable stream of twts. Or individual twts/merkle branch to incrementally access twt feeds.