[Return] [Catalog]

1 guest@cc 2019-02-25T19:58:56
I needed to open MHTs on Debian. My old solutions were rebooting into Windows (doublebooting or VirtualBox) and using Firefox 3.6, but I remembered that legacy Opera could open MHT files. It worked great, but I also got a metric fuck ton of feels. Opera and Edge have moved on to Blink, and Firefox is trying its best to look like Chrome. The web has become bloated with JS, and browsers have become heavy. Very sad.
»
2 guest@cc 2019-02-26T00:09:59
Hey, don't complain about bloat if you're saving your web pages in MHTML. It's arguably a handy archive format (I'd specifically argue against that, though), but it uses far more space than it should have to.
»
3 guest@cc 2019-02-26T00:42:23
>>2
I'm not a fan of the internal formatting either (and the MHTs I'm looking at could benefit massively from sharing assets as well), but I find the overall concept less objectionable than having index.html + index_files, or Print as > index.pdf. (Wikipedia seems to suggest that it's based on e-mail technology, which personally explains a lot.) And besides, what else is 7zip for?
»
4 guest@cc 2019-02-26T04:05:32
>>3
I just wget my sites and hope the developer was sensible enough to use relative links. In case they weren't, I just try to live without the CSS/images knowing they've been backed up to my disk anyway, but if I were less lazy, I'd have this setup where I automatically host the pages on a local web server. I'm sure there's programs for it out there.

Anyway, as you'd probably know, MHT is short for "MIME HTML". MIME is used in all sorts of protocols nowadays, but it was originally designed for email, hence "Multipurpose Internet Mail Extensions". Email was already well-established by the time people started deciding that they wanted to start attaching pictures and stuff to them, but the problem was that email was and always has been just a method of passing single text files around, except in a highly structured way. Breaking that structure to add something like that in would've ended up being pretty disasterous, so to work around this, the guys writing the standards decided that attachments should be converted to text in the middle of the email, and it'd be left to the clients to convert that text back to whatever it originally was. There's different ways to encode binary files as text, but the most popular one is Base64. Clients which couldn't handle this at the time would basically just get a mess of letters and numbers at the bottom of the message, which is a lot better than not being able to read them at all. People eventually got the funny idea of turning their emails into web pages with pictures, bold coloured text, and maybe a little javascript program which tells the sender when you've read their email, and that's where MHTML comes from. Basically a normal HTML page, but anything that was embedded originally is now just encoded as Base64 inline.

I'm a real fan of the technology behind it, but I'm kind of autistic, so I don't really like to change or convert files at all. It'd be best if people basically just abandoned normal HTML entirely, and started hand-writing their pages in MHTML instead, but nothing like that's ever really going to happen.

https://en.wikipedia.org/wiki/Data_URI_scheme#HTML Check this out, isn't it cool?
»
5 guest@cc 2019-02-27T21:05:58
>>4
>I just wget my sites and hope the developer was sensible enough to use relative links.

This reminds me, I had downloaded a bunch of web pages a long while ago and wrote custom CSS + a frames-based HTML wrapper for my offline reading pleasure.

>Anyway, as you'd probably know [...]

I actually did not know. I'm only vaguely aware of how emails are internally (?) formatted. Intriguing.

>I'm kind of autistic, so I don't really like to change or convert files at all.

Oh, absolutely. I use ffmpeg to extract the audio directly from youtube-dl downloads. I appreciate how MHT preserves things decently well in one package.

As for writing pages directly in MHTML and/or data URI schemes: I prefer the idea of keeping resources relatively separate, although there are definitely use cases for putting everything in one file. I used to block images sometimes when browsing the web for faster loading, and did not appreciate Google embedding images as data URIs in the results page. And hand-writing MHTML sounds hellish!
»
6 guest@cc 2019-03-03T14:59:19
>I actually did not know.
Just so I can make myself look less autistic, I meant the part about it being short for "MIME HTML", I didn't expect you to know the rest.

[Return] [Catalog]
Delete Post:
OptionsPassword
Name
Comment
File