Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proton Mail (Bridge) sends quoted strings with not allowed bytes #38

Open
soywod opened this issue Aug 22, 2024 · 12 comments
Open

Proton Mail (Bridge) sends quoted strings with not allowed bytes #38

soywod opened this issue Aug 22, 2024 · 12 comments
Labels
AFFECTED=Proton Bridge v3.6.1 PROTO=IMAP Related to IMAP protocol STATE=REPRODUCED Issue could be reproduced (explained in issue)

Comments

@soywod
Copy link

soywod commented Aug 22, 2024

Given the following malformed message:

Date: Sat, 17 Aug 2024 12:21:18 +0000
From: "Bàd" <from@localhost>
To: "Wrȯng" <to@localhost>
Subject: All is wrong?

Too bad.

When fetching the envelope associated to this message from Proton Bridge, it returns a FETCH containing envelope address with not allowed bytes inside quoted strings, which leads to imap_next error Received malformed message.

I did the same test with Gmail, Outlook, Posteo and even 163, they all encode correctly the sender. Proton Bridge seems to send back envelope addresses as they are in the original message.

CC @duesee

@soywod
Copy link
Author

soywod commented Aug 22, 2024

I precise that the error occurs only from Proton Bridge (v3.6.1 tested). The malformed message displays correctly in their web interface.

@chibenwa
Copy link

CF https://datatracker.ietf.org/doc/html/rfc5738

They take it for granted and activated by default?

How they handle UTF 8 in mailbox names ?

@soywod
Copy link
Author

soywod commented Aug 22, 2024

In fact, Proton Bridge sends invalid bytes even when the message is correctly encoded. With From: =?utf-8?q?Cl=C3=A9ment_DOUIN?= <clement.douin@localhost>, I get the same error:

* 2 FETCH (UID 4 FLAGS (\\Seen) ENVELOPE (\"Thu, 22 Aug 2024 07:48:03 +0000\" \"Test\" ((\"Clément DOUIN\" NIL \"clement.douin\" \"localhost\")) ((\"Clément DOUIN\" NIL \"clement.douin\" \"localhost\")) ((\"Clément DOUIN\" NIL \"clement.douin\" \"localhost\")) ((NIL NIL \"pimalaya.org\" \"proton.me\")) NIL NIL NIL \"<2087a3200a5a1f8f27bcb2add236e70d@localhost>\") BODYSTRUCTURE (\"text\" \"plain\" (\"charset\" \"utf-8\") NIL NIL \"quoted-printable\" 7 1 NIL NIL NIL NIL))\r\n

They take it for granted and activated by default?

Looks like. I precise that I do not send any ENABLE UTF8.

How they handle UTF 8 in mailbox names ?

I created a folder named Envoyés, and looks like they properly encode it using UTF-7 as defined in https://www.rfc-editor.org/rfc/rfc3501#section-5.1.3.

event: Ok(
    DataReceived {
        data: List {
            items: [
                Unmarked,
            ],
            delimiter: Some(
                QuotedChar(
                    '/',
                ),
            ),
            mailbox: Other(
                MailboxOther(
                    String(
                        Quoted(
                            Quoted("Folders/Envoy&AOk-s"),
                        ),
                    ),
                ),
            ),
        },
    },
)

@soywod
Copy link
Author

soywod commented Aug 22, 2024

Looks like they also return invalid bytes in NO errors. Trying to create a mailbox Bàd leads to MalformedMessage error 0.1 NO invalid mailbox name [\"Bàd\"]: operation not allowed\r\n.

@soywod
Copy link
Author

soywod commented Aug 22, 2024

Looks like Proton Bridge does not support the ENABLE capability. And if I still try to enable it, I get once again a MalformedMessage error: 0.1 BAD [Error offset=5]: unknown command 'enable'\r\n. How come we receive malformed message for regular errors? Is Proton Bridge that broken?

@soywod
Copy link
Author

soywod commented Aug 22, 2024

@duesee
Copy link
Member

duesee commented Aug 22, 2024

And if I still try to enable it, I get once again a MalformedMessage error: 0.1 BAD [Error offset=5]: unknown command 'enable'\r\n.

The message is malformed I would say. But: IMAP is weird.

 0.1 BAD [Error offset=5]: unknown command 'enable'\r\n.

... can be interpreted as ...

Status {
  tag: ...
  code: Some(Other("Error offset=5")),
  // The next ":" is broken
  text: "unknown command `enable`"
}

... or (with a lot goodwill) as ...

Status {
  tag: ...
  code: None,
  text: "[Error offset=5]: unknown command `enable`",
}

See #31. I think the [ and ] should clearly signal that this is a code. But: You could also argue it's a valid message (and imap-codec should consume it)... :-/

@soywod
Copy link
Author

soywod commented Dec 12, 2024

Another case of envelope containing invalid chars. Shall we initiate a new quirk?

@duesee
Copy link
Member

duesee commented Dec 12, 2024

Can you give me a short reminder what was our last state here? I remember there was a bug where we activated UTF-8 (without being prepared for it.) Can we confirm that Proton uses UTF-8 unilaterally without us asking for it via A ENABLE UTF-8=...? If so, we are back to implementing a quirk (which should arguably only switch on UTF-8 handling even w/o ENABLE.)

@ajanvrin
Copy link

ajanvrin commented Dec 14, 2024

I'd like to give more context on pimalaya/himalaya#525 which I submitted, and is possibly related:

For context, I'm using ProtonBridge v3.15.1 (br-206), on Windows, and himalaya v1.0.0.

I initially thought that UTF8 was not the cause, as out of 2828 invalid fetches, 4 of them did not, at a glance, include raw UTF8 bytes (all 2824 other invalid fetches contained either \xc3, \xe2 or \xc2 and a followup byte).

I used python's subprocess module to collect the raw output, and here are the raw bytestrings of the 4 interesting log lines:

b'\x1b[2m2024-12-14T08:13:37.131617Z\x1b[0m \x1b[33m WARN\x1b[0m \x1b[1mfetch_envelopes_by_sequence\x1b[0m\x1b[1m{\x1b[0m\x1b[3mclient\x1b[0m\x1b[2m=\x1b[0m1\x1b[1m}\x1b[0m\x1b[2m:\x1b[0m \x1b[2mimap_client::tasks\x1b[0m\x1b[2m:\x1b[0m skipping invalid fetch \x1b[3mfetch\x1b[0m\x1b[2m=\x1b[0m"* 14 FETCH (UID 14 FLAGS () ENVELOPE (\\"Sat, 19 Oct 2024 15:18:01 -0600\\" \\"=?utf-8?q?Comment_avez-vous_trouv=C3=A9_votre_vol_au_d=C3=A9part_de_City?= =?utf-8?q?_et_=C3=A0_destination_de_City=C2=A0=3F?=\\" ((\\"Flight Company Customer Satisfaction\\\\u00a0\\" NIL \\"noreply\\" \\"domain.tld\\")) ((\\"Flight Company Customer Satisfaction\\\\u00a0\\" NIL \\"noreply\\" \\"domain.tld\\")) ((\\"Flight Company Customer Satisfaction\\\\u00a0\\" NIL \\"no-reply\\" \\"qualtrics-survey.com\\")) ((\\"FIRSTNAME LASTNAME\\" NIL \\"user\\" \\"domain.tld\\")) NIL NIL NIL \\"<1214041441.1249006.1729372681678@a2fb25952a56>\\") BODYSTRUCTURE (\\"text\\" \\"html\\" (\\"charset\\" \\"utf-8\\") NIL NIL \\"quoted-printable\\" 34072 894 NIL NIL NIL NIL))\\r\\n"'
b'\x1b[2m2024-12-14T08:13:37.137532Z\x1b[0m \x1b[33m WARN\x1b[0m \x1b[1mfetch_envelopes_by_sequence\x1b[0m\x1b[1m{\x1b[0m\x1b[3mclient\x1b[0m\x1b[2m=\x1b[0m1\x1b[1m}\x1b[0m\x1b[2m:\x1b[0m \x1b[2mimap_client::tasks\x1b[0m\x1b[2m:\x1b[0m skipping invalid fetch \x1b[3mfetch\x1b[0m\x1b[2m=\x1b[0m"* 214 FETCH (UID 214 FLAGS (\\\\Seen) ENVELOPE (\\"Sun, 11 Jun 2023 10:16:44 -0600 (MDT)\\" \\"=?utf-8?q?Merci_d\'=C3=A9valuer_votre_vol_Air_France_au_d=C3=A9part_de_CutCityFirstHalf?= =?utf-8?q?CutCitySecondHalf_et_=C3=A0_destination_de_City?=\\" ((\\"Flight Company Customer Satisfaction\\\\u00a0\\" NIL \\"noreply\\" \\"domain.tld\\")) ((\\"Flight Company Customer Satisfaction\\\\u00a0\\" NIL \\"noreply\\" \\"domain.tld\\")) ((\\"Flight Company Customer Satisfaction\\\\u00a0\\" NIL \\"no-reply\\" \\"qualtrics-survey.com\\")) ((\\"FIRSTNAME LASTNAME\\" NIL \\"user\\" \\"domain.tld\\")) NIL NIL NIL \\"<337850967.7769735.1686500204291@cd67e213b4e3>\\") BODYSTRUCTURE (\\"text\\" \\"html\\" (\\"charset\\" \\"utf-8\\") NIL NIL \\"quoted-printable\\" 35256 894 NIL NIL NIL NIL))\\r\\n"'
b'\x1b[2m2024-12-14T08:13:37.181983Z\x1b[0m \x1b[33m WARN\x1b[0m \x1b[1mfetch_envelopes_by_sequence\x1b[0m\x1b[1m{\x1b[0m\x1b[3mclient\x1b[0m\x1b[2m=\x1b[0m1\x1b[1m}\x1b[0m\x1b[2m:\x1b[0m \x1b[2mimap_client::tasks\x1b[0m\x1b[2m:\x1b[0m skipping invalid fetch \x1b[3mfetch\x1b[0m\x1b[2m=\x1b[0m"* 1809 FETCH (UID 1809 FLAGS (\\\\Seen) ENVELOPE (\\"Mon, 12 Aug 2013 08:57:04 +0200\\" \\"SUBJECT WITH SPACES\\" ((\\"company\\" NIL \\"company\\" \\"domain.tld\\")) ((\\"company\\" NIL \\"company\\" \\"domain.tld\\")) ((\\"company\\" NIL \\"company\\" \\"domain.tld\\")) () NIL NIL NIL \\"<52088740.6020001@domain.tld>\\") BODYSTRUCTURE (((\\"text\\" \\"html\\" (\\"charset\\" \\"utf-8\\") NIL NIL \\"quoted-printable\\" 4251 92 NIL NIL NIL NIL)(\\"image\\" \\"jpeg\\" (\\"filename\\" \\"part4.08050502.05050707@domain.tld\\" \\"name\\" \\"part4.08050502.05050707@domain.tld\\") \\"<part4.08050502.05050707@domain.tld>\\" NIL \\"base64\\" 11256 NIL (\\"inline\\" (\\"filename\\" \\"part4.08050502.05050707@domain.tld\\")) NIL NIL) \\"related\\" (\\"boundary\\" \\"2f917aa8ac3f4ae1ffd7cad73f22103164031b2ec4556f9da1d787853f754740\\") NIL NIL NIL) \\"mixed\\" (\\"boundary\\" \\"f2b6cb68d1cc50d33c0e69dbd11f7064cbf4a252816423eebb5e02a759bbeebe\\") NIL NIL NIL))\\r\\n"'
b'\x1b[2m2024-12-14T08:13:37.219571Z\x1b[0m \x1b[33m WARN\x1b[0m \x1b[1mfetch_envelopes_by_sequence\x1b[0m\x1b[1m{\x1b[0m\x1b[3mclient\x1b[0m\x1b[2m=\x1b[0m1\x1b[1m}\x1b[0m\x1b[2m:\x1b[0m \x1b[2mimap_client::tasks\x1b[0m\x1b[2m:\x1b[0m skipping invalid fetch \x1b[3mfetch\x1b[0m\x1b[2m=\x1b[0m"* 2810 FETCH (UID 2810 FLAGS (\\\\Seen) ENVELOPE (\\"Mon, 17 Dec 2018 11:09:00 -0800 (PST)\\" \\"Delivery Status Notification (Failure)\\" ((\\"Mail Delivery Subsystem\\" NIL \\"mailer-daemon\\" \\"googlemail.com\\")) ((\\"Mail Delivery Subsystem\\" NIL \\"mailer-daemon\\" \\"googlemail.com\\")) ((\\"Mail Delivery Subsystem\\" NIL \\"mailer-daemon\\" \\"googlemail.com\\")) ((NIL NIL \\"user\\" \\"gmail.com\\")) NIL NIL \\"<0509D27D-5C76-4E62-9F9F-E5CF61E6FF83@gmail.com>\\" \\"<5c17f44c.1c69fb81.375a8.09a0.GMR@mx.google.com>\\") BODYSTRUCTURE ((\\"text\\" \\"html\\" (\\"charset\\" \\"utf-8\\") NIL NIL \\"quoted-printable\\" 5661 142 NIL NIL NIL NIL)(\\"image\\" \\"png\\" (\\"filename\\" \\"icon.png\\" \\"name\\" \\"icon.png\\") \\"<icon.png>\\" NIL \\"base64\\" 1986 NIL (\\"attachment\\" (\\"filename\\" \\"icon.png\\")) NIL NIL)((\\"text\\" \\"plain\\" (\\"charset\\" \\"us-ascii\\") NIL NIL \\"7bit\\" 4 2 NIL NIL NIL NIL)(\\"text\\" \\"plain\\" () NIL NIL NIL 30 1 NIL NIL NIL NIL) \\"rfc822\\" (\\"filename\\" \\"email-1.3.eml\\" \\"name\\" \\"email-1.3.eml\\") (\\"attachment\\" (\\"filename\\" \\"email-1.3.eml\\")) NIL NIL) \\"mixed\\" (\\"boundary\\" \\"43a5f034afc02f5e5820f7480ed886e0cfb29c4f5e75dc230225b75dc67cae11\\") NIL NIL NIL))\\r\\n"'

The first two contain encoded UTF8, and raw non-breaking spaces (\u00a0), the other seem mundane.

I minimally redacted the log lines, here's what I replaced the original bits with (the cut city name is mildly interesting in itself, there was no reason to cut it as it only contained ascii letters). None of the original bits included non-ascii bytes:

City
CutCityFirstHalf?= =?utf-8?q?CutCitySecondHalf
FIRSTNAME
Flight Company
LASTNAME
SUBJECT WITH SPACES
company
domain.tld
user

I'm going to try to compile himalaya from source and poke around the code to find potential clues (I'll try to understand how quirks work and what new quirk may solve the issue).
I'm not super proficient in Rust, nor IMAP, so don't hold your breath :).
If there are any diagnostic step you'd like me to do, please tell

@ajanvrin
Copy link

Trying to be useful, I wanted to answer #38 (comment) specifically.

Here's the observed behavior of ProtonBridge v3.15.1 (br-206) (latest version that I'm aware of):

"a1 LOGIN redacted redacted`r
a2 ENABLE UTF8=ACCEPT`r
a3 SELECT Folders/to_sort`r
a4 FETCH 2815 ENVELOPE`r
a5 LOGOUT`r`n"|openssl s_client -connect localhost:1143 -starttls imap
Connecting to 127.0.0.1
CONNECTED(000001F0)
Can't use SSL_get_servername
depth=0 C=CH, O=Proton AG, OU=Proton Mail, CN=127.0.0.1
verify error:num=18:self-signed certificate
verify return:1
depth=0 C=CH, O=Proton AG, OU=Proton Mail, CN=127.0.0.1
verify return:1
. OK CAPABILITY

---
Certificate chain
[TLS-related lines removed for brevity]
---
read R BLOCK
a1 OK [CAPABILITY ID IDLE IMAP4rev1 MOVE STARTTLS UIDPLUS UNSELECT] Logged in
a2 BAD [Error offset=4]: unknown command 'enable'
* FLAGS ($Forwarded Forwarded \Deleted \Flagged \Seen)
* 2823 EXISTS
* 0 RECENT
* OK [PERMANENTFLAGS ($Forwarded Forwarded \Deleted \Flagged \Seen)] Flags permitted
* OK [UIDNEXT 2830] Predicted next UID
* OK [UIDVALIDITY 58736456] UIDs valid
* OK [UNSEEN 1] Unseen messages
a3 OK [READ-WRITE] SELECT
* 2815 FETCH (ENVELOPE ("Thu, 13 Jun 2019 10:31:12 +0200 (CEST)" "=?utf-8?q?Votre_relev=C3=A9_mensuel_redacted" (("Votre mensuel redacted_contains_raw_UTF8_bytes" NIL "redacted" "redacted.com")) (("Votre mensuel redacted_contains_raw_UTF8_bytes" NIL "redacted" "redacted.com")) ((NIL NIL "redacted" "redacted.com")) ((NIL NIL "redacted" "gmail.com")) NIL NIL NIL "<redacted@redacted.com>"))
a4 OK command completed in 573 microsec.
* BYE
a5 OK LOGOUT
closed

I think we can confidently say that:

  • ProtonBridge does not seem to support the ENABLE command
  • ProtonBridge never asked the client to enable UTF8 using an ENABLE command
  • ProtonBridge indeed sent raw UTF8 bytes

@duesee
Copy link
Member

duesee commented Dec 15, 2024

Thank you for your amazing help! I'm a bit short on time but what this means to me is that I should prioritize the implementation of the UTF-8 extension in imap-codec/types and enable a quirk to accept UTF-8 -- maybe even by default. The libraries stick to the RFC very tightly but I feel this issues will keep coming...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AFFECTED=Proton Bridge v3.6.1 PROTO=IMAP Related to IMAP protocol STATE=REPRODUCED Issue could be reproduced (explained in issue)
Projects
None yet
Development

No branches or pull requests

4 participants