Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of filenames with umlauts and other special characters #1017

Closed
phorward opened this issue Jan 10, 2024 · 2 comments
Closed

Handling of filenames with umlauts and other special characters #1017

phorward opened this issue Jan 10, 2024 · 2 comments
Assignees
Labels
discussion These topics must be discussed before completion Priority: High After critical issues are fixed, these should be dealt with before any further issues. security For security related bugs

Comments

@phorward
Copy link
Member

This problem currently exists in ViUR.
A file named testupload- ö ä -(123).txt should be uploaded.

  1. The file testupload- ö ä -(123).txt is selected or dropped.
    image
  2. file/getUploadURL is being called with this filename, and generates an upload URL to Google Cloud Storage with the cloud bucket and file path project-id.appspot.com/06eJqwTaFo8JO/source/testupload-%20%C3%B6%20%C3%A4%20-%28123%29.txt.
    image
    This first sanitizes the filename to the above filename form (source).
         fileName = sanitizeFileName(fileName)
    
         targetKey = utils.generateRandomString()
         blob = bucket.blob("%s/source/%s" % (targetKey, fileName))
         uploadUrl = blob.create_resumable_upload_session(content_type=mimeType, size=size, timeout=60)
    Then, it creates a new file leaf skel with the filename pending in (source):
         fileSkel = self.addSkel("leaf")
         fileSkel["name"] = "pending"
         fileSkel["size"] = 0
         fileSkel["mimetype"] = "application/octetstream"
         fileSkel["dlkey"] = targetKey
         fileSkel["parentdir"] = None
         fileSkel["pendingparententry"] = db.keyHelper(node, self.addSkel("node").kindName) if node else None
         fileSkel["pending"] = True
         fileSkel["weak"] = True
         fileSkel["width"] = 0
         fileSkel["height"] = 0
         fileSkel.toDB()
  3. The file is being uploaded to Google Cloud Storage using the provided upload URL.
  4. Afterwards, file/add is being called with the key of the file leaf skel created by file/getUploadURL. This now reads the Google Cloud storage object blob and extracts the filename again from it (source):
             skel["name"] = utils.escapeString(blob.name.replace("%s/source/" % skel["dlkey"], ""))
    Resulting file entity in Google Cloud Datastore:
    image
    Wow! Such a nicely resolved piece of code.
  5. The file is uploaded. The filename is displayed "broken" everywhere it is being used. A simple string unescaping is not enough, as the filename is URL-encoded. Why? For security reasons?
    image

In the end, this entire stuff is inconsistent and breaks the filename.

  1. In step 2: Why not save the pending file leaf skeleton with the unquoted filename, like testupload- ö ä -(123).txt (pending)
  2. In step 4: Why is skel["name"] being created from blob.name? Why is 06eJqwTaFo8JO/source/testupload-%20%C3%B6%20%C3%A4%20-%28123%29.txt turned into testupload-%20%C3%B6%20%C3%A4%20-%28123%29.txt and then the URL-escaped string is utils.escapeStringed again. Is this extra-secure? Why not just use the original filename provided before without the (pending)-postfix, and this, as normal for StringBone content, is required to be utils.escapeStringed.
  3. General question: Why is ViUR required to save escaped data for any string content? Why is such data not escaped when rendered, so the HTML-renderer is the place where data is going to be escaped, so no code leaking or injection will be possible. In JSON or any other format, the escaping is irellevant and impractical.
@phorward phorward added discussion These topics must be discussed before completion security For security related bugs viur-meeting Issues to discuss in the next ViUR meeting labels Jan 10, 2024
@phorward phorward added bootcamp Mausbrand Bootcamp Topic and removed viur-meeting Issues to discuss in the next ViUR meeting labels Jan 19, 2024
phorward added a commit to phorward/viur-core that referenced this issue Jan 31, 2024
This pull request is focuesed to (partly) fix viur-framework#1017.
@phorward phorward removed the bootcamp Mausbrand Bootcamp Topic label Feb 16, 2024
@phorward phorward self-assigned this Feb 16, 2024
@phorward phorward added this to the ViUR-core v3.6 milestone Feb 16, 2024
@phorward phorward added the Priority: High After critical issues are fixed, these should be dealt with before any further issues. label Mar 4, 2024
@phorward
Copy link
Member Author

phorward commented Mar 4, 2024

#1050 has a new naming scheme which is allowed for filenames, so a paranoid escaping like here should not be required anymore. Filenames not following the naming scheme will be rejected.

phorward added a commit that referenced this issue Mar 8, 2024
This pull request is focused to (partly) fix #1017.

---------

Co-authored-by: Sven Eberth <mail@sveneberth.de>
@phorward
Copy link
Member Author

phorward commented Mar 8, 2024

Hopefully fixed by #1050, otherwise please reopen this issue.
Existing code and uploads are still provided with the escaped filename.
They should be re-uploaded. @xnopasaranx

@phorward phorward closed this as completed Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion These topics must be discussed before completion Priority: High After critical issues are fixed, these should be dealt with before any further issues. security For security related bugs
Projects
None yet
Development

No branches or pull requests

1 participant