8

Why does base64.b64encode () return a bytes object?

 2 years ago
source link: https://www.codesd.com/item/why-does-base64-b64encode-return-a-bytes-object.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Why does base64.b64encode () return a bytes object?

advertisements

The purpose of the base64.b64encode() function is to convert binary data into ASCII-safe "text". However the return type of the method is a bytes object, e.g.:

Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.b64encode(b'abc')
b'YWJj'

Now, it's easy to simply take that output and decode() it, but my question is: What is a significance of base64.b64encode() returning a bytes object rather than a str?


The purpose of the base64.b64encode() function is to convert binary data into ASCII-safe "text"

Python disagrees with that - base64 has been intentionally classified as a binary transform.

It was a design decision in Python 3 to force the separation of bytes and text and prohibit implicit transformations. Python is now so strict about this that bytes.encode doesn't even exist, and so b'abc'.encode('base64') would raise an AttributeError.

The opinion the language takes is that a bytestring object is already encoded. A codec which encodes bytes into text does not fit into this paradigm, because when you want to go from the bytes domain to the text domain it's a decode. Note that rot13 encoding was also banished from the list of standard encodings for the same reason - it didn't fit properly into the Python 3 paradigm.

There also can be a performance argument to make: suppose Python automatically handled decoding of the base64 output, which is an ASCII-encoded binary representation produced by C code from the binascii module, into a Python object in the text domain. If you actually wanted the bytes, you would just have to undo the decoding by encoding into ASCII again. It would be a wasteful round-trip, an unnecessary double-negation. Better to 'opt-in' for the decode-to-text step.


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK