The Basics of Public Key Cryptography

app.mycrypto.com

Contact

What would you like to read about?

The Basics of Public Key Cryptography

July 22nd, 2021

cryptography

To send transactions on the Ethereum network, you need a signed transaction: a transaction signed by a private key. Regardless of the type of wallet you use (e.g., hardware wallet, secret recovery phrase, keystore file), the transaction is always signed using a private key, which is derived from your hardware wallet, secret recovery phrase, or keystore file (unless you’re already using a plain private key).

In addition to a private key, you also need a public key, since Ethereum uses asymmetric cryptography (also called public-key cryptography). This is a type of cryptography where you have a pair of keys: one private and one public. However, when you want to send funds to someone, you need their address, which is different from public key (or private key). In this article, I will explain what the difference is between the private key, public key, and address, and how they all tie together.

The different types of cryptography

Let's start with the basics of cryptography. There are essentially three different types:

Symmetric cryptography (or secret key cryptography)
Asymmetric cryptography (or public key cryptography)
Cryptographic hash function (We won't go over this type of cryptography in detail in this article, but if you're interested, I recommend reading the Wikipedia page.)

With symmetric cryptography, you only have one (secret) key. You can use this key to encrypt data, and use the same key to then decrypt that data. To share data with someone else, both you and the recipient must have access to the same secret key. An example of a use case for symmetric cryptography is protecting your sensitive data, e.g., by encrypting your computer's drive with BitLocker. It's important to keep this secret key a secret (as implied by the name), or anyone else who knows it would be able to access your sensitive data.

Symmetric encryption: Encrypting and decrypting data using a single secret key.

In contrast to symmetric cryptography, asymmetric cryptography involves a key pair: a private key and a public key. Data is encrypted using the public key and can only be decrypted with its corresponding private key. In other words, a sender only needs the public key of the receiver, and the receiver can decrypt the data using their private key. You only need to keep the private key a secret, and it's fine to share the public key with others.

Asymmetric encryption: Encrypting and decrypting data with a separate public and private key.

Asymmetric cryptography can be used, for example, to share sensitive information between two parties. In fact, the website you're using to read this article uses asymmetric cryptography: HTTPS is based on asymmetric cryptography. In Ethereum, it's more commonly used for signing data, rather than encrypting it. This makes it possible for someone to verify that you own a private key without exposing the private key itself.

Asymmetric cryptography and Ethereum

As briefly explained above, asymmetric cryptography is used in Ethereum for things like sending transactions and signing messages. When you send a transaction to the network, you clearly don't want to include your private key in the transaction. So, instead, the transaction includes some proof that you have access to the private key. This is for the network to verify your balance (or a specific message) without exposing your private key.

Ethereum uses elliptic curve cryptography (ECC): Cryptography based on an elliptic curve, specifically using the Elliptic Curve Digital Signature Algorithm (ECDSA) and the secp256k1 elliptic curve. ECDSA is an algorithm specifically made for signing and verifying data, and recovering public keys from a signature.

When you want to send a transaction, the entire transaction is first signed, resulting in a signature {r, s, v}. This signature is added to the transaction, which can then be broadcast to the network. Transactions do not include the address they were sent from. Instead, the address is recovered from the signature using ECDSA. I recommend this article for a more detailed guide on how ECDSA and signatures work.

Asymmetric cryptography in Ethereum: A transaction is signed with a private key, and the public key is recovered from the signature (simplified).

This way, we can safely send transactions (or sign messages) on Ethereum without exposing our private key.

How keys are created

Private keys can either be derived from another source (like a secret recovery phrase or a hardware wallet) or randomly generated using a strong cryptographically secure pseudorandom number generator (CSPRNG). A private key is essentially a random 32-byte (256-bit) number. It does have a few requirements, however:

The private key cannot be 0.
The private key must be smaller than the order of the curve ( n ). This is essentially the maximum number of points on an elliptic curve.

To generate a private key, you can simply take a random 32-byte number (generated by a strong CSPRNG), and check the requirements above. If the requirements are met, you now have a valid private key.

Getting the public key involves elliptic curve mathematics, based on the private key. The public key isn't randomly generated, but rather calculated by "multiplying" the private key with the base point ( G ) on the elliptic curve. This results in a new point on the elliptic curve, which is the public key. This multiplication is a one-way operation, so it’s not possible to calculate the private key from the public key.

Public keys and addresses

A common misconception is that the public key is the same as the address in Ethereum. They are two different things, however, and the address is derived from the public key.

Public keys are either 65 bytes long for uncompressed public keys (0x04 || uncompressed public key), or 33 bytes long for compressed public keys (0x02 or 0x03 || compressed public key ). The difference being that the uncompressed public key includes the y value of the point on the elliptic curve, and the compressed public key does not. Using uncompressed public keys can speed up computation, at the cost of more storage. The first byte is the public key header and it determines whether the key is compressed or uncompressed. For compressed public keys, the first byte also determines the parity, since one x coordinate has multiple y values on an elliptic curve.

To go from a public key to an address, we take the compressed public key, omit the first header byte (to get a 32-byte long public key), and hash that using Keccak-256. Then, take the last 20 bytes, which is the address.

Given the following private key:

0xeaf2c50dfd10524651e7e459c1286f0c2404eb0f34ffd2a1eb14373db49fceb6

Using elliptic curve point multiplication we get the following (uncompressed) public key:

0x04b884d0c53b60fb8aafba20ca84870f20428082863f1d39a402c36c2de356cb0c6c0a582f54ee29911ca6f1823d34405623f4a7418db8ebb0203bc3acba08ba64

Then we hash this with Keccak-256, which results in:

0xf0d03901469804f101fd1c62c6d5a3c98ec9073b54fa0969957bd582e8d874bf

Finally, we take the last 20 bytes (40 characters), which results in the following address:

0xc6D5a3c98EC9073B54FA0969957Bd582e8D874bf

Addresses are shorter than public keys, while still providing sufficient uniqueness and security for sending transactions on Ethereum.

Private keys versus hardware wallets, secret recovery phrases, and keystore files

When using something like a hardware wallet, secret recovery phrase, or keystore file, transactions are still signed using a private key. You cannot sign a transaction using something like a secret recovery phrase directly. In these situations, the private key is derived from the hardware wallet, secret recovery phrase, or keystore file, which in itself involves a bunch of cryptographic functions.

This derivation step is always done under the hood when using something like MyCrypto or a hardware wallet. In the case of hardware wallets, derivation happens on the device itself, so the private key (or secret recovery phrase) never leaves the device. For secret recovery phrases, derivation is done locally by MyCrypto based on the derivation path you select. In the end, you'll always end up with a key pair consisting of a private and public key, with a corresponding address.

Conclusion

To send transactions on Ethereum, you need a key pair, consisting of a private key and a public key. Other methods for accessing an account can be used as well for sending transactions on Ethereum but, under the hood, a private key (and public key, address) is derived to actually sign the transaction with.

Using asymmetric cryptography (public-key cryptography), we can safely send transactions by including a signature, without exposing the private key itself. Transactions are verified by recovering the public key, using elliptic curve cryptography, which can then be further hashed to get an actual Ethereum address.

The major difference between symmetric cryptography and asymmetric cryptography is that with the former you only have one key: the secret key. While not (commonly) used for Ethereum, it has a lot of other possibilities, like protecting sensitive data or sharing data between people using the same secret key.

MyCrypto is an open-source tool that allows you to manage your Ethereum accounts privately and securely. Developed by and for the community since 2015, we’re focused on building awesome products that put the power in people’s hands.

Company

Support Us