You may well ask: How can Kong tell that two documents were signed using the same secret if it does not know that secret? And how can Kong encrypt a document so that only the possessor of a certain secret can decrypt it, if it does not know that secret?
The short answer is that the line in the signature that follows your name contains information about the secret, but is not itself the secret. It is generated from the secret using a one way function. That line is called the public key.
The two lines that look different in each signature must agree both with the document and with the public key. That is to say they should have certain mathematical properties that relate the public key to the particular document. Only a person who knows the secret key can create a signature that agrees both with the public key and with the document.
Thus if the signature agrees with the document, then the document has not been changed since the possessor of the secret corresponding to the public key signed it, and if two documents have the same public key and valid signatures, then they were signed using the same secret, and thus presumably signed by the same person.
We encrypt a document using the public key, and only the secret key can decrypt it.
Kong uses elliptic curves to sign and encrypt documents. Elliptic points can similarly be used to provide anonymous internet money.
One can construct a third point on an elliptic curve from any two other points. This operation is commutative and associative, so we will call it addition, even though it is not addition, and we will use the plus sign for it. (In fact this operation is not mere addition, it involves several multiplications and additions modulo various quantities that thoroughly scramble the bits of the points together.)
If P, Q, and R, are points on an elliptic curve, then P+Q = Q+P, and (P+Q) + R = P + (Q+R)
What mathematicians call an elliptic curve, is not part of an ellipse, and is usually not a curve. We are using discrete elliptic curves, which are like modular integer arithmetic.
Because of associativity, we can efficiently add a point to itself a very large number of times. This means we can efficiently multiply elliptic curve points by ordinary numbers. For example 40*P can be efficiently calculated as follows
2P = P + P
4P = 2P + 2P
8P = 4P + 4P
16P = 8P + 8P
32P = 16P + 16P
40P = 32P + 8P
Which requires only six additions, not 40 additions. This gets more interesting if we multiply by enormous numbers.
For example if we multiply by a trillion, we only need about sixty additions, not a trillion additions. So if we use very large numbers, much much larger than a trillion, multiplication is a one way operation.
If P is an elliptic curve point, and n is a large integer, and Q = n*P, then the attacker who is given P and Q cannot discover n. The order of the curve used by Crypto Kong is slightly more than 2240, thus to discover n the attacker would need to perform about 2120 additions of elliptic curve points, which is around one billion billion billion billion additions.
So for our crypto system, we choose a certain elliptic curve, and a certain point on that elliptic curve, which we will call the generator. All the points that we will use will created by adding the generator to itself many times.
To generate your secret key, your computer hashes your passphrase, your secret file, and the name, to generate a big number, a two hundred and forty bit number. That is a number somewhere around 1000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000. It does this every time that you need the secret key.
So the secret generated from your secret key is really a very big number.
The public key, which appears in your signature, is an elliptic point, the generator multiplied by that number. This point is represented by the x coordinate of the elliptic point, a 255 bit number, plus a sign bit, represented in base 64 format. Because division is hard, the adversary cannot discover the secret key from the public key.
For compactness, these very large numbers are represented in base 64, instead of represented in base ten. The digits 0 to 63 are represented as: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/
High order digits are first, thus 10(base 64) = 64(base 10) 100 (base 64) = 4096 (base 10) This means that order in the text is the reverse of the order in RAM.
The security of the system derives from the fact that to calculate the public key from the secret number merely takes a few hundred elliptic curve additions, but to do the reverse operation, to discover the number from the public key, would require billions and billions of elliptic curve additions, and elliptic curve additions are a rather slow operation. It would require roughly 1000 000 000 000 000 000 000 000 000 000 000 000 operations, which is well beyond the power of even a major government on a matter vital to national security.
The problem of solving the reverse problem is called the discrete logarithm problem by mathematicians, even though it does not involve what ordinary people think of as logarithms. Many mathematicians have studied this problem, and not found any efficient solution. Certicom has offered a hundred thousand dollar prize for a mathematician who finds a more efficient solution to the discrete logarithm problem for elliptic curves.
This is the one way operation, which makes it possible for Crypto Kong to tell that two documents were signed using the same secret when it does not know that secret and to encrypt a document so that only the possessor of a certain secret can decrypt it, when it does not know that secret.
So all our long integer arithmetic is done modulo the order of the generator.
Let any lower case letter represent these large integers.
Let any upper case letter represent a point on our elliptic curve.
To sign data, one generates a one way hash of it using SHA
Let int(P) denote an integer generated from an elliptic curve point.
Let h be the hash code of the data to be signed
Let Ann's secret key be a, and her public key be A = a*G, where G is the generator.
Ann's secret key is a very large number generated by hashing her name, her secret passphrase, and her secret file. The secret file is treated as a continuation of her passphrase.
Ann's computer chooses a random number k, and computes r = int(k*G)+h and s = k - a*r where h is a hash of the data being signed.
The signature is the is Ann's name, followed by Ann's public key A = a*G, followed by r, followed by s+(hash of Ann's name).
Those are the four lines you see in the signature.
Name Public Key r s+(hash of name) |
Ann 8UnJhnG3uIciSsxf598ed1X+hu1dT7mF5VdligtNw4u MkpOPpUnnf5q9OdFGroBARk/tbf3Q2MUVnBE2UZt 769rwaBGXuGoufSTTY4EvJTe4uLu4QrNwn7s/mpj |
h= r - int(s*G + r*A)
This equation must be true because of associativity s*G+r*A = (k-a*r)G +r*a*G = k*G , but no one can discover values for r and s such that it will be true, unless they know the secret key a
We can calculate k*A, because we know k, and everyone knows A, and she can calculate a*K, because she alone knows a, and we gave her K in the clear.
We use the shared secret to encrypt the message using Arc4.
Arc4 (RC4) is a very simple, very well known, very well studied symmetric key algorithm, produced by a great cryptographer. It has certain well known and well understood weaknesses, none of which are relevant because we are using it with 256 bit keys and using it to send signed messages. We use the shared secret a*K=k*A as the symmetric key.
Our encrypted message looks like this:
--Key A K --Cryptotext |
--Key 8UnJhnG3uIciSsxf598ed1X+hu1dT7mF5VdligtNw4u F/bObzQg/8mv2M+2ZtPTbxSpX4KOdC0/l2b/3JsObsX --Cryptotext 0020nCapmnBpzhmvUOQDgnsqOnPYRiEeVYOy+N1EtxHErtiXYA4Wx1j4J8MHjXH6rdyJs+YO g+WMvam4v0k85EjYdMLHrk3vl+UsgdP2XUqAdCepONJVmnh2UaBnrU3Ka5QKzyiCPx3jJUFs 7NL9l5PUMRRT96YC4hntjOX8Pa9ceuOus5ftAN7OHbqUpGluqwfj/kYZCZ9x8s9wvZOoWs3X E6mzivIc70wTwfr2VkidMV6V/LARTYAIGBJgsLVOJSzhO3x+QmJQTF3Rob6n |
Encryption for multiple recipients is similar,
though slightly more complicated.