Authentication

Authentication is the act of confirming the truth of an attribute of a single piece of data (a datum) claimed true by an entity. (from Wikipedia)

There are basically three methods for authenticating a person:

Information
- Something only the user knows, e.g. a password
An item
- Something only this user owns, e.g. a smart card
Biometric data
- Something the user is, e.g. fingerprints

Single factor authentication

Usually only one method from the above list is used to authenticate the user. The easiest is to verify user's knowledge, e.g. ask for a password, PIN code or a pattern.

Examples:

Using password to log into web mail or study information system
Opening Android screen lock with pattern
Logging into an online banking site by using a password and a reusable code card. As this is considered insecure, low (daily) transaction limits are enforced when using single factor authentication.

Authenticating with passwords

Secure password are long and randomly generated. Moreover, a unique password should be used for every service, otherwise one leaked password may mean losing access to many or all of the used services. However, is is hard for people to memorize long and random passwords.

Source: XKCD

To illustrate the problem, take a look at a web site that predicts how long it takes for one PC to crack a given password: https://howsecureismypassword.net/. Do not insert your real password on that site!

Next, we will give a brief overview how a web service (and many other services) use passwords for user authentication. It is clear that the service needs to know something about a user's password in order to verify it. However, a service should not store users' passwords in plain text as a a break-in would leak all users passwords. Instead, a service should store a hash of password instead of the password itself.

A hash function (in Estonian: räsifunktsioon) takes an input of arbitrary size and outputs a bitstring of fixed length. A hash function is deterministic, but its output does not reveal any information about the input.

For example, this is the SHA-256 hash value of the text "test":
sha256(“test”) = 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08

Service providers store hashes of passwords instead of the passwords themselves. If a user wants to authenticate to a service, he sends his password to the service as usual. The service then hashes the password and compares it to the hash in its database. If they match, then the user is successfully authenticated.

However, now we have a new problem - hash function is deterministic and instead of trying to crack the hash, an attacker can precompute hashes of common words. For example, an attacker could hash each word in the English dictionary together with some most common combinations and store the result as a (hash -> word) lookup table. Now if a database with password hashes leaks from a service, the attacker can just look through its database to see if any of the leaked password hashes are present in his lookup table. Hence, storing hashes of passwords instead of passwords themselves does not help if the password is a real word or some simple combination of word and numbers.

To solve this problem, service providers should actually randomize the password hashes. This is accomplished by adding some random data, called salt, to the password before hashing it. Now, with high probability, all hashes are unique, even if some users use the same password. To precompute a lookup table, an attacker now has to also take account this randomness. However, if for example 64-bit salt is used then there are 2^64 (2 to the power of 64) different salts and the attacker would have to compute a different lookup table for each possible salt value. This is practically impossible in both terms of computational power and storage.

An example of adding a salt to the hashing of "test":
sha2(“test+j2Bl”)=4cb0ccd18a4f985823c5640e97103b6c7ee23d175cffc01691baeb006773c365

Where to store the salt?

Even though this might seem insecure, the salt itself can be stored in plain text together with the salted hash value of the password. If this database leaks then an attacker would of course learn the salt values and can eventually crack some passwords if they use common words from dictionaries. However, the attacker can only start the guessing after seeing the leaked database, there is no way to precompute anything reasonable.

Example: Let's see how easy it is to guess a simple password by seeing just the hash. For example, if the password is only 4 characters long, it is enough to hash all 4-character passwords. Note that in reality the attacker would not know the length of the password based on the hash value.

Service providers make cracking the leaked password even harder by using a special password-based key derivation function instead of a hash function. The main feature of such functions is that they hash the passwords not once, but for example 10,000 times, making it even more expensive for attackers to guess the password. Examples of such functions are Bcrypt, PBKDF2, scrypt, etc.

Examples on how not to store users' passwords

Many service providers store users' password insecurely, e.g. in plain text or hashing without using a salt. Some service providers do something even more interesting by for example encrypting the passwords. Encrypting passwords is a bad idea, as they can be decrypted if the decryption key leaks.

In August 2014, a user database of an Estonian web shop http://seemnemaailm.ee/ was leaked. The database contained plain text passwords of more than 10,000 users. A local newspaper also covered the story: Postimees: "Veebikeskkond jättis tuhanded kliendid andmelekkest teavitamata" (in Estonian). Interestingly, the newspaper story mentions encrypting the passwords as a possible solution, but this is actually wrong way to protect passwords.
In spring 2014, the user database of Adobe leaked, containing information about 150 million users. Interestingly, Adobe had decided to encrypt the users' password instead of hashing them. This by itself is a bad idea as encrypted values can be decrypted, while hashing is a one-way operation. Moreover, they used an insecure solution for encryption (ECB operation mode) which meant that identical password gave identical encryptions. On top of that, the database contained password hints making this a large crossword puzzle:

Source: XKCD

Problems with passwords

As a user of a service, you have no control over how your password is stored by the service provider. The only thing a user can do it to use a unique and secure password for each service.

How to memorize and use a secure password? Secure (long and random) password are by definition hard to remember. We will talk about password management in the lab session.

Some ideas:

Memorize the passwords
Write password down on a paper
Store passwords in a text document
Use a special password management software

Usernames and passwords are easy to copy and distribute. Secure passwords are hard to memorize and short password are easy to crack. Moreover, passwords can be easily stolen by using a keylogger, a malware that listens to user's keyboard presses and sends them to an attacker.

Example attack: A web browser is infected with a malware that intercepts the keyboard presses and sends them to an attacker.

Additional materials

An overview of user authentication is given by the MIT course 6.858 Computer Systems Security. The following video lecture is made available through the MIT OpenCourseWare program. It is not compulsory to watch the video, it is an extra material for the students who would like to get more information about user authentication.

Two-factor authentication

To make authentication secure, something more than just user's knowledge has to be used: either something that the user has or something that the user is. Therefore, an additional authentication factor is required. The additional factor could be a device that is owned by the user or something that is directly connected with the user, e.g., biometric properties.

In two-factor authentication, two items from the following list are verified:

user's knowledge
something that the user possesses
biometric data

Authenticating with a smart card

Using a smart card (e.g. Estonian ID-card) is a two-factor authentication as it combines something physical that the user has (the card) with something that the user knows (PIN code). It is important that the private keys stored on the card cannot be copied, so they are strictly tied to the physical object.

One-time passwords

Ordinary passwords may leak and they can also be easily copied and distributed. Therefore, some systems are designed to use one-time passwords. One-time passwords are passwords that are used only once and they can be used only if the client and the server are synchronized (we don't cover those algorithms in this course). More information about one-time passwords can be found from: https://en.wikipedia.org/wiki/One-time_password.

Now, one might have a question about the distribution of such passwords. How could the server secretly share the one-time passwords with the client? Actually, there are many options for doing this:

Delivered on paper - It is possible that the one-time passwords are sent to the client via post or delivery service. E.g., Nordea bank uses one-time passwords which are sent to the clients by using the postal service. It is important to note that the new set of passwords can be used only after they have been activated. This protects the passwords from being read by a third party. If the client notices that the special envelope is damaged then he / she should not activate the new set of one-time passwords.
Sent via SMS - It is a good option as almost everyone has a mobile device. The problem with this approach comes from the weakness of the encryption that is used to protect the SMS messages. In addition, when roaming is used then the client has to trust the mobile service providers. This is also the reason why NIST is deprecating the use of SMS for two-factor authentication in their latest draft of Digital Authentication Guideline:

Note: Out-of-band authentication using the PSTN (SMS or voice) is deprecated, and is being considered for removal in future editions of this guideline.

Delivered on a device - It is possible that a pre-synchronized algorithm is inserted into the device. E.g., PIN-calculators are synced in a way which allows them to generate valid one-time passwords. There are also smartphone apps for some services that allow to generate one-time passwords. The following subsection will focus on the PIN-calculators and on the authentication with mobile devices.

Authentication with the help of mobile device or some other device

A mobile phone fits very well into the two-factor authentication system as it is not directly connected with the computer and as it has a separate communication channel (GSM, 3G, 4G). Therefore, in order to attack a two-factor authentication system, that uses the mobile device as a second factor, one would also have to access or infect the mobile device. However, infecting both the computer and the mobile device would be much more difficult for the attacker and thus it would also be much more expensive. Infecting these devices would be possible as mobile devices, especially Android smartphones have lots of vulnerabilities but the problem is related with delivering the malware. How is the attacker able to infect the secondary device? In some cases the device might be connected to the computer but this may not happen frequently and may not be enough to infect the device. Therefore, we could generalize and say that it is possible to target and infect a mobile device of an individual who uses it for two-factor authentication. However, targeting and infecting a large part of the population is probably too expensive. Therefore, using two-factor authentication should significantly increase your security (in case no one is specially targeting you).

Examples of the second authentication factor:

PIN-calculators used in banks
Google's two-factor authentication
Facebook's two-factor authentication

PIN-calculator

PIN-calculator is a device which is able to generate a pseudorandom number that can be used for logging in to the online bank or for doing online bank transactions. It is important to understand how such devices work, i.e., why they are claimed to be secure. These devices have been synchronized with the bank, more specifically, the initial seed inside the PIN calculator that is used by the pseudorandom number generator is synchronized. Thus, the security of such devices depends on the specific pseudorandom number generator. In addition, it is important that the initial seed would be random and it should not leak. The device itself does not have internet connection and it is also not connected to other devices.

So, how can the code be generated? One way is to use a time-based PIN-calculator, which uses the current time and the shared value to generate the codes. Another approach is to use a hash function such that in the synchronized value is hashed in some predetermined way. For example, the bank and the PIN calculator could have a counter in addition to the shared secret value. Thus, each time the code would be generated the counter would be increased on the device and each time the code is correctly used then the bank would increase its own counter value. Now we see that there should be a window of accepted codes just in case a few codes were not entered in time. This would allow to use the PIN-calculator in such situations and would let the bank to re-synchronize the counter value if the next code was valid. The window of accepted codes should be quite small to prevent random guesses. The hash function based code generation could also be used differently - the shared secret value could be hashed to create a hash chain. This way the PIN-calculator would hash the secret value, then hash the hash of the secret value, etc. Now the first authentication would be done with the last value in the hash chain, the next authentication with the previous value, etc.

The pseudorandom code that is generated by the PIN-calculator can be used only once, i.e., it can not be used to log in to the online bank twice. Therefore, even malware can not really benefit from copying the code if the user has already used the code. Malware is only able to use the code if it is faster than the user, i.e., if it can log in with the code before the user.

What happens if a stranger finds the PIN-calculator or if it gets stolen? Actually, if the device is protected with a strong PIN-code then no one can use the calculator as it has to be unlocked before it can be used. The device only allows to enter three wrong PIN-codes before it gets blocked. Therefore, using a PIN-calculator is considered to be a quite secure two-factor authentication method. Therefore, the major banks in Estonia have not set daily transaction limits when the client authenticates himself / herself with a PIN-calculator.

Google two-factor authentication

Google's two-factor authentication requests the user to enter a verification code when the user is trying to log in from a new web browser (or a new computer). The verification code is sent to user's mobile phone via SMS or a special smart phone app to verify if the user has access to his phone (or its SIM card). Also it is possible to let Google call you and read out the verification code.

When using such two-factor authentication mechanism, an attacker who has stolen your password (e.g. by using a keylogger) still cannot access your account as he does not have access to your phone. More information about Google's two-factor authentication: http://www.google.com/landing/2step/.

After enabling Google's two-factor authentication, it is important to enable backup authentication methods in case something happens with your phone or its SIM card. For this you can write down the single-use backup codes that Google generates in order for you to access your account if your phone gets lost or stolen. These codes must be kept secret.

One will have to generate application-specific passwords for applications that do not support two-factor authentication but use the Google account. Such applications may include, for example your desktop mail client. Application specific passwords can be managed from: "Account -> Security -> 2-step verification -> Manage your application specific passwords".

Facebook two-factor authentication

Facebook's two-factor authentication works similarly to Google's by requiring the user to enter a verification code sent to user's phone via SMS. This extra verification step is required only when logging in from a new web browser or computer.

In addition to Google and Facebook, many other service providers also support two-factor authentication, e.g. Microsoft, Apple, Twitter and Wordpress.

Biometric authentication

Biometric authentication methods include authentication using fingerprints, retina or speech recognition.

Fingerprint scanner on a laptop.

Using biometric data for authentication by itself is not considered to be multi-factor authentication and hence it should be used together with other authentication means, e.g. passwords or physical devices.

Authenticating via third party

We already mentioned that in addition to using secure, randomly generated passwords, it is important to use different password for each service. At the same time, many passwords are difficult to remember. Using password managers is one solution but there are also other options. For example, some web services allow users to authenticate themselves using an account at some other service provider. In this case, users do not have to create another user name and password for this web page.

OpenID

OpenID is one of the first solutions that allow users to authenticate themselves to a web service using some third party. In addition to the user and the web service, there is a third party - OpenID service provider - that is trusted by the web service requiring authentication. The authentication process itself is as follows: (source):

User goes to the web service that he wants to log into and enters his OpenID handle (usually a e-mail address or a domain name)
The web page redirects the user to the log-in page of the chosen OpenID service provider where the user has to authenticate himself to this OpenID service provider. If the user is already authenticated at the OpenID service provider, this step is done automatically.
OpenID service provider asks the user if he wants to authenticate to the given web page and if he agrees that the OpenID service provider shares some information about the user with the web page. The information that has to be shared depends on the web page, but usually it is an e-mail address, full name and/or profile picture, etc.
If the user accepts, then the OpenID service provider contacts the web page directly, shares the requested information and confirms that the user is successfully authenticated at the OpenID service provider.

To use OpenID, the following conditions have to be met:

the web page where the user wants to log into must support at least one OpenID provider;
the user must have an account at one of the supported OpenID service providers.

OpenID service providers have been, among others: Google, Yahoo!, Facebook, AOL, PayPal. Additionally, it is possible to use personal homepage domain as an OpenID handle (that redirects to some other supported OpenID service provider). All connections between the parties should be encrypted. Lately, the number of OpenID providers have decreased, as service providers switch to a more flexible OAuth/OpenID Connect framework.

In Estonia, there is an OpenID service provider OpenID.ee that allows users to authenticate themselves with an ID-card or mobile-ID. Web pages that trust this OpenID service provider get a more strict (cryptographically secure) guarantee that the provided information (full name and personal identification code) is real.

OAuth

OAuth is a similar framework that allows user to authorise a service to act on his behalf at a third party - OAuth service provider. As such, OAuth is actually not meant for authentication (identification) but rather for authorization (giving permissions). To also enable authentication, a separate layer, OpenID Connect, is built on top of OAuth 2.0.

As OAuth is meant for authorization, its usage differs from that of OpenID, although for the end user it seems quite the same. In case of OpenID, a user asks the OpenID service provider to confirm his identity to the web page. In OAuth, the user authorized the web page to ask information from the OAuth service provider. Here is the difference, in addition to allowing requests about the user's identity, the user may also authorize the web page to access other resources from the OAuth service provider on his behalf. These resources can be contacts, calendars, photos, or whatever else the OAuth service provider stores about the user. The exact list of shared resources is confirmed during the authorization process and the user can accept or reject it.

OAuth/OpenID Connect service providers are services that have other resources in addition to the user's identity to share, e.g. photos, contact list, etc. Some providers are Google, Facebook, Flickr. See Wikipedia for a partial list.

Why use OpenID or OAuth/OpenID Connect?

For a web page, authenticating its users with a third party means that the web page does not have to store a separate users database. Hence, no password hashes can be leaked as the web page just do not have those. Moreover, OpenID and OAuth service providers are recognized web services that have more knowledge and resources to protect their users' information. A small company behind a web page may not have that.

For a user, OpenID and OAuth provide a way to memorize (or manage) less passwords and identities. However, it must be noted that both the web page and user have to trust the identity service provider. The web page has to trust that the identity provider does not lie about the users' identities and the user has to trust the identity provider not to misuse his identity.

Infoturve 2018/19 sügis