UPDATED September 25, 2023
Originally built in the United States, the Internet used American Standard Code for Information Interchange (ASCII) for encoding English language Latin-based alphabet characters for communications between computers. You see…computers don’t understand letters or languages, only ones and zeros. But many other world languages use characters that are not found in the Latin alphabet. The international community as a whole saw this as a problem and since the early nineties has created and adopted standards known as Unicode to allow for the handling and rendering of different characters on the web.
The Domain Name System (DNS) which we use to translate user friendly names to IP Addresses for Internet communication, only understands ASCII so in 2010, the Internet Corporation for Assigned Names and Numbers (ICANN) adopted the use of Internationalized Domain Names (IDN) and decided that IDNs should be converted to ASCII-based form so they could be handled by web browsers, applications (IDNA) and DNS. Basically, make it easier on the computer, and the user gets an easier experience as well. But hackers are exploiting this, matching the sophistication of the internet progression.
Very simply put, a word used in a domain name registration can be interpreted by our computers to mean something else. And if the word looks like what we expected to see in the address bar, then we can be fooled into clicking on a link or visiting a site that has been crafted to steal sensitive credentials. This is form of Unicode Phishing known as an IDN Homograph Attack and they are not new and nearly impossible to detect according to Xudong Zheng, a web developer who successfully demonstrated the exploit to the world on his own website and promptly reported the vulnerability to Google’s Chrome Security Team. Note the similarities of the URL and the domain code and the drastic change in the redirect.
The Cyrillic alphabet, created by two brothers who later became canonized as saints, allows for hackers to shift the Unicode to look like their legitimate counterparts but then route users to a faulty, malicious website. Modern browsers can easily detect a homograph attack when tactics like replacing a single or even multiple characters are used such as replacing the first letter of the word, “apple” with a Cyrillic “a”, but this is not always the case when all the characters are replaced with Cyrillic alphabet characters. By using a blend of characters for letters that look very similar to the characters in our own alphabet, hackers can easily fool their phishing targets.
Xudong exposed a previously unknown vulnerability by replacing all of the Latin characters in the word, “apple” with characters in Cyrillic font, “apple." The browsers Chrome and Firefox failed every time. He then registered the converted Unicode name and built a simple webpage announcing the exploit. Chrome has since corrected the vulnerability and Firefox allows you to display the target’s Punycode URL so you can decide if that is where you want to go.So how do they do it?
Use deterrent, preventive, and reactive countermeasures in a layered defense such as:
Still curious? We're happy to help.