Handling Unicode Usernames in Dart or Flutter

Problem Statement

Unicode is a large and complex beast. Because of its complexity, it can leave us open to security issues without us even realizing it. For example, there are many non-printing unicode characters like the Zero-width Space. There are also different character codes that display nearly the same rune (usually the full-width and half-width equivalents of characters) like: U+FF52 (r) and U+0072 (r); U+FFD2 (ᅭ) and U+315B (ㅛ); U+FFE1 (£) and U+00A3 (£)

When presented in usernames, these can cause lots of confusion for humans and are often used for impersonation.

Who's real: E​​lon, Elon, El​​on, or Elo​​n? (hint: double click to highlight the name and the name that highlights completely is the real name.)

Solution

Enter PRECIS rules from RFC 8264, the username and password rules in RFC 8265, and the nickname rules in RFC 8266.

These RFCs were designed to set up the rules for handling validation and sanitizations of strings using unicode characters. And these RFC's now have a dart implemenation that you can download: Precis Pub Page.

Unfortunately while Dart has full unicode support for string handling not enough of the magic is exposed to the developers, so we had to build our own character handling from (mostly) scratch.

I'd like to thank the Java SDK for being open source, since I translated almost all of the Character classes into dart for this package.

I'd also like to thank Christian Schudt and the rocks.xmpp.precis Project for being open source, since this package is a direct translation from there into Dart.

Download

This code was tested with Flutter 3.3.2, Dart 2.18.1

Visit the Precis Pub Page to find the latest version of the published package for use in your projects.

Visit the GitHub repository to clone the source for this plugin, submit bug reports, or see additional documentation.

Setup

Unlike previous tutorials, I have not built a full example app for this plugin, and this tutorial will just discuss how to integrate this package into an existing app or dart project.

To add this package to your existing project run the following command:

With Dart:

$ dart pub add precis

With Flutter:

$ flutter pub add precis

The Code

Before anything else, we must import our package into the code where it is needed. I prefer to use a named import, but feel free not to.

import 'package:precis/precis.dart' as precis;

Since I know this is what you really care about I'll include some code snippets, then I will dive more in-depth into library usage after that.

Use Case 1: Validation that a username does not contain invalid characters

// Returns a String with some of the basic rules applied
// Throws [InvalidCodePointException] on failure
try {
    precis.usernameCaseMapped.prepare(username);
} on PrecisException catch (e) {
    print('Invalid username: ${e.message}');
}

Use Case 2: Formatting a username to check for duplicates

// Returns the formatted string with all rules applied
// Throws [InvalidCodePointException] if there are invalid characters
// Throws [InvalidDirectionalityException] if there are invalid LTR and RTL character mixes
// Throws [EnforcementException] for other errors (like empty strings)
try {
    precis.usernameCaseMapped.enforce(username);
} on PrecisException catch (e) {
    print('Invalid username: ${e.message}');
}

Use Case 3: Comparing new passwords to make sure they match

/// Verify that the given passwords are the same.
/// This method enforces all rules on both strings, so it can possibly throw the three
/// exceptions mentioned above.
bool passwordsMatch(String password1, String password2) {
  try {
    return precis.opaqueString.compare(password1, password2) == 0;
  } on PrecisException {
    return false;
  }
}

Deep Dive

Profiles

This library contains 4 PRECIS Profile implementations for the RFC's mentioned above.

  1. usernameCaseMapped - used for case-insensitive usernames as defined by RFC 8265
  2. usernameCasePreserved - used for case-sensitive usernames as defined by RFC 8265
  3. opaqueString - used for passwords as defined by RFC 8265
  4. nickname - used for nicknames as defined by RFC 8266

Each profile implements the following methods that can be used to interact with strings of the given profile:

  1. String prepare(String) - Ensure only characters defined by the underlying PRECIS rule are present. Returns the input string for convenience.
  2. String enforce(String) - Apply all the PRECIS rules to the passed string. Returns the formatted string with all rules applied.
  3. int compare(String, String) - A Comparator that applies the rules to each passed string before comparing.
  4. String toComparableString(String) - Returns a String that has all the PRECIS rules applied and can be used for comparisons.

Mix and match the profiles and the methods to your heart's content to get the desired results.

You can also extend the abstract PrecisProfile to create your own implementation of the rules for custom use cases, but this is not recommended.

Exceptions

This library throws exceptions to communicate error cases. All return values are valid and contain no error information.

  1. PrecisException - the abstract base class for all Exceptions thrown by the library
  2. InvalidDirectionalityException - thrown to indicate that the directionality rules have been violated
  3. InvalidCodePointException - thrown to indicate that a string contains invalid code points after applying preparation or enforcemnet of PRECIS framework
  4. EnforcementException - thrown to indicate that a string had errors while enforcing its conditions

If you want granular handling you can catch the most precise Exception for the cases you care about. Otherwise just catch the abstract PrecisException and handle all errors in a generic way.

Additional Reading

If you're dealing with unicode in Dart, you should be aware of the built-in codeUnits and runes handling. There is a great breakdown of Dart Runes at GeeksForGeeks.

Additionally, the characters package can be extremely useful for multi-byte character handling in strings. Be aware, though, that it has some parsing rules that make it unsutable for most uses with PRECIS-style codeUnit parsing.

In Conclusion

Now you have the tools at the tips of your fingers to help prevent Elon Crypto Scammers from invading your app and stealing from your users! You can also now use unicode in your apps in a standardized way. And you get all this without ever having to convert a Java string composed of Hex and Octal unicode characters into a Hex-only unicode string compatible with Dart!

Brian Armstrong

Brian Armstrong

I'm Brian Armstrong, a SaaS developer with 15+ years programming experience. I am a Flutter evangelist and React.js enthusiast.