I recently found myself trying to debug a most perplexing problem. Two identical strings were different. How could this be?

Consider this code

print("String1: '\(string1)'")
print("String2: '\(string2)'")

print(string1 == string2)

Giving this output:

String1: '123456'
String2: '123456'
false

Wut? I was reading the strings from a file. I started doubting reality. The string were the same yet they were different.

I thought I’d try getting rid of weird white space characters:

let trimmed1 = string1.trimmingCharacters(in: .whitespacesAndNewlines)
let trimmed2 = string2.trimmingCharacters(in: .whitespacesAndNewlines)

print("String1: '\(trimmed1)'")
print("String2: '\(trimmed2)'")

print(trimmed1 == trimmed2)

Nope:

String1: '123456'
String2: '123456'
false

I finally did some spelunking and discovered the joys of the Byte Order Mark \u{FEFF}, which is invisible.

My strings actually contained this (although since I was reading them from a file it wasn’t obvious):

let string1 = "\u{FEFF}123456"
let string2 = "123456"

Now I have a handy extension:

extension String {
    
    var withoutBOM: String {
        let bom = "\u{FEFF}"
        if hasPrefix(bom) {
            return String(dropFirst(bom.count))
        }
        return self
    }
}

trimmingCharacters(in: .controlCharacters) does the trick too:

import Foundation

let string1 = "\u{FEFF}123456"
let string2 = "123456"

let trimmed1 = string1.trimmingCharacters(in: .controlCharacters)
let trimmed2 = string2.trimmingCharacters(in: .controlCharacters)

print("String1: '\(trimmed1)'")
print("String2: '\(trimmed2)'")

print(trimmed1 == trimmed2)
String1: '123456'
String2: '123456'
true

Whew! The universe is still internally consistent (I think)