Friday, August 7, 2009
I ran into a problem today when writing an output file which kept containing the prefix FFFE when viewed in a hexadecimal editor.
This prefix was in my source file and attempts to remove it with String.Replace where fruitless.
The FFFE prefix was only visible in a file and not within the Visual Studio debugging environment.

So I started investigating.

Unicode contains a byte order mark (BOM) prefix which defines the byte order of your unicode file.
FFFE = Little Endian
FEFF = Big Endian

So I attempted to parse these characters by setting the UTF8 encoding param when instantiating my StreamWriter.
No luck.

The solution was to instead define a custom UnicodeEncoding and disable both Big and Little Endian byte order marks.
You do this by defining a custom UnicodeEncoding type as the encoding parameter in StreamWriter.

Here is the code.

// open our input file

StreamReader readerEDI = new StreamReader(@"input.txt");


// setup custom unicode encoding, disable big and little endian bom's

UnicodeEncoding unicode = new UnicodeEncoding(false, false);


// output file stream

Stream filestream = new FileStream(@"output", FileMode.CreateNew);


// instantiate new streamwriter, apply our custom unicode encoding

StreamWriter writerEDI = new StreamWriter(filestream, unicode);


// write to file



// clean up