Tuesday, October 23, 2007

Closing FileStream and StreamWriter, and Garbage Collection

I came across this tidbit about the Close methods in FileStream and StreamWriter in an article by Jeffrey Richter: Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework. I was never really sure which Close to call and why, but this sums up whats going on behind the scenes:

FileStream fs = new FileStream("C:\\SomeFile.txt", FileMode.Open, FileAccess.Write, FileShare.Read);
StreamWriter sw = new StreamWriter(fs);
sw.Write ("Hi there");
// The call to Close below is what you should do
sw.Close();
// NOTE: StreamWriter.Close closes the FileStream. The FileStream
// should not be explicitly closed in this scenario

-

Notice that the StreamWriter's constructor takes a FileStream object as a parameter. Internally, the StreamWriter object saves the FileStream's pointer. Both of these objects have internal data buffers that should be flushed to the file when you're finished accessing the file. Calling the StreamWriter's Close method writes the final data to the FileStream and internally calls the FileStream's Close method, which writes the final data to the disk file and closes the file. Since StreamWriter's Close method closes the FileStream object associated with it, you should not call fs.Close yourself.


What do you think would happen if you removed the two calls to Close? Well, the garbage collector would correctly detect that the objects are garbage and the objects would get finalized. But, the garbage collector doesn't guarantee the order in which the Finalize methods are called. So if the FileStream gets finalized first, it closes the file. Then when the StreamWriter gets finalized, it would attempt to write data to the closed file, raising an exception. Of course, if the StreamWriter got finalized first, then the data would be safely written to the file.

How did Microsoft solve this problem? Making the garbage collector finalize objects in a specific order is impossible because objects could contain pointers to each other and there is no way for the garbage collector to correctly guess the order to finalize these objects. So, here is Microsoft's solution: the StreamWriter type doesn't implement a Finalize method at all. Of course, this means that forgetting to explicitly close the StreamWriter object guarantees data loss. Microsoft expects that developers will see this consistent loss of data and will fix the code by inserting an explicit call to Close.


So if you were wondering why your StreamWriter writes don't get committed to a file if you forget to call the Close method, that's the reason: it was designed that way.