C# Regular Expression Recipes—Counting Lines of Text
Microsoft .NET Framework, ASP.NET, Visual C# (CSharp, C Sharp, C-Sharp) Developer Training, Visual Studio
| CSharp-Online.NET:Articles |
| C# Articles |
| © 2004 O'Reilly & Assoc., Inc. |
Contents |
Counting Lines of Text
Problem
You need to count lines of text within a string or within a file.
Solution
Read in the entire file and count the number of linefeeds, as shown in the following method:
using System; using System.Text.RegularExpressions; using System.IO; public static long LineCount (string source, bool isFileName) { if (source != null) { string text = source; if (isFileName) { FileStream FS = new FileStream(source, FileMode.Open, FileAccess.Read, FileShare.Read); StreamReader SR = new StreamReader(FS); text = SR.ReadToEnd( ); SR.Close( ); FS.Close( ); } Regex RE = new Regex("\n", RegexOptions.Multiline); MatchCollection theMatches = RE.Matches(text); // Needed for files with zero length // Note that a string will always have a line terminator // and thus will always have a length of 1 or more if (isFileName) { return (theMatches.Count); } else { return (theMatches.Count) + 1; } } else { // Handle a null source here return (0); } }
An alternative version of this method uses the StreamReader.ReadLine method to
count lines in a file and a regular expression to count lines in a string:
public static long LineCount2(string source, bool isFileName) { if (source != null) { string text = source; long numOfLines = 0; if (isFileName) { FileStream FS = new FileStream(source, FileMode.Open, FileAccess.Read, FileShare.Read); StreamReader SR = new StreamReader(FS); while (text != null) { text = SR.ReadLine( ); if (text != null) { ++numOfLines; } } SR.Close( ); FS.Close( ); return (numOfLines); } else { Regex RE = new Regex("\n", RegexOptions.Multiline); MatchCollection theMatches = RE.Matches(text); return (theMatches.Count + 1); } } else { // Handle a null source here return (0); } }
The following method counts the lines within a specified text file and a specified string:
public static void TestLineCount( ) { // Count the lines within the file TestFile.txt LineCount(@"C:\TestFile.txt", true); // Count the lines within a string // Notice that a \r\n characters start a new line // as well as just the \n character LineCount("Line1\r\nLine2\r\nLine3\nLine4", false); }
Discussion
Every line ends with a special character. For Windows files, the line terminating
characters are a carriage return followed by a linefeed. This sequence of characters is
described by the regular expression pattern \r\n. Unix files terminate their lines with
just the linefeed character (\n). The regular expression "\n" is the lowest common
denominator for both sets of line-terminating characters. Consequently, this method
runs a regular expression that looks for the pattern "\n" in a string or file.
Simply running this regular expression against a string returns the number of lines minus one because the last line does not have a line-terminating character. To account for this, one is added to the final count of linefeeds in the string.
The LineCount method accepts two parameters. The first is a string that either contains
the actual text that will have its lines counted or the path and name of a text file
whose lines are to be counted. The second parameter, isFileName, determines
whether the first parameter (source) is a string or a file path. If this parameter is true,
the source parameter is a file path; otherwise, it is simply a string.
See Also
See the ".NET Framework Regular Expressions," "FileStream Class," and "StreamReader Class" topics in the MSDN documentation.
|

