Understanding Line-, Token-, and String-Based Command Line Utilities

My books, Windows Command-Line Administration Instant Reference and Administering Windows Server 2008 Server Core, both contain batch file sections that answer basic needs, but sometimes you need more than basic information to perform a task. A reader asked me how to perform a task using the FindStr utility the other day based on my Regular Expressions with FindStr post. The problem is that FindStr is a line-based utility, and the reader was trying to obtain a token-based result. Using FindStr alone won’t solve the problem. Here is the original reader comment:

 

If I have lines like below in a file called Sum.txt :

Total001 abcdefg
Total002 hijklmn
Total099 opqrstuv

and I use a regular expression to get all the results like “findstr Total[000-099] Sum.txt” the result printed is :

Total001 abcdefg
Total002 hijklmn
Total099 opqrstuv

But I want it to print only the matches to the regular expression like

Total001
Total002
Total099

How can this be done?


And my response:

 

The FindStr
utility is line oriented, which means you obtain an entire line as
output, not individual tokens. In order to accomplish what you want to
do, you need to create a For loop. Using a For
loop would allow you to process the individual tokens in the line. The
following command will do what I think you want to accomplish:




For /F “UseBackQ” %1 In (`FindStr Total[000-099] Sum.txt`) Do @Echo %1




There are two important things to notice here. First, you must provide the “UseBackQ”
option or the command won’t work. The command itself must appear in
back-quotes—not regular quotes. The back-quote normally appears above
the Tab button and to the left of the 1 on a keyboard. It usually
appears with the tilde (~) character.



Using For makes it possible to create a token-based output from the line-based FindStr output. The default For setting relies on the space and tab characters as delimiters, but you can use the Delimiters= option to change the default behavior. However, sometimes a token-based result isn’t enough. You may not want an entire word (or whatever element the delimiters define). In this case, you need a string-based output.

One of the undocumented features of the command line is to create substrings from variables. For example, let’s say you define the following variable:

 

Set MyVariable=Hello World


Now, you want to obtain just a piece of that variable to use somewhere in your batch file. To obtain the substring, you use the tilde (~) operator. This operator uses a 0-based offset. So, let’s say you issue the following command:

 

Echo %MyVariable:~3%


The output of this command is: lo World. The output begins with the forth character, which is an l and displays the remainder of the string. However, let’s say you don’t need the rest of the string. Well, in this case, you can add a second number to define the characters you do need. If you issue this command:

 

Echo %MyVariable:~3,6%


the output is: lo Wor. The output begins with the fourth character and proceeds to the ninth character. The output contains the six characters you requested. In short, it’s possible to perform some fancy string manipulation in batch files as long as you keep the short of output you need in mind. Let me know how you use batch files to perform various sorts of string manipulation at [email protected].