Searching For and Extracting Data
grep
The grep command searches for files that contain a specified string and returns the name of the file and (if it’s a text file) the line containing that string.
- can also use to search a specified file for a specified string
- uses regular expressions
- if a filename is not specified, it uses standard input
- shell uses some characters for its own purposes, so you may need to enclose regex in quotes
- e.g.,
|or*
- e.g.,
find
The find command locates files using filename and file’s date stamps by searching through a specified directory tree.
- tends to be slow because of the brute-force approach
- can use multiple directory paths
wc
wc provides basic word statistics on text files.
- e.g.,
wc newfile.txt- outputs:
37 59 1990 newfile.txt - 37 lines
- 59 words
- 1,990 bytes
- outputs:
cut
The cut command extracts text from fields in a file record.
- frequently used to extract variable information from a file whose contents are highly patterned
- to use:
- pass to it one or more options that specify what information you want
- followed by one or more filenames
sort
The sort command sorts information in a file.
- sorts alphabetically by default with no options
- e.g.,
$ sort pets.txt
bird
cat
dog
fish- no changes are made to the files data, only output is sorted
cat
The cat command displays text files on screen and can concatenate files together.
- files themself are not modified, only output