Search in office files on the command line on the command line with the free Open Source tool Swiss File Knife
sfk office file support
sfk can read Open Document Format office files,
which are a standard since 2007, with these
filename extensions:
.docx .dotx .dotm .docb .xlsx .xlsm .xltx .xltm
.pptx .pptm .potx .potm .ppam .ppsx .ppsm .sldx
.sldm .odt .ods .odp .odg .odc .odf .odi
.odm .ott .ots .otp .otg
sfk can not read older office file formats
like .doc .xls or .ppt.
supported commands:sfk olist mydir
list all office files in folder mydir.
sfk ofind mydir "/myword/"
search office and plain text files in mydir
containing the word 'myword'.
sfk ofind mydir "/foo*bar/"
search foo followed by bar in the same line.
for more infos type: sfk ofindsfk ofilter in.xlsx -+foo
filter content of a spreadsheet table
for lines containing 'foo'
for more infos type: sfk ofiltersfk oload in.xlsx
load and display content of in.xlsx
sfk oload in.xlsx +xex "/*\tapple\t/*"
find fields containing just 'apple'
and get the whole row around.
sfk oload in.xlsx +filt -spat -+\tapple\t
same as above, using +filter.
for more infos type: sfk oloadsfk snapto=alldoc.txt -office mydir
collect plain text, and text from office
files like .docx .xlsx .odt .ods,
from folder mydir into one file alldoc.txt
sfk find alldoc.txt foo bar
search alldoc.txt for lines containing
the words foo and bar
dview alldoc.txt
browse and search alldoc.txt interactively
with the Depeche View text file browser.
for details see: sfk viewsfk unzip in.xlsx -todir tmp
extract all contents of in.xlsx
into a folder tmp. -todir is important
otherwise you end up with many files
in the current folder.
sfk zip -rel out.xlsx tmp
recreate an office file out.xlsx
from all contents in tmp.
-rel is important to strip folder
name 'tmp' from the content filenames.
search office files interactively with DView
the GUI tool Depeche View can browse and
search text from office files directly.
for details type: sfk view