A New Year’s Thought on Shortcutting Data Organization

Business, Technology
Reading Time: 3 minutes

By Elizabeth Thede

There’s a good chance that one of your New Year’s Resolutions is to “get this pile of data organized so that I have everything available at my fingertips.” That goal is what I want to focus on today. And here’s the best part: you can achieve that goal without ever actually going into your data and organizing a thing.

The key to getting this to work is to deploy a search engine like dtSearch to sift through your data. dtSearch enterprise and developer products instantly search terabytes of “Office” files, PDFs, emails plus attachments, databases and Internet/Intranet data. Because dtSearch can instantly search terabytes, many dtSearch customers are large enterprises like Fortune 100 companies and federal, state and international government agencies. But dtSearch doesn’t just work with large enterprise data; you can also instantly search your own documents, emails and the like.

Whether dtSearch is providing concurrent multithreaded enterprise search or operating for individual use, dtSearch starts by building one or more up to terabyte-size data indexes. Now the index isn’t like a book index. Rather, it is an internal tool to let dtSearch instantly search across all of the full-text content plus metadata in a data collection.

To start dtSearch indexing, all you have to do is point dtSearch at the relevant folders and other data. dtSearch will figure out for itself what mix of data you have like Word, Excel, Access, PowerPoint, OneNote, XML, HTML, PDF, ZIP or RAR files, or even emails with multilayer nested email attachments. Then dtSearch will automatically build the index.

At that point, you can instantly search for anything using over 25 different search options to make sure that you immediately find what you need.

dtSearch can do any combination of structured Boolean and/or/not searching along with phrase searching. That way, you can find all documents and emails that mention both green curlicues and purple squiggles but omit any mention of lavender hearts. Or you could do a proximity search, finding green curlicues but only if they appear in a document or email within 35 words of outer space. Or you could find green curlicues but only if they appear within 8 words prior to outer space.

If you don’t want to do a structured Boolean/proximity-type search request, dtSearch can also do so-called natural language searching looking for all of the above search terms anywhere, and then ranking the retrieved documents and emails by hit term density and rarity. That way, if curlicues appears in 5 million documents in your collection, but hearts appears only in 3 documents, then hearts’ documents would get a much higher relevancy ranking.

dtSearch also supports variable term weighting, so you could give curlicues a negative ranking of 8 and squiggles a positive weighting of 6. Or dtSearch could do all of the above but with stemming to find different word endings on the same route word, so you could retrieve not only squiggles but also squiggling or squiggler.

dtSearch can also activate fuzzy searching adjustable from 1 to 10 to sift through typographical or OCR errors. That way, even if you mistyped squiggles in an email as squipples, dtSearch could still find it. Or dtSearch could look for squiggles but only in the subject line of an email while looking for outer space in the main body of an email.

After a search, dtSearch can display individual retrieved documents and emails with multicolor highlighted hits, like curlicues in green and squiggles in purple. Or dtSearch can generate a search report for you showing all retrieved hits from all documents and emails with as many words of context as you want.

dtSearch even lets you leverage advanced search techniques that go beyond word searching, like searching for credit card numbers or file hash values. And if you are a developer, there are even more search options to apply, like faceted searching using database metadata or granular classification of retrieved search results for security purposes.

So go to dtSearch.com, and download a fully-functional 30-day evaluation version to instantly search terabytes. One New Year’s Resolution done!

“dtSearch products have over 25 search features, and can display retrieved data with highlighted hits. The product line includes extensive international language support, as well as special forensics search features.”

 

LISTEN TO THE INTERVIEW IN ITS ENTIRETY HERE

Share This: