top of page

Ingest is hard so we made it easy

At Jigsaw we deal with big data projects nearly daily. Our customers utilized our platform to ingest billions of items per day that provide direct insight into data by parsing out items of interest through keyword searches and items of interest.

Ask any intelligence analyst and they will tell you that having a hit list of important terms and items of interest which they can update while doing research is a top feature request for keeping up with what is happening in the world and what cyber related threats may affect their businesses or the businesses they protect. Being alerted to email compromises, hackers talking on IRC planning an attack on your organization or identifying activity on the dark web indicating that your organization may have security issues are just some of the things that are typically done with the Jigsaw platform. Search relevant items of interest and you may be able to stop the next cyber security threats or terrorist attack.

Ingest Made Easy

Jigsaw Security has been listening to our customer request. One of our latest features is the ability to directly ingest PDF, Word Documents, Images, Video and nearly any media type that may be of interest. Currently our ingest processes 190 file formats natively by pulling out metadata and by OCR recognition on PDF and image files. This allows you to quickly and easily find items of interest that are streaming into your environment. Previously ingest took the form of customized scripts which constantly needed tweaking to find things that are of interest and then create reference data to point to the files stored in cloud environments. With Jigsaw's solution, all you do is sftp or scp your files of interest to a directory, our platform then looks through the documents stored on the ingest directory and does 2 things. The first thing our platform does is extract all of the text of the documents and store that text in your platform index to make it easy to find what your looking for, the second thing that occurs is that the document is then made available to all users in your platform with a link to the original content so it can be downloaded by your users. This automated ingest makes things extremely easy whereas if you can get the data into the ingest directory using an automated or manual process, the data is instantly available to your cloud users.

Let's have a look

So let's have a look at an example of our ingest process. First let's find a document of interest to ingest. Let's go through a typical Jigsaw analyst process and see how we go from content to delivery of content. We start by using our OSINT-X platform to find items of interest.

OSINT-X: A Jigsaw Security framework for bringing content into our platform from RSS and other items of interest. OSINT-X is an RSS, Twitter and other aggregator developed to keep track of items of interest on RSS, Social Media and from CERTS, ISACS and similar sources. The data downloaded into the platform is directly written into the Jigsaw Security platform. See the screenshot below.

In this Windows we see an Interesting article on a recent Ransomware attack in North Carolina government.

Since we have automated ingest setup this same content should already have been pulled and ingested into our platform. Some keywords that may be present in the new article are sure to contain North Carolina, Government and Ransomware as keywords. So while researching incidents in the Jigsaw Platform let's have a look as see what the platform did with this content on ingest. To find this information we have to have a look at the Jigsaw Document Viewer in the platform since this content will be ingested as a document.

Let's search using the keywords we agreed on above and see what our results look like.

As you can see we returned 4 documents including the document of interest. To view the document all the analyst needs to do is click to drop down box so we will do that next.

Here's a closeup of the result.

Clicking the dropdown icon reveals details of the article.

The metadata shows items such as filesize on disk, the parser used to read the content, author, etc.

Most importantly our keywords selected are highlighted in the results.

Metadata is King

As you can see not only did we do OCR recognition on the PDF document, but we also extracted all of the relevant details. This allows analyst to find items of interest quickly and easily. How will you use the Jigsaw platform in your organization? Will you be able to quickly determine risk levels of items of interest? The platform automatically highlights content of interest by category that your organization is interested in finding.

This area of interest approach shows you details about the articles you are viewing and allows quick filtering by content.


So when ingesting unstructured data you need to be able to quickly and easily identify or be alerted to items of interest that may affect your security posture at your organization. The Jigsaw Analytic Platform can ingest multiple streams and also make those data streams instantly available to search or alert to items that are of concern to your organization. Items such as your domain name serve to monitor your companies reputation, email addresses may allow you to quickly find compromised accounts in your organization and keywords pertaining to your business operations may give you insight on the world stage that otherwise would have been missed such as wildfires in California or terrorist attacks in Bali as some examples. By putting in hit list and items of interest into your platform, you are sure to never miss relevant items of detail. With Jigsaw Security Analytic's platform, we make this possible and more. Call your sales representative today for additional information or to see a demo of our platform in action.

5 views0 comments
bottom of page