Skip to main content

Overview

Since the advent of Google, search has been the bedrock for the spur of the information age. With the touch of a button it is possible to access any public indexed document stored. The drawback with conventional search engines such as Google and Bing is that they cannot be used on proprietary data sets. 

Our client, a large multinational bank had an internal database of PDF files which contained information about whether a statement or comment should be flagged. There are certain keywords and phrases that traders would use that would amount to insider trading, and the job of the compliance team was to closely monitor Bloomberg chat logs to ensure that this was not the case. There were a very high number of false flags which meant that the team had to identify the relevant document to check if it was a real flag.

Approach

DataSpartan was asked to solve the problem by creating a tool which would allow them to search their internal databases faster for the relevant information. A custom interface was built using Django which had PDF previews for key words and phrases to allow the officers to preview the documents manually and by eye for relevancy. This is the first component of a larger system which involves a document recommendation engine highlighting the usage of the keywords in the document explicitly.

Results

The component being developed is currently being integrated into the client servers and is projected to save each officer 3 hours of work each month. Full integration of the larger system is projected to save 10 hours of work each month.