Why Data Curation is Vital to Your Financial Services Firm


This blog post was featured on 

One of the “Five Things Every Financial Services Professional Needs To Know For 2018 is that financial services firms are starting to more carefully manage their data as a valuable asset.

Imagine if you visited a museum and every piece on display was presented on one shelf with no labels or description. It would be impossible to determine what was valuable and what was worthless.

Data stored at large financial services firms face the same challenge. Many terabytes of data are available, but if these are not cataloged and curated, anyone who wanted to work with the data and create meaningful and accurate reports would not know where to begin.

Museums have solved this problem by having a curator responsible for cataloging, describing and managing the artifacts so others will find them useful. Similarly, financial services organizations have realized that data is actually their most valuable asset and also needs to be carefully cataloged and managed to make it useful.

Since we can’t put data behind a glass with a plaque, we must use software tools instead. Software packages such as Alation, Collibra and Informatica offer businesses user-friendly ways to catalog and manage an organization’s data such that the right data can be sourced with high confidence and in a consistent manner throughout the organization.

Studies have shown that 90% of the time knowledge workers spend creating new reports is spent recreating information that already exists. In my work with banking clients, I have often seen other issues, including confusion about the context of a specific piece of data (e.g., interest rate). A bank has many database fields called “interest rate,” but which is for bonds versus for loans? This situation makes the data guru who has been around for a long time very valuable from a job security standpoint, but impedes everyone else’s productivity.

Data dictionaries, which store information about data (also known as metadata), have been around for a while. Data catalogs take the concept further and act as a combination of a wiki and a search engine. Users can search data catalogs to seamlessly locate the required data and gain a better understanding of the data field and its intended use.

Data curation takes the process to the next level. It includes a crowdsourcing model of metadata and annotations. Users can upvote or downvote data assets, annotate why a certain filter was used or add additional nuance to a data definition — for example, “This field ties directly to Line 1 of the PPNR schedule of the FRY-14Q report.”

The evolution of data cataloging and data curation can be viewed as the next step in the evolution of the self-service model for data analysis. Analytics tools popularized over the past few years, such as Tableau and Qlikview, allowed data analysis to move out of the realm of IT and into the hands of business users. Data cataloging takes this trend further and allows the sourcing of data to also be democratized for a business user to be able to perform this task rather than requiring a data jockey from the IT team.

Financial services professionals should capitalize on this democratization wave by getting on board with their organization’s data program as a data cataloger or data curator — no pocket protector required.

We have now covered three of the “Five Things Every Financial Services Professional Needs To Know For 2018”: data curation, LIBOR and CECL. Next in this series is an overview of Ripple, one of the hottest players in the cryptocurrency market.

Insights into the Recent Rapid Growth of Robotic Process Automation (RPA)
How Ripple is re-engineering cross-border payments
Related Posts
LIBOR Transition: Preparing Your Organization for the CCP Discounting Switch
LIBOR Transition: Preparing Your Organization for the CCP Discounting Switch
Implementing Robotic Process Automation For Internal Audit
Implementing Robotic Process Automation For Internal Audit
3 Key Lessons in Scaling Your Automation Program
3 Key Lessons in Scaling Your Automation Program