Government agencies are continually looking to unearth greater insights from their mass amounts of data, and this is why those in the public sector are integrating data lakes as part of their IT initiatives. Holding both structured and raw information, data lakes allow agencies, like the Department of Defense (DoD), to better use data as a strategic asset.
“If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state,” explained James Dixon who coined the concept of data lakes. “The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”
As a result of their accessibility and lack of restriction, as it pertains to data types and sources, data lakes have become increasingly more attractive to federal agencies. According to a report from ResearchAndMarkets.com, the global data lake market size is anticipated to grow at a Compound Annual Growth Rate (CAGR) of 20.6 percent by 2024. The reason for this increase is the need for deeper insights paired with growing volumes of data as well as the need for simplified access to data from legacy systems and departmental silos.
Earlier this year, the U.S. Census Bureau announced their interest in an enterprise data lake, with an estimated project budget of $22.3 million. According to Census Bureau leaders, an enterprise data lake would allow for the modernization of data storage and better data analysis capabilities. With this in place – requiring an additional 30 full-time employees to manage it – IT leaders at the Bureau believe a data lake would increase their capacity of growing administrative records and large amounts of economic and demographic data.
The DoD also has plans to use a large-scale data lake. In a partnership between Ironclad Technology Services and Qlik, an agency within the DoD will migrate data from ten legacy systems into an integrated data lake. According to a press release, using Qlik’s solution, data will be moved more seamlessly and allow for it to be organized downstream without having to move data upstream first.
“We had to manually rebuild our data replication any time we changed anything in the legacy systems, which was extremely time-consuming,” explained Chris Hutcheson, vice president of operations at Ironclad. “We are able to automate these processes and manipulate tables however we need with Qlik data integration so that it can anticipate what the data will look like downstream.”
The importance of data lakes is not to be underestimated. Recognizing this, federal agencies including the U.S. Census Bureau and the DoD both announced their plans for data lake integration as a way to better store and organize all types and sources of data including structured, semi-structured, and raw. When moving from legacy systems to data lakes agencies open the door for a more modern and simplified data management process, ultimately leading to greater insights that positively impact mission delivery.
Ready to learn more? Click here.