Technical experts share tested and effective data minimization strategies

Data collection allows businesses to gather essential information for improving processes, reducing costs, better customer service and more. However, it also comes with risk: Any organization that collects data, no matter its size, is a target for hackers. Further, new and evolving government regulations, including the EU General Data Protection Regulation and the California Consumer Privacy Act, are placing new responsibilities and restrictions on companies when it comes to collecting and data usage.

Organizations across industries are exploring data minimization initiatives that are focused not only on reducing the volume of data they already hold, but also on collecting less new data in the future. Below, 20 members of the Forbes Technology Council share practical and effective data minimization strategies that can be used by companies in a variety of industries and share success stories they’ve overseen themselves.

1. Create a data map

To minimize the amount of data you hold, identify where critical and sensitive information is stored. A data map that shows storage locations and security models helps determine what to keep and manage in a storage plan. Observation of an organization creating overlapping data management plans for multiple regulators highlighted the importance of cross-planning for regulatory compliance and future training. – Kathleen Hurley, Sage, Inc.

2. Adopt data mesh and data fabric architectures

We can address this with two modern data architecture paradigms: the data mesh and the data fabric. The data grid provides decentralized data management so that only the necessary data is collected, processed and maintained by each domain. The data factory provides unified data access and management to reduce the need for multiple copies of data, minimizing data duplication and storage overhead. – Suri Nuthalapati, Cloudera


The Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology leaders. Do I qualify?


3. Disclose irrelevant information

We helped a retail store clean up its customer data by only keeping essential information such as names and contact details, making sure it was all securely protected. This not only saved the store money in data storage, but also increased the security of customers’ personal information. By following strict data usage rules, we have improved customer service without compromising privacy. – Balasubramani Murugesan, Digit7

4. Keep regulatory requirements in mind

With the introduction of GDPR, I saw data volumes increase significantly in retail banking. In cooperation with data protection officers, we have led several simplification initiatives to strike a balance between meeting legal requirements and retaining necessary data. We became very precise in our classifications, taking care to distinguish which types of data should be kept for a certain period (for example, 10 years) rather than just keeping all the information collected. – Luboslava Uram, Solvd Group

5. Leverage causality analysis

Using causality analysis in our AI work has significantly reduced the amount of data required to derive intelligent recommendations about a variety of decisions facing a Fortune 100 company. With less data, we can still achieve useful and actionable knowledge about manufacturing operations, potential market opportunities, optimizing customer service and improving culture. – Pravir Malik, QIQuantum

6. Take a multi-pronged approach

I worked with a financial services company to reduce data overload. The idea is to minimize the collection, storage and processing of sensitive data to reduce the risks of breach and improve the efficiency of data management. We classified data, eliminated unnecessary collection points, implemented strict collection and retention policies, and applied data masking and anonymization, resulting in a 40% reduction in retained data. – Dutt Kalluri, Celsius Technologies

7. Reduce the amount of personal data collected

We have significantly reduced the amount of personal data we collect from users as part of our data minimization strategy. This not only increased user privacy, but also improved data security, leading to the near elimination of incidents. The strategy also fostered greater confidence among customers, consistent with broader regulatory compliance and improved overall customer satisfaction. – Michael Beygelman, Claro Analytics

8. Implement schema-based data creation

The implementation of schema-based (structured) data generation from end-user devices has greatly reduced the computing power required for processing and minimized the amount of derived data that must be stored before gold data be made available to engineers and data scientists. This approach has reduced our storage costs by 40%, improved our GDPR and CCPA compliance and increased customer confidence. – Ravi Bandlamudi, AtoB

9. Eliminate duplicate and outdated data

Data cleansing is essential for organizations looking to minimize the amount of data they are storing. A life sciences company we worked with had assembled various legacy systems. They needed to merge all their data into a single SAP ECC instance. We helped them eliminate duplicate data and remove obsolete and unnecessary data before migrating to the new, unified system. – Kevin Campbell, Syniti

10. Implement a timed retention and purge policy

Internally, we recently implemented a seven-year data retention and cleaning policy. We had veteran employees with records from the 2000s! Our clients’ project files were just as old. Keeping old data made the risk of having to disclose any breach 300% higher than it is now. Users initially resisted, but with some change management principles, we deleted all but a few HR and financial records without complaint. – Chris Stegh, eGroup | Enabling technologies

11. Replace sensitive elements with tokens

In digital advertising, data plays a crucial role, and the key here is to collect only the data you need. At the same time, I would recommend businesses to replace sensitive elements with tokens. Since tokens cannot be accidentally decrypted, the system is more secure, which is beneficial to both an organization and its customers. – Roman Vrublivskyi, SmartHub

12. Remove the fields on the receipt forms

We decided to remove some of the fields on certain receipt forms and customer lists to increase productivity. The results were master lists that are easier and faster to analyze and improved productivity across the board. Sometimes, simple is better. – Michael Gargiulo, VPN.com

13. Control, anonymize and automate

At a financial company, I led a data minimization initiative that included auditing and anonymizing data and automating data cleansing. This increased data security, reduced storage costs, improved customer compliance and trust, and increased operational efficiency. The initiative protected the company from security threats and compliance issues, benefiting both the company and its customers. – Sumit Bhatnagar, JP Morgan Chase

14. Remember that quality trumps quantity

AI requires large amounts of data, but as always, quality trumps quantity. By carefully looking at data quality, a computer vision company managed to reduce the number of images needed to train models by about 16%. This was done by removing similar and poor quality images and increasing the quality of annotations. The result was not only better performance, but also cheaper and faster AI training. – Erik Aasberg, eSmart Systems

15. Use ML to filter out unnecessary data

We conducted frequent data audits, redesigned our collection policies, and implemented machine learning algorithms to filter out unnecessary data. These efforts resulted in a 25% improvement in system performance, increased data security, increased customer satisfaction, and led to a 20% reduction in data management costs. – Ketan Anand, Suuchi

16. Division of departmental silos

I worked with a government that wanted to consolidate spending from all departments into a centrally managed procurement panel. The issue was the proliferation of data: each department kept its own data and there was no data management. By breaking down these silos so that all data could be analyzed in one place, the government realized over $1 million in savings in the first three months. – Lewis Wynne-Jones, ThinkData Works

17. Fostering a culture of privacy and responsibility

A data minimization initiative should include fostering a culture of privacy and responsibility within the company. Employees should become more aware of the importance of data privacy and their role in protecting sensitive information, as this will lead to better data handling practices across departments. Automation tools can simplify compliance checks and data management processes. – Roman Reznikov, Intellias

18. Think of data as a “toxic asset”

Our policy is to treat data as a “toxic asset”. This is a useful concept to keep in mind when dealing with data. As we would with a toxic substance, we aim to minimize data handling, limit the number of people who come into contact with it, reduce the amount of time we store it, and reduce exposure. – M. Nash, Integrity

19. Focus on data collection only for specific and defined purposes

We have benefited the most by optimizing the user analytics process. We revised our data collection practices by implementing a strategy of collecting only the essential data points needed for performance analytics and improving the user experience. This has reduced the amount of data stored and improved security, regulatory compliance and customer confidence. – Phil Portman, Textrip

20. Implement Edge devices to minimize data sent to the cloud

The production generates 18 PB of data per year. We have implemented edge devices to normalize, contextualize and detect anomalies, sending only useful data to the cloud. This reduces costs and avoids over-processing – for example, when anomalies are detected in images or videos, only selected files are sent for further analysis and model training. It has also significantly optimized data handling and cost efficiency. – Ravi Soni, Amazon Web Services

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top