

The most critical component to any vehicle is its engine. Without it, the vehicle ceases to operate. For a business, the same can be said about its data. Without data the company would cease to operate.
As history tells us, each of the following were initially developed using a physical commodity of water and paper respectively. Ironically, both have evolved at similar major milestones and aligning equally to AI…

In fact, engines have additionally grown in capacity to curate data. According to Tuxera.com, autonomous cars generate 300TB of data per year. So, it begs the question… is the engine still the critical component of the vehicle or is it the data?
This fun analogy brings us to January’s annual Data Privacy week and gives us more to contemplate as we consider the tolerances of privacy and security of our data. While data has always been a constant, what has changed is the detail of data being consumed. Take our engine example, initially the metrics collected were on temperatures, pressure, fuel consumption, and emissions to help engineers improve design and performance. Over time, more data metrics werediscovered that could enhance the business in additional ways, such as fuel and energyefficiency, location and GPS, safety and diagnostics, usage and driving behavior, infotainment and connectivity, etc. These data sources all provide a tremendous amount of information that can potentially improve the vehicle and overall driver and passenger experience. However, as more data is generated, correlated, and consumed the more it crosses into the territory of becoming an evasion of privacy when collecting – location, habits, and usage associated to the owner and/or passenger(s).
Consumers in Europe were the first to draw a line and call for policies around regulating the data generated and how it’s associated to individuals. In fact, they established Europe’s Data Protection Day on Jan 28th, 1981. The US and Canada didn’t create their respective Data Privacy Day until 2008 and today an entire week is dedicated to socializing the importance of Data Privacy. While the US is slowly establishing privacy policies at the state level (except forhealthcare data which has been regulated for many years) there is still a very long way to go.Less than half of the states contain privacy laws and only a couple with comprehensive coverage.
Let’s take a real-world look at the issue continuing with our engine example. So, now data is being collected on all these aspects of our vehicle and time spent in it. The business, then realizes I could share or even better, sell this data to our infotainment manufacture to make the features better, to the insurance industry on safety and driving habits, GPS information could go to businesses helping to understand customers patterns, and from here the options continue toexplode. We call this second source data, and it can continue replicating as third/fourth/fifth…source data for other businesses. This becomes especially relevant with the new world of AI.
We can think of data for AI like oil for an engine – there is pure (conventional) oil and synthetic oil. Conventional is direct from crude oil, where synthetic is chemically engineered. In AI we have pure data and synthetic data. Pure is real data and synthetic is artificially generated. Now, today’s engines are designed to run better on synthetic oil, but that is not the case for AI. For AI to be most effective it needs pure data. Thus, making data protection one of the most consequential moves in establishing your AI program.
In today’s world, Data Protection should be what Endpoint Potection Platforms (EPP) or Endpoint Detection and Response (EDR) is for all network connected devices… A standard device posture requirement! In most organizations you cannot even join the VPN if your device doesn’t have an EPP/EDR enabled. Data Protection should be associated as a standard network posture requirement. As the need for data in AI continues to intensify well beyond first source and into second/third/fourth/fifth source data, the protection of data should be of utmost importance to every organization.
When I think about protecting data, I visualize a transformer toy that can transform into five key areas of protection – privacy, access, mask, encryption, distribution.
Thanks to AI graphics, my transformer named PAMED (pa-med) comes to view as our Full Data Protection Transformer…

Privacy
• Classifying and categorizing what the data is. Tags could be applied at this level as well for further enhancement down the protection path.
Access
• Based on classification and category, owners and associated access controls can be applied.
Mask
• Depending on the classification or category the data may need to be masked for further protection where it resides or is accessed, particularly if it’s going to be distributed.
Encryption
• Whether at rest or in transit, this will keep the data protected from unauthorized access.
Distribution
• Ensure appropriate sharing of your data, whether that is with other users, AI, or outside entities.
It’s evident that data has become a true treasure, surpassing that of pirate’s gold and jewels. The protection of data can get complicated, but not if planned and executed using generally available frameworks and programs. Sayers is readily available to assist you in developing and operationalizing a robust data protection strategy that will align with any AI initiatives, please reach out to hello@sayers.com.