Apama 10.15.0 | Developing Apama Applications | Protecting Personal Data in Apama Applications | Where personal data is held within the Apama platform
 
Where personal data is held within the Apama platform
Most deployments of Apama deal with personal data only in customer-defined data fields, which are largely under the control and responsibility of the customer who writes and deploys the Apama application.
In Apama applications, customer-defined data is usually held in memory in EPL event fields and monitor variables, by connectivity plug-ins, or by EPL plug-ins such as the MemoryStore.
Customer-defined application data may also be stored at rest in the following places:
*Correlator, IAF and dashboard server log files (the main log file, and for the correlator also any additional files defined by eplLogging and correlatorLogging configuration). These files include logging performed by the customer's application and by standard Apama connectivity and EPL plug-ins and the correlator itself. For example, the contents of Apama events are often logged (either in full, or truncated) if an error occurs during processing or sending of the event, and data from events or other EPL data structures may be logged as part of correlator error messages.
*Correlator input log. If enabled, it contains the contents of all events sent into the correlator.
*Correlator persistence database (and if using JMS, the reliable receive database). If enabled, it contains an on-disk representation of the state of the Apama application.
Apama also provides the ability to store customer-defined data in external systems such as a Terracotta distributed cache (using our MemoryStore API) or a database. You should consult the documentation of systems such as these for information about how to ensure personal data written there by your application is properly handled and protected, and you should also check that your Apama application logic includes mechanisms to rectify or erase personal data stored there, if required.
We strongly advise against allowing any personal data to exist in the application logic itself (the EPL source files), and this documentation assumes that this principle is being followed.
In addition to the customer-defined data mentioned above, there are a small number of situations where the Apama platform could potentially be considered to directly handle personal data. You should establish whether in your own environment any of the users listed below represent the personal data of an identifiable human protected by legislation, and which merely represent machine-to-machine communication, or system administrators who have accepted the logging of their user name and IP address as part of their terms of employment.
Product area
Potential "personal data"
Where data could be stored
Correlator, IAF, dashboard servers
User identifiers and IP addresses for direct connections to/from Apama server processes (typically only for machine-to-machine communication between server processes, or monitoring and management by system administrator accounts). These are logged to provide an audit trail in case of an attack or accidental mistake by a system administrator.
*correlator main log file
*correlator input log
*correlator persistence database
*IAF and dashboard server log files
Scenario Service API, queries, DataViews, dashboard servers, custom clients and dashboards
The Scenario Service event protocol contains a username field, identifying users who created instances of scenarios (for example, DataViews or queries). There are various places where this username could show up.
Note:
Apama queries are deprecated and will be removed in a future release.
*correlator and dashboard server log files
*correlator input log
*correlator in-memory state
*correlator persistence database
Dashboard servers
User identifiers and IP addresses of dashboard clients that connect, who may be end-users. These are logged to provide an audit trail.
*dashboard server log files
Dashboard servers, if using JAAS
User identifiers of dashboard clients for authentication purposes. These are logged to provide an audit trail.
Under the control of the JAAS plug-in used. For example, the UserFileLoginModule provided by Apama stores usernames in plaintext in an XML file, whereas other plug-ins are available that hold usernames on a remote server such as an LDAP server. See Administering Dashboard Security.
You can choose an appropriate JAAS plug-in which complies with the way you need to protect the user data if required.
HTTP server connectivity plug-in
User identifiers and IP addresses of clients that connect to the HTTP server, as specified in HTTP header. These are written to the log file. Along with other HTTP headers they are also present in the message metadata. Thus they can optionally be mapped to fields in an Apama event, using a connectivity codec such as the mapper codec.
*correlator main log file
*correlator input log file
*correlator in-memory state
*correlator persistence database
HTTP server connectivity plug-in, only if authentication is enabled
User identifiers of clients who are permitted to connect to the HTTP server (with a secure hash of the passwords).
*HTTP server authentication password file, which is stored on disk in plaintext and contains un-encrypted usernames and hashed salted passwords, see Authentication. As the file is completely under the user's control, you can use standard tools included with your operating system to set access control for protecting this data as needed. Users can be deleted from the file using a text editor or the httpserver_passman provided by Apama as described in the documentation.