Data Minimisation
BrainPredict is architected around the principle that AI should use the least data possible to achieve maximum accuracy. Every layer of the stack — from data ingestion to model training to audit logging — enforces minimisation by default. No configuration required.
6 Minimisation Principles — Active by Default
Collection Minimisation
GDPR Art. 5(1)(c)Only the features statistically necessary for prediction accuracy are ingested. The AI Readiness Connector computes feature importance before any data enters the pipeline — irrelevant columns are dropped at source.
At-Rest Encryption
GDPR Art. 32All training caches are encrypted with Kyber-768 (post-quantum key encapsulation) at rest. BrainCode™ compression reduces encrypted payload size by 60–80%.
Pseudonymisation by Default
GDPR Art. 4(5)The Installation Wizard automatically detects PII fields (GDPR special categories) and applies SHA-256 one-way hashing before data enters any AI model. Names, emails, national IDs are never seen by the ensemble.
BrainCode™ Compression
Data MinimisationA Tabular Autoencoder compresses training data into a 32-dimensional latent space, discarding statistical noise while preserving 98%+ of the signal. Latent vectors replace raw records in storage.
Automated Retention Limits
GDPR Art. 5(1)(e)The training cache enforces a configurable rolling window (default: 10,000 samples). Records outside the window are automatically purged. No manual cleanup required.
Consent Ledger
GDPR Art. 7Every data subject's consent status is tracked in an append-only ledger. Withdrawal of consent triggers automated model retraining with the subject's data excluded — within 24 hours.
Data Minimisation Stack
| Layer | Technology | Action |
|---|---|---|
| Data Ingestion | Apache Arrow + AI Readiness Connector | Column-level feature selection — drop irrelevant at source |
| Pseudonymisation | SHA-256 + configurable salt | PII fields hashed before touching any model |
| Compression | BrainCode™ Tabular Autoencoder | 60–80% size reduction, 98%+ signal fidelity |
| Encryption | Kyber-768 (PQC) + AES-256-GCM | At-rest encryption of all training caches |
| Retention | Rolling cache (10,000 samples default) | Automatic purge of oldest records |
| Audit | Dilithium-3 signed audit log | Every data access event logged and signed |