DEV Community

Cover image for Leaking sensitive data via membership inference attacks on machine learning models
Yaw Joseph Etse
Yaw Joseph Etse

Posted on

Leaking sensitive data via membership inference attacks on machine learning models

The paper, "Membership Inference Attacks against Machine Learning Models," shows how easy it is to expose how models can inadvertently leak sensitive information about the data they were trained on.

Leaking via membership inference attacks (essentially allow an adversary to determine if a particular data record was used in the training set of a machine learning model) is another reason why the work in the differential privacy space is so interesting and valuable. It’s extremely common to deal with confidential and sensitive data in most enterprise settings, and the assurance that this data cannot be reverse-engineered or exposed is critical.

The authors of the paper does a nice job walking through multiple experiments and evaluations, demonstrating that machine learning models, especially those that are overfitted, are susceptible to these types of attacks. The authors show that the models behave differently when queried with data they have seen before, compared to unseen data. This difference in behavior can be exploited to infer membership information.

A key contribution of the paper is the introduction of a novel technique called "shadow training." This technique involves training models (referred to as shadow models) to mimic the behavior of the target model, using data that is similar to the target model’s training data. The shadow models are then used to generate a dataset for training an attack model, which learns to distinguish between the target model’s outputs on its training and test data. This attack model can then be used to infer membership information about new data records.

The implications of this are far-reaching. Whenever the usefulness of privacy enhancing technologies comes up, it’s appropriate to raise the risks associated with membership inference attacks and take steps to mitigate these risks. This includes being mindful of the trade-offs between model accuracy and vulnerability to such attacks, and implementing strategies to prevent overfitting.

Some questions that arise from this research, and that I believe warrant further exploration, include:

  1. How can we effectively measure the susceptibility of our models to membership inference attacks?

  2. What are the best practices for implementing shadow training in a real-world scenario, and how can we ensure its effectiveness?

  3. Are there specific types of data or model architectures that are more prone to these attacks, and how can we safeguard against this?

I would be interested in engaging with more researchers who are currently delving deeper into these questions and to explore potential collaborative efforts to address these vulnerabilities.

Top comments (0)