AP CSA Unit 4.1: Data Privacy and Ethics Practice

Unit 4, Section 4.1
Day 1 Practice • January 7, 2026
🎯 Focus: Data Privacy and Ethics

Practice Question

Scenario:
A fitness tracking app collects data about users' daily steps, heart rate, sleep patterns, and GPS location throughout the day. The company's terms of service state that they may "share aggregated, anonymized data with third parties for research purposes."

A data scientist discovers that by combining the "anonymized" location data with publicly available information (home addresses from property records), they can identify specific users and their daily routines.
Which of the following best describes the primary ethical concern in this situation?

What This Tests: Section 4.1 addresses the ethical and social implications of data collection—a critical topic as software increasingly handles personal information. The AP CSA curriculum requires understanding privacy concerns, responsible data handling, and the potential for harm even when data is "anonymized." This tests whether you can identify genuine privacy violations.

Key Concept: De-anonymization and Privacy

Anonymization is the process of removing personally identifiable information (PII) from data. However, true anonymization is extremely difficult because:

  • Data combination: Multiple "anonymized" datasets can be combined to re-identify individuals
  • Unique patterns: Location data, browsing habits, and behavior patterns are often unique enough to identify specific people
  • Public records: Combining anonymized data with public information (property records, social media, etc.) can break anonymity

Example: Even if the app removes names from the data, knowing someone lives at "123 Main St" (public record) and seeing "anonymized" location data showing someone regularly at that address reveals their identity.

Detailed Explanation

Why B is correct: The core ethical issue is that the company claimed data was "anonymized" (implying privacy protection), but it can actually be re-identified by combining it with other data sources. This violates users' reasonable expectation that their personal routines won't be trackable by third parties.

Key privacy principle violated: Users consented to sharing "anonymized" data, not identifiable data. The company's technical failure to truly anonymize the data breaks the trust users placed in them.

Real-world impact:

  • Stalkers could track specific individuals' daily routines
  • Insurance companies could identify people with health issues
  • Employers could track employees outside work hours
  • Targeted advertising becomes surveillance

Common Mistakes

Mistake 1: Answer A (Collecting too much data)

While the amount of data collected could be debated, the question is about what happens AFTER collection. The ethical issue isn't that they collect heart rate or location—it's that they share it in a way that compromises privacy despite claiming anonymization. Many legitimate fitness apps collect similar data responsibly.

Mistake 2: Answer C (Never share any data)

This is too extreme. Sharing truly anonymized, aggregated data for research (like "average steps per day in a city") can be valuable and ethical. The problem here is that the anonymization failed—not that sharing happened at all.

Mistake 3: Answer D (Users' responsibility)

This "blame the victim" approach ignores that companies have ethical obligations to protect user data. Users should be able to benefit from technology without sacrificing all privacy. The responsibility lies with developers to implement proper privacy protections.

Mistake 4: Answer E (Terms of service = no problem)

Following terms of service doesn't automatically make something ethical. If the terms claim anonymization but the technical implementation fails, that's still an ethical violation. Ethics and legality are separate—something can be legal but unethical, or vice versa.

Responsible Data Practices

Principles for Ethical Data Collection

As a developer, you should:

  1. Minimize collection: Only collect data you actually need
  2. Informed consent: Users should understand what you collect and why
  3. Genuine anonymization: If you claim data is anonymous, ensure it actually is
  4. Purpose limitation: Only use data for the stated purpose
  5. Security: Protect data from breaches and unauthorized access
  6. User control: Let users access, correct, and delete their data

Real-World Examples

Example 1: AOL Search Data (2006)

AOL released "anonymized" search queries. Researchers easily identified individuals by combining search terms (unique medical conditions + hometown + age). One person's identity was revealed from queries like "landscapers in [city]" + "[specific medical condition]" + "[specific hobby]".

Example 2: Netflix Prize Dataset (2007)

Netflix released "anonymized" movie ratings for a competition. Researchers re-identified users by matching the ratings to public IMDb reviews, revealing political views and other sensitive information.

Example 3: Strava Heat Map (2018)

Strava published an aggregate heat map of users' exercise routes. It inadvertently revealed locations of secret military bases because soldiers' jogging patterns around bases were visible, even though individual names weren't shown.

Technical Solutions

Better anonymization techniques:

  • Differential privacy: Add statistical noise to data so individual records can't be identified
  • k-anonymity: Ensure each record is indistinguishable from at least k-1 other records
  • Data aggregation: Only share summary statistics, never individual records
  • Time/location fuzzing: Reduce precision (e.g., "within 1 mile" instead of exact coordinates)

Related Topics

  • Section 4.2: Introduction to Using Data Sets (working with data responsibly)
  • Section 5.E: Legal and ethical concerns (broader computing ethics)
  • Computing Innovation Impact (Create Performance Task requirement)
Difficulty: Medium • Time: 3-4 minutes • AP Skill: 5.E - Explain how computing systems might have unintended consequences

Ready to Level Up Your AP CSA Skills?

Get personalized help or access our complete question bank

Premium Question Bank - Coming Soon! Schedule 1-on-1 Tutoring

575+ Unit 4 questions • Expert tutoring with 1,700+ hours experience • 5.0 rating

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.