Topic 2.2: Data Compression | AP CSP Big Idea 2 | APCSExamPrep.com
Data Compression
After this lesson, you will be able to:
- Distinguish between lossless and lossy compression and explain how each works
- Explain why lossy compression typically achieves greater size reduction than lossless
- Select the appropriate compression type for a given scenario based on priorities
- Explain why fewer bits does not necessarily mean less information
Spotify streams 320 kbps MP3 audio — that's lossy compression. Apple Music streams lossless ALAC at up to 9,216 kbps. The difference in file size is about 30x. Most people can't hear the difference on normal headphones. Some audiophiles insist they can. This is the data compression trade-off made real: how much quality loss is acceptable in exchange for how much size reduction? The AP exam asks you to make this judgment for specific contexts.
Why Compress Data?
Data takes up space — on disk, in memory, and on the wire when transmitted. A raw, uncompressed 4K video runs about 50 GB per hour. An uncompressed audio track of a three-minute song can be 30 MB. These sizes are impractical for storage and transmission. Data compression solves this.
Data compression reduces the number of bits used to represent data. The key question: can you reduce bits without losing information? The answer depends on the type of compression.
The AP exam almost always presents compression as a scenario question: “A hospital stores patient X-rays. Which compression is most appropriate?” Know the rule: when quality or exact reconstruction is critical → lossless. When minimizing size is the priority and some quality loss is acceptable → lossy.
Lossless Compression
Lossless compression reduces the number of bits stored or transmitted while guaranteeing complete reconstruction of the original data. Not a single bit of information is permanently lost.
How does it work? By finding and exploiting patterns and redundancy in the data. A simple example: instead of storing “AAAAAABBBBCCCC” as 14 characters, a lossless algorithm might store it as “6A4B4C” — 6 characters. The original can be perfectly reconstructed from the compressed version.
Real-world lossless formats: ZIP files, PNG images, FLAC audio, GIF images.
When to choose lossless:
- Medical images (X-rays, MRIs) — pixel-perfect accuracy required for diagnosis
- Legal and financial documents — exact reconstruction is legally required
- Software and code files — a single altered bit can break a program
- Scientific data — measurement precision must be preserved
Lossy Compression
Lossy compression can significantly reduce the number of bits stored or transmitted — but only allows reconstruction of an approximation of the original data. Some information is permanently removed.
Lossy algorithms are smarter about what they remove: they discard data that humans are less likely to notice. MP3 audio removes frequencies outside typical human hearing. JPEG images reduce fine color detail that the eye struggles to perceive at normal viewing distances. The result can be dramatically smaller files with minimal perceived quality loss.
Real-world lossy formats: JPEG images, MP3 audio, MP4/H.264 video, AAC audio.
When to choose lossy:
- Music streaming — slightly reduced audio quality is acceptable at much smaller file sizes
- Video streaming — some visual quality traded for smooth playback over limited bandwidth
- Social media photos — storage cost reduction outweighs minor quality loss
- Video conferencing — minimizing transmission time is more important than perfect quality
Students sometimes think lossy compression just means “lower quality.” The precise AP definition is: lossy compression allows only reconstruction of an approximation of the original data. The original data cannot be recovered once lossy compression is applied. This is why you cannot convert an MP3 back to a lossless WAV and recover the original recording.
The Key Trade-off
Lossy compression can reduce file size more than lossless, but at the cost of permanent data loss. The right choice depends entirely on the use case. Neither is universally better — the AP exam tests your ability to match the compression type to the context.
Key Vocabulary
| Term | AP Definition | Plain English |
|---|---|---|
| Data compression | The process of reducing the number of bits needed to represent data | Making files smaller |
| Lossless compression | Compression that guarantees complete reconstruction of the original data | Smaller file, no information lost — ZIP, PNG |
| Lossy compression | Compression that allows only reconstruction of an approximation of the original data | Smaller file, some data permanently removed — JPEG, MP3 |
| Redundancy | Repeated or predictable patterns in data that compression algorithms exploit | The thing lossless compression finds and removes |
| Compression ratio | The ratio of the original file size to the compressed file size | How much smaller the file gets |
Big Idea 2 data concepts appear in the Create Task when you describe how your program processes or uses data. Understanding how to extract information and work with datasets will strengthen your written response. See the Create Task module →
Get a free AP CSP question every day
Join 3,000+ students. Daily practice questions, study tips, and exam strategies.
I. Fewer bits always means less information.
II. Lossless compression can guarantee complete reconstruction of the original data.
III. Lossy compression typically achieves greater size reduction than lossless compression.
Frequently Asked Questions
🔗 Continue studying AP CSP
The Superpack includes a full lesson plan for this topic with editable slides, student guided notes, and a unit test with answer key covering all of Big Idea 2. View what's included →
Get in Touch
Whether you're a student, parent, or teacher — I'd love to hear from you.
Just want free AP CS resources?
Enter your email below and check the subscribe box — no message needed. Students get daily practice questions and study tips. Teachers get curriculum resources and teaching strategies.
Message Sent!
Thanks for reaching out. I'll get back to you within 24 hours.
Prefer email? Reach me directly at [email protected]