Notes
Slide Show
Outline
1
Spam
  • Edward W. Felten
  • Dept. of Computer Science
  • Princeton University
2
Scope of the Problem
  • About 60% of all email is spam
    • Much is fraudulent
    • Much is inappropriate for kids
  • 5% of U.S. net users have bought something from a spammer
    • Billions of dollars of sales
    • Spamming pays
  • Will talk about email; but affects other communication technologies also
3
An Email Message
4
Email Transport
5
What is Spam?
  • Email that the recipient doesn’t want.
  • Problems:
  • - only defined after the fact
  • - ban raises First Amendment issues


  • (2) Unsolicited email.
  • Problem: lots of unsolicited email is desired


6
What is Spam?
  • (3) Unsolicited commercial email.
  • But what exactly does “unsolicited” mean?
7
Free Speech Issues
  • Law sometimes allows speech, even when the listener doesn’t want to hear it.


  • Commercial speech less protected than political speech.


  • At the very least, let’s not block a message if both parties want it to get through.
8
Working Definition of Spam
  • Any commercial, non-political email is spam, unless
  • (a) the recipient has consented to receive it,
  • (b) the sender and receiver have an ongoing business relationship, or
  • (c) the message relates to an ongoing commercial transaction between the sender and receiver.


  • Note: just looking at a message won’t tell you whether or not it’s spam.
9
Anti-spam Measures
  • Enforce laws against wire fraud, false medical claims, etc.
  • Require accurate labeling of origin; allows filtering by origin
    • Big spammer just sentenced to nine years in VA state prison for mislabeling
10
Private Lawsuits by ISPs
  • ISP sends spammer cease-and-desist letter
  • Spammer keeps sending spam
  • ISP files suit
    • Claiming cyber-trespass
    • Seeking money damages
    • Seeking injunction against further spamming
  • Some success so far, but mostly useful as deterrent
11
Blacklists
  • Make list of known email addresses, or known IP addresses, of spammers
  • Discard email from those addresses
  • Problems
    • Spammers try to mislead about message origin
    • Spammers move around a lot
    • Innocent users sometimes end up sharing addresses with spammers
    • False accusations
12
Whitelists
  • Make list of people/places you want to get email from
  • Impractical to accept email only from these people
  • But still useful
    • Make other anti-spam measures more stringent
    • Exception for people on whitelist
13
Payment
  • Try to raise cost of sending email
    • Ideally, raise more for spammers than for normal senders
  • Pay in the form of:
    • Money
    • Wasted computational resources
    • Human attention
14
Problems with payment
  • If using real money, involves the banking system
  • If paying in resources, waste of resources
    • Resources are cheap for spammers anyway
  • Deters some legitimate email – especially big (legitimate) mailing lists
15
Sender authentication
  • Various schemes
  • Make sure that mail comes from the right place, given the (claimed) sender
    • e.g. my mail comes from a Princeton IP address
  • Works okay, but
    • Complicated in presence of forwarding etc.
    • Doesn’t address spambots on stolen machines
16
Content-Based Filtering
  • Classify incoming messages based on contents
    • Apply fixed rules (e.g. SpamAssassin)
    • Machine learning, based on user labeling
      • Word-based Bayesian learning
17
Filtering Issues
  • Fairly accurate, but not foolproof
    • Trade off false positives vs. false negatives
    • Still need to look at suspected-spam messages
  • Spammers using countermeasures
    • “word salad”
18
Case Study: Do-Not-Email List
  • In CAN-SPAM Act, Congress asked FTC to study a National Do-Not-Email (DNE) list
    • Like Do-Not-Call list for telemarketing
  • Congress asked:
    • Should we have a DNE list?
    • If we have one, how should it work?
  • FTC hired experts (including me) to give technical advice.
19
DNE List: Law
  • Users can put their email addresses on the DNE list.
  • Domain owner can put whole domain (e.g. washington.edu) on DNE list.


  • Illegal to send spam to anybody on the list.
20
DNE List: Approaches
  • Give spammers the list
    • Very bad idea: “whom-to-spam” list
    • Can seed each spammer’s list with “telltale” addresses?  (Interesting CS theory problem.)
  • Spammer submits their mailing list to DNE service; service returns “scrubbed” list
    • Spammer still learns about some valid addresses
    • Might be able to limit this by limiting access, charging for access, etc.


21
DNE List: Approaches
  • Spam-forwarding service
    • Spammer must direct all spam through a DNE service
    • Service forwards email to addresses not on DNE list
    • Silently drops if address is on list
    • Doesn’t leak information about list
    • Irony: as an anti-spam measure, the government is forwarding spam
  • All approaches: risk that list will leak
22
Outlaw Spam
  • Biggest problem for DNE List is outlaw spammers
    • Ignore the law
    • Send spam from stolen machines
    • Very hard to catch them
23
Spam: Bottom Line
  • Spam will be with us, as long as people buy stuff from spammers.
  • People will keep buying the kinds of products that spammers sell.
  • At best, we’ll fight to a stalemate.