1. This presentation uses some slides from lecture slides of Associate Prof.
Tran Quang Anh from FIT - HANU
&&
Anti-spamAnti-spam
Group No 2C12Group No 2C12
4. 1. Background knowledge
PRIMARY
FIELDS
SECONDARY FIELDS MIME FIELDS
1. From
2. To
3. Subject
4. Date
5. Message-ID
6. Bcc (Blind Carbon Copy)
7. Cc (Carbon copy)
8. Content-Type
9. Importance
10.In-Reply-To
11.Precedence
12.Received
13.Return-Path
14.Sender
15. X-Originating-IP
16.MIME format
17.Content encoding
18.Content type
19.Content-
Disposition
5. 1. Background knowledge
1.2 Email sending steps
If server Gmail wants to send an email to
manhnv@hanu.edu.vn, it will
Step 1: Check MX record (IP) of
hanu.edu.vn
Step 2: Connect to port 25 in that IP
address
Step 3: Follow SMTP protocol
6. 2. Email Spam
2.1 What is email spam?
UBE (Unsolicited Bulk Email)
Same content but lots of mails
Purposes: Advertisement,
phishing, spreading malware, etc.
7. 2. Email Spam
2.2 Why is email spam?
o Technical consideration
o Sender is anonymous
o Internet (email, ADSL) is prevalent
o Economical consideration
o Low cost to send an email
o Demand of advertisement
8. 2. Email Spam
2.3 Problems caused by
email spam:
o Denied of service (full mail box,
wrong delete)
13. 3. Anti - spam
Content-based method
o Analyze the frequency of top keywords in email (SpamAssassin)
o Effective algorithm: Bayesian filtering algorithm
o Example: giá, c h i, siêu, mi n phí (Vietnamese keywords), free, like,ơ ộ ễ
subscribe, Facebook, hot deal, sale off (English keywords)
14. 3. Anti - spam
Header-based method
o Examines the headers of email messages to detect spam
o Approaches:
o Whitelist: email addresses of legitimate email in a database
o Blacklist schemes collect the IP addresses of all known spammer
15. 3. Anti - spam
Source: http://www.mcafee.com/threat-intelligence/ip/spam-senders.aspx
17. 3. Anti - spam
Sender authentication
o Spammer can fake identity (they can claim who they are).
o Sender authentication treat this way.
o How does SA work?
1. SA adds a “marker” to the DNS server, which inform the designated email
servers for a specific domain.
2. A server verify if a received email message actually came from on these email
servers.
o Example: Sender Policy Framework (AOL, HANU), SenderID (Microsoft),
DomainKeys (Yahoo)
18. 3. Anti-spam
Social network
o PageRank (Google)
o Graph theory:
• Consider an email network with nodes
are users and links are email
transaction activities
• Coefficient: low (do not exchange email
frequently), high
19. 4. Gmail anti-spam
4.1 Gmail anti-spam technique
o Gmail uses multiple techniques:
o SPF (Sender Policy Framework),
o DomainKeys
o DKIM (DomainKeys Identified Mail)
20. 4. Gmail anti-spam
4.2 Gmail header format
o How to read a header? (Demonstration with web browser)