Categories
Admin Consumer Interest Consumer Tech Firewall Home Computing Linux Scams Security Spam Web Site Technologies

Types of Cyberattacks and other terms from the world of cyber security

Intro

It’s convenient to name drop different types of cyber attacks at a party. I often struggle to name more than a few. I will try to maintain a running list of them.

But I find you cannot speak about cybersecurity unless you also have a basic understanding of information technology so I am including some of those terms as well.

As I write this I am painfully aware that you could simply ask ChatGPT to generate a list of all relevant terms in cybersecurity along with their definitions – at least I think you could – and come up with a much better and more complete list. But I refuse to go that route. These are terms I have personally come across so they have special significance for me personally. In other words, this list has been organically grown. For instance I plowed through a report by a major vendor specializing in reviewing other vendor’s offerings and it’s just incredible just how dense with jargon and acronyms each paragragh is: a motherlode of state-of-the-art tech jargon.

AiTM (Adversary in the Middle)
Baitortion

I guess an attack which has a bait such as a plum job offer combined with some kind of extortion? The usage was not 100% clear.

BYOVD (Bring Your Own Vulnerable Driver)
Clickfix infection chain

Upon visiting compromised websites, victims are redirected to domains hosting fake popup windows that instruct them to paste a script into a PowerShell terminal to fix an issue.

Collision attack

I.e., against the MD5 hash algorithm as done in the Blast RADIUS exploit.

Credential Stuffing Attack

I.e., password re-use. Takes advantage of users re-using passwords for different applications. Nearly three of four consumers re-use password this way. Source: F5. Date: 3/2024

Data Wiper
Authentication Bypass

See for instance CVE-2024-0012

Email bombing

A threat actor might flood a victom with spam then offer “assistance” to fix it.

Evasion

Malicious software built to avoid detection by standard security tools.

Password spraying

A type of attack in which the threat actor tries the same password with multiple accounts, until one combination works. 

Port Scan
Host Sweep
Supply Chain attack
Social Engineering
Hacking
Hacktivist

I suppose that would be an activitst who uses hacking to further their agenda.

Living off the land
Data Breach
Keylogger
Darknet
Captcha
Click farms
Jackpotting

This is one of my favorite terms. Imagine crooks implanted malware into an ATM and were able to convince it to dispense all its available cash to them on the spot! something like this actually happened. Scary.

Overlay Attack

Example: When you open a banking app on your phone, malware loads an HTML phishing page that’s designed to look just like that particular app and the malware’s page is overlaid on top.

Payment fraud attack

In a recent example, the victim experienced “multiple fraudulently induced outbound wire transfers to accounts controlled by unknown third parties.”

Skimmer
XSS (Cross site Scripting)
bot
Anti-bot, bot defense
Mitigation
SOC
Selenium (Se) or headless browser
Obfuscation
PII, Personally Identifiable Information
api service
Reverse proxy
Inline
endpoint, e.g., login, checkout
scraping
Layer 7
DDOS
Carpet bombing DDOS attack

Many sources hitting many targets within the same subnet. See:

https://www.a10networks.com/blog/carpet-bombing-attacks-highlight-the-need-for-intelligent-and-automated-ddos-protection/#:~:text=Carpet-bombing%20attacks%20are%20not,entire%20CIDR%20or%20multiple%20ASNs.

SYN flood
DOS
Visibility
Automation
Token
Post
JavaScript
Replay
Browser Fingerprint
OS
Browser
GDPR
PCI DSS
AICPA Trust Services
Grandparent scam

A social engineering attack where scammers target grandparents by pretending to be a grandchild in a bind.

GUI
(JavaScript) Injection
Command Injection
Hotfix
SDK
URL
GET|POST Request
Method
RegEx
Virtual Server
TLS
Clear text
RCA
SD-WAN
PoV
PoC
X-Forwarded-For
Client/server
Threat Intelligence
Carding attack
Source code
CEO Fraud
Phishing
Vishing

(Voice Phishing) A form of cyber-attack where scammers use phone calls to trick individuals into revealing sensitive information or performing certain actions.

Business email compromise (BEC)
Deepfake
Threat Intelligence
Social engineering
Cybercriminal
SIM box
Command and control (C2)
Typo squatting
Voice squatting

A technique similar to typo squatting, where Alexa and Google Home devices can be tricked into opening attacker-owned apps instead of legitimate ones.

North-South
East-West
Exfiltrate
Malware
Infostealer
Obfuscation
Antivirus
Payload
Sandbox
Control flow obfuscation
Buffer overflow
Use after free
Indicators of Compromise
AMSI (Windows Antimalware Scan Interface)
Polymorphic behavior
WebDAV
Protocol handler
Firewall
Security Service Edge (SSE)
Secure Access Service Edge (SASE)
Zero Trust

Zero Trust is a security model that assumes that all users, devices, and applications are inherently untrustworthy and must be verified before being granted access to any resources or data.

Zero Trust Network Access (ZTNA)
ZTA (Zero Trust Architecture)
Zero Trust Edge (ZTE)
Secure Web Gateway (SWG)
Cloud Access Security Broker (CASB)
Remote Browser Isolation (RBI)
Content Disarm and Reconstruction (CDR)
Firewall as a service
Egress address
Data residency
Data Loss Prevention (DLP)
Magic Quadrant
Managed Service Provider (MSP)
0-day or Zero day
User Experience (UX)
Watermark
DevOps
Multitenant
MSSP
Remote Access Trojan (RAT)
SOGU

2024. A remote access trojan.

IoC (Indicators of Compromise)
Object Linking and Embedding
(Powershell) dropper
Backdoor
Data Bouncing

A technique for data exfiltration that uses external, trusted web hosts to carry out DNS resolution for you

TTP (Tactics, Techniques and Procedures)
Infostealer
Shoulder surfing
Ransomware
Pig butchering

This is particularly disturbing to me because there is a human element, a foreign component, crypto currency, probably a type of slave trade, etc. See the Bloomberg Businessweek story about this.

Forensic analysis
Sitting Ducks

An entirely preventable DNS hijack exploit. See https://blogs.infoblox.com/threat-intelligence/who-knew-domain-hijacking-is-so-easy/

Attack vector
Attack surface
Economic espionage
Gap analysis
AAL (Authentication Assurance Level)
IAL (Identity Assurance Level)
CSPM (Cloud Security Posture Management)
Trust level
Network perimeter
DMZ (Demilitarized zone)
Defense in depth
Lateral movement
Access policy
Micro segmentation
Least privilege
Privilege Escalation (PE)
Breach
Intrusion
Insider threat
Cache poisoning

I know it as DNS cache poisoning. If an attacker manages to fill the DNS resolver’s cache with records that have been altered or “poisoned.”

Verify explicitly
Network-based attack
Adaptive response
Telemetry
Analytics
Consuming entity
Behavior analysis
Authentication
Authorization
Real-time
Lifecycle management
Flat network
Inherent trust
Cloud native
Integrity
Confidentiality
Data encryption
EDR (Endpoint Detection and Response)
BSOD (Blue Screen of Death)

Everyone’s favorite Windows error!

BSI (Bundesamt für Sicherheit in der Informationstechnik)

German Federal Office for Information Security (Bundesamt für Sicherheit in der Informationstechnik)

ICS (Industrial Control System)
Reverse shell

A text-based interfaces that allow for remote server control.

Crypto Miner
RCE (Remote Code Execution)
Threat Actor
APT (Advanced Persistent Threat)
Compromise
Vulnerability
Bug
Worm
Remote Access VPN (RAVPN)
XDR (Extended Detection and Response)
SIEM (Security Information and Event Management)
User Entity Behavior Analytics (UEBA)
Path traversal vulnerability

An attacker can leverage path traversal sequences like “../” within a request to a vulnerable endpoint which ultimately allows access to sensitive files like /etc/shadow.

Tombstoning
Post-exploit persistence technique
Volumetric DDoS
MFA bomb

Bombard a user with notifications until they finally accept one.

Use-after-free (UAF)

use-after-free vulnerability occurs when programmers do not manage dynamic memory allocation and deallocation properly in their programs.

Cold boot attack

A cold boot attack focuses on RAM and the fact that it is readable for a short while after a power cycle.

Random Prefix Attack

A type of DNS attack. https://developers.cloudflare.com/dns/dns-firewall/random-prefix-attacks/

Famous named attacks

Agent Tesla
Cloudbleed
Heartbleed
log4j
Morris Worm

Explanations of exploits

https://blog.sonicwall.com/en-us/2024/06/critical-path-traversal-vulnerability-in-check-point-security-gateways-cve-2024-24919-2/

Famous attackers

APT29 (Cozy Bear)

A Russia-nexus threat actor often in the news

Volt Typhoon

2024. A China-nexus threat actor

Cybersecurity Terminology

Blast Radius

One of those annoying terms borrowed from the military that only marketing people like to throw around. It means what you think it might mean.

Blue Team – see Red Team
BSI (The German Federal Office for Information Security)
DLS (Data Leak Sites)

Sites where you can see who has had their data stolen.

Red Team

 In a red team/blue team exercise, the red team is made up of offensive security experts who try to attack an organization’s cybersecurity defenses.

SNORT (Probably stands for something)

An open source rule-matching engine to scan network traffic and serve as an IDS.

IT terminology

2FA (2 Factor Authentication)
802.1x
ACL (Access Control List)
AD (Active Directory)
ADO (Azue DevOps)
AFK (Away From Keyboard)
AGI (Artificial General Intelligence)

AGI is the theory and development of computer systems that can act rationally.

AIOps

Applying AI to IT operations.

ANN (Artificial Neural Network)
Ansible

I would call it an open source orchestrator.

anti-aliasing

When you smooth out color in neighboring pixels.

anycast
Anydesk

A popular remote management software.

apache

A formerly popular open source web server which became bloated with features.

APM (Application Performance Management)
ARIN
ARM

A processor architecture from ARM Corporation, as opposed to, e.g., x86. Raspberry Pis use ARM. I think Androids do as well.

ARP (Address Resolution Protocol)
ASCII

An early attempt at representing alpha-numeric characters in binary. Was very english-focussed.

ASN (Autonomous System Number)

Each AS is assigned an autonomous system number, for use in Border Gateway Protocol routing

ASN.1 (Abstract Syntax Notation One)

A standard interface description language (IDL) for defining data structures that can be serialized and deserialized in a cross-platform way.

ASPA (Autonomous System Provider Authorization)

An add-on to RPKI that allows an ASN to create a record that lists which ASNs can be providers for that ASN. The concepts are “customer” (an ASN) and “providers” (a list of ASNs). This is used to do hop by hop checking of AS paths.

ASR (Aggregation Services Router)

A high-end Interent router offered by Cisco for business customers.

AV (anti-virus)
AWS (Amazon Web Services)
Azure AD
BGP (Border Gateway Protocol)
BIND (Berkeley Internet Name Daemon)

An open source implementation of DNS, found on many flavors of linux.

BOM (Bill of Material)
Boot start

A flag for a driver in Windows that tells it to always start on boot.

bootp

A predecessor protocol to DHCP.

broadcast
Browser
BYOD (Bring Your Own Device)

I.e., when employees are permitted to use their personal smartphone to conduct company business.

BYOL (Bring Your Own License)

F5 permits this approach to licensing one of their cloud appliances.

CA (Certificate Authority)
Callback

A routine designed to be called when someone else’s code is executing. At least that’s how I understand it.

CDR (Call Detail Record)

Metadata for a phone call.

CDN (Content Distribution Network)
CDP (Cisco Discovery Protocol)

This protocol allows devices connected to switch ports to learn what switch and which switch port they are connected to. It is a layer 2 protocol.

CDSS (Cloud Delivered Security Services)

Only used in Palo Alto Networks land.

CGN (Carrier Grade NAT)

The address space 100.64.0.0/10 is handled specially by ISPs for CGN. RFC 6598

CHAP
Chatbot

A computer program that simulates human conversation with and end user.

CI (Configuration Item)
CI/CD

An ITIL term referring to the object upon which changes are made.

CISA (Cybersecurity and Infrastructure Security Agency)
CISO (Chief Information Security Officer)
Cleartext

Format where no encryption has been applied.

CMDB (Configuration Management Database)
CMO (Current Mode of Operations)
CNN (Congruential Neural Network)
Computer Vision

A field of AI that leverages machine learning and neutral networks to enable machines to identify and understand visual information such as images and videos.

Copilot

Microsoft’s AI built into their productivity software. Sorry, no more Clippy.

Courrier

A well-known fixed-width font.

CPE (Customer Premise Equipment)
CSR (Certificate Signing Request)
CUPS (Common Unix Printing Systems)
Customer Edge (CE)
CVE

CVEs, or Common Vulnerabilities and Exposures, are a maintained list of vulnerabilities and exploits in computer systems. These exploits can affect anything, from phones to PCs to servers or software.  Once a vulnerability is made public, it’s given a name in the format CVE–. There are also scoring systems for CVEs, like the CVSS (Common Vulnerability Scoring System), which assigns a score based on a series of categories, such as how easy the vulnerability is to exploit, whether any prior access or authentication is required, as well as the impact the exploit could have.

CVSS (Common Vulnerability Scoring System)

Part of CVE lingo.

DAST (Dynamic Application Security Testing)
Data at rest
Data in motion
Data Plane

A physical security appliance separates data traffic from its management traffic, which transits the managemenbt plane.

Data Remanence

The residual representation of data that remains even after attempting to erase or initialize RAM.

DDI (DNS, DHCP and IP address management)
Debian Linux

A nice distro which I prefer. It is free and open source. Its packages are relatively uptodate.

Deep Learning

A subset of machine learningthat focus on using deep neural networks with multiple layers to model complex patterns in data.

Deepfake

A manipulated video or other digital representation produced by sophisticated machine-learning techniquies that yield seemingly realistic, but fabricated images and sounds.

DHCP (Dynamic Host Control Protocol)
Distributed Cloud

A Gartner term for a SaaS service which runs over multiple cloud environments.

DLL
DLP (Data Loss Prevention)
DNAT (Destination NAT)
DNS (Domain Name System)
DNSSEC (Domain Name System Security Extensions)
DOA (Dead on Arrival)

Usage: That equipment arrived DOA!

Docker
DoH (DNS over HTTPS)
Domain
DRM (Digital Rights Management)
DVI (DeVice Independent file)

See LaTEX entry.

EAP
East-West

Data movement with a data center, I believe, as oppose to North-South.

EBITDA (Earnings Before Interest, Taxes, Depreciation and Amortization)

Hey, an IT person needs to know some business terminology!

Eduroam
Enhanced Factory Reset (EFR)
Entra

From Microsoft. The new name for Azure AD

EU AI Act
EULA (End User Licnese Agreement)
Exact Data Matching (EDM)
FAQ (Frequently Asked Questions)
Fedora Linux

Free and open source linux. New features are introduced here before migrating into Redhat Linux

FEX (Fabric Extender)
FIFO (First in, First Out)
FIPS (Federal Information Processing Standard)

Government security practices. Best to avoid if possible.

FMO (Future Mode of Operation)

As opposed to CMO.

FN (False Negative)
Forensics
Fortran

An ancient procedural programming language popular in the scientific and engineering communities from decades ago.

FOSS (Free and Open Source Software)
FP (False Positive)
freeBSD

A Unix variant which still exists today.

Fritz!Box

A popular home router in Germany.

GA (General Availability)
Gartner Group

A well-regarded research firm which reviews software and SaaS products. They decide which vendors are in the Magic Quadrant.

GBIC

A type of fiber optic transceiver that converts electric signals to optical signals.

GCP (Google Cloud Provider)
GDPR (General Data Protection Regulation)

An EU directive to achieve data privacy.

Generative AI

AI which can create new human-quality content, including text, images, audio or video.

GMP (Good Manufacturing Practice)

FDA lingo that implies their rules are being followed.

GMT – see UTC
GRE
GSLB (global Server Load Balancing)
GUI (Graphical User Interface)
HA (High Availability)
Hallucination

When an LLM perceives patterns that are non-existent creating nonsensical or inaccurate outputs.

Hands and Eyes

When you don’t have physical access to a server, you need someone who does to be this for you.

HIBP (Have I Been Pwned)

https://haveibeenpwned.com/

HIP (Host Information Profile)

Only used in the world of Palo Alto Networks.

HLD (High Level Design)
HPC (High Performance Computing)
HSM (Hardware Security Module)
HTML (HyperText Markup Language)

I started with version 0.9!

Hypervisor
IaaS (Infrastructure as a Service)

E.g., brining up a VM on AWS.

IANA (Internet Assigned Numbers Authority)
ICANN (Internet Corporation for Assigned Numbers and Names)
ICS
IDE (Integrated Development Environment)
IdP (Identity Provider)
IDS (Intrusion Detection System)
ILEC (Incumbent Local Exchange Carrier)
Incident Response Team

Variations include: Computer emergency Response team, Security incident Response Team, etc.

IPAM (IP Address Management)
IPI (IP Intelligence)

At least in the world of F5 this means IP Intelligence, i.e., the reputation of a given IP address.

IPS (Intrusion Prevention System)
IPSEC
IPv6 (Internet Protocol version 6)
iRule

F5 specific lingo for programmable control over load-balancing and routing decisions. Uses the TCL language.

ISC (Internet Software Consortium)

A body which maintains an open source reference implementation for DNS (BIND) and DHCP.

ISO 9001
ISP (Internet Service Provider)
ITIL (IT Infrastructure Library)
JSON (JavaScript Object Notation)

Pronounced JAY-son. A popluar format for data exchange. Sort of human-friendly. Example: {“hi”:”there”,”subnets_ignore”:[“10/8″,”192.168/16”]}

Kanban

Agile way of tracking progress on tasks and brief meetings.

Kernel mode
Kerning

Adjusting the spacing between letters in a proportional font.

KEV (Known Exploited vulnerabilities)

CISA maintains this catalog.

K8s (Kubernetes)

Open source system for automating deployment, scaling, and management of containerized applications

KVM (Kernel Virtual Module)
L2TP (Layer 2 Tunneling Protocol)
L3, L4, L7 (Layer 3, Layer 4, Layer 7)

Refers to ISO 7-layer traffic model.

LACP (Link Access Control Protocol)
LAMP (Linux Apache MySQL and PHP)

An application stack which gives a server needed software to do “interesting things.”

LaTEX

A markup language based on TEX I used to use to write a scientific paper. I think it gets transformed into a DVI, and then into a postscript file.

LEC (Local Exchange Carrier)
Link
Linux
LLD (Low Level Design)
LLD (Low Level Discovery)

A command-line browser for unix systems.

LLDP (Link layer Discovery Protocol)

See also CDP

An open source OS similar to Unix.

LLM (Large Langiuage Model)
lynx

A command-line browser for linux systems.

MAC (Media Access Control) Address

Layer 2 address of a device, e.g., fa-2f-36-b4-8c-f5

Machine Learning

A subfield of AI that deals with creating systems that can learn from data and improve their performance without explicit programming.

Magic Quadrant

Gartner’s term for vendors who exceed in both vision and ability to execute.

Management Plane

See Data Plane.

Mandiant
MD5 (Message Digest 5)
MDM (Mobile Device Management)

Management software used to administer smartphones and tablets.

MFA (Multi Factor Authentication)
MITRE ATT&CK
Modbus protocol
MS-CHAPv2
MSS (Maximum Segment Size)

Set by a TCP option in the beginning of the communcation.

MTTI (Mean Time To Identification)

Probably only Cisco uses this acronym e.g., in their ThousandEyes product.

MTTR (Mean Time To Resolution)
MTU (Maximum transmission unit)

Often 1500 bytes.

multicast
NAESAD (North American Energy Software Assurance Database)
Named pipes

I read it’s a Windows thing. huh. Hardly. It’s been on unix systems long before it was a twinkle in the eye of Bill gates. It acts like a pipe (|) except you give it a name in the filesystem and so it is a special file type. It’s used for inter-process communication.

NAT (Network Address Translation)
NDA (Non-Disclosure Agreement)
.NET
Netflow

Think of it like a call detail record for IP communications. Metadata for a communications stream.

NGFW (Next Generation FireWall)

Palo Alto Networks describes their firewalls this way.

NGINX

A web server that is superioir to apache for most applications.

NLP (Natural Language Processing)

A branch of AI that uses machine learning to enable computers to understand, interpret, and respond to human language.

NOC (Network Operations Center)
North-South

Data movement from/to the data center. Also see East-West.

NSA (National Security Agency)
NTLM

Relies on a three-way handshake between the client and server to authenticate a user.

OAuth bearer token

A security token with the property that any party in possession of the token (a “bearer“) can use the token in any way that any other party in possession of it can.

OCR (Optical Character Recognition)
Offensive Security – see Red Team
OKRs (Objectives and Key Results)

HR lingo.

OpenRoaming
openssl

A common open source implementation of SSL/TLS.

OS (Operating System)
OSFP (Open Shortest Path First)
OSS (Open Source Software)
OT (Operational Technology)
ova

The image filetype for a virtual host.

Overlay

See underlay.

OWASP (Open Worldwide Application Security Project)

An online community that produces freely available articles, methodologies, documentation, tools, and technologies in the fields of IoT, system software and web application security.

PAP
Patch
PaaS (Platform as a Service)
PBR (Policy Based Routing)
PCI (Payment Card International?)

A standard which seeks to define security practices around the handling of credit cards.

PDF (Portable Document File)
PDU (Protocol Data Unit)
PE (Provider Edge)

Telecom lingo so cisco uses this term a lot.

PEM (Privacy Enhanced Mail)

The format certificates are normally stored in.

PHP (Probably stands for something)

A scripting language often used to program back-end web servers.

PII (Personally Identifiable Information)
PKCS (Public Key Cryptography Standard)
PKI (Public Key Infrastructure)
Plain Text

A human-readable format, i.e., no encyrption and not a binary file.

PLC (programmable logic controller)
PM (Product Manager)

Could also be Project Manager but for me it usually means Product Manager.

PO (Purchase Order)
POC (Point of Contact)
POC (Proof of Concept)
Port Channel
Portable Executable (PE)
POTS (Plain Old Telephone Service)

Voice-grade telephone service employing analog signal transmission over copper 

POV (Proof of Value)
Private Cloud
Prompt Engineering

The practice of crafting effective prompts that elicit high-quality answers from generative AI tools.

PS (PostScript)

A file type I used to use. It is a vector-oriented language, stack-based, which tells the printer how to move its ink pens around the page. Before there was PDF, there was postscript.

PS (PowerShell)

A versatile scripting language developed by Microsoft and available on all Windows computers.

PTO (Paid Time Off)
Purple Team

Purple teams combine red team and blue team functions. See Red Team.

PyPi (Python Package Index)
Python

A popular programming language, not the snake.

QSFP (Quad Small Form factor Pluggable)

A newer kind of SFP.

Rack Unit
RADIUS
RAG (Retrieval Augmented Generation)

A method to train LLMs.

Ray

An open-source unified compute framework used by the likes of OpenAI, Uber, and Amazon which simplifies the scaling of AI and Python workloads, including everything from reinforcement learning and deep learning to tuning and model serving.

RBAC (Role-Based Access Control)
RDP (Remote Desktop Protocol)
Recursive

A function which calls itself.

Redhat Linux

A commercialized version of Fedora whose packages are always dated, usually by years.

Redirect
Remediation

Addressing a security flaw.

Remote Desktop Licensing (RDL) services

Often deployed on Windows severs with Remote Desktop Services deployed.

Retrieval-Augmented Generation (RAG)
Reverse Engineer

To figure out the basic building blocks or code by first observing behavior of a system.

Reverse Proxy

A TCP gateway which terminates a tcp connection and maintains a separate tcp connection to a back-end server.

RFC (Request for Comment)
RFI (Requst for Information)
RFO (Reason for Outage)
RFP (Request for Proposal)
RFQ (Request for Quote)
RIPE
RIR (Regional Internet Registry)
RMA (Return Merchandise Authorization)

You hear this a lot when It guys need to get a replacement for failed equipment.

RMM (ReMote Management)
ROA (Route Origin Authorization)
ROCE (Return on Capital Employed)

Hey, an IT person has to know a few business terms!

Route 53

In AWS-land, an intellugent DNS service, i.e., geoDNS +.

RPC (Remote Procedure Call)
RPKI (Resource Public key Infrastructure)

Provides a way to connect Internet number resource information to a trust anchor.

RPi (Raspberry Pi)

A popular small, inexpensive server aimed at the educational crowd.

RPZ (Response Policy Zone)

A concept in DNS for either a DNS firewall or way to overwrite DNS responses.

RR (Resource Record)
RSA

Asymmetric encryption standard named after its creators, Ron Rivest, Adi Shamir and Leonard Adleman.

RTFM (Read The “flippin'” Manual)
SaaS (Software as a Service)
SAML
SANS

Private outfit in the US which specializes in information security and cybersecurity training.

Sans-Serif

A font type which does not have the fancy rounded blobs at the tips of the letter, such as Helvetica.

SASE (Secure Access Service Edge)
SCADA
Scale sets

In cloud, a service which automates the build-up or tear-down of VMs behind a load balancer.

SDWAN (Software defined WAN)
SEO (Search Engine Optimization)
SFP (Small Form factor Pluggable)

A type of optic transceiver that converts electric signals to optical signals.

SGML (Standard Generalized Markup Language)

If you ask the French they proudly point to this as the predeccesor to the more widely known HTML.

SFTP (Secure file Transfer Protocol)
SHA (Secure Hash Algorithm)
SIEM (Security Information and Event Management)
SRE (Site Reliability Engineer)
SMP (Symmetric Multi Processing)
SMTP (Secure Mail Transfer Protocol)
SNAT (Source NAT)
SNI (Server Name Indication or similar, I think)

When multiple HTTP[S web sites whare a single IP this technology can be used to identify which certificate to send to a requester.

SNMP (Simple Network Management Protocol)

All security appliances support this protocol which permits system monitoring.

Spoofing

When a source IP address is faked.

SSH
SSL (Secure Socket Layer)
SOC (System on a Chip)

I believe the RPi is described to be this.

SOC (Security Operations Center)
Solaris

A Unix variant possibly still available. Offered by Oracle and formerly Sun Microsystems Corporation.

SSL (Secure Sockets Layer)
SSL labs

A Qualys (so you know it has to be good quality) service where you can test a web site’s SSL certificate.

SSO (Single Sign On)
Steal with Pride

To unashamedly build on someone else’s work.

SVI (Switch virtual Interface)

A layer 3 on-switch routing between vlans on that switch. It’s a Cisco thing.

TAC (Technical Account or something?)
TAM (Technical Account Manager)

Another Cisco term.

Cisco uses this term a lot.

TCP (Transport Control Protocol)
TeamViewer

A remote management tool.

Telnet
Terraform
The epoch

The first moment of January 1st, 1970

TI (Threat Intelligence)
TLP (Traffic Light Protocol)
TLS (Transport Layer Security)
Tooltip

An element of a graphical user interface in the form of a box of text that appears when a cursor is made to hover over an item; normally used to explain the function of the item.

TPM (Trusted Platform Module)

TPM, a Microsoft security feature required by Windows 11, is a dedicated chip designed to provide “hardware-level security services for your device,” keeping your private information and credentials safe from unauthorized users. 

TSF (Tech Support File)

Palo Alto Networks-specific lingo for a dump file they require for a firewall support case.

Ubuntu Linux

A commercialized implementation of Debian Linux from Canonical.

UC (Unified Communications)

Cisco likes this term.

udev rules

udev rules in Linux are used to manage device nodes in the /dev directory. Those nodes are created and removed every time a user connects or disconnects a device.

UI5

SAP’s UI for HTML 5.

Ultrix

A Unix variant which ran on DEC workstations.

Underlay

SD Wan terminology for the underlying network. As opposed to overlay.

Unit testing
UPS (Uninterruptible Power Supply)
URL
Use case
UTC (Universal Time Coordinated)

What used to be called GMT.

UTF-8

Common representation of common language characters. I think of it as a successor to ASCII.

Validated

In FDA parlance, an adjective used to describe a system which follows FDA controls.

VDI

A virtual desktop offered by Citrix.

VLAN
VM (Virtual Machine)
VMWare

Will Broadcom destroy this company the way they did to Bluecoat/Symantec?

VNC (Virtual Networking Computer)

VNC is a software used to remotely control a computer.

VPC (Virtual Private Cloud)
vPC (Virtual Port Channel)

A virtual port channel (vPC) allows links that are physically connected to two different Cisco FEXes to appear as a single port channel by a third device.

VPG (Virtual Port Group)

A Cisco-ism.

VPN – Virtual Private Network
VRF

A logically separated network when using MPLS.

WAF (Web Application Firewall)
Webhook
Website
Wiki

A less formal and usually more collaborative approach to documentation, the prime example being Wikipedia.

Windows PE or Win PE

A small OS for repairing or restoring Windows systems.

WWW (World Wide Web)
x86

A type of processor architecture. Found in most Windows PCs.

XHR (XMLHttpRequest)

I.e., ajax.

XML (eXtensible Markup Language)

Common file format for data exchange, but not too human-friendly.

YAML
YARA
Zabbix

An open source infrastructure monitoring system.

Categories
Scams Spam

Latest spear phishing: your password plus extortion

Intro
Three users that I know at a certain company have all received spear phishing emails worded very much like this one:

Spear Phishing shows you your password and extorts you

The details
I don’t really have many more details. One user described it to me as follows. He got this email at work. It displayed to him a password which he uses for some of his personal accounts and maybe for a few work-related logins. He said the wording was very similar to the one I showed in the above screenshot.

This one comes from IP 40.92.6.45, which is a legitimate Microsoft-owned IP. So it has an air of legitimacy to traditoinal spam filters.

I htikn all the users are reluctant to pursue the normal methods o reporting phishing, which involve sending the entire email to some unknown group of analysts because the email does in fatc contain a legitimate password of theirs. This makes it that much harder for an incident repsonse team to kick into gear and start a detailed analysis.

I mentioned three users – those are just the ones brought to my attention, and I’m not even in the business any more. So by extrapolation, this has probably occurred to many more users at just this one company. It’s disturbing…

November update
Another one came in to a different user. I have the text of this one and have only changed the recipient information.

From: [email protected] <[email protected]>
Sent: Thursday, November 29, 2018 11:55 AM
To: Dr J <[email protected]>
Subject: [email protected] has been hacked! Change your password immediately!
 
Hello!
 
I have very bad news for you.                                                                                                                                 03/08/2018 - on this day I hacked your OS and got full access to your account [email protected] On this day your account [email protected] has password: drj1234
 
So, you can change the password, yes.. But my malware intercepts it every time.
 
How I made it:
In the software of the router, through which you went online, was a vulnerability.
I just hacked this router and placed my malicious code on it.
When you went online, my trojan was installed on the OS of your device.
 
After that, I made a full dump of your disk (I have all your address book, history of viewing sites, all files, phone numbers and addresses of all your contacts).
 
A month ago, I wanted to lock your device and ask for a not big amount of btc to unlock.
But I looked at the sites that you regularly visit, and I was shocked by what I saw!!!
I'm talk you about sites for adults.
 
I want to say - you are a BIG pervert. Your fantasy is shifted far away from the normal course!
 
And I got an idea....
I made a screenshot of the adult sites where you have fun (do you understand what it is about, huh?).
After that, I made a screenshot of your joys (using the camera of your device) and glued them together.
Turned out amazing! You are so spectacular!
 
I'm know that you would not like to show these screenshots to your friends, relatives or colleagues.
I think $709 is a very, very small amount for my silence.
Besides, I have been spying on you for so long, having spent a lot of time!
 
Pay ONLY in Bitcoins!
My BTC wallet: 1FgfdebSqbXRciP2DXKJyqPSffX3Sx57RF
 
You do not know how to use bitcoins?
Enter a query in any search engine: "how to replenish btc wallet".
It's extremely easy
 
For this payment I give you two days (48 hours).
As soon as this letter is opened, the timer will work.
 
After payment, my virus and dirty screenshots with your enjoys will be self-destruct automatically.
If I do not receive from you the specified amount, then your device will be locked, and all your contacts will receive a screenshots with your "enjoys".
 
I hope you understand your situation.
- Do not try to find and destroy my virus! (All your data, files and screenshots is already uploaded to a remote server)
- Do not try to contact me (you yourself will see that this is impossible, the sender address is automatically generated)
- Various security services will not help you; formatting a disk or destroying a device will not help, since your data is already on a remote server.
 
P.S. You are not my single victim. so, I guarantee you that I will not disturb you again after payment!
 This is the word of honor hacker
 
I also ask you to regularly update your antiviruses in the future. This way you will no longer fall into a similar situation.
 
Do not hold evil! I just do my job.
Good luck.

Conclusion
A new disturbing type of spear phishing campaign is presented. The email presents an actual password (no hint as to how the hacker obtained it) and then tries to extort the user for quite a bit of money to avoid reputation-damaging disclosures to their close associates.

References and related
This is a useful site, albeit a little frightening, that shows you the many sites that have leaked your Email address due to a data breach: https://haveibeenpwned.com/

Categories
Admin Internet Mail Spam

Gmail: not as much a white-glove service as you thought

Intro
I have a pretty high regard for Google and their Gmail email service. They really seem to strive to keep its reputation sterling. But lately a persistent spam has been coming in to me from one of their users and no action is being taken, so I am beginning to wonder.

The details

It’s not that I don’t get spam from Gmail account holders. I do. That’s not shocking as I get hundreds of spam each day since my address is available from whois registrations on hundreds of domains, amongst other readily available sources.

How do I know it’s a real Gmail user and not someone spoofing the sender address? These two headers tell me:

Received-SPF: pass (google.com: domain of [email protected] designates 209.85.160.44 as permitted sender) client-ip=209.85.160.44;
Received: from mail-pb0-f44.google.com ([209.85.160.44]) (using TLSv1) by drjohnstechtalk.com, etc.

In other words one of the received headers provided by a trusted server gives me the IP of the sending server (209.85.160.44), which is in Google’s directly allocated IP space.
This can be confirmed at arin.net.

The email itself looks like this:

From: "Tom Zhu" <[email protected]>
 
Dear Sir/Madam,
 
We are owner of your_domain.mx and Presently we would like to know if you have an
interest in buying it. We are looking to sell this domain for 2,000 Euro.
It has been listed on Sedo.com. You can buy it through the following link:
 
https://sedo.com/search/details.php4?domain=your_domain.mx
...

But instead of your_domain the email mentions a specific copyrighted domain name.

I’ve received it over 10 times from the same sender. The sender is a cybersquatter sending repeated, unsolicited spam. If that doesn’t constitute a violation of their Terms of Use then I don’t know what does. I’ve filed no fewer than five formal abuse complaints to Google over the course of the several months. The Gmail abuse link is in the references. But they keep coming in so I know Google has taken no action whatsoever. And of course I have never heard back from them.

I’ve filed lots of other abuse complaints about other Gmail senders as well, but those others seem to be one-off spams and I don’t get additional emails from them. Yes it takes time to fill out the abuse form, but I do it for the overall good of the Internet. We technical people have a responsibility to make our world better…

Conclusion
I am personally miffed and professionally concerned that Google Gmail may not be operating as clean a shop (white glove) as we all had thought. Here I’ve documented a specific case of documented abuse about which they have done nothing for months on end.

References
Gmail abuse link.
ARIN’s IP lookup service is here.
A detailed look at how enom has handled some spam/domain complaints is written up here.
My successful fight to conquer scads of Chinese language spam is documented here.

Categories
Internet Mail Spam

enom is the source of recent spam campaigns

Intro
I’m still watching over spam. The latest trend are spam campaigns which have a few characteristics in common perhaps the most interesting of which is that the domains have all been registered at enom.com.

The details
Some other things in common. These recent campaigns fell into two main categories. One set uses domains which are semi-pronounceable. The other are domains which incorporate sensible english words. Both categories have these other features in common.

– brevity (no HTML, for instance)
– valid SPF records (!)
– domains were used for spam almost immediately after having been registered (new domains)

Today’s example

From:        Patriot Survival Plan <[email protected]> 
To:        <[email protected]> 
Date:        05/22/2014 04:22 AM 
Subject:        REVEALED: The Coming Collapse 
 
 
 
--------------------------------------------------------------------------------
 
 
 
 
[email protected]
 
Since I exposed this I'm getting a lot of comments. 
 
People are terrified and they are asking me to spread the word even more...
 
So don't miss this because it might be too late for you and your family!
 
Obama's done a lot of stupid things so far, but this one will freeze the blood in your veins!
 
He's been trying hard to keep this from American Patriots... but now his betrayal has finally come to light.
 
And he'll have to pay through the nose for this.
 
But here's a Warning: the effects of Obama's actions will hit you and your family by the end of this year.
 
And they'll hit you like nothing you've ever seen before...
 
So watch this revealing video to know what to expect...
and how to protect against it.
 
-> Watch Blacklisted video now, before it's too late -->                 http://check.best-survival-plan-types.com
 
 
 
 
 
 
 
No_longer_receive_this _Warning :   http://exit.best-survival-plan-types.com
Patriot Survival Plan _405 W. Fairmont Dr. _Tempe, AZ 85282
 
 
 
 
 
First off, there's nothing special 22409526 in the Ironbound. Food in quantity, 22409526not quality. It's amazing how many people 22409526 rate these establishments as excellent. This said, I've always had fun going to these places, 22409526 as long as your dining expectations are gauged accordingly. Therefore, 22409526 my rating reflects those reduced expectations. :)
 
Being a steakhouse, 22409526 one would expect a thorough steak menu such as those at Gallagher's, Luger's, or even Del Frisco's. However, you're not getting true steakhouse fare here; 22409526 it's the Ironbound after all. So, you're getting a less than Prime cut of beef, 22409526 sometimes cooked to your liking.

Whois lookup of best-survival-plan-types.com shows this:

Domain Name: BEST-SURVIVAL-PLAN-TYPES.COM
Registry Domain ID: 1859701370_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.enom.com
Registrar URL: www.enom.com
Updated Date: 2014-05-21 17:26:19Z
Creation Date: 2014-05-22 00:26:00Z
Registrar Registration Expiration Date: 2015-05-22 00:26:00Z
Registrar: ENOM, INC.
Registrar IANA ID: 48
Registrar Abuse Contact Email: [email protected]
Registrar Abuse Contact Phone: +1.4252744500
Reseller: NAMECHEAP.COM
Domain Status: clientTransferProhibited
Registry Registrant ID:
Registrant Name: DONI FOSTER
Registrant Organization: NONE
Registrant Street: 841-4 SPARKLEBERRY LN
Registrant City: COLUMBIA
Registrant State/Province: SC
Registrant Postal Code: 29229
Registrant Country: US
Registrant Phone: +1.8037886966
Registrant Phone Ext:
Registrant Fax: +1.5555555555
Registrant Fax Ext:
Registrant Email: [email protected]
Registry Admin ID:
Admin Name: DONI FOSTER
Admin Organization: NONE
Admin Street: 841-4 SPARKLEBERRY LN
Admin City: COLUMBIA
Admin State/Province: SC
Admin Postal Code: 29229
Admin Country: US
Admin Phone: +1.8037886966
Admin Phone Ext:
Admin Fax: +1.5555555555
Admin Fax Ext:
Admin Email: [email protected]
Registry Tech ID:
Tech Name: DONI FOSTER
Tech Organization: NONE
Tech Street: 841-4 SPARKLEBERRY LN
Tech City: COLUMBIA
Tech State/Province: SC
Tech Postal Code: 29229
Tech Country: US
Tech Phone: +1.8037886966
Tech Phone Ext:
Tech Fax: +1.5555555555
Tech Fax Ext:
Tech Email: [email protected]
Name Server: DNS1.REGISTRAR-SERVERS.COM
Name Server: DNS2.REGISTRAR-SERVERS.COM
Name Server: DNS3.REGISTRAR-SERVERS.COM
Name Server: DNS4.REGISTRAR-SERVERS.COM
Name Server: DNS5.REGISTRAR-SERVERS.COM

See 1) that it was registered yesterday at 17:26:19 Universal Time, and 2) that the registrar is enom?

And the SPF record:

> dig +short txt best-survival-plan-types.com

"v=spf1 a mx ptr ~all"

Actually this domain is a small aberration insofar as it does not have a SPF record with a -all at the end – the others I checked do.

What to do, what to do
Well, I reported the spam to Postini, but I don’t think that has any effect as they are winding down their business.

I am pinning greater hopes on filling out enom’s abuse form. Of course I have no idea what actions, if any, they take. But they claim to take abuse seriously so I am willing to give them their chance to prove that.

enom’s culpability
I don’t feel enom is complicit in this spam. I’m not even sure they can easily stop these rogue operators. But they have to try. Their reputation is at stake. On the Internet there are complaints like this from years ago, that enom domains are spamming.

Every one that comes across my desk I am reporting to them. The time it takes for me to report any individual one isn’t worth the effort compared to the ease of hitting DELETE, but I am hoping to help lead enom to find a pattern in all these goings-on so they can stop these registrations before new ones cause harm – that is why I feel my actions are for the greater good.

Other recently deployed enom domains

Domain

First spam seen

First registered

onlinetncresults.us

8/22

8/21

checkdnconlinesystems.us

8/20

8/20

extremeconcretecoating.com

8/8

8/8

woodsurface.com

8/7

8/7

shorttermloanspecial.com

7/24

7/23

heartattackfighter1.com

6/19

3/2

handle-unsafe-parasites.me

6/10

6/9

best-survivalplan-learn.com

5/28

5/28

survival-plan-days.com

5/27

5/26

only-survival-plan.com

5/20

5/19

local-vehicle-clearance.us

5/19

5/19

ghiused.com

5/14

5/14

pastutmy.com

5/14

5/14

lekabamow.com

5/14

5/14

etc – there are plenty more!

Finally we hear back
Weeks later, on June 14th, I finally received a formal response concerning only-survival-plan.com and local-vehicle-clearance.us.

From: [email protected]
Subject: [~OOQ-128-23745]: FW: eNom - Report Abuse - Reference #ABUSE-11116
 
Hello, 
 
Thank you for your email. While the domain name(s) reported is registered with Namecheap, it is hosted with another company. So we cannot check the logs for the domain(s) and confirm if it is involved in sending unsolicited bulk emails. We can only take an action if a report is confirmed by blacklists of trusted anti-spam organizations like SpamHaus or SURBL.
 
Thus, we have initiated a case regarding the following domain(s) blacklisted by trusted anti-spam organizations:
only-survival-plan.com
In case the listing is not removed, the domain(s) will be suspended.
 
The following domain(s) has already been suspended:
local-vehicle-clearance.us
 
Let us also suggest you addressing the issue to the hosting company which servers were involved in email transmission for help with investigating the incident of spam. You may find their IP address in the headers. To find their contact details, please whois this IP address. You may use any public Whois tool like https://www.domaintools.com/ 
 
Kindly let us know if you have any question.
 
-------------------------------
Regards,
Alexander XXX.
Legal & Abuse Department
Namecheap Group
http://www.namecheapgroup.com

Analysis of their response
Reading between the lines, here’s my analysis. There’s some not-well-documented relationship between enom and namecheap.com. I reported the abuse to enom and got a response from namecheap.com. I kind of agree that suspending a domain is a BIG DEAL and a registrar has to be on firm footing to do so. As I write this one Jun 16th, the domains do not yet appear to be suspended. Are you really going to trust Spamhaus to render your judgement? That’s basically one of those extortionist enterprises purportedly offering a take-it-or-leave-it service. If the author of that email was a lawyer, well, their English isn’t the best. That doesn’t provide a lot of confidence in their handling of the matter. And wasn’t my complaint by itself good enough for them to initiate action? I do have to concede the point that the sending of the spam was probably out of their control and probably did come from another hosting company. But it is glib advice to suppose it is that easy to track them down the way they describe. Since they are part of the problem and have the evidence why don’t they follow up with the hosting provider themselves?? There was no mention of my other eight or so formal complaints. So this still seems to be getting an ad hoc one-by-one case treatment and not the, Whoa, we got a problem on our hands and there’s something systemically wrong with what we’re doing here reaction I had hoped to provoke.

Actually I got two responses but with slightly different wording. So they were crafted by hand from some boilerplate text, and yet the person stitching together the boilerplate was sufficiently mindless of the task as to forget they had already just sent me the first email??

So their response is better than a blackhole, but perhaps could be characterized as close to the bare minimum.

I have gotten several other responses from some of my other complaints as well, all saying pretty much the same thing. In August the responses started to look different however.

August responses
Here’s one I received this morning about woodsurface.com, 19 days after my initial complaint:

Hello,
 
This is to inform you that woodsurface.com domain was suspended. It is now pointed to non-resolving nameservers and will be nullrouted once the propagation is over. The domain is locked for modifications in our system.
 
Thank you for letting us know about the issue. 
 
------------------
Regards,
Alexander T.
Legal & Abuse Department
Namecheap.com

Conclusion
I hope my actions spur enom into some action of their own in figuring out where there domain registration requirements are too lax that spammers are taking wholesale advantage of the situation and sullying their reputation.

June, 2014 Update
The storm of spam from enom has subsided. I’m basically not seeing any. Oops. Spoke too soon! New enom-registered domains popped up and created more spam storms (documented in the table above), but not as severe as in the past. I don’t know if our anti-spam filter got better or enom stepped up to the plate and improved their scrutiny of domain registrants. If another spam storm hits us I’ll report back…

August, 2014
enom-generated spam is back!

References
My most popular spam-fighting article describes how to defeat Chinese-language spam.
A new type of spam that uses Google search results for link laundering is described here.

Categories
Internet Mail IT Operational Excellence Spam

How to Stop Chinese Spam – for Mail Admins, w/ June 2014 update

(Updated 12/19/2011 and 6/2014 with additional character sets)
(updated 9/2012 with additional signature)
Intro
I have been a target for random Chinese language spam in my various email accounts, but the problem has really gotten worse in the past few months.

The thing about these messages is that at first Postini (a Google spam filtering service used mostly by businesses), wasn’t very good at catching them. Postini is about the best in the business, and they’re competently catching just about every other type of spam. But these Chinese character messages kept slipping through…

Their support tech gave me some advice which turned out to be incorrect, but led me in the right direction. Their tech told told me to create a content manager rule, but the actual rule he provided was only going to catch Russian and Ukranian spam!

This is the rule he provided:

Rule Name: Non_English_spam
"Match Any"
Header - matches regex

koi8-r|koi8-u|koi7|koi8
Disposition: delete (blackhole)
Set quarantine to Recipient

I had no idea what that was doing, so I looked up koi8-r, koi8, etc and found that it had to do with the Cyrillic alphabet. So I wondered if the Chinese language spams have something similar, but for Chinese. Indeed they do: gb2312. Looking at a few of my Chinese spams, almost all contain this string in the headers. It’s not always in the exact same place, but it’s there. To be concrete, here’s an example (some headers have been obfuscated to prevent the bad guys from trying to reverse engineer Postini’s scoring algorithms):

Received: from websmtp.sohu.com ([61.135.132.136]) by eu1sys200amx108.postini.com ([207.126.147.10]) with SMTP;
		 Sun, 28 Aug 2011 18:41:21 GMT
Received: from omlbw (unknown [110.53.27.141])
		 by websmtp.sohu.com (Postfix) with ESMTPA id 9B3C6720CEA;
		 Sun, 28 Aug 2011 23:55:04 +0800 (CST)
Message-ID: <[email protected]>
From: =?gb2312?B?y7O1wsf4xu/A1rbguabE3NfU0NCztdPQz965q8u+?= <[email protected]>
To: 
Subject: =?gb2312?B?d3Azz/ogytsg1vcgudwg1/Yg0KkgIMqyIMO0IA==?=
		 =?gb2312?B?uaQg1/cgssUgxNwgzOEgIMn9INK1ILyoIKO/LS0=?=
		 =?gb2312?B?qIk=?=
Date: Sun, 28 Aug 2011 23:55:37 +0800
MIME-Version: 1.0
X-mailer: Lzke 2
X-SOHU-Antispam-Bayes: 0
X-pstn-levels:     omitted
X-pstn-settings: omitted
X-pstn-addresses: from <[email protected]> [49/2] 

Content-Type: multipart/mixed;
		 boundary="----=_NextPart_000_015A_013AC9FA.1A2D5A60"

------=_NextPart_000_015A_013AC9FA.1A2D5A60
Content-Transfer-Encoding: base64
Content-Type: text/html;
		 charset="gb2312"

See it? charset=”gb2312″ appears in the content-type header and =?gb2312? appears in both the Subject and From fields.

That message looks like this as displayed in my mail client:

How do I know this is Chinese? I pasted the characters into translate.google.com and it auto-detected it. That’s a convenient tool!

How do I know it is spam? I am open-minded. Perhaps it is a legitimate business proposition that just happens to be written in Chinese? It does sort of read that way from the translation of any one such message. On the other side are some stronger pieces of evidence. The empty To: header is a strong hint, but some legitimate messages could contain that undesirable feature, so that is merely an indicator but not definitive. Most important is the fact that I get these messages, all showing similar patterns in appearance, and most telling always coming from a different sender tells me unambiguously that this is really, truly spam.

So the actual Postini Content Manager rule to capture Chinese spam is this:

Rule Name: Chinese_spam
"Match Any"
Header matches regex (charset="gb2312"|=\?GB2312\?)

Disposition: delete (blackhole)
Set quarantine to Recipient

Obviously this type of rule is a bit dangerous. What if you are expecting something written in Chinese? It will be subject to the same treatment as the spam. That is why the suggestion is to Set quarantine to recipient so that these messages could be delivered from the user quarantine.

And over the course of a couple months Postini has gotten much better about capturing this type of spam. That is the best thing – to let the experts handle it. They just needed to train their algorithms. I was quite concerned at first that this spam is so different from the usual, recognizable spam campaigns that they might have a hard time spotting it while simultaneously allowing the good Chinese email through. But they’re almost there…

12/19 UpdateThe filter described above has been working extremely well for me. Essentially perfectly, in fact, as I can see when I look in my quarantine. But not today. Today I got some suspected Chinese spam in and examing the headers showed something slightly different. The subject looks like this:

Subject: =?GBK?B?bnZ2dyAyMDExLjEyLTIwMTItMDEgvqsgxrcgzcYgz/ogIGZkZXI=?=

And the Mime header also had that string:

Content-Type: text/plain;
		 charset=GBK

Looking up GBK character set you’ll immediately see it is simplified Chinese, extended. So I think we better add that character set to our expression. It makes our content manager rule only a little more complicated. Now we would have:

Rule Name: Chinese_spam
"Match Any"
Header matches regex (charset="gb(k|2312)"|=\?GB(K|2312)\?)

Disposition: delete (blackhole)
Set quarantine to Recipient

For the complete prescription see the summary in the Conclusion.

If you happened upon this article and don’t have the Postini service is there any relevance? Yes, I think so. You should be able to filter on the message headers to look for the string =?gb2312? or =?gbk? in the beginning of the subject line. To speak about mailers with which I have some experience, in sendmail you could do this with a milter. In PureMessage it would be possible to concoct an appropriate rule as well.

9/2012 Update
My filter was working so well these past few months I essentially forgot about the problem, but the occasional Chinese spam slipped through. How? It used a different encoding. Here is an example subject line:

Subject: =?utf-8?B?6K+35p+l5pS277yB?=

This is displayed by my mail client as three Chinese characters followed by “!” They used a different encoding. This one drove me to do a little research. This is an Encoded-Word, according to Wikipedia’s excellent MIME writeup. The “?B?” in the front means base64 encoding. I had previously written a mimedecoder in perl, which I put to use:

> mimedecode 6K+35p+l5pS277yB

which produces:

???!

which is pretty much garbage. So I decided to analyze the output with unix utility od:

> mimedecode 6K+35p+l5pS277yB|od -x

which gives

0000000 e8af b7e6 9fa5 e694 b6ef bc81

Next, I needed a UTF-8 converter, which I found at this Swiss site.

I used it with input type hexadecimal.

The results reproduced exactly the Chinese characters my mail client displayed to me! It also gives a lot of other descriptions for these characters (such as Cangjie). The first few lines begin:

As character names:

U+8BF7 CJK UNIFIED IDEOGRAPH character (请)
U+67E5 CJK UNIFIED IDEOGRAPH character (查)
U+6536 CJK UNIFIED IDEOGRAPH character (收)
U+FF01 FULLWIDTH EXCLAMATION MARK character (!)

As raw characters:

请查收!

Well, that was an interesting exercise, but I’m not sure we’ve learned anything that can be put to use in a RegEx on the original expression. Unless there’s a way to uniquely identify Chinese characters by the beginning of the encoded-word sequence following the ?B?. I have my doubts, but since I don’t seem to get thee UTF-8 emails from other sources, and I have a sample size of about five emails that fooled the other filter to work with, I have developed a content filter which would capture all of them!

Check for a header containing the RegEx:

=\?utf-8\?B\?[56]

More specifically sometimes the utf-8 string is used in the From header, sometimes it is in the subject. Most of my samples would have been caught by the simpler RegEx =\?utf-8\?B\?5, and I mention that in case you want to be more specific, but there was one recent one that had a “6” instead of a “5.”

For the record here’s that mimedecode “program”

#!/usr/bin/perl
# base64 MIME decoding
# example:
# mimedecode Nz84QGxhdGU=
# =&gt; 7?8@late
use MIME::Base64;
 
foreach (@ARGV) {
#      $encoded = encode_base64($_);
      $decoded = decode_base64($_);
#print "enc,dec: $encoded, $decoded\n";
        print $decoded;
}

And its sister program, which I call mimeencode:

#!/usr/bin/perl
# base64 MIME decoding
# DrJ, 6/2004
# example:
# mimedecode Nz84QGxhdGU=
# =&gt; 7?8@late
use MIME::Base64;
 
foreach (@ARGV) {
      $encoded = encode_base64($_);
#      $decoded = decode_base64($_);
#print "enc,dec: $encoded, $decoded\n";
        print $encoded;
}

There’s probably a built-in linux utility which does the same thing, I just don’t know what that is.

2022(!) update

Well, I finally ran across it. The built-in program to do mimeencode/mimedecode is base64. Oh well, better late than never…

Conclusion
Your users needn’t suffer from Chinese Spam. The vast majority are characterized by, um, Chinese characters, of course, whose presence is almost always indicated by the string gb2312 in the message headers. You can take advantage of that fact and build an appropriate rule for Postini or your mailer. But beware of throwing out the baby with the bathwater! In other words, make sure you only subject your users to this rule unless you either have a good quarantine, or they are sure they should never receive this type of email.

There are some spam types which evade the gb2312 rule mentioned above, however. And this part is not as well tested, frankly. The exceptions, which are still a minority of my Chinese spam, are characterized by a subject line or sender that contains =?utf-8?B?5… or =?utf-8?B?6… (see summary below). My honest expectation is that a rule this broad and coarse will also catch a few other languages (Portuguese?, Urdu?, etc.) so be careful! If you are expecting to get non-english email more testing is in order before implementing the utf-8 filter. But it will certainly help to eliminate even more Chinese spam.

4/2013 update
Summary, including 6/2014 update
My filter has worked very well for me and has withstood the test of time. I catch at least a dozen Chinese spams each day. One got through in 6/2014 however, with character set gb18030. I realize reading the above write-up is confusing because I’ve mixed my love of telling a good IT mystery with my desire to convey useful information. So, to summarize, the new combined rule is:

Match Any:

Header matches RegEx:
(charset=”gb(k|2312|18030)”|=\?GB(K|2312|18030)\?)

Header matches RegEx:
=\?utf-8\?B\?[56]

References
A spate of spam from enom-registered domains is described here.
A disappointing case where Google is not operating their Gmail service as a white-glove service is described here.