What arrives in our in-boxes these days is becoming progressively richer and
fatter. The content includes HTML formatted rich text, hyperlinks and
attachments of various types, including Office documents, databases,
images, videos, etc. It is now estimated that more than 5 per cent of emails
contain images.
Most companies that employ content security have an
email/Internet policy document. Email users must read this document and
agree to abide by its rules. Email misuse can be deliberate or accidental,
so it is important to have the ability to enforce the rules in the policy to
prevent a policy breach. The ability to analyze messages and detect
unacceptable images is a very important part of this policy enforcement.
Pornography is a big issue in many parts of the
world, but acceptance of pornography varies by region and culture. In the
workplace it is unacceptable in most regions. Most companies would not find
it acceptable for employees to bring pornographic magazines into work and
start reading them at their desk during working hours. The same type of
content is available via email and this should be treated with the same
attitude.
There have been many high-profile cases involving
pornography in email. There are often large numbers of people involved,
simply because it is so easy to forward email to large groups of people.
Terminating employees and/or suspending them on full pay, for a lengthy
investigation, constitutes an enormous cost to business in terms of lost
revenue and damage to reputation.
Pornography is not the only threat
There are hundreds of ‘joke image’ web sites
available, with thousands of joke images, many of which could be considered
blasphemous, racist, sexist, pornographic or otherwise offensive. Some
office jokers would be easily tempted to click on the ‘email to a friend’
button on these sites, but not everyone may have the same sense of humor. An
image that is a joke to one person may well be offensive to another.
Post-September 11, 2001, and the subsequent U.S. military action, there were
a great number of ‘joke’ emails being circulated containing images
featuring Osama bin Laden. Many people found these images offensive.
Mail can easily be misdirected; it is very easy to
send the wrong message to the wrong person. Email clients may auto-complete
the email address from address books for you and this is often the source of
mistakes. Most email users have either sent or received a misdirected email
at some point. Employing a policy-based content security solution can help
reduce the risk of misdirected content.
Protecting the value of images
Images can contain confidential information. These
images could be photographs from medical or legal records, confidential
designs such as silicon chip designs, or the shape and styling of a new
prototype car. It may be completely acceptable, or even necessary, for these
images to circulate within a company, but it is so easy for them to be
accidentally or deliberately forwarded to the wrong person outside the
company. One of our customers in East Asia is a car manufacturer. The
company's new car designs were stored in the Clearswift software as images
unacceptable for transmission out of the company. A company insider
attempted to send the designs to a competitor and the image management
software successfully prevented this.
Litigation related to content security is on the
increase. In most cases the organization is responsible and liable for the
actions of its employees in the workplace, including employees’ use of
email and all the information transmitted from their systems. Legal issues
can arise if an employee sends an image that depicts other companies or
individuals in a less favorable light.
It is easy to see how unacceptable images in email
can lead to legal problems - for example, if employees receive emailed
pornography and their employer has made no effort to prevent this. In many
countries, companies have a legal responsibility to protect their employees
from exposure to content threats, such as racist, sexist, pornographic and
other offensive material.
Images can be copyright. Forwarding an email with
such an image could constitute an infringement of copyright. The issues of
digital rights management are becoming better understood and organizations
will progressively seek to formalize the description, identification,
trading, protection, monitoring and tracking of image assets - including the
management of right holders’ relationships.
Mishandling of images can slow the company down
Images add a considerable size overhead to email. A
typical text email could be around 1Kb. Adding one JPEG (typically about
40Kb) to a small email could make the email over 40 times its original size.
An email with five attached images could be more than 200 times larger than
an email containing just text. If on average there is one image for every
ten emails, this could increase the volume of email traffic by 40 per cent.
This has implications in terms of network bandwidth
and storage. Emails with images take up more space on email servers.
Multiple copies of the same email could be stored multiple times on the same
email server. For example, a cartoon email received by one person may be
forwarded to colleagues, and then on to others. It is not uncommon to see
some joke emails multiple times from different sources - “Oh no, not that
one again.” This is a waste of bandwidth and storage resources.
Advanced content security products should be able to
extract images from reports, spreadsheets, presentations and many other
types of documents. Documents are often distributed using email, both
internally within an organization, and externally, to customers and
partners. Many of these documents contain images. For content security
software to be effective, it is essential that rich documents can be
decomposed to extract the images within them. These images can then be
passed to an image analysis component.
Reducing the incidence of ‘false positives’
Image recognition software will never be 100 per
cent accurate when trying to detect ‘unknown’ porn images. Therefore
there will be false positives (innocent images detected as pornography) and
false negatives (pornography passing through undetected).
Both these factors are very important to consider
when looking at detection rates. False negatives can mean that ‘unacceptable’
images are delivered; false positives mean that potentially business
critical mail is being held up. From a business continuity point of view,
false positives can have the biggest impact. In respect to mail, business
email could be held in a quarantine area because it has incorrectly been
identified as containing pornography.
Some companies will wish to implement a policy
temporarily or intermittently. A company introducing content security or
refining its policy may start by simply monitoring to get an idea of what
the current situation is. This monitoring policy may deliver all emails, but
gather information that will help with the implementation of policy.
Pornography can be detected in images by examining a
variety of image attributes such as shape, color, tone gradients, position,
body part recognition, etc. This type of image analysis is processor
intensive. It is possible to create image signatures of ‘known images’
and use this as a method of blocking known unacceptable images or passing
known acceptable images. This type of image processing is very fast.
Performing sophisticated image analysis on images
will add a processing overhead to all messages containing images that can be
processed. This overhead may affect other processes on the host and delay
email. Combining image analysis with known hash comparison is much more
efficient. For example, known company logos can be added to the list of ‘acceptable
images’ to avoid false positives and reduce processing required.
Keeping good images in and bad images out
Many companies may already have some form of content
security, but few content security solutions have the ability to manage
images. There are many options for implementing policy for images. These
include monitoring images in email; deleting all inbound and outbound mail
containing images; removing images from all email, but passing the email on
without the images; manually validating all email containing images; manual
validation with an ‘acceptable list’; and using an automatic process to
validate images in emails based on analysis of the image content.
The ideal solution would combine monitoring,
automatic analysis, manual validation and validation from ‘acceptable’
and ‘unacceptable’ lists. Automatic analysis performs most of the
validation. This validation can be manually reviewed and corrected. The ‘acceptable’
and ‘unacceptable’ lists are used to correct the automatic analysis and
reduce the processing administration workload. Once an image has been
manually validated, it can permanently added to a pre-classification
database (as ‘acceptable’ or ‘unacceptable’) and will not need to be
manually validated again.
Problem images not identified during automatic
analysis can be added to the ‘unacceptable’ database, as can images that
are confidential or intellectual property. The database will contain only
the image fingerprints, not the images themselves. Once a problem image is
found, it should be very easy to block emails containing the images.
Fingerprints can be added from monitoring and, over time, the
pre-classification database will contain many image fingerprints, making the
solution accurate and efficient, and reducing costs.
Paul Rutherford is chief marketing officer for
Clearswift (www.clearswift.com).