Sex Trafficking and Data Mining

Harvard Computer Science graduate Chris White has recently applied data mining tools to make huge leaps in eradicating sex trafficking. In September 2010, Dr. White travelled to Afghanistan to bust an online financial system and confront al-Qaida. While at Harvard, White had studied the intersection between big data, statistics and machine learning.


Defense Advanced Research Projects Agency

A mentor introduced Dr. White to DARPA, the Pentagon’s scientific development agency. DARPA stands for the “Defense Advanced Research Projects Agency”. This agency focuses on making huge investments to technology that advances U.S. security. The very first project DARPA completed was the launch of Sputnik in 1957, with the goal being to increase the national security of the United States. Most of their workers are part of the government, but they also employ students who are new to the field. They created not only advanced military technology, such as precision weapons and stealth utilities, but also created a lot of the technology that civilians use today. Such technology includes the Internet, voice recognition, and the Global Positioning System, also known as the GPS. They also created night vision, Agent Orange, and weather satellites. DARPA currently employs about 220 government officials, and nearly 100 managers. It currently has 250 research programs. DARPA aims to help the US stay at the top of the technological game, and releases huge advancements in tech every five to ten years for the rest of the world to enjoy.


The Birth of Memex

Attending a DARPA conference taught White about the world and the wars that are taking place. He learned top-level information on the brutality of the tactics used for killing, terrorizing, and defensive mechanisms. This was White’s first introduction to the idea that big data could be used to combat dark problems. While working for DARPA, White learned that the U.S. had too much information on things that were happening in Afghanistan, and that they had a problem with sorting through all of the data they had collected. This problem lead to the development of tools that could help with sorting through huge amounts of information. After leaving DARPA, White decided to take on making these tools easier to use. He called his project Memex, combining the words Memory and Index. The project took three years and around $50 million dollars. The search-engine tool box contained units coded by both industry and university professionals.  The purpose of the project was to mine through data that couldn’t be easily reached on a platform such as Google, Firefox, or Safari. This tool would become helpful in making associations between different ideas and facts, therefore making huge amounts of data visual. The tool would start by going into the dark side of the web.

The Onion Router

There are a lot of things that regular internet users don’t know. Only about 5-20% of the internet is open to the public. For example, there are pieces of data that are protected by passwords via social media. However, the lesser known area of the internet isn’t an untraveled place. Most of this lesser known area is accessed through something called The Onion Router (or the “TOR”).  TOR is a free software that allows people to communicate with each other in an anonymous way. It is capable of concealing a location and keeps people from being discovered through traffic analysis. The software is intended to help protect people’s privacy, and also to encourage confidential communication. It was originally developed by DARPA to protect U.S. intelligence communications. However, it has also become popular for people who wish to hide their activity from the government. There are about 2.5 million daily visitors, including ISIS planners and hacktivists. Sex trafficking is unique, as buyers must be able to find their products. This is the exact reason why human traffickers don’t use regular search engines.

Memex Applications

Memex is able to search the Onion Router and present hidden sites in a simple list. This sort of accomplishment created huge waves in the possibilities of data mining. Usually, detectives and private investigators would normally spend two weeks working twelve hour shifts every day to search through the same amount of data that Memex is able to search through in moments. These detectives would need to search page by page on Google and write down new information to search (such as an email address), thus losing other pages that came up on the same search page. A tool called Datawake in Memex represents search results as a series of circles. It organizes information in a way that is easy to follow, and allows detectives to look at all of the results all at once without neglecting certain areas first. Old cases lead to new cases, the pattern continues. With these revolutionary changes in the ability to mine huge amounts of data, detectives are making huge breakthroughs on crime stopping technology.


The power of the internet is only three clicks away.


For further reading: