Stop Thinking, Just Do!

Sungsoo Kim's Blog

“Big Data” Promises a Revolution

tagsTags

3 March 2014


Summary

  • Article Source: Bruce McClendon and Wally Hill, “Big Data” Promises a Revolution, PM Magazine, VOLUME 95, NUMBER 6, JULY 2013 [11]
  • Article URL: “Big Data” Promises a Revolution

big data


Introduction

It’s called “big data.” And big it is.

This emerging field promises to revolutionize government and business. How? By capturing, curating, managing, processing, and analyzing with computers valuable information stored in vast databases. Information technology combined with predictive analytics makes it possible to identify correlations and predict and manage for better outcomes.

In the new book, Big Data: A Revolution That Will Transform How We Live, Work, and Think big data [1]

“refers to things one can do at a large scale that cannot be done at a smaller one, to achieve new insights or create new forms of value, in ways that change markets, organizations, the relationships between citizens and government, and more.”

The real value of big data resides in using it, not in just possessing it. In the private sector, competitors and shareholders drive corporations to access, capture, and build huge customer-centered troves of big databases and then to unlock and monetize the implicit latent value of this data.

Local governments can be voracious collectors of huge amounts of data but most do not have the public pressure, interest, knowledge, or technical skills necessary to actually put this data to use. Big data could be used to boost use of services, generate additional revenues, enhance decision making, improve the value and reliability of services, or increase employee productivity.

CASE STUDY: HOW NEW YORK CITY USES BIG DATA

Housing inspections.

While most local governments, to be honest, lag behind the private sector’s knowledge and interest in using big data and analytics, there are some remarkable exceptions. New York City, for example, receives 25,000 complaints a year for illegal residential conversions and takes them seriously. But it can only afford to budget for 200 inspectors to respond to these complaints.

Cutting up dwellings and creating a series of unsafe, hazardous, smaller units that are constructed without the benefit of electrical, structural, plumbing, health, and fire code inspections – which can create fire and life safety hazards and harbor diseases and pest infestations – is associated with serious social welfare issues, criminal activity, and declining property values.

Illegal residential conversions are a common problem in every large metropolitan area and most local governments have more demand for code enforcement services than they have resources. What is uncommon is how New York City chose to respond to this situation.

Not all illegal conversion complaints result in finding code violations and even when they do, some violations are more serious than others. The highest and most critical priorities are violations in residential structures that cannot be repaired and which pose a risk for catastrophic fires. In actually conducting such high-priority inspections, New York City found that 13 percent of the complaints resulted in a vacate order.

However, New York City decided it needed better results than 13 percent. Its options were either to hire more inspectors or to have the existing staff become more efficient and more effective to achieve better results. The city chose the latter option and assigned it to Mike Flowers, the first director of analysis for the Office of Policy and Strategic Planning, and his five-person staff of statistical analysts.

Observing and learning from the inspectors in the field and using existing big databases from 19 different agencies for 900,000 parcels of property that included such factors as type of building, year built, property tax delinquencies, foreclosure proceedings, utility service billing issues and service cutoffs, ambulance visits, crime rates, fire records, and more, the analysts were able to find enough correlations to develop a model for predicting inspection outcomes and prioritizing the order of inspections.

When New York City activated the system, inspections resulted in a fivefold increase in the percentage of the number of complaints for which vacating orders were issued. The analysts had taken [1]

“massive quantities of data that had been lying around for years, largely unused after it was collected, and “harnessed it in a novel way to extract real value.”

Sewer violations and bootlegged cigarette sales

In addition to increasing the effectiveness of its overburdened housing inspectors, New York City’s in-house, number-crunching geek squad was responsible for the Department of Environmental Protection’s 95 percent hit rate in tracking down restaurants that were illegally dumping cooking oil into the city’s sewer system as well as doubling the city’s success rate in finding stores selling bootlegged cigarettes.

Flowers calls his staff the “get stuff done folks” and modestly explains that, “All we do is take and process massive amounts of information and use it to do things more effectively [2].

Defective manholes

In New York City, electrical power has been provided since 1882 by Con Edison, which is one of the largest investor-owned utilities in the country. In recent years, Con Ed and New York City have been plagued by the random hazards of smoking, flaming, and exploding manholes that would catapult the 300-pound covers up to 50 feet in the air.

The city has 51,000-odd manholes and service boxes in Manhattan and only a few hundred serious events occur each year. While the number of events may not be proportionally large in comparison to the size of the system, it was important to Con Ed to make every possible effort to eliminate or at least to reduce the number of dangerous manhole events as much as possible.

Con Ed’s engineers, who were the experts on the operating systems, reviewed the various factors that were responsible for the explosions and concluded there was no way to predict which manholes were more likely to malfunction. But rather than accept the opinion of its engineers and settle for having to make unranked and unprioritized inspections of every manhole, they asked researchers from Columbia University to look at the company’s data and try to predict which manholes were more likely to explode and should be inspected first.

Cynthia Rudin, who is now an assistant professor in the Operations Research and Statistics group at the MIT Sloan School of Management, led the team of researchers from Columbia. Using historical data dating as far back as the 1880s, they identified more than 100 possible variables that were associated to one degree or another with the potential for a manhole event.

Subsequently, analysts developed an algorithm to analyze past records and find the strongest and most meaningful correlations so they could identify and rank order the manholes with the critical characteristics that were associated with the likelihood of an event. This successful effort of using big data and predictive analytics was highlighted in an online edition of Wired in 2010. (Wired is a monthly magazine and online periodical that was described by its original founders as the Rolling Stone of Technology [3].)

The two biggest factors that were associated with the exploding manholes were the age of the underground cables and whether the manholes had experienced previous troubles.

In retrospect, this causal relationship should have been obvious but quoting network theorist Duncan Watts, the coauthors of Big Data reminded us that [4],

“Everything is obvious once you know the answer.”

And, if you have ever attended a controversial public hearing, there are few things as persuasive as obvious information that is confirmed and backed up by real data.

INTEREST GROWS IN PREDICTIVE POLICING

In the television show Person of Interest, a computer that was initially developed to find terrorists uses a huge classified database and analytics to predict specific individuals who will either be murdered or commit a murder. The two featured actors in the drama and their supportive colleagues in law enforcement have the challenge of finding the profiled individual, figuring out who the actual “bad guy” is, and preventing the crime from taking place.

It turns out that in real life, truth is just as strange as fiction. In the spring of 2010, Jody Weis, superintendent of the Chicago Police Department, established a predictive analytics group. The group’s responsibility was to sort through crime statistics and demographic data and produce twice-a-day intelligence reports identifying violent crime “hot spots” where teams of roving officers should be deployed.

In October of that same year, the crime-forecasting unit of the group that was analyzing 911 calls for service produced an intelligence report predicting a shooting would soon occur on a particular block on the South Side. Three minutes later, a shooting took place. The specifics of the predictive analytics behind the report have not been publically disclosed to protect the methodology from being compromised by criminals [5].

In the nonfiction book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie and Die by Eric Siegel, the author notes that, in Maryland, the state analytically generates predictions for people who are under supervision who will kill or be killed. University and law enforcement researchers have developed an analytic model that predicts the likelihood that someone who has previously been convicted for homicide will commit another murder [6].

Predictive policing is the term that is used to describe the use of data and analytics to proactively prevent crime. The spreading interest in data-driven predictive policing by law enforcement agencies is often related to externally driven budgeting and accountability issues involving concerns about right sizing, cost effectiveness, and the use of performance standards.

For example, when faced with a 30 percent increase in service calls since 2000 while staffing levels had declined by 20 percent and on pace to set a record for burglaries, Santa Cruz, California decided to experiment with predictive policing. Eight years of crime data was entered into a computer and a complex algorithmic program was developed by researchers at Santa Clara University (SCU), including mathematicians George Mohler and Martin Short; Jeff Brantingham, an anthropologist; and criminologist George Tita. Dr. Mohler is an assistant professor in the Department of Mathematics and Computer Science.

The predictive policing model they created was based on models developed for predicting aftershocks from earthquakes. The model the researchers developed analyzes and detects patterns in criminal behavior using statistical data for past years in order to generate projections of areas and time parameters, where crimes like home burglaries and vehicular theft are more likely to occur.

At roll calls, police officers are given a list of the 10 highest-probability “hot spots” for that day. New data is entered into the model on a daily basis [7]. During the first six months of the trial program, the total number of reported burglaries in Santa Cruz declined by 11 percent [8].

The crime data for developing the predictive model used in Santa Cruz was provided by the Los Angeles police department, which was also monitoring the experiment. Dr. Mohler told us that after the SCU researchers’ original work for Santa Cruz, the predictive software was then deployed in the Foothill—Southwest and North Hollywood divisions in Los Angeles.In randomized controlled trials, it was found to be twice as effective in predicting crime as the police department’s expert crime analysts.

For predictive analytics to have any really useful value, data analysts must be able to demonstrate that their results are superior to human intuition and traditional analysis. Based on the success of the analytical tool, it has been expanded to include the entire city of Los Angeles and several other agencies in California. It has also been implemented in Seattle and Tacoma, Washington, and in Kent, United Kingdom.

Over the past decade, Los Angeles has taken an interest and been an active participant in predictive policing. The Los Angeles Police Department (LAPD) is one of seven agencies that have received a predictive policing planning grant from the National Institute of Justice (NIT).

Police Chief Charlie Beck is quoted as saying [9],

“I’m not going to get more money. I’m not going to get more cops. I have to be better at using what I have, and that’s what predictive policing is about. If this old street cop can change the way he thinks about this stuff, then I know that my (officers) can do the same.”

A BRAVE NEW WORLD

In light of the recent Boston Marathon bombing, we want to mention the growing interest on the part of law enforcement officials in the use of predictive analytics for video surveillance. Several companies have developed software that is designed for digital signal processor-based programs embedded inside of cameras, DVRs, and Internet Protocol (IP) devices.

Using artificial intelligence, the software can automatically detect such anomalous behavior as a package or backpack being left behind and then send an electronic alert to law enforcement agencies. The innovative smart cameras and video monitors that are currently being used and developed with artificial intelligence and predictive analytics capabilities will increasingly become a part of government efforts to use technology to more effectively and efficiently deliver law enforcement services and proactively anticipate and prevent crimes.

The commonly held perception by government that there is limited, significant public interest or demand for technology-driven e-government services is unfounded. Based on our actual experience and from proprietary surveys of residents and customers that were conducted as part of customer service audits by a management consulting firm, they want and are willing to pay more for the speed, convenience, effectiveness, and efficiency of electronic government.

Some customer-centered local governments are already using information technology to flatten their organizational structures, streamline their permitting processes and procedures, reduce processing times and inspection costs, enhance the effectiveness and efficiency of services, and empower their customers to help themselves [10]. They are, however, at this point in time, the exception rather than the rule.


Predictive policing

Predictive policing refers to the usage of predictive and analytical techniques in law enforcement to identify potential offenders. It uses technology based on earthquake prediction, and has been described in the media as a revolutionary innovation capable of “stopping crime before it starts” [12].

In November 2011, TIME Magazine named predictive policing as one of the 50 best inventions of 2011. In the United States, the practice of predictive policing has been implemented by police departments in several states such as California, Washington, South Carolina, Arizona, Tennessee, and Illinois.

History

In 2008, Police Chief William Bratton at the Los Angeles Police Department (LAPD) began working with the acting directors of the Bureau of Justice Assistance (BJA) and the National Institute of Justice (NJI) to explore the concept of predictive policing in crime prevention.

In 2010, researchers proposed that it was possible to predict certain crimes, much like scientists forecast earthquake aftershocks.

Predictive policing program is currently used by the police departments in several U.S. state such as California, Washington, South Carolina, Arizona, Tennessee, and Illinois.

Effectiveness

The effectivess of predictive policing was recently tested by the Los Angeles Police Department (LAPD), which found its accuracy to be twice that of its current practices. In Santa Cruz, California, the implementation of predictive policing over a 6-month period resulted in a 19 percent drop in the number of burglaries. In Kent, 8.5 percent of all street crime occurred in locations predicted by PredPol, beating the 5 percent from police analysts.

References

[1] Mayer-Schonberger, and Cukier, Kenneth. 2013. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston: Houghton Mifflin Harcourt.
[2] Feuer, Alan. 2013. The Mayor’s Geek Squad., The New York Times. New York: The New York Times Company, March 23, 2013. 3
[3] http://www.wired.com/wiredscience/2010/07/manhole-explosions/ (Ehrenberg, Rachel. “Predicting the Next Deadly Manhole Explosion”).
[4] See N.1 Supra.
[5] http://www.suntimes.com/news/metro/3295264-418/intelligence-weis-center-crime-department.html. (Police sensing crime before it happens).
[6] Siegel, Eric. 2013. Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, and Die. Hoboken, New Jersey: John Wiley & Sons.
[7] http://www.nytimes.com/2011/08/16/us/16police.html?_r=0.
[8] http://www.gtweekly.com/index.php/santa-cruz-news/santa-cruz-local-news/3472-scpd-predictive-policing-review.html.
[9] www.predpol.com.
[10] McClendon, Bruce and Birch, Mac and Quay, Ray. 2013. Customer Service Gov. Citygate Press.
[11] Bruce McClendon and Wally Hill, “Big Data” Promises a Revolution, PM Magazine, VOLUME 95, NUMBER 6, JULY 2013 [12]Predictive policing, From Wikipedia, the free encyclopedia, 2013.


comments powered by Disqus