Skynet, Smugglers and The Gift of Fear: What we can learn from snap judgements, and machines can learn from us

So, in the day or two since I posted the piece about “Big Filter“, I’ve gotten several calls, comments and emails that all seemed to focus on the scary notion of “machines that think like us”.  Some folks went all “isn’t that what Skynet and The Matrix, and (if you’re older, like me) The Forbin Project, and W.O.P.R were on about?”  If machines start to think like us, doesn’t that mean all kinds of bad things for humanity? 

Actually, what I said was, “We have to focus on technologies that can encapsulate how people, people who know what they’re doing on a given topic, can inform those systems… We need to teach the machines to think like us, at least about the specific problem at hand.”  Unlike some people, I have neither unrealistic expectations for the grand possibilities of “smart machines”, nor do I fear that they will somehow take over the world and render us all dead or irrelevant.  (Anyone who has ever tried to keep a Windows machine from crashing or bogging down or “acting weird” after about age 2 should share my comfort in knowing that machines can’t even keep themselves stable, relevant or serviceable for very long.) 

No, what I was talking about, to use a terribly out-of-date phrase, was what used to be known as “Expert Systems”, a term out of favor now, but that doesn’t mean the basic idea is wrong. I was talking about systems that are “taught” how someone who knows a very specific topic or field of knowledge thinks about a very specific problem.  If, and this is a big if, you can ring-fence the explicit question you’re trying to answer, then it is, I believe, possible, to teach a machine to replicate the basic decision tree that will get you to a clear, and correct, answer most of the time.  (I’m a huge believer in the Pareto Principle or “80-20 rule” and most of the time is more than good enough to save gobs and gobs of time and money on many many things.  More on that in a moment.) 

A few years ago now, I read a book called “The Gift of Fear” by Gavin de Becker, an entertaining and easy read for anyone interested in psychology, crime fighting, or the stuff I’m talking about.  The very basic premise of that book, among other keen insights, is that our rational minds can get in the way of our limbic or caveman brains telling us things we already “know”, the kind of instantaneous, can’t-explain-it-but-I-know-I’m-right, in-our-gut knowledge that our rational brains sometimes override or interfere with, occasionally to our great harm.  (See the opening chapter of The Gift of Fear, in which a woman who’s “little voice” as I call it told her there was something wrong with that guy, but she didn’t listen, and was assaulted as a result.  Spoiler alert, she did, however, also escape that man, who intended to kill, her using the same intuition. Give it a read.) 

De Becker, himself a survivor of abuse and violence, went on to study the evil that men do in great detail, and from there, to codify a set of principles and metrics that, encoded into a piece of software, enabled his firm to evaluate risk and “take-it-seriously-or-not-ness” for threats against the battered spouses, movies stars and celebrities his Physical Security firm often protects.  Is this Skynet taking over NORAD and annihilating humanity? Of course not.  What is is, however, is the codification of often-hard-won experience and painful learning, the systematizing of smarts. 

I was thinking about all this in part because, in addition to the comments on my last post, I’m in the middle of re-reading “Blink” (sorry, I appear to be on a Malcolm Gladwell kick these days.)  It’s about snap decision making and the part of our brain that decides things in two seconds without rational input or logical thought.  A few years ago, as some of you know, my good friend Nick Selby of (among many other capes and costumes) the Police Led Intelligence Blog, decided he was so passionate about applying technology to making the world better and communities safer that he both founded a software company (streetcred software – Congrats on winning the Code for America competition this year!) and became a police officer to gain that expertise he and his partner would encode into the software.  He told me a story from his days at the Police Academy.  I may have the details wrong on this bit of apocrypha, but you’ll get the point. 

During training outside of Dallas, there was an experienced veteran who would sometimes spend time helping catch smugglers running north through Texas from the Mexican border.  “Magic Mike” I call this guy, I can’t remember his real name, could stand on an overpass and tell the rookies, “Watch this.”  He’d watch the traffic flowing by beneath him, pick out one car seemingly at random and say, “That one.” (Note that, viewed at 60 mph and looking at the roof from above, age, gender, race or other “profiling” concerns of the occupants is essentially a non-issue here.) 

Another officer would pull over the car in question a bit down the road, and, with shocking regularity, Magic Mike was exactly right.  How does that happen?!  And can we capture it?  My argument from yesterday is that we can, and should.  We’re not teaching intelligent machines in any kind of scary, Turing-Test kind of way.  No, it’s much clearer and more focused than that.  Whatever went on in Magic Mike’s head – the instantaneous Mulligan Stew of car make, model, year, speed, pattern of motion, state of license plate, condition etc. – if it can be extracted, codified and automated, then we can catch a lot more bad guys. 

I personally led a similar effort in Cyber space.  Some years ago, AOL decided that member safety was a costly luxury and stared laying off lots of people who knew an awful lot about Phishing and spoof sites.  Among those in the groups being RIF’ed was a kid named Brian, who had spent untold hours sitting in a cube looking at Web pages that appeared to be banks, or Paypal or whatever, saying, “That one’s real. That one’s fake.  That one’s real, that one’s fake.”  He could do it in seconds. So, we hired him, locked him in an office and said, “You can’t go to the bathroom til you write down how you do that.” 

He said it was no big deal – over the years he’d developed a 27-step process so he could teach it to new guys on the team.  Just one of those steps turned out to be “does it look like any of the thousands of fake sites I’ve gotten to know over the years?”  Encapsulating Brian’s 27 steps in a form a machine could understand took 400 algorithms and nearly 5,000 individual steps.  But… so what?  When weeks of effort was done, we had the world’s most experienced Phish-spotter built into a machine that thought the way he did, and worked 24×7 with no bathroom breaks.  We moved this very bright person on to other useful things, while a machine now did what AOL used to pay a team of people to do, and it did it based not on simple queries or keywords, but by mimicking the complex thought process of the best guy there was. 

If we can sit with Brian, who can spot a Phishing site, or De Becker who can spot a serious threat among the celebrity-stalker wannabes, or Magic Mike who can spot a smuggler’s car from an overpass at 70 miles an hour, when we can understand how they know what they know in those instant flashes of insight or experience, then we can teach machines to produce an outcome based not just on simple rules but by modeling the thoughts of the best in the business.  Whatever that business is – catching bad guys, spotting fraudulent Web sites, diagnosing cancer early or tracking terrorist financing through the banking system, that (to me) is not Skynet, or WOPR, or Colossus.  That’s a way to better communities, better policing, better healthcare, and a better world. 

Corny? Sure.  Naive? Probably.  Worth doing?  Definitely.  

 

 

“Big Filter”: Intelligence, Analytics and why all the hype about Big Data is focused on the wrong thing

These days, it seems like the tech set, the VC set, Wall Street and even the government can’t shut up about “Big Data”.  An almost meaningless buzzword, “Big Data” is the catch-all used to try and capture the notion of the truly incomprehensible volumes of information now being generated by everything from social media users – half a billion Tweets, a billion Facebook activities, 8 years of video uploaded to youtube… per day?! – to Internet-connected sensors of endless types, from seismography to traffic cams.   (As an aside, for many more, often mind-blowing, statistics on the relatively minor portion of data generation that is accounted for by humans and social media, check out these two treasure troves of statistics on Cara Pring’s “Social Skinny” blog.)

http://thesocialskinny.com/216-social-media-and-internet-statistics-september-2012/

http://thesocialskinny.com/100-more-social-media-statistics-for-2012/

In my work (and occasionally by baffled relatives) I am now fairly regularly asked “so, what’s all this ‘big data’ stuff about?”  I actually think this is the wrong question.

The idea that there would be lots and lots of machines generating lots and lots… and lots… of data was foreseen long before we mere mortals thought about it.  I mean, the dork set was worrying about  IPv4 Address exhaustion in the late 1980s.  This is when AOL dial-up was still marketed as “Quantum Internet Services” and made money by helping people connect their Commodore64’s to the Internet.  Seriously – while most of us were still saying “what’s a Internet?” and the nerdy kids at school were going crazy because, in roughly 4 hours, you could download and view the equivalent of a single page of Playboy, there were people already losing sleep over the notion then that the Internet was going to run out of it’s roughly four-and-half billion IP addresses.   My point is, you didn’t have to be Ray Kurzweil to see there would be more and more machines generating more and more data.

What I think is important is that more and more data serves no purpose without a way to make sense of it.  Otherwise, more data just adds to the problem of “we have all this data, and no usable information.” Despite all the sound and fury lately about Edward Snowden and NSA, including my own somewhat bemused comments on the topic, the seemingly omnipotent NSA is actually both the textbook example and the textbook victim of this problem.

It seems fairly well understood now that they collect truly ungodly amounts of data.  But they still struggle to make sense of it.  Our government excels at building ever more vast, capable and expensive collection systems.  Which only accentuates what I call the “September 12th problem.”  (Just Google “NSA, FBI al-Mihdhar and al-Hazmi” if you want to learn more.)  We had all the data we ever needed to catch these guys.  We just couldn’t see it in the zetabytes of other data with which it was mixed.  On September twelfth it was “obvious” we should have caught these guys, and Congress predictably (and in my opinion unfairly) took the spook set out to the woodshed perched on the high horse of hindsight.

What they failed to acknowledge was that the fact we had collected the necessary data was irrelevant.  NSA collects so much data they have to build their new processing and storage facilities in the desert because there isn’t enough space or power left in the state of Maryland to support it.  (A million square feet of space, 65 megawatts of power consumption, nearly two million gallons of water a day just to keep the machines cool?  That is BIG data my friends.)  And yet, what is (at least in the circles I run in) one of the most poignant bits of apocrypha about the senior intelligence official’s lament?  “Don’t give me another bit, give me another analyst.”

It is this problem that has made “data scientist” the hottest job title in the universe, and made the founders of Splunk, Palantir and a host of other analytical tool companies a great deal of money.  In the end, I believe we need to focus not just on rule-based systems, or cool visualizations, or fancy algorithms from Isreali and Russian Ph.Ds.  We have to focus on technologies that can encapsulate how people, people who know what they’re doing on a given topic, can inform those systems to scale up to the volumes of data we now have to deal with.  We need to teach the machines to think like us, at least about the specific problem at hand.  Full disclosure, working on exactly this kind of technology is what I do in my day job, but just because my view is parochial doesn’t make it wrong.  The need for human-like processing of data based on expertise, not just rules, was poignantly illustrated by Malcolm Gladwell’s classic piece on mysteries and puzzles.

The upshot of that fascinating post (do read it, it’s outstanding) was in part this.  Jeffrey Skilling, the now-imprisoned CEO of Enron, proclaimed to the end he was innocent of lying to investors. I’m not a lawyer, and certainly the company did things I think were horrible, unethical, financially outrageous and predictably self-destructive, but that last is the point.  They were predictably self-destructive, predictable because, whatever else, Enron didn’t, despite reports to the contrary, hide the evidence of what they were doing. As Gladwell explains in his closing shot, for the exceedingly rare few willing to wade through hundreds or thousands of pages of incomprehensible Wall Street speak, all the signs, if not the out-and-out evidence, that Enron was a house of cards, were there for anyone to see.

Jonathan Weil of the Wall Street Journal wrote the September, 2000 article that got the proverbial rock rolling down the mountain, but long before that, a group of Cornell MBA students sliced and diced Enron as a school project and found it was a disaster waiting to happen.  Not the titans of Wall Street, six B-school students with a full course load. (If you’re really interested, you can still find the paper online 15 years later.)    My point is this – the data were all there. In a world awash in “Big Data”, collection of information will have ever-declining value.  Cutting through the noise, filtering it all down to which bits of it matter to your topic of choice; from earthquake sensors to diabetes data to intelligence on terrorist cells, that will be where the value, the need and the benefits to the world will lie. 

Screw “Big Data”, I want to be in the “Big Filter” business.

A really smart guy blows it completely…Malcolm Gladwell isn’t exactly wrong, he just missed the point.

So let me start with a couple of quick disclaimers.

  1. Malcolm Gladwell is a really smart guy, I respect a lot of his ideas, and I really liked several of his books.
  2. I’m not trying to pick a fight with someone famous just to elevate my blog.  What might I accomplish? Doubling the two dozen people who read it?  This isn’t gratuitous, and (as evidenced by my spotty posting record) I’m obviously not trying to make this blog a platform for fame or visibility.
  3. He’s also a lot more famous, rich and brainy than I am, so if those are the metrics that serve as proxy for right and wrong, maybe I should shut up. That said… Yeah, he totally blew it.

So a while back, I wrote a post called “Tech Coup 2.0 – The Revolution Will Be Twittervised…”, one in a long list of plays on the original title, poem and song, (The Revolution Will Not Be Televised, Gil Scott-Heron, 1970).

Unbeknownst to me at the time, Gladwell had written a piece a few months before for the New Yorker called “Small Change: The Revolution Will Not Be Tweeted”.  The reason I’m taking this on now, when the question might seem oh-so-totally-six-months-ago is not just to defend my position, but because I think this is going to be the question of 2012 far more than it was the question of 2010.  First let’s talk about how he’s missed the point, then I’ll touch on why I think the impact of this is going to reach far beyond the past year’s “Arab Spring”.

Gladwell’s argument, as well as those of several learned and impressive people he sites including Golnaz Esfandiari’s excellent piece in Foreign Policy “Misreading Tehran: The Twitter Devolution”, is that social media and virtual networks have fundamental flaws as a tool for organizing revolution.  I won’t recap his whole argument here, but citing examples from East Germany to the US civil rights movement, he explains that, among other things, social uprising against the status quo requires two very important elements.

The first is what he calls “strong ties”.  It’s easy, he argues, to “join a cause” by clicking the “Like” button facebook or giving a dollar via Web site, but when we’re talking about rising up against a regime or authority with the ability and willingness to use coercion and force, it’s a different ballgame.  To be willing to stand in front of the proverbial tank or put flowers in a rifle barrel aimed at your head, true (that is, physically dangerous) stands against authority have traditionally required a personal connection to others involved.  Flash-mobbing Wall Street in New York, where the rule of law and one’s physical safety are essentially not in question (recent left-wing hysteria about pepper spray and fascism not withstanding), is totally different than coming out of your house to face down Assad’s security forces because of a text message or Tweet.  People, he argues, put their asses on the line, because people they know and care about are taking to the streets too, and/or have been victims of the condition against which they protest.

The second factor a true uprising requires to be sustained, he argues, is hierarchy and organizational control, the very antithesis of social, informal and virtual networks.  If one’s goal is just to create havoc, then sure, a loose confederation of like-minded individuals acting semi-autonomously is fine.  But if your goal is explicit, specific and clear policy and governmental changes, then (citing examples like the NAACP), a clearly structured organization and chain of command is explicitly required.

He also does a fine job of pointing out the flaws of, if not completely tearing down, the arguments for the power of social media that are made in Smith and Aaker’s “The Dragonfly Effect” and Shirky’s “Here Comes Everybody”.  My favorite nugget:

“ ‘Social networks are particularly effective at increasing motivation,’ Aaker and Smith write.  But that’s not true.  Social networks are effective at increasing participation – by lessening the level of motivation that ‘participation’ requires.” 

Again, I won’t restate his whole argument here, it’s really worth it to read Gladwell’s piece.  And I say that because, (and here comes the potentially confusing part) I think he’s absolutely right. I think his critiques of the whole “social media will change the world” view is dead on in terms of the flaws he exposes in social media as a tool of organization for large scale social or revolutionary change.

So… HUH?  Didn’t I start this whole discussion saying Gladwell’s wrong?  Nope.  I said he missed the point.  Not the same thing at all.  He’s absolutely right that technology, social networks and the like will not likely play (and explicitly have NOT to date played), the role it’s cheerleaders have claimed.

Here’s the point I think he missed, and it was the core point of my own Revolution post, which perhaps I didn’t state explicitly enough.  Technology and social networks will not bring the tools and organization and strong ties required to bring people out in the face of the threat of physical force.  But let’s remember what gets people out in the street in the first place – a motivation to take the risk, something so inspiring, egregious or powerful it overcomes the collective inertia of not revolting.  And that is what technology can, and will, bring.

Gladwell is right that it took organization, strong ties, and deeply seated moral beliefs among both the black protesters and the white freedom riders and volunteers who eventually rose up to begin changing life for black Americans.  Twitter and YouTube can’t provide the ties, or the organizational structure.  What they can provide is the motivation, the evidence, the “why”.  How much sooner, and how many more, white supporters might have come, how many more black students might have sat in, if lynchings and beatings and rapes of black girls by white men had been caught on cellphone cameras and posted on YouTube.

What was the catalyst that started the Tunisian upheaval? One poor street vendor, despondent and disheartened to the point of self-immolation, became the (literal) match that lit the fuse of revolution.

Can Twitter or SMS really provide the the organizational structure and the belief systems to make thousands turn out in the face of arrest, imprisonment or worse and keep them focused on a long term goal or societal change?  Not at all.  Does it provide the strong familial or social ties that get folks to link arms in front of a machine gun?  Nope.  But…

Can it, in seconds and nearly unstoppably, communicate out to a million people the photo, video, report or account of an atrocity, injustice or societal wrong that will get them in the streets and provide the motivation to organize, reach out and engage one’s close ties?

It has (flip phone vid of Saddam Huessein being hanged anyone?), it can and it will.

Like I said, Gladwell wasn’t wrong, in fact I agree with his criticisms of the social-media evangelist set. Social media doesn’t play the role it’s cheerleaders claim.  On this, he’s right. I just think he’s arguing the wrong point.  I’ll close by repeating my own thought from the previous post, for whatever that’s worth.

If, and where, keeping the world from knowing “what’s really happening” is important to maintaining advantage, power or undeserved legitimacy,  the inability to keep the information genie in the bottle ever, at all, anywhere, is going to catch a whole lot of employers, governments and belief systems up short.

From cults to political parties to hate organizations to repressive regimes, the daylight is coming to shine on you and your beliefs.  If you cant say it out loud and in public without losing support, money or legitimacy, know that your days are numbered.  I think, whether in months or years, the end is nigh, and your doom will come not from jackbooted troops, police SWAT teams or even intrepid reporters, but in the form of the individual with a conscience and the cheap, ubiquitous camera phone.

Disclaimer: The views expressed on this blog are mine alone, and do not represent the views, policies or positions of Cyveillance, Inc. or its parent, QinetiQ-North America.  I speak here only for myself and no postings made on this blog should be interpreted as communications by, for or on behalf of, Cyveillance (though I may occasionally plug the extremely cool work we do and the fascinating, if occasionally frightening, research we openly publish.)

%d bloggers like this: