comment 0

Why the Polls Don’t Mean What You Think They Mean

There are three kinds of lies: lies, damned lies and statistics.

Benjamin Disraeli

 

With the elections less than two weeks away, many have already proclaimed Hillary Clinton as the winner.

This is based on Clinton’s overwhelming dominance in the polls and election forecasts.

 

I’d like to show that the race, based on what we know now, is much closer than what most of us think, in two very meaningful ways:

  1. We humans aren’t generally very good at interpreting the “forecast” numbers that are being presented to us.
  2. There are more systemic and structural ways than we’re aware of, in which the polls and forecasts, despite being very professionally executed, may end up incorrectly predicting the outcome.

 

Disclaimer:

Please don’t take what follows to mean that I am making a prediction to the opposite of what most forecasts suggest. It is indeed likely, based on information we have now, that Hillary Clinton will win the election.

Nor am I suggesting that the best forecasting methods out there are not well designed and executed. They are. (Nate Silver literally wrote the book on forecasting and I’m criticizing his system)

The last thing to know is that I do have a strong preference for one candidate over the others, and that, combined with my neurotic nature, makes me more interested in convincing myself and you that the race is closer than it seems.

 

National Polls Are Meaningless, So Just Don’t

Let’s kill off one huge thing really quickly to begin, and that is the National Polls.

screen-shot-2016-10-26-at-10-33-01-pm

These are actually the “predictive” numbers that get the most airplay on traditional media, which isn’t great, because they have very little to do with how the Presidential Elections actually work.

 

Presidential Elections use an electoral college where the winner in each state gets all the state’s electoral votes (with only two exceptions), a quantity proportional to each state’s population.

Getting the most votes nationwide is not what gets you elected and so, nationwide polls are an obviously flawed method of forecasting the outcome of the election.

State by state simulation is better

Luckily, there’s a better system, popularized by Nate Silver of FiveThirtyEight and now used by many major news outlets like the New York Times.screen-shot-2016-10-31-at-12-25-36-pm

Roughly: it aggregates polls on a state-by-state basis, finding a probability of each candidate winning for each state. Then, it runs 20,000 (in the FiveThirtyEight case) computer simulations where each state’s outcome is determined by a random selection, weighted according this probability. Finally, the outcomes are tallied using the real election system (number of electors for each state) and a winner of each simulation is determined. The bottom line, referred to as “Chance of Winning”, is the percent of simulations in which each candidate was the winner.

 

Despite being vastly superior to a nationwide poll, this method has huge issues when we’re trying to interpret its output. I’ll list a few:

 

1. Simulated Probabilities for Rare Events Are Weird

Our brains are wired to interpret the world in a pretty binary way – phenomena and reasoning work that way:

You either eat the cake or not.  It either rained last Tuesday or it didn’t. A statement is either true or it’s false.

The probability of something happening in the future – that’s not something we can accurately process.

One way to think about the “chances of winning” is: we can put 100 colored balls into a bag. 76 are colored blue and 24 are colored red. We then shuffle the balls and pull one out without looking. The likelihood of pulling a blue ball is the likelihood of Clinton winning the election.

Well, that should already trouble a Clinton supporter deeply: there’s nearly a 1 in 4 chance that pull out a red ball!

This underlines a problem with thinking of a very rare and impactful event in terms of probability: there’s only one election every four years and the winner gets to command a huge nuclear weapons arsenal. All of a sudden a 76%, or even a 90% chance, doesn’t inspire that much confidence.

 

2. Probabilities Don’t Capture Change Over Time

Importantly, what this style of prediction fails to capture is the likelihood of changes to the prediction itself between now and the predicted event:

One day the probability may be one thing, while the next day, when something happens in the world, it may change.

Because of how we’re wired, when we look at a “chance of winning” like the above, we fail to factor in this “changeability”. It’s very hard to us even parse it: what does it mean about the eventual event we’re trying to predict that the prediction was 80%-20% yesterday and 75%-25% today? Which one is true and which one is false? Are both true?

FiveThirtyEight does have a mode called “Polls Plus” which attempts to capture the likelihood of polls to be true or false based on historic data but any attempt in this direction will still fail to capture the entire complexity of human behavior leading to and on election day.

 

3. The spread in individual states appears much smaller than the spread in the overall chances

With an understanding of the electoral college system we can now drill down into key states and rescreen-shot-2016-10-31-at-12-53-33-pmalize that the difference between the candidates appears to be much smaller than in the bottom line prediction. As an example, let’s look at Florida: We see that even in polls where Clinton is winning, it’s only by a few points.

A 44%-42% lead doesn’t sound as impressive as a “75% chance of winning”, does it?

 

 

4. Different State Polls Can End Up Being “Wrong” Together

One catalyst of the financial crisis of 2009 was Collateralized Mortgage Obligations (“CMO”s).

The poorly executed idea was to group multiple subprime mortgages (those given to borrowers with a low credit score and are more likely to default) together as one investment product.

They used statistics to claim these “packaged” mortgages carried a much lower risk together than any individual mortgage separately: if you run a simulation of how the loans will play out over their lifetimes, some borrowers will default but it’s unlikely that “many” of the mortgages will default, so the investor is still safe.

The problem was obvious in hindsight: while the simulations assumed each mortgage is independent, in reality the underlying economic conditions that cause many borrowers to default are shared by all the mortgages. When borrowers started defaulting – a death spiral followed: they all defaulted, for the same reasons.

Similarly, if there’s an underlying reason why polls would be skewed in one direction or another and/or if something will influence voters between now and election date – these things are likely to influence polls and voters in multiple states all at once.

On paper, FiveThirtyEight tries to take this into account while admitting it’s “tricky”. My gut feeling, looking at the polls in individual swing states vs. the bottom line nationwide prediction, is that this phenomenon is not being accounted for enough to not be a problem.

 

5. This Time Polls Could Be Even Less Representative Than Usual

Let’s face it: the 2016 election is nothing short of extraordinary. No candidate in modern times was nearly as “unusual” as Trump.

While all election polls contain errors of methodology:

  1. People change their minds
  2. Voters who were surveyed end up not voting
  3. A non-representative sample is used, for example because it’s easier to reach a certain type of people (e.g. older people who stay at home and answer the phone)

This time around, there’s even more cause for concern:

Due to how controversial of a character Trump is, I find it very easy to imagine some voters unwilling to admit when polled on the phone that they’re voting for Trump. But that won’t prevent them from voting their heart on election day.

 

comment 0

The Effort to Know

We live in a glorious time: information is more accessible to us than at any time in the past, by far. A middle schooler in a small town has access to more information using her phone today than a Nobel Prize winning researcher did in 1900 using the greatest library of the time.

Unfortunately, our mental capacity as humans did not progress as much, on average. This led to a situation where we are incapable to process the huge amount of information we’re being bombarded with.

By default, our lazy brains cope by developing “shortcut” techniques: we became skilled at skimming through hundreds of tweets, posts and headlines in a very short amount of time (I am as guilty of this as the next person).

The distributors of information are well aware of this: they honed mechanisms to catch our attention and help us consume bite sized, semi-digested “content”: click-baiting headlines, tweets, “listicles”, fact-checks, “memes”. Easily understandable, broken down, eye-catching, easily shareable.

 

I think of this type of “content” as the information equivalent of junk food: engineered to taste better and satisfy faster, tempting to consume much more than the body needs, and unfortunately not that nutritious.

But as anybody who consumed a gallon of Coke or a party-sized bag of Fritos knows: there is a price to pay.

 

When you skim over tweets, headlines in your Facebook feed or Reddit and watch 20 second videos with no sound, you’re not attaining knowledge but something else:

  1. Reduced to not capture 90% of the topic at hand and its complexity.
  2. Manipulated by an interested party to steer your opinion in some direction.

Importantly, these are not just philosophical ramblings: if you, like me, believe that the advancement of civilization is achieved using the spread and application of knowledge, it’s clear that this process is leading us on a path of a under-informed, unknowledgeable society that is incapable of making good decisions, which will inevitably stall progress.

 

Simply put, in a tweet form: if we keep consuming information they way we are now, as a society we’ll make dumb decisions and we won’t have nice things.

 

Man reading newspaper

So here are our responsibilities and the tools we must use to avoid this situation, in chronological order:

  1. The first step is to realize that the world has dramatically changed in how information is being packaged and distributed: the majority of information we access is not being actively sought out, retrieved and processed by us (what computer scientists call “pull”). It is being carefully structured, pre-processed, targeted and “pushed” out to us.
    Before: You go to a library, find a book, read the book, try to make sense of it.
    After: A carefully crafted computer algorithm sends you an impossible-to-ignore piece of content, drawing your attention to it by vibrating a supercomputer inside your pocket.This means information reaches us when our guards are down and we are exposed to being manipulated.
  2. We must also be constantly aware that information is always being actively positioned, packaged and manipulated by interested parties.
    This is not a new phenomenon: journalists always had opinions; no story can be told without bias and perspective.
    The new parts, however, are meaningful: the reach, influence and ability to spread manipulated content are vastly greater than in the past.
    Additionally, the identity and motivation of the influencers and manipulators are much less conspicuous than before, which makes us, the consumers, less careful and more susceptible to manipulation.
    We must approach every piece of information with high suspicion and skepticism, if we want to uncover true knowledge.
  3. Being aware of the two points above, we ought to force ourselves to study in-depth every topic of interest and piece of “news” or “information” we encounter and consume.
    The following methods of information consumption are invalid and amount to zero new knowledge attained:
    Reading a headline or title.
    Clicking through and skimming over an article.
    Reading the entire article without understanding the background of the topic and the background, opinions and biases of the author.Note also that if after using one of the above inadequate methods of information consumption, you are to form an opinion and/or share the content on the internet – you’re doing even more damage by increasing the chances that one of your friends will fall into the same trap.

So, if the above behavior does not expand our knowledge, what does?

To actually expand our knowledge we have to:

  1. Read through a long-form piece of information (article length or more).
  2. Thoroughly research the background and general opinions of the author and her affiliations.
  3. Thoroughly research the topic and the context of the piece of information.
  4. Actively seek out and consume long-form alternative and opposing opinions on the subject presented in the original piece of content.

Only thus we are truly learning something and allowing ourselves to form an opinion.

 

Sounds like a lot of work, right? It is. But let’s not delude ourselves thinking we can skip this work and still reap the benefits of a functioning  civilization that promotes our well being.

comment 0

All It Takes to Win an Election: Scare 270,000 People in Pennsylvania

Disclaimer: this is a highly speculative, controversial-on-purpose post. Historic events and data is mostly fact. Predictions and forward-looking analysis are mine, based on gut feelings and what-if thinking, not based on fact.

 

This is just a friendly reminder that in September 1999, a few months before the presidential election that brought him to power, Putin staged “terrorist attacks” in multiple cities across Russia in which 298 people lost their lives. The Chechen rebels were blamed for the bombings. Putin, campaigning on a hardline approach to the war in Chechnya won the election in March 2000 and the rest is history.

Oh what are you saying that the leader of the second more powerful nuclear power in the world killed hundreds of his own people to tip the outcome of presidential elections get out of here!

110926putin-131704267274110300

Putin has since been at the helm for 16 years and counting but that’s not our problem, our problem is right here, right now:

I would argue that the outcome of the 2016 U.S. Presidential Election can be easily swayed with one strategically placed and timed “act of terrorism” on U.S. soil or against an American target. See, it’s not that one candidate or the other is the “better anti-terrorism candidate”. All you need is for a not-too-large (see below) group of voters to be temporarily affected by a dramatic news event and vote for the candidate whose rhetoric is more plainly and one-sidedly “against terror”.

Oh but you’re being funny this is America not Russia that stuff doesn’t happen here what are you talking about we have rule of law and folks who prevent that kind of thing!

Really? Like nobody would hack a major party’s emails to influence public opinion in favor of an outspoken Putin sympathizer? No way! Don’t forget, this is the kind of stuff presidents used to quit their jobs over.

Listen there buddy, hacking emails is one thing, blowing up people is another.

Don’t forget, the potential for real terror acts is out there constantly. The reason there’s not a bombing taking place on U.S. soil every day is because the terrorists, as a whole, are much less competent and worse funded than the government agencies combating them. And that’s a good thing. But what if a serious, professional organization, skilled in international sabotage, decides to intervene on behalf of the terrorist and just help them a little tiny bit? That’s a different story. You do remember Putin is a Lt. Colonel in the KGB.

But that’s cool, when tens of millions of people vote, reason will surely prevail over any temporary outbursts of anger.

Not so fast. The predicted gap between Trump and Clinton has been steadily shrinking over the last two weeks. As of today (9/19, source: http://projects.fivethirtyeight.com/2016-election-forecast), Clinton is predicted to win 287 electoral votes, just a 37 vote lead over Trump:

screen-shot-2016-09-19-at-10-23-01-am

 

A lot? Not really. In fact, if all else remains the same, all it takes is swinging just one state like Pennsylvania, wielding 20 electors, to tip the scale in Trump’s favor.

Pennsylvania has been polling strongly in Clinton’s favor:

screen-shot-2016-09-19-at-10-28-27-am

 

But think about this: Pennsylvania has 8.3M registered voters. Assuming a very high 65% turnout rate (this is, after all, a highly engaging and controversial campaign), we can expect about 5.4M votes to be cast. Looking at the poll data above, a 5% swing can change the outcome of the vote in Pennsylvania. That’s just 270,000 people.

What if there’s a terrorist bombing with multiple casualties in downtown Philadelphia two weeks before election date? Can you imagine 270,000 people in Pennsylvania voting in anger, despair and fear? I can.

comment 0

Twitter Polls Should be a 3-rd Party App

Twitter is rolling out their polls feature.
How many normal Twitter users care about polls? That’s obviously not going to be a big deal in terms of solving “Twitter’s problems” of not growing fast enough and not generating a ton of revenue.

polls_compose_EN
Twitter Polls is an ideal example of something that should just be a third party app. In a world where you can build apps on top of Twitter that seamlessly integrate into the Twitter UI and/or can live outside of it – this is an interesting product for a small team to build on top of Twitter. Everybody wins.
At the same time, their CEO is apologizing to developers.

I worked on a project built entirely on top of Twitter. It’s going to take a lot of time and effort to heal the Twitter-developer relationship but it’s worth it. I would not build a software business on top of Twitter right now, because if it’s any good there’s a good chance Twitter will just build the same thing themselves.

I think the only way for Twitter to succeed is to make it into the de-facto communication layer that allows both machine-machine, machine-human and human-human communication and regain developers trust enough for them to build real businesses on top of this layer.

And they should charge developers for it.

I think there’s “common knowledge” saying that you can’t justify a company the size of Twitter (~ 20B market cap) buy selling APIs. You must also sell higher order services, to larger markets (consumers, not developers). Maybe Twitter can be the one to break this mold.

comment 0

Oculus VR, the Age of Makers and Growing a Business that’s Already Huge

Beyond the usual hoopla over the Facebook acquisition of Oculus VR hides a truly spectacular feat of individual Making (capital M) and a huge milestone in the history of (crowd)funding and selling a business. In my mind, it even eclipses the acquisition of WhatsApp just a month ago, as well as the acquisition of Instagram (all by the same acquirer, more on that in a bit).

Palmer Luckey, the founder and inventor of the Rift, began working on the product sometime around 2009 as a student at USC. He posted about his little project on August 21st, 2009:

I am making great progress on my HMD kit! All of the hardest stuff (Optics, display panels, and interface hardware) is done, right now I am working on how it actually fits together, and figuring out the best way to make a head mount. It is going to be be out of laser cut sheets of plastic that slide together and fasten with nuts and bolts. The display module is going to be detachable from the optics module, so you will be able to modify, replace, or upgrade your lenses in the future!

This is 3.5 years before selling the company for 2 billion dollars: one guy, hacking together a product. Inspiring.

After creating a prototype, Oculus Rift was launched as a Kickstarter campaign in August 2012, with a modest goal of $100,000:

We’re here raising money on Kickstarter to build development kits of the Rift, so we can get them into the hands of developers faster. Kickstarter has proven to be an amazing platform for accelerating big and small ideas alike. We hope you share our excitement about virtual reality, the Rift, and the future of gaming.

It became one of the most successful projects on Kickstarter, raising $2.4 million.
Let’s repeat that for a second: 18 months before being sold for $2 billion, the product received its first funding of $2.4 million on a crowdsourcing platform.

And a final thought goes to the company that acquired all three companies (Oculus VR, WhatsApp and Instagram) mentioned. Mark Zuckerberg and his team deserve more credit than they’re getting. It seems that the common analysis is that crazy money is being thrown all around by companies who fear competition and fear missing out on the next big thing.

But there’s more business savvy going on: most of the value exchanging hands in the acquisitions of WhatsApp and Oculus VR (16B and 1.6B respectively) is in Facebook stock. Facebook is using the fact it’s a publicly traded company and its currently very highly priced stock in just the right way: to generate new business way into the future.

Traditionally, a company would sell stock to the public to get cash required for its operation. Facebook is using stock, a piece of paper that only has its current value based on the expectation of future profits, to acquire the very future profits in question. No cash needed.

Assuming no major economical disasters, Facebook stock is expected to remain strong. This means more “insanely” priced acquisitions of nascent and/or massively growing businesses in the near future. Exciting times.

comment 0

Apple Will Lose

As more and more sad details surface about Apple’s legal crusade, I keep thinking why I’m using the iPhone and don’t just switch to an Android.

Yeah, it’s not as good still, and I always told myself I’ll get an Android phone eventually, when they’re good enough.

But then it hit me:

It doesn’t matter. Apple is not going to lose only because eventually its customers will switch to the competitors’ products. Apple is going to lose because eventually its own employees, the people that make it the greatest company in the world, will leave.

Apple is the personal creation of a great man. Perhaps the greatest man of our time. But as such, Apple also has in it the seed of its own destruction. In it’s insatiable desire to being the greatest there’s also the insatiable desire to be worshiped and acknowledged as the greatest.

It’s not enough to win, to sell the most phones, tablets and laptops. It’s not enough to be the most valuable company in the world. Apple also wanted its competitors to bow before it, admit that Apple invented and created everything that’s good in the world, and commit a elaborate suicide ritual at Apple’s feet.

And when I say “Apple”, I mean “Steve”.

comment 0

Measuring The Alcohol Content of Beer

All booze has an “Alcohol by Volume” measure specified. It’s denoted as a percentage which is supposed to tell you “how much alcohol” there’s in the specific drink, or, alternatively “how fucked up are you going to be and how fast”. Beer is typically between 4%-10%, wine 12%-14%, vodka and whiskey 40% and so on.

But how do they measure this quantity? How do they know exactly how much alcohol is there in a bottle of beer?

General process

The alcohol in beer is created by fermentation. Yeast is eating up the grains in the beer, making alcohol (ethanol) in the process.

The density of ethanol is known, so in order to tell the amount of ethanol in the beer, we measure the overall density of the beer before and after the fermentation, and then deduce the amount of ethanol in the beer.

Density

“Density” is a measure of how “heavy” something is for a given volume. Imagine two identical boxes, one with nails and one with flowers. The box with nails will probably be heavier, so intuitively we can say that nails are more “dense” than flowers.

We use “Specific Gravity” to denote the density of beer. Specific Gravity is a unit of how dense something is relatively to some kind of “standard” density. For liquids, the standard is usually water. So water has a specific gravity of “1”. Something that’s twice as dense as water has the specific gravity “2” and so on.

Measuring the density of beer

A tool called Hydrometer is used to measure the density of a liquid. It’s basically a tube with a weight at the bottom. You fill the tube up to a certain point with the liquid you’re interested in measuring and the hydrometer shows the density by measuring the weight of the liquid.

The calculation

Now, since we know the density of ethanol and we have the two measures of density for the beer – before and after fermentation, we can use a simple formula to tell the Alcohol by Volume:

( ( 1.05 x ( OG – FG ) ) / FG ) / 0.79 x 100

Where “OG” is “Original Gravity” – the density before fermentation, and “FG” is “Final Gravity”, the density after fermentation.

comment 0

Apache with PHP on a Windows Machine


The program can’t start because LIBPQ.dll is missing from your computer. Try reinstalling the program to fix this problem.

If you’re getting the above error when starting Apache after installing Apache and PHP on your Windows machine, go to your PHP install directory (e.g. c:Program Files (x86)PHP) and copy the file libpq.dll into the bin directory under the Apache install directory (e.g. C:Program Files (x86)Apache Software FoundationApache2.2bin).

comment 0

The Finnish Education System

A thought provoking article in the Atlantic about the education superpower Finland:

The small Nordic country of Finland used to be known — if it was known for anything at all — as the home of Nokia, the mobile phone giant. But lately Finland has been attracting attention on global surveys of quality of life — Newsweek ranked it number one last year — and Finland’s national education system has been receiving particular praise, because in recent years Finnish students have been turning in some of the highest test scores in the world.

The thesis is the egalitarian approach is the main reason for the success:

In fact, since academic excellence wasn’t a particular priority on the Finnish to-do list, when Finland’s students scored so high on the first PISA survey in 2001, many Finns thought the results must be a mistake. But subsequent PISA tests confirmed that Finland — unlike, say, very similar countries such as Norway — was producing academic excellence through its particular policy focus on equity.

Not sure if applicable to some other countries, but definitely mind blowing.

comment 0

Using Better Naming to Clarify Code

I’m a big fan of good naming in code, here’s a recent example:

Suppose you have a unique index in a database table and you’re trusting that index to enforce no more than one record with the key.

So you’re using an insert ignore into…on duplicate key update statement.

So you end up calling something like DataAccess.InsertRecord(data) or DataAccess.AddRecord(data). Looking at such code it’s very unclear that what really happens is an insert/update and you’re only left with one record.

You can go the way of making your code explicit my moving the logic into your app and doing something like

var record=DataAccess.GetRecord(key);
if(record == null)
  DataAccess.InsertRecord(data);

But then you’ll be losing the power of using the database do that for you.

So what I’m suggesting is just making your naming better, for example: DataAccess.ReplaceRecord(data).