A Few Tips on Statistics

Statistics are a valuable tool, yet frequently misused and distorted. I have often wished that statistics was a mandatory high school class, since their usage is so critical to understanding the world around us. I thought I would try to share a few tips for better understanding statistics.

First, while statistics are often misused, they truly represent our most valuable tool for objectively assessing and quantifying issues and the nature of the world beyond our reach. The alternative is to rely on anecdotal evidence, stories that we have or others have experienced. While stories can be a great way of helping us to understand how situations work and connecting with others, they tend to lack of quantifiable objectivity. First, stories can very easily be non-representative of the most common realities. We, or those we listen too, can easily have experienced unique situations that do not actually correspond with normalcy. If fact, the stories we tend to enjoy most are those that are most unusual. So the stories that are shared, whether it be orally or through social or news media are often the ones that are furthest from an accurate representation of typical reality. For example, the deaths we were hear the most about on the news are the ones that are most unusual, and least likely to happen, precisely because those are the ones that are most interesting, the causes that in reality take the most lives, are banal.

Even though statistics are our most objective tool of describing the world in quantifiable ways, they are plenty of ways we can be deceived by them. Let’s look at a couple basic steps for filtering statistics.

The first type of statistic we often see is those that simply describe some issue. The statistics are dealing with “what”, but not “how” or “why”. These are the easiest statistic to understand since they simply describing a measurable quantity. The first thing to consider in evaluating these statistics is understanding context. Many statistics can be completely overwhelming, and difficult to comprehend. And when statistics are difficult to comprehend it is actually a good intuition that we need more context.

For example, the US national debt is currently about 17 trillion dollars. The huge size of national debt is often talked about, but in reality this number alone lacks context. Trillions of anything are simply incomprehensible for just about any human. So how can we bring context? We could consider this at a personal level, the national debt per person is about $55,000. This is little easier to understand (less than many mortgages, but much higher than a responsible level of consumer debt). We could also consider a statistic from a history perspective (the debt as a percentage of GDP is higher than most times in US history, but it is lower than certain times like during WWII). We could also compare the US to other countries (the US is higher than most, but lower than several, for example, Japan’s is nearly twice as high). This isn’t to make any particular claims about the US national debt, but it demonstrates how we can try to find some context for statistics. The US debt is high, but without this context, numbers like the national debt are conceptually meaningless, and it is important that we find things to compare against to provide meaning.

The next thing to beware of in approaching these statistics that often something is being implied. While using statistics to understand the “what”s of the world, we often want to take the next step to understand “why” things are the way they are, and “how” we can affect them. We want to understand cause and effect.

Probably the most common saying among those who deal with research and statistics is “correlation does not imply causation”. This may sound confusing, but it is a fairly simple warning. While statistics may show that two things are related, we need to be very cautious about assuming that one thing caused another thing.

Let’s consider an example. One could say that you are far more likely to die when you are in the hospital than when you are at the grocery store. Of course this is true, but does this therefore imply that hospitals are more dangerous than grocery stores? Of course not. Hospitals aren’t the cause of the deaths. It is the illnesses that lead both to deaths and to the hospitals that is the cause. We can’t assume that just because hospitals and deaths are related that one is the cause of the other.

But determining cause and effect, so that we can learn what factors lead to what results is still extremely important, and statistics can be used to assess cause and effect. So how can we accurately determine when a statistics indicates a true cause? A statistical relationship can demonstrate cause and effect if we can prove that no other factor is the real cause. Sometimes we can logically establish this. Other times we can setup experiments. In medicine (and increasingly in other fields, like microeconomics), trials are set up where different people are given or not given some intervention (like a drug), and then you can measure the effects by comparing the two groups. Usually the recipients are randomly chosen. The advantage of this approach, is that by randomly choosing the cause, we can be certain that no other causes are in play. For this reason, randomized control trials are used extensively to accurately assess drugs, and can be very valuable for making accurate assessment of other types of interventions.

Outside of controlled experiments, it can be very difficult to pin down sources, because it is very difficult to be sure that no other factors are influencing results. For example, determining the effect of different educational institutes is remarkably hard because so many factors like affluence affect both the education outcome and the type of institution that the parent would choose, and determining whether the institute was the deciding factor or any of the other vast number of ways that wealth can be used to benefit the education, is again, very difficult.

When we can’t control the factors, the primary remaining technique is regression analysis. However, this is where this casual discussion of statistics quickly turns much more advanced. Suffice to say that to truly demonstrate cause and effect with real world uncontrolled statistics typically requires complicated analysis.

To summarize the warning here, if someone is trying to convince you that one factor is the cause of another issue through statistics, unless the cause is logically provable, comes from a controlled experiment, or involves regression analysis, it is probably suspicious.

Statistics are an invaluable tool for understanding our world, making informed decisions how we prioritize our time and money. But in order understand them, remember to seek to understand them by putting statistics in context, and be cautious of drawing cause and effect relationships. Hopefully these tips help make sense of statistics.

The Engineer’s Charities

There has been a growing movement away from disinterested charity to engaging in rational and effective altruism. I wanted to write this post, because I believe that we, as engineers, should be on the forefront of this movement. This post is not an attempt to persuade engineers that they should give, although generally most engineers are reasonably well-compensated, and have the opportunity to do tremendous good through the generosity. But here I wanted to consider how and where we can give in a way that aligns with the values and principles of engineering.

A good engineer is a peculiar type of person. A good engineer is one who can solve problems and designs solutions by applying a logical, rational thinking, and using careful and clear analysis. Engineers are often not origination of research, but must analyze research to intelligently determine how the research of others applies to solving real problems. Engineers must not operate solely in the abstract and theoretical, as they constantly must solve problems of the real world, and yet they still must integrate abstract theory to create designs that will work beyond just surface level needs, that having enduring and broad application now and later.

Certainly one clear distinction between a poor engineer and a good one, is that the former is content to accept whatever tool, material, or design that is put before him first, perhaps because it is the most popular, or suggested by another, whereas a good engineer is a meticulous critic, knowing how to carefully choose between the bad, good, better, and best option. A good engineer can clearly point to the reasons why a particular component or design is better, and precisely why and how much it will yield better results or performance.

There is no reason we shouldn’t apply this same level analysis and critique to our generosity and giving. Unfortunately, this is not how the world of charity generally works. Money is typically raised by making emotional appeals, and leveraging networks of connections to attract donations. Most donations occur, not as the result of careful analysis, seeking to find the best and most efficient opportunity to deliver a desired good for others, but usually as an incidental response to some plea. Consider if we applied this approach to one of our projects, rather than analyzing our options and rationally choosing the optimum components or design for our project; we just went with whatever advertisement or buzzword we heard most recently. This would represent the epitome of lazy engineering.

As I mentioned before, there has been increasing efforts towards engaging in charity, not just as a capricious way of getting rid of money and creating some temporary emotional satisfaction, but approaching altruism with a clear goal of helping the most people in the greatest way possible with the funds we have available to us. This movement towards effective altruism, points toward the same relentless analysis of different possible mechanisms and the results they achieve, just as we engineers demand in the projects, where we spend most of our energy. This approach demands real research of interventions and the evidence of their efficacy over emotional appeals and marketing.

With this, I want to introduce a couple charities and meta-charities that epitomize effective, rational altruism:

A Charity Based on Research

Innovations for Poverty Action (IPA) is a charity founded by some of the best researchers in development economics. This charity has a foundation of research work, focused first on scientifically evaluating different possible interventions to objectively determine the most promising interventions. IPA has been at the forefront of the movement to make extensive use of randomizing control trials (RCT). This form of research has been incredibly promising since it facilitates clear tests of causation. In non-randomized trials, controlling for various factors and establishing causation is incredibly difficult and prone to biases, whereas RCTs give an irrefutable cause (randomized selection) for the basis of tests.

From the results of this research, IPA has developed programs to scale up efforts in proven and promising programs that have been shown to be effective and efficient. IPA has branched off a separate organization, Evidenced Action for the funding and development of these efforts.

Givewell – Givewell is a charity evaluation organization that has begun to receive significant attention for their extremely detailed analysis of a select few charities, and how many lives are saved or impacted for each dollar spent. While their level of analysis is probably not unprecedented (it probably happens within Gate foundation, USAID, and perhaps some other foundations or agencies), but their transparency almost certainly is. They have done very careful and detailed research of top, effective charities. You can visit Givewell’s site for more information, but their current top-rated charity is the Against Malaria Foundation (AMF). From their research, it is estimated that approximately one live is saved for every $2000 donated, a remarkable return on investment, with a high level of research behind it.

For some, even engineers, this might sound excessively cool and calculating. Isn’t giving supposed to be more than just another engineering project, something emotional and/or spiritually satisfying? I don’t believe rational giving is mutually exclusive with these other forms of satisfaction. We often apply the greatest level of rationality and thought to the things we care most about. And spiritually, at least for me, as a Christ-follower, called to care for others, putting the actual result of my giving in terms of impact on other’s lives as the highest goal, is to me the ultimate fulfillment of this spiritual pursuit. Likewise, emotionally, realizing that I have tangibly and realistically saved dozens of real, human lives, probably children who were loved as much as I love my children, is tremendously emotionally compelling. My challenge to engineers: approach giving just like you do engineering, with your best, most rational, careful analysis and concern. The impact you can make is immense.

Closure Based Instance Binding

In JavaScript, we frequently write functions where we wish to access |this| from another (typically outer) function. The |this| keyword does not inherit from outer functions like other variables, each function has its own |this| binding. For example, suppose we have a widget that needs trigger some code in response to a button being pressed. We might try:

buildRendering: function(){
  this.domNode.appendChild(document.createElement("button")).onclick = function(){
    if(this.ready){
      this.showMessage();
    }
  };
},
showMessage: function(message){
...

However, this code won’t work as intended, because |this| in the inner function refers to something different (the target DOM node in this case) than it did in the outer method. Often times developers use a bind() or hitch() function to force a function to have |this| be the same as an outer function, fixing this code as such (note this is only available in ES5, most JS libraries provide a function to do something similar):

buildRendering: function(){
  this.domNode.appendChild(document.createElement("button")).onclick = (function(){
    if(this.ready){
      this.showMessage();
    }
  }).bind(this);
},

Another technique used by developers, is closure based instance binding, where we assign |this| to a variable that can then be accessed by the inner function:

buildRendering: function(){
  var widget = this;
  this.domNode.appendChild(document.createElement("button")).onclick = function(){
    if(widget.ready){
      widget.showMessage();
    }
  };
},

Both techniques have their place, but I almost always use the latter. I wanted to give a few reasons:

  • Philosophical – JavaScript is foremost a closure-based language, with method dispatching as additional sugared mechanism built on the foundation property access and call operators. This is in contrast to other languages that place object oriented method dispatching as foundational with closures as something built on top of it. Every function in JavaScript has a closure on some scope (even if it is the global scope), and using closures to maintain references to different instances through different function levels is the essence of idiomatic JavaScript.
  • Clarity and readability – “this” is a very generic term, and in layered functions, its ambiguity can be make it difficult to determine what instance is really being referred to. Setting |this| to a variable lends itself to giving the instance a clearly articulated variable name that describes exactly what instance is being referred to, and can easily be understood by later developers without having to trace through bindings to see where |this| comes from. In the example above, in the inner function, it is clear that the are referring to the widget instance, as opposed to the node or any other instance that might get assigned to |this|.
  • Flexibility – Using closure based instance binding, we can easily reference a parent function’s instance without losing the ability to reference the inner function’s |this|. In the example above, we can access the widget instance, but we can also still use |this| to reference the target node of the event.
  • Code Size – After minification setting up access to the parent |this| instance requires (approximately, it can vary on minification techniques, this assumes variables are reduced to 2 bytes):
    function.bind() – setup: 9, each reference: 4
    closure-based – setup: 8, each reference: 2
    Some functions like forEach, already have an argument to define |this| for the callback function, in which case they have a shorter binding setup.
  • Reduced Dependencies – Function binding entails using ES5’s bind(), which reduces compatibility to only ES5 environments, or requires reliance on third-party library, decreasing your code’s portability and flexibility in being used in different contexts. Closure-based bindings rely on nothing but JavaScript mechanisms that have been supported by all VMs forever.
  • Performance – There are two phases of the performance to test, that varies on our mechanism of binding. First is the setup of the function. This can be dominant if the function is not called very often. Second is the actual function invocation. Obviously this is more important if the function is called frequently. From these tests you can see the clear, and often immense, performance advantage of closure based binding:
    Setup: http://jsperf.com/bind-vs-closure-setup
    Invocation: http://jsperf.com/bind-vs-closure-invocation
  • Of course there are there are certainly situations where it is appropriate to bind. If you are going to provide a method directly as an event listener, binding is often appropriate. However, it is worth noting that I have seen plenty of instances of creating a method specifically for an event listener, and when an anonymous inline function (with closure-based instance binding) could easily have been used.

UNFPA

It has been estimated that at a cost of about $1.7 billion a year, modern contraceptive use prevents 105 million induced abortions and another 3.6 million infant and mother deaths (in 2004) per year, with the United Nations Population Fund (UNFPA) being a main provider of contraceptives in developing countries where access would otherwise be limited. If you believe that life begins at conception, than the UNFPA has almost certainly prevented more deaths than any other organization on earth, and with more efficiency (between $7 and $177 per death prevented) than any other organization as well. The tragic irony is that funding cuts to the program in this decade have largely been driven by pro-life organizations, the very ones that hold this view of life (due to misconceptions that the UNFPA advocates abortion, which continues). This isn’t intended to be a slant against the well-meaning intentions of these organizations, but a lesson that to honestly pursue a cause, we must humbly learn from empirical feedback what are effective solutions rather than simply pushing forward with our ideological assumptions.

You can give to UNFPA here.

Capitalism is a Tool

Free market capitalism is awesome. But capitalism is a tool. Arguing about whether it is universally better than some other tool (socialism, communism, restricted capitalism, etc.) is as foolish as arguing about whether a hammer is a universally better than a screwdriver. The more relevant question is when it is appropriate (or in what form). Capitalism works fantastically well with a few conditions that create incentives that drive all parties to make decisions that benefit everyone:
1. Consumers have the ability to make a rational choice about what to consume.
2. Consumers have the freedom to choose between different products or services, creating competition that drives producers towards better value.
3. Demand drives producers to increase supply, benefiting consumers.
4. Consumers inability to afford a product/service does not represent an ethical violation.

When these conditions are present, capitalism generally contributes to a flourishing and just economies. For example with cars, consumers can easily research and test drive car to make a rational choice, they can choose between different makers and models, the demand pushes auto makers towards building more cars and increasing availability, and in general their is no basic inviolable human right to having a vehicle. This is a win-win situation, incentives benefit both producers and consumers in an relatively fair and accurate way that generally generates high and robust growth and production. Capitalism was integral in the industrial revolution were many growing sectors matched these conditions and showcased the benefits of capitalism in spectacular fashion.

However, if any of these conditions are missing, the effects can be negative. When a condition is missing, an otherwise free market may actually create perverse incentives rather than beneficial incentives. For example, when a monopoly occurs, consumers no longer have freedom, thus anti-competition laws are enforced. While many economic sectors fulfill these conditions, it is naive to assume that all do.

Education is sector where there is general consensus that it is ethically unacceptable to deny some children education due to economic circumstances (especially since such circumstances are usually beyond the control of the children). While a free market still exists with private education, allowing consumers to choose potentially superior services, this sector is supplemented with a public social structure because the ethical condition of capitalism is not fully realized. It is not appropriate to apply it as the only tool, thereby denying some children access to education.

Health care is a sector where consumers intrinsically fall short of rational and free choice. Some services are rendered in distress or emergency where there is little opportunity for research or alternatives. Other choices are made by specialists where the buyer isn’t involved (there isn’t a budget/value force towards low-price) or doesn’t understand the options enough to participate. Decisions also involved short-term vs long-term benefits where humans are notoriously poor at make rational decisions (we tend to choose short-term gains with little concern for long-term). Also insurance plays a buffering role further removing the consumer from direct value-based decisions. With these conditions for capitalism missing, perverse incentives are present, motivating increasing health care costs rather than reduction in costs. Predominantly privatized health care has struggled to produce good value (compared to government provided universal health care), not because capitalism doesn’t work, but because the conditions for capitalism simply do not exist (and it’s unlikely that they can be forced to exist) in health care. This is blatantly evident with the US health care system, one of the few developed countries still languishing with a privatized model. The US spends vastly more than any other country on health care, costs are rapidly increasing (since incentives are driving prices up instead of down), to the point where statisticians use it as an example of statistical outlier. In fact the US government spends more on our privatized health care (where the majority of expenditures are private), than do countries with universal health care (where the goverment covers the majority of costs), all while we have shorter life spans and higher mortality rates than most other developed countries. Finally, the health care also fails to meet the ethical condition. Denying basic health services due economic circumstances is a moral failure, even if the system was working efficiently.

Computers and the Internet are also creating situations where the demand-driven supply condition is started to evaporate in certain sectors. An example of this is the music industry. Duplication of music (and other products like software) has effectively reached zero-effort. This means that supply can be almost infinite as soon as music is recorded, as it can be distributed with no virtually effort from the producer. Consequently, perverse incentives are present, leading the music industry to create arbitrary constraints (DRM). With the absence of this condition for capitalism, the music industry is incentivized to create demand by reducing/constraining supply (a negative utility to society), rather than working to increase supply (benefitting society). The solution probably is not socialize the music industry, but we should acknowledge the suboptimal efficiency, and recognize that alternate post-capitalistic models may need to be exercised (and perhaps are already being used, many artists have opted for more of a gift economic approach ).

Of course there will be legitimate disagreements on the degree and type of different economic models needed in different sectors, but we should at a least start with the right question. We must start with a proper perspective of economic models as tools, not sacred institutions, that can be applied in differing degrees in different situations, rather than falling for simplistic overarching false dichotomies. Rather than leaning on our assumptions, the question of what tool to use should be driven by pragmatic look at works for the situation.

Gratefulness and Gas

A gallon of gas contains 130 million Joules of energy, equivalent to the energy of lifting a man from sea level to the top of Mt Everest about 15 times, or 31,000 Calories, and equivalent to about two weeks worth of food. Being able to purchase something with this much energy for less than $30 is incredible and something that we should be extremely grateful for, and we may not always enjoy.

The price of gas is based on simple global supply and demand. Gas comes from oil, which is a fungible resource readily interchangeable on the global market. Supply is dominated by OPEC. The US portion of the supply is very small, it currently produces around 8% of global oil, and holds only 1.5-2% of global reserves (there are some debates on extents of provable reserves, so it is possible we have up to a percentage point higher). On the otherhand, the US provides a significant part of the demand (25% of global demand), thus the US’s primary influence on prices is in demand. Economic recession decreases activity thus decreasing demand and prices, economic growth increases prices. If you don’t like prices, just like any other commodity, the recourse is it to not buy. If you are a buyer you are a participant in demand.

There are a few other contributions to gas prices that either small, regional, short-term. One is the gas tax. George H.W. Bush raised the federal gas tax to 18.4 cents per gallon in 1993, and it has remained at the level ever since. This tax represented about 14% of the price of gas under (the first) Bush. With current gas prices, this federal tax now only constitutes about 5% of the price of gas. The gas tax under the Obama administration is also at the lowest amount in constant dollars in over two decades. This is also one of the smartest taxes that we have since it not only improves roads, but incentivizes lower consumption better than any MPG mandates can (I would love to see it increased). Oil futures and trading affects price, but this too has minimal long-term impacts, and can help cope with supply fluctuations. Like other global products, inflation and exchange rates affect the current price, but the last few years have seen lower than average inflation and a general strengthening of the dollar against other currencies. Finally, there are different state and local taxes, and different regional requirements on the quality gas, and different transportation costs due to proximity to refineries and sources of production, and these are the primary drivers of the differences in gas prices in different locales, but this isn’t related to the overall country gas price trends. For each of these factors the federal government has little control or avoided any price increasing policy changes.

To grumble about prices and make it political is generally either ignorant (the principle way the US reduces prices is through economic recession), or manipulative. Drilling or pipelining more has a negligible affect on global supply. And the US is depleting it’s proven reserves at least 4-5 faster than most countries in the world, drilling faster now as way to avoid foreign dependence on oil just shifts even greatere dependence to our next generation. As one concerned for children’s future, I don’t want to dump them further them into the dead end of oil addiction.

Specifically the supposed benefits of the Keystone pipeline towards lowering prices are particularly vacuous, pipeline don’t produce oil, they just move it around, and as many have pointed out, it will move oil to the gulf for easier export, making it likely to actually slightly increase gas prices in the midwest. The real question in regards to Keystone is simply the ethics of increasing efficiency of tar sand based oil transportation versus the ecological impact.

In reality, increased gas prices are basically due to increased economic activity and recovery (the dominant supply factors are mostly out of our control), increased spring/summer driving, and supply concerns with possible conflict in Iran. The primary effect America has had is in economic recovery (and possibly we’ve played a part in some short-term supply concerns by threatening military action against Iran). You can’t eat your recovery and have your cheap oil too.

When it comes to amazing amount of convenient energy in we can buy in a gallon gas, gratefulness is a better attitude than complaining and reduced usage is a better response than political manipulation.

Kony 2012

A few thoughts from watching Kony 2012: This is an awesome and inspirational video and awesome cause worth fighting for. It is fantastic to see attention brought to tragic foreign issues that affect humanity beyond what just impacts our own self-interests. I know there have been criticisms of Invisible Children, but from what I have seen they have done a great job of responding to critiques and disclosing financial and operational information. They are definitely in it to end this injustice.

However, hopefully we don’t end with Kony. Kony and the LRA are just the tip of the iceberg of global injustices. What Invisible Children has courageously demonstrated is a willingness to learn and expose injustices that have been hidden, and speak out about them. We are not truly following in their footsteps if all we do is share or watch a viral video. And it is shallow pursuit of justice if we only react when there is a clear villian we can point our finger at. One of the criticisms of the film is that it oversimplifies the issue, but in reality this should be a criticism of us, and the fact that we often won’t respond to any issue that is more complicated than what a five year old can digest, even for injustices they have claimed far more lives than Kony. Simplification is just what IC had to do to connect with us (and obviously it worked).

Anyway, let’s follow Invisible Children’s lead. Let’s stop Kony. Our voices do indeed matter, and make a real difference. And let’s not stop there, let’s follow their lead in truly learning about other injustices and making our voices heard. And we can indeed reshape human history, both in Uganda and in the rest of the world.

“Where you live should not determine whether you live… At the end of my life I want to say that the world we left behind is one [my child] can be proud of… A place where children no matter where they live, have a childhood free from fear.” – Jason Russell (cofounder of IC)

Follow

Get every new post delivered to your Inbox.