The Problem of Induction

(This post is inspired by a tweet by Nathan Oseroff in which he says that we should "retire" the word induction because the word has come to mean so many different things in so many different contexts that the so-called problem of induction is just a useless collection of vaguely related concepts. This is my attempt to give a clear definition of induction and the problem everyone has with it. Yes, I'm aware of this comic.)

Deductive inference is a process by which we come to know that some proposition, Q, is true on the basis of other propositions P_1,...,P_n. The hallmark feature of deductive inference is that if it's employed in the "right" way (e.g. in a valid inference) then we can be absolutely certain that Q is true, given P_1,...,P_n.

Inductive inference (or "induction" for short) is a process by which we come to know that some proposition, Q, is true on the basis of other propositions P_1,...,P_n. In that way, deduction and induction are identical. However, unlike deductive inference, inductive inference never guarantees the truth of Q.

That's where we get the infamous "problem" of induction (which isn't actually a problem but a feature of inductive inference). The problem of induction is a challenge to tell me under what conditions, if any, I can know that Q on the basis of some propositions P_1,...,P_n. We take ourselves to know how this works for valid inferences. What about invalid inferences? Trying to answer that question is what generates the "problem" of induction.

The problem of induction is just the problem of evidential support for invalid inferences. The problem of evidential support asks why are some propositions in a relation of evidential support with others and can we tell when this happens and when it doesn't? We generally take ourselves to have a good answer to this question for deductive inference (most people think the evidential support relation reduces to the relation of validity), so the "problem" with induction is that we don't have a good answer for inductive inference.

As I understand it, there are only two plausible possible answers to the question of when we can know Q on the basis of P_1,...,P_n (in the case of induction): never and sometimes. (I'm ignoring people who asy "always" because I don't think any such people exist.)

People who say "never" are skeptics about induction. People who say "sometimes" are statisticians.

The further, and I think really interesting, question here is whether an "all-purpose" answer to this question exists. Is there a unity of "confirmation" (like Carnap thought) such that deduction and induction all fall under the same broad process? If not, is there at least a good, subject-neutral answer to the question for induction (separate from the one for deduction) that we can apply in cases where deduction won't help us?

Partitions and Graphs

A partition of a set is a way of dividing that set into several non-empty, non-overlapping subsets. For example, the partitions of the set {A,B,C} are as follows:

• A|B|C
• AB|C
• AC|B
• BC|A
• A|B|C

The Bell numbers (OEIS/A000110) describe the number of unique partitions for a set of n elements. The sequence grows pretty fast:

1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147, 115975, 678570, 4213597, 27644437, 190899322, 1382958545, ...

Set partitions are interesting for a number of reasons. One reason is that set partitions also describe the number of equivalence relations on a set. So, for an arbitrary set there are B_n ways to treat the n elements of that set as equivalent. This puts an upper bound on measurement, since we can understand these equivalence relations as modes of indistinguishability.

Another reason is that Bell numbers also describe the number of unique component graphs for n nodes. E.g. for a simple graph with 3 labeled nodes, there are 5 unique graphs:

• A B C
• A-B C
• A-C B
• B-C A
• A-B-C

Set partitions also describe the possible ways of dividing a set of events into equivalent outcomes for Bayesian learning. If I want to know how to think about the possible outcomes of some experiment, for example, I partition the set of possible observations. However, your choice of partition matters a lot to how you should set your credences. This suggests that you should follow a rule or be otherwise internally consistent in some way in how you partition the set of possible observations. What I'd like to know is whether we can give well-defined conditions under which this requirement is met.

A Causal Model for the Principle of Alternative Possibilities

The Principle of Alternative Possibilities says that S is morally responsible for P iff it is possible that S did not perform P.

In his 1969 book Alternate Possibilities and Moral Responsibility Harry Frankfurt introduced a much discusssed counterexample to the Principle. His counterexample runs as follows:

Suppose that Smith really wants Jones to perform some action. He sets out to ensure that Jones will perform this action. Smith resolves to wait until Jones makes up his mind about what to do before he acts. Once Jones resolves what to do, Smith will step in to ensure that Jones performs the desired action. As Smith watches, Jones deliberates and then elects to do the action Smith desired all along!

Frankfurt concludes from this example that Jones is certainly morally responsible for his action, even though it's true that Jones could not have done otherwise.

In this post I'd like to offer an explanation for the force of the counterexample based on its causal model.

The Model

An oversimplified first pass at the causal model of the situation Jones is in looks like this:

deliberation --> action

In Frankfurt's original case, Smith knows the value that deliberation will take. If it takes an certain value (say 0), then Jones will act in the way undesired by Smith. Smith observes the value that Jones takes in deliberation and then Smith will step in to ensure that action takes the desired value, in case deliberation takes the value 0. If deliberation takes the value 1, then Smith does nothing. This is effectively a plan to intervene on the variable action.

While it's true that Jones could not have done otherwise, this fact isn't due to a feature of the causal model itself (I'm just going to stipulate this, although you might think if a model like this was veridical and deterministic there would be bigger threats to Jones's freedom than Smith) but due to the way the intervention is being applied to the model. The performance of the intervention is conditional on the values for deliberation. It seems to me that the force of Frankfurt's example is that Jones is morally responsible for his action (despite being unable to do otherwise) because of the value that deliberation takes.

However, suppose that Smith intervened on deliberation instead. My reaction to this intervention is that Jones isn't responsible at all. Part of this is the way intervention gets formally modeled. There's no way for Smith to observe that deliberation has taken value 0 and then intervene "before" that value is transmitted to action. The intervention and subsequent downstream effects are simultaneous. The only way to get Frankfurt's case is to suppose another exogeneous variable exists:

brain activity --> deliberation --> action

Now Smith can observe the brain activity variable for its value and intervene on deliberation to prevent the value from getting passed downstream. I think the main problem at stake with this approach is that it begins to look less and less plausible that Frankfurt's counterexample is going to do the work of undermining the Principle. If the causal model for Jones' behavior looks like the one here, then it's already false that Jones could have done otherwise.

A few lingering thoughts:

1. It would be interesting to see how the model would differ in a stochastic case.
2. Is the best way to understand Smith's action as an intervention?
3. Could Smith be intervening not on a variable in the model but on the model's structure?