A recent video from Howie Hua showed how if you split a collection of numbers into equal-sized groups, then find the mean of each group, then find the mean of those means, it turns out this final answer is the same as the mean of the original collection. He was careful to say it usually does *not *work if the groups were different sizes. Which got me to wondering: just how much of an effect on the final mean-of-means can you have by splitting a collection of numbers into different-sized groups?

This is how I asked the question:

You have the whole numbers from 1 to 12.

You split them into however many groups you like of whatever sizes you like, so every number is used exactly once.

For example, you could split them like this:

{2,5},{8},{1,6,10,11},{3,4,7,9,12}.Now find the mean of each of these groups.

In the example, those means are

(2+5)/2=3.5, 8, (1+6+10+11)/4=7, (3+4+7+9+12)/5=7.Then find the mean of those means.

In the example, this mean-of-means is

(3.5+8+7+7)/4=6.375.What is the smallest you can make the final mean-of-means, by choosing your groups just right?

Having played with this question online for a day, and in person with others at two sessions of One Hundred Factorial, it has now become one of my favourite problems. It has been particularly fascinating watching other people think about the problem. It seems to defy instincts to begin with, but as people try things out, their instincts develop and they quite quickly have a conjecture about the smallest possible answer. *Proving *this really is the smallest possible answer was actually quite difficult, though I have to say I really enjoyed the process. The ideas that produced the proof came from multiple people, including me, Lyron Winderbaum, Alex Mackay and James Fitton-Gum..

I’m going to include a proof here, because I love the ideas in it so much, and because I happen to find a certain pleasure in writing up a proof.

While writing it, I was struck anew by how different the process of coming up with a proof is from writing one. You can know a thing is true, and know all the reasons why, and be able to explain to someone why, but it’s not the same thing at all as a formal written proof. A formal written proof is a thing all its own, which requires its own special thought processes different from just coming up with or explaining the ideas in a proof to someone live. For a start, a formal proof usually does not run by just giving suggestive examples like an explanation can. You have to define everything carefully and choose variables and terminology, and choose an efficient and logical order for the statements you make, which usually takes *re*ordering along the way. I happen to like this process, even though it can be hard work.

Anyway, the proof is done and I have written it below. It is slightly less formal than the traditional formal proof, but I like it. if you don’t want to see a proof right now, please look away!

In all that follows, assume that n is a fixed natural number.

**Definition 0: **

*Consider a collection of disjoint sets of numbers. When you calculate the mean of each set, and then calculate the mean of those means, the final result is called the mean-of-means of the collection of sets. *

For example, the mean-of-means of the collection of sets

{2,5},{8},{1,6,10,11},{3,4,7,9,12}

is

6.375.

Of particular interest here are the collections of sets that are a partition of the set {1,…,n}. A partition of a set is a collection of disjoint subsets which together contain all the elements of the original set. We are interested in which partition of {1,…,n} has the lowest mean-of-means.

Since the set {1,…,n} is finite, it has only finitely many possible partitions, and so there are only finitely many means-of-means. Therefore there must actually be a lowest mean-of-means.

**Lemma 1:
**

*Consider a collection of natural numbers that add to n. Of all partitions of the set {1,…,n} into subsets of those sizes, choose one with the lowest mean-of-means. In such a partition, if two numbers are in different-sized subsets, then the larger number is in the larger subset.*

For example, if you partition {1,…,12} into subsets of sizes 2,3,3,4, then, out of all the ways to arrange the numbers into sets of those sizes, a way to do it to get the smallest mean-of-means is this:

{1,2},{3,4,5},{6,7,8},{9,10,11,12}.

A different arrangement of the numbers into the same sized subsets is:

{1,2},{3,9,5},{6,7,8},{4,10,11,12}.

And that has a bigger mean-of-means.

In the first case, the mean-of-means is

1/4( 1/2 (1+2) + 1/3 (3+4+5) + 1/3 (6+7+8) + 1/4 (9+10+11+12) )

= 1/4( 1/2+2/2+3/3+4/3+5/3+6/3+7/3+8/3+9/4+10/4+11/4+12/4)

In the second case, the mean of means is

1/4( 1/2 (1+2) + 1/3 (3+9+5) + 1/3 (6+7+8) + 1/4 (4+10+11+12) )

= 1/4( 1/2+2/2+3/3+9/3+5/3+6/3+7/3+8/3+4/4+10/4+11/4+12/4)

Comparing these calculations, everything is the same except that these two terms in the first expression

4/3+9/4

become these two terms in the second

9/3+4/4.

Changing that 4/3 to 9/3 makes the total go up by 5/3. Changing that 9/4 to 4/4 makes the total go down by 5/4. But 5/3 is more than 5/4 and so the total has gone up overall.

This is the argument I’ll use in the proof, though I’ll go the other way, from a partition with the numbers in the wrong order to one with the numbers in the right order.

**Proof of Lemma 1:
**Suppose a partition of {1,…,n} into k subsets has the number p in a subset of size t and the number q>p in a different subset of size s<t. That is, the larger number is in the smaller subset.

The contribution of p and q to the mean-of-means is 1/k (p/t+q/s).

If everything in the partition is kept the same except that p and q are switched, then all numbers will make the same contribution to the mean-of-means except for p and q, whose contribution is now 1/k (p/s+q/t).

So, the change in the mean-of-means is

1/k (p/s+q/t-p/t-q/s)

= 1/k (q/t-p/t -q/s+p/s)

= 1/k ((q-p)/t-(q-p)/s)

= 1/k (q-p)(1/t-1/s)

Now t>s, so 1/t<1/s.

This means 1/t-1/s is negative, and since (q-p) is positive, the change in the mean-of-means is negative.

That is, the mean-of-means has *decreased*.

Therefore, if the partition is to produce the lowest mean-of-means, then for every pair of numbers in subsets of different sizes, the greater number must be in the larger subset. Otherwise the mean-of-means could be reduced.

**End of proof!**

Note that technically this proof does not show that there have to be consecutive numbers in each subset. For example,

{1,2},{3,5,8},{4,6,7},{9,10,11,12}

has the same mean-of-means as

{1,2},{3,4,5},{6,7,8},{9,10,11,12}.

But it doesn’t have a properly *lower* mean-of-means, so that’s ok.

**Lemma 2:
**

*Consider a collection of natural numbers that add to n. Of all partitions of the set {1,…,n} into subsets of those sizes, there is one with lowest mean-of-means where every subset has consecutive numbers.*

**Proof of Lemma 2:
**Suppose a specific partition does have the lowest mean-of-means and consider the subsets of size s. Let p be the smallest number in all the subsets of size s and let q be the largest number in all the subsets of size s.

Consider a number c with p<= c <=q. The number c cannot be in a subset larger than s, or it would have to be larger than q, by Lemma 1. The number c cannot be in a subset smaller than s, or it would have to be smaller than p, also by Lemma 1. So c must be in a subset of size s. So, every number from p to q is in a subset of size s.

If there were only one subset of size s, this shows it contains consecutive numbers. If there is more than one subset of size s, consider again the number c in one of these sets.

If there are k subsets in total, then the contribution of c to the mean-of-means is c/(sk) regardless of which subset of size s it is in. So, the numbers in the subsets of size s can be rearranged into other collections of subsets of size s without changing the mean-of-means. Therefore, it is possible to arrange them with consecutive numbers in each subset of size s.

**End of proof!!**

The next thing I want to prove uses a move like the following.

Start with these two sets

{3,4,5},{6,7,8}.

Their means are 4 and 7.

Move the 5 from one set to the other to get the new sets

{3,4},{5,6,7,8}.

Their means are 3.5 and 6.5.

By moving the largest number in one set to become the smallest number in the other, *both* means have been reduced.

It is always true that if you remove the largest number from a set (that has more than one element), the mean reduces, and it is always true that if you include an extra number in a set smaller than all the ones already there, the mean reduces, but I just feel a need to prove them to myself.

**Lemma 3:
**

*Suppose a set of numbers S has a number p larger than all the others. Then the set S\{p} has a lower mean than S.*

**Proof of Lemma 3:
**Let s be the size of S and the mean of S be σ.

The s-1 numbers in S other than p are all less than p. Therefore their total is less than (s-1)p.

So the total of S is less than (s-1)p+p=sp.

Hence the mean of S is less than sp/s=p. That is, σ<p.

The total of S is sσ, so the total of S/{p} is sσ-p.

Since p>σ, that implies sσ-p<sσ-σ=(s-1)σ,

The mean of S\{p} is

(sσ−p)/(s-1)

<(s-1)σ/(s-1)

=σ.

That is, the mean of S\{p} is less than the mean of S.

**End of proof!**

**Lemma 4:**

*Suppose T is a set of numbers and p is a number less than every number in T. Then the mean of {p}UT is less than the mean of T. *

**Proof of Lemma 4:**

Let the size of T be t and the mean of T be τ. All of the t numbers in T are greater than p, so the total of T is greater than tp. Therefore the mean of T is greater than tp/t=p. That is, τ>p.

The total of T is tτ, so the total of {p}UT is tτ+p. Since p<τ, this implies

tτ+p< tτ+τ=(t+1)τ.

The mean of {p}UT is

(tτ+p)/(t+1)

<(t+1)τ/(t+1)

=τ.

That is, the mean of {p}UT is less than the mean of T.

**End of proof!**

Now I am ready to prove a different way to reduce the mean-of-means.

**Lemma 5:
**

*Let k be a natural number with k<=n. Of all the partitions of {1,…,n} into k subsets, the one with the lowest mean-of-means has the numbers k to n all together in the same subset, and all numbers less than k each in their own subset of size 1.*

For example, when you split {1,…,12} into 4 subsets, the lowest mean-of-means belongs to {1},{2},{3},{4,…,12}.

**Proof of Lemma 5:**

Suppose {1,…,n} is partitioned into k subsets in such a way that there is a subset S with more than one number in it, and there is another subset T all of whose numbers are greater than those in S.

Let p be the largest number in S. Create a new partition by replacing S with S\{p}, replacing T with {p}UT, and leaving all other subsets unchanged. That is, move the number p from S to T.

Since the largest number p has been removed, the mean of S\{p} must be lower than that for S, by Lemma 3. Since a smaller number has been included, the mean of {p}UT must be lower than that for T, by Lemma 4.

Consider the list of k means from the sets in the new partition and compare to the list of k means from the sets in the old partition. All the means are the same except two of them, which are lower in the new partition. Therefore the total of the k means is lower in the new partition compared to the old partition. Since both lists have k means and the total has reduced, the mean-of-means must have reduced also.

Therefore, if there are any subsets with more than one number, whose numbers are below all the numbers in another subset, the mean-of-means is not the lowest possible for partitions with k subsets.

Suppose we do have the partition with k subsets that has the lowest mean-of-means out of all partitions with k subsets. It is necessarily the lowest mean-of-means for all partitions with its particular list of subset sizes.

By Lemma 1, in such a partition, all the biggest numbers are in the biggest subsets. If there are any other subsets, they must have exactly one number each, otherwise the mean-of-means could be reduced by the procedure above.

If there is more than one largest subset, the numbers in them can be rearranged without changing the mean-of-means, as shown in the proof of Lemma 2, so that the set containing n has consecutive numbers. In that case, this partition could not have the lowest mean-of-means for partitions with k subsets as it could again be reduced by the procedure above.

Since there are k subsets in total and all but one of them has size 1, there are k-1 subsets of size 1. The remaining subset contains the remaining largest numbers.

**End of proof!**

It’s worth noting at this point that the arguments above can actually work for any collection of distinct numbers, not just {1,…,n}. You can even modify them to deal with repeated numbers in the list.

I also realise that you could probably argue this in fewer lemmas, or different lemmas, but I just loved some of the arguments there so much I couldn’t bring myself to whittle it down and risk losing them. Sometimes the most efficient proof is not the most instructive or inspiring.

The final step, which does depend on the actual set being {1,…,n} is to find the optimum number of sets in the partition. To get my first answer for that, I used some calculus. But before I used calculus, I had to figure out a function for the lowest mean-of-means based on the number of subsets.

**Lemma 6:**

*Let k be a natural number and consider the partition of the set {1,…,n} which has the numbers k to n all together in the same subset, and all numbers less than k each in their own subset of size 1. That is, the partition {1},…,{k-1},{k,…,n}.*

*The mean-of-means of this partition is 1/2 (k +n/k).*

**Proof of Lemma 6:
**The sum of the numbers 1,…,k-1 is 1/2 (k-1)k.

The mean of the set {k,…,n} is (k+n)/2.

So the total of the means of the subsets in the partition {1},…,{k-1},{k,…,n} is

1/2 (k-1)k+(k+n)/2

= 1/2 ( (k-1)k+k+n)

= 1/2 ( k²-k+k+n)

= 1/2 (k²+n)

Therefore the mean-of-means of the partition is

1/k ∙1/2 (k²+n)

= 1/2 (k+n/k)

**End of Proof!**

Now I am ready to begin finding the number of subsets that has the lowest mean-of-means.

**Lemma 7:
**

*For each natural number k, consider the partition of the set {1,…,n} which has the numbers k to n all together in the same subset, and all numbers less than k each in their own subset of size 1. That is, the partition {1},…,{k-1},{k,…,n}.*

The value of k with the lowest mean-of-means is either √n if it is an integer or it is one of the integers on either side of √n.

The value of k with the lowest mean-of-means is either √n if it is an integer or it is one of the integers on either side of √n.

For example, when you partition {1,…,16}, the lowest mean-of-means happens with √16=4 subsets arranged as {1},{2},{3},{4,…,16}. When you partition {1,…,12}, since √12 is between 3 and 4, the lowest mean-of-means happens with one of either 3 groups or 4 groups.

**Proof of Lemma 7:
**By Lemma 6, the mean-of-means of such a partition is 1/2 (k+n/k)

Consider the function m(k) = 1/2 (k+n/k) for real number k>0. The minimum of this function will happen when its derivative is 0.

dm/dk= 1/2 (1-n/k²) =0

1=n/k²

k²=n

k=√n (since k is positive)

When k<√n, n/k²>1 so dm/dk=1-n/k² is negative.

When k>√n, n/k²<1 so dm/dk=1-n/k² is positive.

Thus as k increases, the function decreases until k=√n and then increases again.

If √n is itself an integer, then the minimum possible value of m occurs when k=√n.

If √n is not an integer, then the minimum value of m for integer k will occur at one of the nearest integers to √n.

**End of proof!**

The important question is, if √n is not an integer, which of the two integers on either side is the correct choice for the number of groups? Looking at the list of answers for several values of n, it appears that the correct k is actually the nearest integer to √n. That is, to find the correct k, round off √n in the usual way.

I wasn’t able to use derivatives to prove this, so I went back and re-proved the previous lemma in a different way that let me figure it out. This meant I should probably have removed Lemma 7 from the argument, but I really liked the argument I made, so I have left it in.

**Lemma 8:
**

*For each natural number k, consider the partition of the set {1,…,n} which has the numbers k to n all together in the same subset, and all numbers less than k each in their own subset of size 1. That is, the partition*{1},…,{k-1},{k,…,n}.

*The value of k with the lowest mean-of-means is ⌊√n⌉, the nearest integer to √n.*

For example, when you partition {1,…,12}, since √12≈3.46 which rounds to 3, the lowest mean-of-means happens with 3 groups.

**Proof of Lemma 8:
**By Lemma 6, the mean-of-means for this partition is 1/2 (k+n/k).

Let m(k) = 1/2 (k+n/k) for natural numbers k.

Given a natural number k, consider how much the value of m changes between k and the next number:

m(k+1)-m(k)

= 1/2 ( (k+1)+n/(k+1) ) – 1/2 (k+n/k)

= 1/2 ( k+1 + n/(k+1) – k -n/k)

= 1/2 (1 + n/(k+1)-n/k)

= 1/2 (1+n(1/(k+1)-1/k) )

= 1/2 (1-n/(k(k+1))

This is negative if k(k+1)<n and positive if k(k+1)>n. So the next value of m(k) is lower until k(k+1) ≥ n, and then after that, the next value is the same or higher.

That means the value of k with the lowest value of m(k) is the first natural value of k with k(k+1)≥n.

k(k+1) ≥ n

(k+1/2-1/2)(k+1/2+1/2) ≥ n

(k+1/2)²-(1/2)² ≥ n

(k+1/2)²-1/4 ≥ n

(k+1/2)² ≥ n +1/4

k+1/2 ≥ √(n+1/4)

k ≥ √(n+1/4) -1/2

So we are looking for the first integer equal to or greater than √(n+1/4) -1/2. Let r= √(n+1/4) -1/2. Then we are looking for ⌈r⌉, the ceiling of r.

Note that ceiling of r must be between r and r+1, so we will find bounds for r and r+1.

First notice r=√(n+1/4)-1/2 > √n-1/2

Now ⌈r⌉ >= r >√n-1/2

So ⌈r⌉>√n-1/2.

Also, from the working above that found the value of r,

(r+1/2)²=n+1/4

<n+2√n∙1/2+1

=(√n+1/2)²

So r+1/2<√n+1/2

So r<√n

Finally, begin again with

(r+1/2)²=n+1/4

(r+1-1/2)²=n+1/4

(r+1)²-2(r+1)∙1/2+1/4=n+1/4

(r+1)²-(r+1)=n

(r+1)²=n+r-1

<n+√n-1

<n+√n+1/4

=(√n+1/2)²

So r+1<√n+1/2

Now ⌈r⌉<r+1<√n+1/2

So ⌈r⌉<√n+1/2

Therefore √n-1/2<⌈r⌉<√n+1/2.

The integer ⌈r⌉ is within 1/2 of √n, which means it must be the nearest integer to √n.

That is, ⌈r⌉= ⌊√n⌉.

That is, the value of k with the lowest mean-of-means is ⌊√n⌉.

**End of proof!**

Later, I came up with a different proof of Lemma 8, but I enjoyed the inequality reasoning and algebra from the proof I first made so much I didn’t want to remove it. So here is the alternative as a bonus.

**Alternative Proof of Lemma 8:
**By Lemma 6, the mean-of-means for this partition is 1/2 (k+n/k).

Let m(k) = 1/2 (k+n/k) for natural numbers k.

Given a natural number k, consider how much the value of m changes between k and the next number:

m(k+1)-m(k)

= 1/2 ( (k+1)+n/(k+1) ) – 1/2 (k+n/k)

= 1/2 ( k+1 + n/(k+1) – k -n/k)

= 1/2 (1 + n/(k+1)-n/k)

= 1/2 (1+n(1/(k+1)-1/k) )

= 1/2 (1-n/(k(k+1))

This is negative if k(k+1)<n and positive if k(k+1)>n. So the next value of m(k) is lower until k(k+1) ≥ n, and then after that, the next value is the same or higher.

That means the value of k with the lowest value of m(k) is the first natural value of k with k(k+1)≥n.

(This is the same as the previous proof up until here.)

Suppose k is the first natural number with k(k+1)≥n.

Then k²+k≥n

k≥n-k²

n-k²≤k

So, if n is above k² then it is within k of k².

Now (k+1)² = k²+2k+1

= k²+k+k+1

≥ n+k+1

So (k+1)² – n ≥ k+1

Therefore, n is below (k+1)² and the nearest it can be to (k+1)² is k+1.

Hence n is closer to k² than to (k+1)².

If k is the first natural number with k(k+1)≥n, then k(k-1)<n.

k²-k < n

k²-k ≤ n-1 (since both k²-k and n are natural numbers)

k²-k+1 ≤ n

k²-n ≤ k-1

So if n is below k², then it is within k-1 of k².

Now (k-1)² = k²-2k+1

= k²-k-k+1

≤ n-1-k+1

= n-k

So (k-1)²-n ≤ -k

So n-(k-1)² ≥ k.

So n is above (k-1)² and the nearest it can be to (k-1)² is k.

Hence n is closer to k² than to (k-1)².

Therefore, k² is the nearest perfect square to n, and so k is the nearest whole number to √n.

**End of proof!**

The common part of the two proofs above has something interesting worth noting. When n=k(k+1), then m(k+1)-m(k)=0, so that means the two nearest values of k to √n actually both give the *same *value for the mean-of-means. For example, the number 12 is 3∙(3+1) so actually the lowest mean-of-means for n=12 happens both for 3 groups and 4 groups.

Now that we know what the partition with the lowest mean-of-means is, all that remains is to find the actual lowest value for the mean-of-means.

**Lemma 9:**

*Let k= ⌊√n⌉ and let d = n-k². That is, let d be the signed distance from the nearest perfect square to n.
Consider the partition of the set {1,…,n} which has the numbers k to n all together in the same subset, with all numbers less than k each in their own subset of size 1. That is, the partition {1},…,{k-1},{k,…,n}.
The mean-of-means of this partition is k+d/(2k).*

For example, the nearest perfect square to 12 is 9=3², and the distance from 9 to 12 is 3. So the mean-of-means of the partition {1},{2},{3,…,12} is 3+3/6 = 3.5.

For another example, the nearest perfect square to 21 is 25=5², and the distance from 25 to 21 is -4. So the mean of means of the partition {1},{2},{3},{4},{5,…,21} is 5-4/10=4.6.

**Proof of Lemma 9:
**By Lemma 6, the mean of means of the partition {1},…{k-1},{k,…,n} is

m=1/2 (k+n/k)

= 1/2 (k²+n)/k

= (k²+n)/(2k)

= (k²+k²+n-k²)/(2k)

= (2k²+d)/(2k)

= k + d/(2k)

**End of proof!**

It is worth noting that if n is itself a perfect square, then the k in Lemma 9 is exactly √n, and the distance d is 0, so the sum-of-sums of the partition {1},…,{k-1},{k,…,n} is exactly k=√n.

Now that the final answers have been found, I will put it all together into a theorem.

**Theorem 10:
**

*Out of all partitions of {1,…,n} into any number of subsets, the lowest mean-of-means possible occurs when the number of subsets is k=⌊√n ⌉, the nearest integer to the square root of n.*

*The lowest mean-of-means occurs for the partition which has the numbers k to n all together in the same subset, and all numbers less than k each in their own subset of size 1. That is, the partition is {1},…,{k-1},{k,…,n}.*

*If d = n-k², the signed distance from the nearest perfect square to n, then the lowest mean-of-means is k+d/(2k).*

**Proof of Theorem 10:
**The proof is given by all the previous lemmas. I will recap here.

Lemma 1: Out of all partitions with a specific list of subset sizes, the ones with the lowest mean-of-means have larger numbers in larger subsets

Lemma 2: Out of all partitions with a specific list of subset sizes, there is one with the lowest mean-of-means that has consecutive numbers in every subset (proven using Lemma 1).

At this point, we know how to choose a partition to get the lowest mean-of-means for any specific list of subset sizes. The next step is to find the lowest mean-of-means out of all partitions with a specific number of subsets.

Lemma 3: Removing the largest number from a set reduces its mean.

Lemma 4: Including a new smallest number in a set reduces its mean.

Lemma 5: Out of all partitions with a specific number k of subsets, the one with the lowest mean-of-means has the numbers k to n all together in the same subset, and all numbers less than k each in their own subset of size 1 (proven using Lemmas 2, 3 and 4). That is, the partition is {1},…,{k-1},{k,…,n}.

At this point, we have found a specific kind of partition which has the lowest mean-of-means for all the partitions using each specific number of subsets. The overall lowest mean-of-means must belong to one of these few partitions.

Lemma 6: The partition {1},…,{k-1},{k,…,n} has mean-of-means is 1/2 (k+n/k).

Lemma 7: Of all partitions of the form {1},…,{k-1},{k,…,n}, the lowest mean-of-means happens when k is either √n if it is an integer or it is one of the integers on either side of √n (proved using Lemma 6).

Lemma 8: Of all partitions of the form {1},…,{k-1},{k,…,n}, the lowest mean-of-means happens when k=⌊√n⌉ , the nearest integer to the square root of n (proved using Lemma 6).

This now means that the lowest mean-of-means possible for all partitions happens for this specific kind of partition with this specific number of subsets.

Lemma 9: For the partition {1},…,{k-1},{k,…,n},with k=⌊√n⌉, the mean of means is k+d/(2k), where d=n-k², the signed distance from the nearest perfect square to n.

Since this specific partition with this specific number of subsets is the one with the lowest mean-of-means, this is indeed the way to calculate the lowest mean-of-means.

**End of proof!!**

A list of examples to round it out.

- The lowest mean-of-means for partitions of {1,…,4}

happens with 2 sets {1},{2,3,4}

and the lowest mean-of-means is 2. - The lowest mean-of-means for partitions of {1,…,5}

happens with 2 sets {1},{2,3,4,5}

and the lowest mean-of-means is 2+1/4=2.25. - The lowest mean-of-means for partitions of {1,…,6}

happens with 2 sets {1},{2,…,6} and 3 sets {1},{2},{3,…,6}

and the lowest mean-of-means is 2+2/4=3-3/6=2.5. - The lowest mean-of-means for partitions of {1,…,7}

happens with 3 sets {1},{2},{3,…,7}

and the lowest mean-of-means is 3-2/6≈2.67. - The lowest mean-of-means for partitions of {1,…,8}

happens with 3 sets {1},{2},{3,…,8}

and the lowest mean-of-means is 3-1/6≈2.83. - The lowest mean-of-means for partitions of {1,…,9}

happens with 3 sets {1},{2},{3,…9}

and the lowest mean-of-means is 3. - The lowest mean-of-means for partitions of {1,…,10}

happens with 3 sets {1},{2},{3,…,10}

and the lowest mean-of-means is 3+1/6≈3.17. - The lowest mean-of-means for partitions of {1,…,11}

happens with 3 sets {1},{2},{3,…,11}

and the lowest mean-of-means is 3+2/6≈3.33. - The lowest mean-of-means for partitions of {1,…,12}

happens with 3 sets {1},{2},{3,…,12} and 4 sets {1},{2},{3},{4,…,12}

and the lowest mean-of-means is 3+3/6=4-4/8=3.5. - The lowest mean-of-means for partitions of {1,…,13}

happens with 4 sets {1},{2},{3},{4,…,13}

and the lowest mean-of-means is 4-3/8=3.625. - The lowest mean-of-means for partitions of {1,…,14}

happens with 4 sets {1},{2},{3},{4,…,14}

and the lowest mean-of-means is 4-2/8=3.75. - The lowest mean-of-means for partitions of {1,…,15}

happens with 4 sets {1},{2},{3},{4,…,15}

and the lowest mean-of-means is 4-1/8=3.875. - The lowest mean-of-means for partitions of {1,…,16}

happens with 4 sets {1},{2},{3},{4,…,16}

and the lowest mean-of-means is 4. - The lowest mean-of-means for partitions of {1,…,17}

happens with 4 sets {1},{2},{3},{4,…,17}

and the lowest mean-of-means is 4+1/8=4.125.

That’s enough. Thanks for reading, and I hope you enjoyed it.

Very fun problem, nice solution! (Noticed a typo between Lemma 9 and its proof: 3+3/6 is 3.5, not 3/5.)

Thanks James! And thanks for catching the typo — getting other people to read something is the easiest way to catch typos. 🙂

Note that because of Lemma 2, you do not yet have a proof of the complete description of the optimal sets, even though the list of examples at the end seems to suggest this is what you’re thinking of.

To fill in that gap, we just have to make the obvious observation that for each of those partitions P, there is no other partition P* that can result in P after going through the manipulations described in Lemma 2. (In fact, this is true for the partitions described in Lemma 6.)

I’m not 100% sure what you’re getting at. The lowest mean-of-means for a specific collection of subset sizes happens with bigger numbers in bigger subsets, and the proof of Lemma 2 shows that if there’s more than one set with the same size in that lowest mean-of-means partition then all the rearrangements of those numbers in sets of the same size have the same mean-of-means.

If you have the lowest mean-of-means for a specific number of subsets, it has to be the lowest mean-of-means for its own collection of subset sizes, so must be arranged like I described in Lemma 5.

I have edited the proof of Lemma 5 to make it clearer how the argument works, and this may be what you were talking about after all.

>> (…) when you split {1,…,12} into 4 subsets, the lowest mean-of-means belongs to {1},{2},{3},{5,…,12}.

This statement is false. Your split does not contain number 4.

Oops. Just a typo! I’ll go fix it now.