# EXCEL question



## TarfHead (23 Jul 2014)

I am trying to solve a puzzle using EXCEL.

The puzzle comprises a 9x9 grid, similar to, but not, a SUDOKU, grid.  Each row and each column adds up to a given number.  Similarly, both diagonals (top left to bottom right, bottom left to top right) add up to a given number.  Some cell values are given, the rest are the puzzle, i.e. populate all blank cells in the grid so that they add up to the given numbers.

Across each row and each column, numbers must be in the range 1 through 9, but not each number in the range, and numbers can be repeated.

After trying to manually set values, I decided to put my hope in randomness . In each of the blank cells, I have used the expression =RANDBETWEEN(1,9) and repetitively hit F9.

Can anyone think of a way to automate this approach, i.e. get a script or macro to repetitively refresh workbook until a condition is fulfilled ?


----------



## mathepac (23 Jul 2014)

http://www.mrexcel.com/tip109.shtml

[broken link removed]

lots of others out there as well. HTH


----------



## Orga (24 Jul 2014)

For a 9x9 grid a random approach will take a very long time to solve...something your kids will probably inherit. It would be better to take a more structured approach. It sounds like you have 9 different variables (digits 1-9) but the rules are not clear e.g. to what do they add, are rows/columns of numbers bound by rules? Have a look at matrix algebra solvers for a quicker solution.


----------



## TarfHead (24 Jul 2014)

Thanks for those replies,

Something for me to chew over when I've an hour or three free .


----------



## RichInSpirit (25 Jul 2014)

It sounds interesting all right. The older versions of Excel had visual Basic built in. 2003 is the  latest version I've seen. Don't know if the newer versions still have it.?
I think loops might be the way to go rather than random numbers. 
Eg. For next loops, do until loops etc.
There would be an awful number of loops and loops within loops. But I'd say it could be done.


----------



## dub_nerd (26 Jul 2014)

Sounds like a candidate for a backtracking algorithm. Give me the grid and the rules and I'll give you a solution (with code) in a jiffy. A fun use of backtracking I did a while back was to find all the words on a Boggle board. There are many millions of possible paths, but with a backtracking algorithm I can solve it in milliseconds.


----------



## TarfHead (28 Jul 2014)

dub_nerd said:


> Give me the grid and the rules and I'll give you a solution (with code) in a jiffy.


 
Have a look at this.

FWIW, someone I know solved the puzzle, but I am interested in knowing of other options to solve, and create , such puzzles.


----------



## dub_nerd (28 Jul 2014)

Actually, I'm not sure it's a backtracking problem after all. First guess is it's a set of 22 simultaneous equations on 34 variables that maybe can be solved by Gaussian elimination. Lemme get back to you.


----------



## dub_nerd (28 Jul 2014)

Yes, I think Gaussian elimination is the way to go. It works like this (sorry if I am teaching granny to suck eggs):

Suppose I have this pair of simultaneous equations:






It's clear I can add the two equations to eliminate y and solve for x ... just add the corresponding terms to get this:






Simplify to get:






Now that we know x, we can put its value back into either of the original equations to get 
	

	
	
		
		

		
		
	


	




.

Problem solved. Lets try a more complicated one with three simultaneous equations:






We always look for a way to eliminate one of the variables to get something simpler. In this case, looking at the x's I can see that if I take the first one and subtract twice the second, I can eliminate x. Remember, whatever I do to the x's must also be done to the other terms, so I have to take the first and subtract twice the second, like this:






Then I simplify to get:






Ok, that gives me one equation with two variables. I'll need another one if I want to be able to solve it. So I go back to the original three equations and look at the last two. I see that if I take -3 times the second and add the third, the x's will disappear again:






... which simplifies to:






So now I have two simultaneous equations in two variables:






Using the same old trick, this time I take -5 times the first, and add the second, and -- skipping to the simplification -- I get:






Finally, I have a value for one of my variables. Now I have to go all the way back to my original three equation, and replace all the y's with 3. With the y's gone I will end up with three simultaneous equations in x and z, and I can take any pair of them to solve for either x or z. Armed with _that_ value, I go back once more, substitute in, and get the last variable. In case you wondered, the correct values are:






Whew! All very tedious, but you just follow fairly simple rules to get your answer. Let's take a look at this same problem a slightly different way. The original three equations each had an x term, a y term, and a z term, and a sum on the right hand side. Did it really matter that the variables were called x, y and z? No, it would have worked out exactly the same if it was p, q, and r, or x1, x2 and x3, or anything else. The variable names don't really matter, as long as we know which positions correspond to which variables.

So here's another formulation of the same problem, with the variable names left out, and just the co-efficients retained:






Remember, the meaning of each of the three rows is still "some number of x plus some number of y plus some number of z = some total". I've also just added a label to each equation just to identify it, so the first equation is called R1. (Quick note here that if, say, z didn't appear at all in one of the equations, I could just put a zero in the corresponding column to mean "zero z's"). We'll just saying passing, although it's not important, that the above arrangement is called an _augmented matrix_.

Ok, where's all this getting us? Well, let's follow the steps that we did with the three original equations, using our new notation:






Hopefully you can see what's going on here. The middle line shows that I took the first equation and subtracted twice the original second equation. That gives me a new equation which I've given a new label to: R2a. Similarly with the third line, it's just the exact same as the second substitution step I did with the original equations.

Ok, just to show there's more than one way to skin a cat, this time I'm going to eliminate that -13y as follows:






So now I've taken 13 times R2a and subtract 6 times the original R3a to give a new R3b. From here I pretty much have a value for z, since I know 17z = 85, so z - 5. Now I can substitute that back into the second last equation above (R2a) to get -6y = -18, so y=3. And armed with both y and z, I can go back to R1 and find that x = 2.

Note the final form of the matrix above. The leading diagonal is the line from the top left of the co-efficients to the bottom right, and it splits the matrix in half. Everything to the lower left of the leading diagonal is zero. That's called an upper triangular matrix, and once we have one we have solved for at least one of our variables. That is always our aim. At each step we try to eliminate one co-efficient and make it zero. That's called _Gaussian elimination_. Once we have our upper triangular matrix we solve for all our variables as just noted, using _back substitution_. By tediously following the same simple rules we can solve large systems of simultaneous equations. Note, though, that not every system necessarily _has_ a solution. But let's skip lightly over that for now.

Now, you can easily see how this solves TarfHead's puzzle, right?

If you can, you're a better nerd than me, because we haven't even touched on that yet. We just done a few preliminaries. But we'll get there in the next post.


----------



## dub_nerd (28 Jul 2014)

So here's the grid from TarfHead's puzzle:






I've done a couple of things. I've included the row totals, but left out the column and diagonal totals for simplicity. Each blank square in the original puzzle I have labelled with a *B* for blank and a number in ascending sequence.

So let's look at the top row. We see that it is supposed to total to 47. Let's total the non-blank numbers that we have in the top row. They sum to 26. So to make the total up to 47, we must have:






Let's do the same with the second row. We get:






You can see where this is heading. When we do all the rows we will end up with ten simultaneous equations. They are all in completely different variables, though, so that doesn't help us solve anything.

But now we do the columns. We get another ten simultaneous equations. The diagonals give us two more. We've got 22 simultaneous equations in 34 variables. Is there a solution? Maybe. Or maybe there's more than one solution.

At this stage, none of the original numbers in the puzzle grid actually matter any more. We've eliminated them and just kept the labelled blanks (which have become our variable names) and their associated totals. How would we write an augmented matrix for this? To do an augmented matrix, remember, we drop the variable names and just use the coefficients. But for this to work each equation will need to reference every variable. Remember how we did that? We just put in a zero for every variable that wasn't referenced in any given equation. A variable can only appear once in a row or column, so any coefficient that isn't a zero will be a one. So we'll need 34 columns for our 34 variables, and our augmented matrix will contain 1's and 0's and totals in the right hand column like this (my equation editor breaks down on a matrix this size, so gotta use an uglier table here):



1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 21
0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 10
0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 16
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 12
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 16
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 8
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 26
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0| = 11
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0| = 18
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1| = 15
0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 13
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0| = 14
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0| = 13
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0| = 12
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 20
0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 16
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0| = 17
0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1| = 19
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 9
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0| = 20
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0| = 10
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0| = 23

And that's it. Just solve for x. 

Problems like this crop up regularly in the real world, so I believe there are matrix solvers that do this sort of thing. But I don't have one. Might be able to automate it though. (I already automated producing that table). I'll have a look in the next post. One slight fly in the ointment with solving a large matrix like this is that an upper triangular form rarely pops out by accident. You end up having to move rows around to achieve this. Since each row is an independent equation, you can move it at will without affecting anything.


----------



## tallpaul (29 Jul 2014)

I think dub_nerd officially wins the "Brainiest Poster on AAM" award for 2014. Between this and his work on Prize Bonds, it is just wow!


----------



## TarfHead (29 Jul 2014)

Thank you for your efforts on 'my' behalf.  I won't even pretend that I understand each and every step that you have detailed, but I have promised myself to sit down some time soon with a pot of coffee and work my way through it.

A solution to the puzzle was identified and we were able to locate it on Sunday.


----------



## dub_nerd (29 Jul 2014)

tallpaul said:


> I think dub_nerd officially wins the "Brainiest Poster on AAM" award for 2014. Between this and his work on Prize Bonds, it is just wow!



Kind words, but you may have noticed I don't actually have an answer yet. Without that it's all just guff. 

This turns out to be quite a gnarly constraint problem. There's no instant answer based on Gaussian elimination. I could solve it by brute force. Based on the constraints in my augmented matrix there are only 249,147,904 different combinations possible in the rows, and 1,593,888,768 in the columns. That's solvable by computer in a second or two.

But I wanted something a bit more "civilised" i.e. an approach more akin to what a human might do. Currently I'm pursuing the line that row six of the augmented matrix (which also corresponds to row 6 of the puzzle) has the minimum number of possible solutions at 21. Where to next though? You want something that both minimises the number of possibilities, but also is constrained by what you've already done, i.e. something that intersects with row 6. The puzzle's leading diagonal (i.e. row 21 of the augmented matrix) has more possible solutions (36) than row 6, but would constrain three other rows and columns, so might be a better bet for a starting place. Trading off number of solutions versus number of new constraints introduced seems to be the important heuristic, but not an easy one to get a computer to understand. Not sure if I understand it myself!


----------



## 110quests (29 Jul 2014)

Isn't language a strange thing. I understand every word in dub-nerd's posts, in that I know the meaning of each one and could use each one in context. But when they are assembled as above I don't understand any part of his explanation!!


----------



## dub_nerd (29 Jul 2014)

Ok, apologies, I may have slightly overcomplicated things. That augmented matrix stuff doesn't really get us anywhere, so those big long posts were a bit unnecessary. We do still need simultaneous equations. But I'll start again from the beginning and make it nice 'n' easy:

Here's TarfHead's puzzle again, showing only row totals:







And here, again, I've labelled all the blank entries with unique labels:








And as before, this gives me a set of simultaneous equations which I get by subtracting the the totals _of the digits I'm given_ from the required row totals:







... and so on. Ten rows gives me ten equations. Ten columns gives me ten more. Two diagonals give another two. So I have a total of 22 equations which have to be satisfied simultaneously.

(I'm going to end up saying "row or column or diagonal" a lot here, but there's a mathematical word to cover all three: a _tuple_).

Let's start with the first equation, corresponding to row 1. I have to make three one-digit numbers add up to 21, and I can use the digits 1 to 9. So I could have (3 + 9 + 9) or (4 + 9 + 8) or (4 + 8 + 9) and so on. It's easy to see how you could follow a pattern to generate all the combinations. And I could figure out how many ways there are to make three digits total 21 (there are 28 ways).

So after I pick my first values, say (3 + 9 + 9), I'm going to try another tuple, and see can I find a combination that works there _given that I have set these particular values in row 1_. Which one should I try second? Ideally I should pick the one with the least different possibilities, because that way if things don't work out I should find out quickly and go back to row 1 to try my next combination. For example, having chosen (3 + 9 + 9) I could take a look at column 10. It has a total of 54, but the numbers already given sum to 34. So (B3 + B21 + B25 = 20), but I've already trialled (B3 = 9) in row 1, so now (B21 + B25 = 11) which reduces the possibilities.

The algorithm I wrote works like this. Having chosen a first tuple and tried a set of values, it looks at all the other tuples and calculates which one has the least remaining possibilities (the "most constrained") and starts trying values in that one next. Having tried a first set of values there, it finds the most constrained remaining tuple, and so on.

Now, in most cases it's going to hit a brick wall, finding that it has painted itself into a corner with no way to proceed. In that case it backtracks from the tuple on which it hit a brick wall, and goes back to the previous one it tried. It moves on to the next possible combination there, and tries to proceed again. Eventually it will either find a solution or hit a brick wall with no more values to try.

So there's the essence of the approach. Label the blanks, set up the simultaneous equations, and start trying combinations, always moving to the most constrained tuple next and backtracking when you hit a brick wall. I got some quite surprising results this way.


----------



## dub_nerd (29 Jul 2014)

Results time!

A question I didn't answer in the last post was "which tuple should you start with?". Actually, I tried all 22 of them. Some of them came up with the same solution, but most didn't. Out of 22 different starting points, I got at least 17 valid solutions. And there are doubtless lots more, given that I just take the first one I come across. So the solution isn't unique.

How many values do you have to try in order to get a solution? Surprisingly, that varied wildly depending on where you started. If trialling a set of values for one tuple counts as "one attempt", then if you were unlucky enough to decide to start on column 6, you'd take 127,167 attempts before coming up with a solution -- nearly 10 times more than the next worst.

On the other hand, if you were lucky enough to start on column 4, you'd get to the answer in just 25 attempts. Here's the grid with blanks again, followed by the 25 steps, and the solution:







Try column 4: B16 = 1; B19 = 2; B31 = 9; 
Try row 6: B20 = 1; B21 = 5; 
Try column 7: B18 = 7; B33 = 9; 
Try column 10: B3 = 6; B25 = 9; 
Try row 1: B1 = 6; B2 = 9; 
Try row 10: B32 = 1; B34 = 5; 
Try column 3: B26 = 1; B30 = 6; 
Try row 9: B29 = 3; Stuck -- go back to previous
Try column 3: B26 = 2; B30 = 5; 
Try row 9: B29 = 4; Stuck -- go back to previous
Try column 3: B26 = 3; B30 = 4; 
Try row 9: B29 = 5; 
Try anti-diagonal: B10 = 9; 
Try column 8: B7 = 1; B28 = 4; 
Try row 8: B27 = 4; 
Try lead diagonal: B4 = 1; B17 = 5; 
Try row 5: B15 = 3; 
Try column 2: B12 = 7; 
Try row 4: B13 = 1; B14 = 4; 
Try column 9: B11 = 1; B24 = 4; 
Try row 3: B8 = 1; B9 = 5; 
Try column 1: B22 = 9; 
Try row 7: B23 = 4; 
Try column 5: B5 = 6; 
Try row 2: B6 = 2; 








I won't list the one with 127,167 attempts! 

Anyway, interesting problem, thanks for posting it TarfHead. One last thing of note -- you just needed a subset of the results for your map coordinates. Presumably those would have come out the same no matter which of the many solutions you got. So those particular values are always the same. I haven't checked that, but it's  interesting. Also interesting how the people who created the puzzle figured it out.


----------



## RichInSpirit (30 Jul 2014)

Congratulations Dubnerd !


----------



## dub_nerd (30 Jul 2014)

Someone asked for the code that solved this. There's one main Java class and a number of ancillary ones:

CrossNum.java, Base9.java, Board.java, Combos.java, DigitCombo.java, Tuple.java, Variable.java.


----------



## TarfHead (5 Jan 2015)

I have a different EXCEL question, hope someone here can help.

I used to have a sheet containing the function that did what I want, but now cannot find it.  In other words I know this can be done. just don't know how.

Suppose I have a range of 25 cells (5 rows by 5 columns) which each contain one name.  Across the range 25 cells, there are 4 names used and I want a formula to count the number of times each name occurs in the range.

I know it's a form of COUNTIF, but can't recall the specification of the criteria, e.g. =COUNTIF(A1:E5,"=John"), except a cell reference would be used instead of the literal 'John'.  IIRC, it involved using a single or double ampersand (&).

Thanks


----------



## dub_nerd (7 Jan 2015)

Hi TarfHead, the following works just fine: =COUNTIF($A$1:$E$5,G1)

No ampersands involved, maybe you are thinking of the dollar symbols, for keeping the range reference fixed. That way you can list your names in, say, column G, and copy and paste the above formula into column H such that the searched range stays the same each time.

In general, a dollar in an Excel reference keeps the reference fixed instead of relative if you copy it to another cell. You can fix row and column references separately.


----------



## Crugers (8 Jan 2015)

TarfHead said:


> ...
> I know it's a form of COUNTIF, but can't recall the specification of the criteria, e.g. =COUNTIF(A1:E5,"=John"), except a cell reference would be used instead of the literal 'John'.  IIRC, it involved using a single or double ampersand (&)....



Might you be thinking of COUNTIF*S*?
The COUNTIF*S* equivalent of _=COUNTIF($A$1:$E$5,G1)_ would be =COUNTIF*S*($A$1:$E$5,*"="&*G1).


----------



## TarfHead (8 Jan 2015)

Thanks for those replies.

Yes, the single cell reference does work, so I don't need the previous solution.  Just bugs me that I can't remember, or find that document


----------

