What does the strongest search mean?

I see that when you use stockfish to search, it says strongest which I take to mean it uses the strongest build? Yet when I select the strongest build available from the drop down menu, it regularly gives a different move
How so?

That’s correct – Stockfish is the strongest of the two engines, and generally the latest dev build is going to be the strongest version of Stockfish.

I’m not 100% sure I understand the question. Did you mean to say that it rarely gives a different move? Like, you can calculate a move using Stockfish 8, and then do the same with the latest dev build, and it yields the same result?

I’ve only just started to study the Stockfish source code, so I can’t know exactly how, but the dev build tests confirm that the latest dev build must play differently than Stockfish 8. It might be interesting to see if I can find some positions for which the latest dev build and Stockfish 8 consistently disagree on a move.

That’s what I’m going to do.

I’ll generate a bunch of positions by having Stockfish play itself a few times, and for each position, analyze it a bunch of times with both the dev build and Stockfish 8 to see if I can find disagreements.

I’ll post what I find back here.

I often misinterpret questions and go way off in the wrong direction, so just let me know if I’ve done that in this instance :slight_smile:

I have some data! I did a very crude test:

I had Stockfish 8 and the latest dev build (20171218-1532) play itself with 30+.3 time controls a few times, took the resulting positions, threw out the early positions, took a random sample, and had both Stockfish 8 and 20171218-1532 analyze those positions two times each.

One observation I had was that it seemed that the shorter the think time, the more likely it was that the results would disagree. Specifically, on my laptop, the cheapest ThinkPad T-470 I could find, the engines seemed more likely to arrive at different results with a think time of 1 second and threads set to 2.

So I tried to tighten up the results by setting threads to 4 (total physical cores on the machine) and think time to 5 seconds.

Here’s the raw output:

output.txt (104.9 KB)

Some of the ones marked as different were not reproducible on NCM, at least with 15 second think times on the D-1540s, but many were, for example:

  • r1q4k/2p2pp1/ppp2Nbp/2b1P1B1/3R4/5Q2/PP4PP/3R2K1 w - - 0 24
  • 8/8/2bP2k1/1p5p/p1p1P1pK/P1P1B3/1P6/8 b - - 15 67
  • 2krr3/1pp3p1/p1p2p2/5b1p/2PP4/4RN1P/PP3PP1/4R1K1 b - - 1 23

I’m guessing the difference is some combination of hardware / think time, and maybe even the fact I was just pasting the FEN in to NCM as opposed to loading the entire move history for Stockfish to consider.