Why restart the engine for each move?

Trackblack · April 10, 2019, 1:50am

According to the engine output text, it seems that the engine process will re-initialize for each request. This is very unreasonable and making the response latency too long. Why not just keep the process running and handle each request with a simple "position … go … " instruction combination?

chendry · April 10, 2019, 3:46am

That is a very good question, and the short answer is that I don’t know

Long answer:

The main reason for creating a new process for each request is that it frees us from having to worry about shared process state.

We have a bunch of backend servers to handle calculation requests, so requests from a single user will likely land on different servers each time. Assuming we scope processes to users, (and, necessarily, to the requested engine software,) that could result in a lot of long-running processes per server.

One of the features of NCM Pro is that each calculation request gets 100% of the server’s CPU cycles, RAM, SSD Disk I/O for syzygy tablebases, etc. So I’d need to make sure that having long-running processes doesn’t compromise that.

I’m wondering if maybe I can simply send a SIGSTOP to those processes after calculation has completed, and resume them as-needed?

Anyways, I think we can reduce latency! Currently our load balancers, frontend servers and database servers are at AWS, and the NCM Pro backends are ~90-100ms away at different providers. I suspect that we can reduce latency by A) as you mentioned, being smarter about process creation, and B) reducing network chatter between AWS and non-AWS servers.

Stay tuned!