![eve online time eve online time](https://64.media.tumblr.com/5e44e9c98fcf0f68fbb410133ed548b0/tumblr_o545mtmaTx1u3t3e6o1_500.gif)
A whole lotta effort and power is needed for such a heroic stunt. And, just like last time, there will be a video coming soon. See you on Thursday, 9 September, as the second No Downtime experiment commences. We now want to observe how that ecosystem holds up with no downtime of the primary game cluster, making sure no assumptions have been made about a daily downtime. We have been working for a few years now on a micro-service and message bus technology platform for EVE, and started using that platform for a number of features. No-downtime is a long-term goal and all our technological advances aim towards that. In 2019, the day #1 memory pressure was at 55%, but these days it is around 35% and so we want to rebase our observations. We might be able to run Tranquility for 3 days (and perhaps 7 hours more) if we were to run the cluster to a "first-node-at 100% memory usage" state, given those 2019 numbers. The most memory-hungry nodes in the Tranquility cluster, the Character Services nodes that store those brains I mentioned above (among other things), were at 75% memory pressure at the end of day #2 last time, which is just below our operating tolerance of 80%. We also never clean up any memory, as the cluster node memory is reset every day anyway, which is a reliance on a daily reboot (note: we of course don't clear our DB cache memory or our Redis cache memory, but the main simulation cache memory is cleared in the reboot in each downtime). As an example, the Brain in a Box and Dogma Rewrite projects in 2015 were all about computing and storing skills and their effects (i.e., the characters' brains) and transferring the computed results between solar systems instead of re-computing the brains on each entry to a new solar system. For better performance, then, we have always opted for pre-computing values & processing data and storing the results for later reference rather than re-computing those values again later. Tranquility has always been memory hungry. Time desynchronization is now normally within ☑/100 of a second, well within the maximum of ☐.5 seconds. Players started to notice once the desynchronization was above 3 seconds, mostly by noting what felt like module lag or delay when their client and the node hosting their solar system disagreed significantly about when modules were cycling. But with newer hardware, we had been observing an end-of-run desynchronization of 2.25 seconds and - predictably - 4.5 seconds at the end of day #2 in the first no-downtime experiment in 2019. The target for time desynchronization is a maximum of ☐.5 seconds. The time desynchronization was a known issue, but last time we were observing whether players noticed at the end of day #2. We also observed time desynchronization (which we fixed), and significant memory usage (which we improved somewhat). Now we want to verify them further (of course they have been tested but our test environments don't have Tranquility's scale) and look for more such issues. We fixed all those issues that we found, and those you reported to us. So what did we discover last time, I hear you ask?įirst and foremost we discovered reliance on downtime as an event to mark the beginning of a daily cycle, and a reliance on a daily startup, such as structures not finishing 24+ hour timers and corporations not joining Faction Warfare. Verify that our technology platform (which you will hear more about later) is not making any downtime assumptions.Verify that no other code/features have regressed since last time and in general look for further issues.Verify the fixes made for the issues discovered in the previous experiment in the live production environment.The purpose of this second no-downtime experiment is at least four-fold: And now another no-downtime experiment is being planned for September 9. Nevertheless, I wanted to start this blog with a concrete example of improvements made in downtime reduction since last time.
![eve online time eve online time](https://www.denofgeek.com/wp-content/uploads/2019/04/eve-online-best-battles.jpg)
Downtime will not become much less than 160-200 seconds instead there must first be fewer downtimes and then none at all.
![eve online time eve online time](https://i.ytimg.com/vi/wwrIg9fjTH0/maxresdefault.jpg)
There is a (soft) lower bound of approximately 3 minutes given the three different activities during downtime - shutdown, database jobs, startup - which last approximately 1 minute each, unless fundamental changes are made, and the most fundamental one is still to not have any downtime at all.