Changes between Version 53 and Version 54 of Check
- Timestamp:
- Oct 29, 2010, 1:02:49 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Check
v53 v54 405 405 [[Image(sh-multi island failing.png, center, 600px)]] 406 406 407 Starting the Self-healing service changes this situation. Once a node fails, the service is able to start a node in either island. In the next screenshot, you may see how turning off the service on node 0001 in island2 makes the Self-healing service turn on node 0002 in island1:407 Starting the Self-healing service changes this situation. Once a node fails, the service is able to start a node in either island. In the next screenshot, you may see how turning off the service on node 0001 in island2 (browser window) makes the Self-healing service turn on node 0002 in island1 (lower terminal): 408 408 409 409 [[Image(sh-multi island new node.png, center, 600px)]] 410 410 411 Please note that not even a single sensor reading in the test client suffered of lack of sensors . In the next section, we detail a bit the timings.411 Please note that not even a single sensor reading in the test client suffered of lack of sensors (upper terminal). In the next section, we detail a bit the timings. 412 412 413 413 === Performance === 414 414 415 The lag between the detection of a node failing and restoring the service, by turning it on for another node, is usually around a few miliseconds. We have tried a few measurements, for both local islands and remote islands. In these examples, the Self-healing service is running in the same network as the Bucharest island and we introduce node failures in both islands. 416 415 417 [[Image(sh-performance.png, center, 600px)]] 416 418 419 The data from the above screenshot is better shown in this table: 420 421 ||= Test no. =||= Node failure =||= Detection time =||= Service restored time =||= Lag =|| 422 || 1 || remote || || || 423 || 2 || remote || || || 424 || 3 || remote || || || 425 || 4 || remote || || || 426 427 You may notice that the mean restore time for local failures is ~ and the mean restore time for remote failures is ~. 417 428 418 429 = Proposed Check Demonstration Sketch =