[35177] 765.17.237.150 I reasoning-budget: budget=0, forcing immediately [35177] 765.17.237.151 I reasoning-budget: forced sequence complete, done [35177] 765.17.237.259 I slot launch_slot_: id 0 | task 44677 | processing task, is_child = 0 [35177] 765.17.318.824 I slot create_check: id 0 | task 44677 | created context checkpoint 5 of 32 (pos_min = 41602, pos_max = 41602, n_tokens = 41603, size = 149.626 MiB) [35177] 765.18.772.156 I slot print_timing: id 0 | task 44677 | prompt eval time = 140.18 ms / 61 tokens ( 2.30 ms per token, 435.14 tokens per second) [35177] 765.18.772.159 I slot print_timing: id 0 | task 44677 | eval time = 1394.70 ms / 91 tokens ( 15.33 ms per token, 65.25 tokens per second) [35177] 765.18.772.159 I slot print_timing: id 0 | task 44677 | total time = 1534.88 ms / 152 tokens [35177] 765.18.772.160 I slot print_timing: id 0 | task 44677 | graphs reused = 42955 [35177] 765.18.773.358 I slot release: id 0 | task 44677 | stop processing: n_tokens = 41752, truncated = 0 [35177] 765.18.773.555 I srv update_slots: all slots are idle 790.04.294.799 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 765.19.178.487 I srv params_from_: Chat format: peg-native [35177] 765.19.228.188 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 765.19.230.596 I reasoning-budget: activated, budget=0 tokens [35177] 765.19.230.599 I reasoning-budget: budget=0, forcing immediately [35177] 765.19.230.599 I reasoning-budget: forced sequence complete, done [35177] 765.19.230.722 I slot launch_slot_: id 0 | task 44771 | processing task, is_child = 0 [35177] 765.20.815.631 I slot print_timing: id 0 | task 44771 | n_decoded = 100, tg = 66.06 t/s [35177] 765.20.951.595 I slot print_timing: id 0 | task 44771 | prompt eval time = 71.03 ms / 53 tokens ( 1.34 ms per token, 746.17 tokens per second) [35177] 765.20.951.598 I slot print_timing: id 0 | task 44771 | eval time = 1649.83 ms / 109 tokens ( 15.14 ms per token, 66.07 tokens per second) [35177] 765.20.951.598 I slot print_timing: id 0 | task 44771 | total time = 1720.86 ms / 162 tokens [35177] 765.20.951.599 I slot print_timing: id 0 | task 44771 | graphs reused = 43062 [35177] 765.20.952.810 I slot release: id 0 | task 44771 | stop processing: n_tokens = 41913, truncated = 0 [35177] 765.20.953.004 I srv update_slots: all slots are idle 790.16.432.840 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 765.31.311.453 I srv params_from_: Chat format: peg-native [35177] 765.31.362.939 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 765.31.365.321 I reasoning-budget: activated, budget=0 tokens [35177] 765.31.365.323 I reasoning-budget: budget=0, forcing immediately [35177] 765.31.365.323 I reasoning-budget: forced sequence complete, done [35177] 765.31.365.433 I slot launch_slot_: id 0 | task 44883 | processing task, is_child = 0 [35177] 765.31.448.513 I slot create_check: id 0 | task 44883 | created context checkpoint 6 of 32 (pos_min = 41914, pos_max = 41914, n_tokens = 41915, size = 149.626 MiB) [35177] 765.32.379.105 I slot print_timing: id 0 | task 44883 | prompt eval time = 141.90 ms / 61 tokens ( 2.33 ms per token, 429.89 tokens per second) [35177] 765.32.379.108 I slot print_timing: id 0 | task 44883 | eval time = 871.75 ms / 57 tokens ( 15.29 ms per token, 65.39 tokens per second) [35177] 765.32.379.109 I slot print_timing: id 0 | task 44883 | total time = 1013.65 ms / 118 tokens [35177] 765.32.379.109 I slot print_timing: id 0 | task 44883 | graphs reused = 43116 [35177] 765.32.380.334 I slot release: id 0 | task 44883 | stop processing: n_tokens = 42030, truncated = 0 [35177] 765.32.380.533 I srv update_slots: all slots are idle 790.36.594.754 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 765.51.477.428 I srv params_from_: Chat format: peg-native [35177] 765.51.528.309 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 765.51.530.815 I reasoning-budget: activated, budget=0 tokens [35177] 765.51.530.817 I reasoning-budget: budget=0, forcing immediately [35177] 765.51.530.818 I reasoning-budget: forced sequence complete, done [35177] 765.51.530.927 I slot launch_slot_: id 0 | task 44943 | processing task, is_child = 0 [35177] 765.53.087.118 I slot print_timing: id 0 | task 44943 | prompt eval time = 84.18 ms / 61 tokens ( 1.38 ms per token, 724.61 tokens per second) [35177] 765.53.087.122 I slot print_timing: id 0 | task 44943 | eval time = 1471.99 ms / 97 tokens ( 15.18 ms per token, 65.90 tokens per second) [35177] 765.53.087.122 I slot print_timing: id 0 | task 44943 | total time = 1556.17 ms / 158 tokens [35177] 765.53.087.123 I slot print_timing: id 0 | task 44943 | graphs reused = 43211 [35177] 765.53.088.362 I slot release: id 0 | task 44943 | stop processing: n_tokens = 42187, truncated = 0 [35177] 765.53.088.557 I srv update_slots: all slots are idle 790.48.570.552 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 766.03.460.695 I srv params_from_: Chat format: peg-native [35177] 766.03.517.724 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 766.03.520.100 I reasoning-budget: activated, budget=0 tokens [35177] 766.03.520.102 I reasoning-budget: budget=0, forcing immediately [35177] 766.03.520.103 I reasoning-budget: forced sequence complete, done [35177] 766.03.520.223 I slot launch_slot_: id 0 | task 45043 | processing task, is_child = 0 [35177] 766.03.607.589 I slot create_check: id 0 | task 45043 | created context checkpoint 7 of 32 (pos_min = 42188, pos_max = 42188, n_tokens = 42189, size = 149.626 MiB) [35177] 766.04.519.821 I slot print_timing: id 0 | task 45043 | prompt eval time = 146.29 ms / 61 tokens ( 2.40 ms per token, 416.98 tokens per second) [35177] 766.04.519.825 I slot print_timing: id 0 | task 45043 | eval time = 853.29 ms / 56 tokens ( 15.24 ms per token, 65.63 tokens per second) [35177] 766.04.519.826 I slot print_timing: id 0 | task 45043 | total time = 999.58 ms / 117 tokens [35177] 766.04.519.827 I slot print_timing: id 0 | task 45043 | graphs reused = 43265 [35177] 766.04.521.266 I slot release: id 0 | task 45043 | stop processing: n_tokens = 42303, truncated = 0 [35177] 766.04.521.560 I srv update_slots: all slots are idle 791.08.723.346 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 766.23.604.100 I srv params_from_: Chat format: peg-native [35177] 766.23.653.796 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 766.23.656.139 I reasoning-budget: activated, budget=0 tokens [35177] 766.23.656.141 I reasoning-budget: budget=0, forcing immediately [35177] 766.23.656.141 I reasoning-budget: forced sequence complete, done [35177] 766.23.656.250 I slot launch_slot_: id 0 | task 45102 | processing task, is_child = 0 [35177] 766.25.128.472 I slot print_timing: id 0 | task 45102 | prompt eval time = 84.35 ms / 61 tokens ( 1.38 ms per token, 723.15 tokens per second) [35177] 766.25.128.476 I slot print_timing: id 0 | task 45102 | eval time = 1387.86 ms / 92 tokens ( 15.09 ms per token, 66.29 tokens per second) [35177] 766.25.128.476 I slot print_timing: id 0 | task 45102 | total time = 1472.21 ms / 153 tokens [35177] 766.25.128.477 I slot print_timing: id 0 | task 45102 | graphs reused = 43355 [35177] 766.25.129.699 I slot release: id 0 | task 45102 | stop processing: n_tokens = 42455, truncated = 0 [35177] 766.25.129.895 I srv update_slots: all slots are idle 791.21.188.104 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 766.36.084.595 I srv params_from_: Chat format: peg-native [35177] 766.36.133.826 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 766.36.136.198 I reasoning-budget: activated, budget=0 tokens [35177] 766.36.136.200 I reasoning-budget: budget=0, forcing immediately [35177] 766.36.136.200 I reasoning-budget: forced sequence complete, done [35177] 766.36.136.315 I slot launch_slot_: id 0 | task 45197 | processing task, is_child = 0 [35177] 766.36.222.987 I slot create_check: id 0 | task 45197 | created context checkpoint 8 of 32 (pos_min = 42456, pos_max = 42456, n_tokens = 42457, size = 149.626 MiB) [35177] 766.37.126.774 I slot print_timing: id 0 | task 45197 | prompt eval time = 145.87 ms / 61 tokens ( 2.39 ms per token, 418.18 tokens per second) [35177] 766.37.126.777 I slot print_timing: id 0 | task 45197 | eval time = 844.57 ms / 56 tokens ( 15.08 ms per token, 66.31 tokens per second) [35177] 766.37.126.777 I slot print_timing: id 0 | task 45197 | total time = 990.44 ms / 117 tokens [35177] 766.37.126.778 I slot print_timing: id 0 | task 45197 | graphs reused = 43409 [35177] 766.37.128.055 I slot release: id 0 | task 45197 | stop processing: n_tokens = 42571, truncated = 0 [35177] 766.37.128.252 I srv update_slots: all slots are idle 791.52.575.578 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 767.07.484.627 I srv params_from_: Chat format: peg-native [35177] 767.07.538.664 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 767.07.541.213 I reasoning-budget: activated, budget=0 tokens [35177] 767.07.541.216 I reasoning-budget: budget=0, forcing immediately [35177] 767.07.541.216 I reasoning-budget: forced sequence complete, done [35177] 767.07.541.323 I slot launch_slot_: id 0 | task 45256 | processing task, is_child = 0 [35177] 767.08.459.526 I slot print_timing: id 0 | task 45256 | prompt eval time = 74.34 ms / 27 tokens ( 2.75 ms per token, 363.21 tokens per second) [35177] 767.08.459.529 I slot print_timing: id 0 | task 45256 | eval time = 843.85 ms / 56 tokens ( 15.07 ms per token, 66.36 tokens per second) [35177] 767.08.459.530 I slot print_timing: id 0 | task 45256 | total time = 918.19 ms / 83 tokens [35177] 767.08.459.530 I slot print_timing: id 0 | task 45256 | graphs reused = 43463 [35177] 767.08.460.766 I slot release: id 0 | task 45256 | stop processing: n_tokens = 42653, truncated = 0 [35177] 767.08.460.968 I srv update_slots: all slots are idle 791.56.389.614 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 767.11.278.235 I srv params_from_: Chat format: peg-native [35177] 767.11.331.525 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 767.11.333.844 I reasoning-budget: activated, budget=0 tokens [35177] 767.11.333.845 I reasoning-budget: budget=0, forcing immediately [35177] 767.11.333.846 I reasoning-budget: forced sequence complete, done [35177] 767.11.333.965 I slot launch_slot_: id 0 | task 45315 | processing task, is_child = 0 [35177] 767.12.917.630 I slot print_timing: id 0 | task 45315 | n_decoded = 100, tg = 66.27 t/s [35177] 767.12.932.925 I slot print_timing: id 0 | task 45315 | prompt eval time = 74.63 ms / 61 tokens ( 1.22 ms per token, 817.38 tokens per second) [35177] 767.12.932.928 I slot print_timing: id 0 | task 45315 | eval time = 1524.32 ms / 101 tokens ( 15.09 ms per token, 66.26 tokens per second) [35177] 767.12.932.929 I slot print_timing: id 0 | task 45315 | total time = 1598.95 ms / 162 tokens [35177] 767.12.932.929 I slot print_timing: id 0 | task 45315 | graphs reused = 43561 [35177] 767.12.934.407 I slot release: id 0 | task 45315 | stop processing: n_tokens = 42814, truncated = 0 [35177] 767.12.934.689 I srv update_slots: all slots are idle 792.08.414.790 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 767.23.305.297 I srv params_from_: Chat format: peg-native [35177] 767.23.355.142 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 767.23.357.684 I reasoning-budget: activated, budget=0 tokens [35177] 767.23.357.687 I reasoning-budget: budget=0, forcing immediately [35177] 767.23.357.687 I reasoning-budget: forced sequence complete, done [35177] 767.23.357.797 I slot launch_slot_: id 0 | task 45419 | processing task, is_child = 0 [35177] 767.23.440.238 I slot create_check: id 0 | task 45419 | created context checkpoint 9 of 32 (pos_min = 42815, pos_max = 42815, n_tokens = 42816, size = 149.626 MiB) [35177] 767.24.353.630 I slot print_timing: id 0 | task 45419 | prompt eval time = 141.31 ms / 61 tokens ( 2.32 ms per token, 431.67 tokens per second) [35177] 767.24.353.633 I slot print_timing: id 0 | task 45419 | eval time = 854.51 ms / 57 tokens ( 14.99 ms per token, 66.71 tokens per second) [35177] 767.24.353.634 I slot print_timing: id 0 | task 45419 | total time = 995.82 ms / 118 tokens [35177] 767.24.353.634 I slot print_timing: id 0 | task 45419 | graphs reused = 43616 [35177] 767.24.354.841 I slot release: id 0 | task 45419 | stop processing: n_tokens = 42931, truncated = 0 [35177] 767.24.355.031 I srv update_slots: all slots are idle 792.39.796.726 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 767.54.692.239 I srv params_from_: Chat format: peg-native [35177] 767.54.752.059 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 767.54.754.439 I reasoning-budget: activated, budget=0 tokens [35177] 767.54.754.441 I reasoning-budget: budget=0, forcing immediately [35177] 767.54.754.441 I reasoning-budget: forced sequence complete, done [35177] 767.54.754.558 I slot launch_slot_: id 0 | task 45479 | processing task, is_child = 0 [35177] 767.55.698.807 I slot print_timing: id 0 | task 45479 | prompt eval time = 75.22 ms / 27 tokens ( 2.79 ms per token, 358.97 tokens per second) [35177] 767.55.698.810 I slot print_timing: id 0 | task 45479 | eval time = 869.01 ms / 57 tokens ( 15.25 ms per token, 65.59 tokens per second) [35177] 767.55.698.810 I slot print_timing: id 0 | task 45479 | total time = 944.22 ms / 84 tokens [35177] 767.55.698.811 I slot print_timing: id 0 | task 45479 | graphs reused = 43670 [35177] 767.55.700.088 I slot release: id 0 | task 45479 | stop processing: n_tokens = 43014, truncated = 0 [35177] 767.55.700.284 I srv update_slots: all slots are idle 792.58.655.693 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 768.13.542.376 I srv params_from_: Chat format: peg-native [35177] 768.13.587.382 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 768.13.589.675 I reasoning-budget: activated, budget=0 tokens [35177] 768.13.589.677 I reasoning-budget: budget=0, forcing immediately [35177] 768.13.589.677 I reasoning-budget: forced sequence complete, done [35177] 768.13.589.780 I slot launch_slot_: id 0 | task 45539 | processing task, is_child = 0 [35177] 768.14.878.815 I slot print_timing: id 0 | task 45539 | prompt eval time = 84.39 ms / 60 tokens ( 1.41 ms per token, 711.02 tokens per second) [35177] 768.14.878.819 I slot print_timing: id 0 | task 45539 | eval time = 1204.63 ms / 80 tokens ( 15.06 ms per token, 66.41 tokens per second) [35177] 768.14.878.820 I slot print_timing: id 0 | task 45539 | total time = 1289.01 ms / 140 tokens [35177] 768.14.878.820 I slot print_timing: id 0 | task 45539 | graphs reused = 43748 [35177] 768.14.880.076 I slot release: id 0 | task 45539 | stop processing: n_tokens = 43153, truncated = 0 [35177] 768.14.880.272 I srv update_slots: all slots are idle 793.00.328.693 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 768.15.228.704 I srv params_from_: Chat format: peg-native [35177] 768.15.282.367 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 [35177] 768.15.285.056 I reasoning-budget: activated, budget=0 tokens [35177] 768.15.285.059 I reasoning-budget: budget=0, forcing immediately [35177] 768.15.285.059 I reasoning-budget: forced sequence complete, done [35177] 768.15.285.183 I slot launch_slot_: id 0 | task 45622 | processing task, is_child = 0 [35177] 768.15.366.817 I slot create_check: id 0 | task 45622 | created context checkpoint 10 of 32 (pos_min = 43154, pos_max = 43154, n_tokens = 43155, size = 149.626 MiB) [35177] 768.16.637.466 I slot print_timing: id 0 | task 45622 | prompt eval time = 140.72 ms / 57 tokens ( 2.47 ms per token, 405.05 tokens per second) [35177] 768.16.637.469 I slot print_timing: id 0 | task 45622 | eval time = 1211.54 ms / 80 tokens ( 15.14 ms per token, 66.03 tokens per second) [35177] 768.16.637.469 I slot print_timing: id 0 | task 45622 | total time = 1352.27 ms / 137 tokens [35177] 768.16.637.470 I slot print_timing: id 0 | task 45622 | graphs reused = 43825 [35177] 768.16.638.700 I slot release: id 0 | task 45622 | stop processing: n_tokens = 43289, truncated = 0 [35177] 768.16.638.885 I srv update_slots: all slots are idle 793.02.158.713 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 768.17.050.717 I srv params_from_: Chat format: peg-native [35177] 768.17.102.733 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.996 (> 0.100 thold), f_keep = 1.000 [35177] 768.17.105.080 I reasoning-budget: activated, budget=0 tokens [35177] 768.17.105.082 I reasoning-budget: budget=0, forcing immediately [35177] 768.17.105.082 I reasoning-budget: forced sequence complete, done [35177] 768.17.105.197 I slot launch_slot_: id 0 | task 45705 | processing task, is_child = 0 [35177] 768.17.256.573 I slot create_check: id 0 | task 45705 | created context checkpoint 11 of 32 (pos_min = 43466, pos_max = 43466, n_tokens = 43467, size = 149.626 MiB) [35177] 768.18.810.441 I slot print_timing: id 0 | task 45705 | n_decoded = 100, tg = 65.30 t/s [35177] 768.20.103.277 I slot print_timing: id 0 | task 45705 | prompt eval time = 173.74 ms / 182 tokens ( 0.95 ms per token, 1047.52 tokens per second) [35177] 768.20.103.281 I slot print_timing: id 0 | task 45705 | eval time = 2824.32 ms / 185 tokens ( 15.27 ms per token, 65.50 tokens per second) [35177] 768.20.103.281 I slot print_timing: id 0 | task 45705 | total time = 2998.06 ms / 367 tokens [35177] 768.20.103.282 I slot print_timing: id 0 | task 45705 | graphs reused = 44007 [35177] 768.20.104.558 I slot release: id 0 | task 45705 | stop processing: n_tokens = 43655, truncated = 0 [35177] 768.20.104.764 I srv update_slots: all slots are idle 793.06.770.559 I srv proxy_reques: proxying request to model qwen36-27b-neocoder-q4-preserve on port 35177 [35177] 768.21.630.462 I srv params_from_: Chat format: peg-native [35177] 768.21.680.136 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.845 (> 0.100 thold), f_keep = 0.716 [35177] 768.21.682.419 I reasoning-budget: activated, budget=0 tokens [35177] 768.21.682.421 I reasoning-budget: budget=0, forcing immediately [35177] 768.21.682.422 I reasoning-budget: forced sequence complete, done [35177] 768.21.682.531 I slot launch_slot_: id 0 | task 45893 | processing task, is_child = 0 [35177] 768.21.682.555 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [43466, 43466] against 31267... [35177] 768.21.682.557 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [43154, 43154] against 31267... [35177] 768.21.682.557 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [42815, 42815] against 31267... [35177] 768.21.682.558 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [42456, 42456] against 31267... [35177] 768.21.682.558 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [42188, 42188] against 31267... [35177] 768.21.682.559 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [41914, 41914] against 31267... [35177] 768.21.682.559 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [41602, 41602] against 31267... [35177] 768.21.682.560 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [41225, 41225] against 31267... [35177] 768.21.682.560 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [40713, 40713] against 31267... [35177] 768.21.682.560 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [40258, 40258] against 31267... [35177] 768.21.682.561 I slot update_slots: id 0 | task 45893 | Checking checkpoint with [39925, 39925] against 31267... [35177] 768.21.682.561 W slot update_slots: id 0 | task 45893 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055) [35177] 768.21.682.566 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 39925, pos_max = 39925, n_tokens = 39926, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.693.936 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 40258, pos_max = 40258, n_tokens = 40259, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.707.553 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 40713, pos_max = 40713, n_tokens = 40714, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.720.686 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 41225, pos_max = 41225, n_tokens = 41226, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.734.447 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 41602, pos_max = 41602, n_tokens = 41603, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.744.172 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 41914, pos_max = 41914, n_tokens = 41915, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.753.458 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 42188, pos_max = 42188, n_tokens = 42189, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.765.676 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 42456, pos_max = 42456, n_tokens = 42457, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.776.975 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 42815, pos_max = 42815, n_tokens = 42816, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.788.948 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 43154, pos_max = 43154, n_tokens = 43155, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.21.800.354 W slot update_slots: id 0 | task 45893 | erased invalidated context checkpoint (pos_min = 43466, pos_max = 43466, n_tokens = 43467, n_swa = 0, pos_next = 0, size = 149.626 MiB) [35177] 768.25.170.633 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 12288, progress = 0.33, t = 3.49 s / 3522.85 tokens per second [35177] 768.25.775.502 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 14336, progress = 0.39, t = 4.09 s / 3502.60 tokens per second [35177] 768.26.391.021 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 16384, progress = 0.44, t = 4.71 s / 3479.68 tokens per second [35177] 768.27.013.537 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 18432, progress = 0.50, t = 5.33 s / 3457.52 tokens per second [35177] 768.27.646.817 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 20480, progress = 0.55, t = 5.96 s / 3433.78 tokens per second [35177] 768.28.288.980 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 22528, progress = 0.61, t = 6.61 s / 3410.01 tokens per second [35177] 768.28.942.291 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 24576, progress = 0.66, t = 7.26 s / 3385.24 tokens per second [35177] 768.29.610.285 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 26624, progress = 0.72, t = 7.93 s / 3358.33 tokens per second [35177] 768.30.296.166 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 28672, progress = 0.77, t = 8.61 s / 3328.68 tokens per second [35177] 768.30.997.546 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 30720, progress = 0.83, t = 9.32 s / 3297.91 tokens per second [35177] 768.31.715.129 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 32768, progress = 0.89, t = 10.03 s / 3266.16 tokens per second [35177] 768.32.453.339 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 34816, progress = 0.94, t = 10.77 s / 3232.44 tokens per second [35177] 768.33.149.036 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 36499, progress = 0.99, t = 11.47 s / 3183.10 tokens per second [35177] 768.33.282.533 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 36975, progress = 1.00, t = 11.60 s / 3187.50 tokens per second [35177] 768.33.361.321 I slot create_check: id 0 | task 45893 | created context checkpoint 1 of 32 (pos_min = 36974, pos_max = 36974, n_tokens = 36975, size = 149.626 MiB) [35177] 768.33.381.838 I slot print_timing: id 0 | task 45893 | prompt processing, n_tokens = 37011, progress = 1.00, t = 11.70 s / 3163.52 tokens per second [35177] 768.33.491.888 I slot print_timing: id 0 | task 45893 | prompt eval time = 11730.70 ms / 37015 tokens ( 0.32 ms per token, 3155.39 tokens per second) [35177] 768.33.491.891 I slot print_timing: id 0 | task 45893 | eval time = 78.64 ms / 6 tokens ( 13.11 ms per token, 76.30 tokens per second) [35177] 768.33.491.891 I slot print_timing: id 0 | task 45893 | total time = 11809.34 ms / 37021 tokens [35177] 768.33.491.892 I slot print_timing: id 0 | task 45893 | graphs reused = 44011 [35177] 768.33.493.020 I slot release: id 0 | task 45893 | stop processing: n_tokens = 37020, truncated = 0 [35177] 768.33.493.208 I srv update_slots: all slots are idle