Command that produces this log: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> type_embedding.weight: torch.Size([123, 100]) >>> trans_rep.weight: torch.Size([1024, 1124]) >>> trans_rep.bias: torch.Size([1024]) >>> coref_type_ffn.weight: torch.Size([3, 4096]) >>> coref_type_ffn.bias: torch.Size([3]) n_trainable_params: 561067023, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 19:27:13.761322: step: 4/527, loss: 0.04923645034432411 2023-01-22 19:27:14.794817: step: 8/527, loss: 0.043383460491895676 2023-01-22 19:27:15.840109: step: 12/527, loss: 0.06345069408416748 2023-01-22 19:27:16.882424: step: 16/527, loss: 0.037945427000522614 2023-01-22 19:27:17.900014: step: 20/527, loss: 0.026889178901910782 2023-01-22 19:27:18.975781: step: 24/527, loss: 0.014282737858593464 2023-01-22 19:27:20.025216: step: 28/527, loss: 0.07314743101596832 2023-01-22 19:27:21.060987: step: 32/527, loss: 0.013953915797173977 2023-01-22 19:27:22.110349: step: 36/527, loss: 0.02508673444390297 2023-01-22 19:27:23.138528: step: 40/527, loss: 0.014697683043777943 2023-01-22 19:27:24.176319: step: 44/527, loss: 0.01800478808581829 2023-01-22 19:27:25.214469: step: 48/527, loss: 0.011815366335213184 2023-01-22 19:27:26.266954: step: 52/527, loss: 0.05149368941783905 2023-01-22 19:27:27.297942: step: 56/527, loss: 0.027764907106757164 2023-01-22 19:27:28.350661: step: 60/527, loss: 0.02928166277706623 2023-01-22 19:27:29.398514: step: 64/527, loss: 0.06118535250425339 2023-01-22 19:27:30.437037: step: 68/527, loss: 0.0191913191229105 2023-01-22 19:27:31.469710: step: 72/527, loss: 0.014914190396666527 2023-01-22 19:27:32.508463: step: 76/527, loss: 0.020977135747671127 2023-01-22 19:27:33.565525: step: 80/527, loss: 0.06389444321393967 2023-01-22 19:27:34.613525: step: 84/527, loss: 0.03556324541568756 2023-01-22 19:27:35.658862: step: 88/527, loss: 0.03598419576883316 2023-01-22 19:27:36.705046: step: 92/527, loss: 0.017860865220427513 2023-01-22 19:27:37.744246: step: 96/527, loss: 0.017174694687128067 2023-01-22 19:27:38.795513: step: 100/527, loss: 0.03404431417584419 2023-01-22 19:27:39.834666: step: 104/527, loss: 0.05877925455570221 2023-01-22 19:27:40.906363: step: 108/527, loss: 0.056674402207136154 2023-01-22 19:27:41.944665: step: 112/527, loss: 0.022927330806851387 2023-01-22 19:27:42.996048: step: 116/527, loss: 0.014627698808908463 2023-01-22 19:27:44.045040: step: 120/527, loss: 0.03617612272500992 2023-01-22 19:27:45.090405: step: 124/527, loss: 0.021311931312084198 2023-01-22 19:27:46.147813: step: 128/527, loss: 0.02233605645596981 2023-01-22 19:27:47.194025: step: 132/527, loss: 0.052028801292181015 2023-01-22 19:27:48.232524: step: 136/527, loss: 0.019956503063440323 2023-01-22 19:27:49.306726: step: 140/527, loss: 0.017463866621255875 2023-01-22 19:27:50.351136: step: 144/527, loss: 0.01615358144044876 2023-01-22 19:27:51.381477: step: 148/527, loss: 0.02121655084192753 2023-01-22 19:27:52.421033: step: 152/527, loss: 0.018111443147063255 2023-01-22 19:27:53.456992: step: 156/527, loss: 0.05822242796421051 2023-01-22 19:27:54.490183: step: 160/527, loss: 0.0 2023-01-22 19:27:55.538547: step: 164/527, loss: 0.014703494496643543 2023-01-22 19:27:56.586841: step: 168/527, loss: 0.012373027391731739 2023-01-22 19:27:57.622111: step: 172/527, loss: 0.021156037226319313 2023-01-22 19:27:58.673585: step: 176/527, loss: 0.017676973715424538 2023-01-22 19:27:59.716416: step: 180/527, loss: 0.01889359951019287 2023-01-22 19:28:00.761986: step: 184/527, loss: 0.016506966203451157 2023-01-22 19:28:01.819377: step: 188/527, loss: 0.020693611353635788 2023-01-22 19:28:02.866149: step: 192/527, loss: 0.03187645971775055 2023-01-22 19:28:03.904639: step: 196/527, loss: 0.009794244542717934 2023-01-22 19:28:04.958065: step: 200/527, loss: 0.01717703975737095 2023-01-22 19:28:06.001495: step: 204/527, loss: 0.018400736153125763 2023-01-22 19:28:07.046917: step: 208/527, loss: 0.010766998864710331 2023-01-22 19:28:08.097371: step: 212/527, loss: 0.029402419924736023 2023-01-22 19:28:09.138079: step: 216/527, loss: 0.016083508729934692 2023-01-22 19:28:10.175854: step: 220/527, loss: 0.048895031213760376 2023-01-22 19:28:11.222463: step: 224/527, loss: 0.01457796711474657 2023-01-22 19:28:12.275168: step: 228/527, loss: 0.014130916446447372 2023-01-22 19:28:13.319280: step: 232/527, loss: 0.03943173959851265 2023-01-22 19:28:14.377085: step: 236/527, loss: 0.024307668209075928 2023-01-22 19:28:15.428181: step: 240/527, loss: 0.04134993255138397 2023-01-22 19:28:16.477610: step: 244/527, loss: 0.0866246372461319 2023-01-22 19:28:17.535344: step: 248/527, loss: 0.04274255782365799 2023-01-22 19:28:18.580982: step: 252/527, loss: 0.02999594435095787 2023-01-22 19:28:19.646213: step: 256/527, loss: 0.014202937483787537 2023-01-22 19:28:20.701287: step: 260/527, loss: 0.019418755546212196 2023-01-22 19:28:21.743129: step: 264/527, loss: 0.01926596648991108 2023-01-22 19:28:22.797137: step: 268/527, loss: 0.0678904801607132 2023-01-22 19:28:23.844898: step: 272/527, loss: 0.0191855039447546 2023-01-22 19:28:24.877353: step: 276/527, loss: 0.00817350298166275 2023-01-22 19:28:25.924674: step: 280/527, loss: 0.025740692391991615 2023-01-22 19:28:26.978911: step: 284/527, loss: 0.02880026213824749 2023-01-22 19:28:28.040602: step: 288/527, loss: 0.009712154977023602 2023-01-22 19:28:29.112898: step: 292/527, loss: 0.05614226311445236 2023-01-22 19:28:30.169818: step: 296/527, loss: 0.007925443351268768 2023-01-22 19:28:31.221125: step: 300/527, loss: 0.011007772758603096 2023-01-22 19:28:32.265563: step: 304/527, loss: 0.02407767064869404 2023-01-22 19:28:33.318436: step: 308/527, loss: 0.04419030249118805 2023-01-22 19:28:34.372901: step: 312/527, loss: 0.03359941393136978 2023-01-22 19:28:35.429422: step: 316/527, loss: 0.07747422903776169 2023-01-22 19:28:36.482763: step: 320/527, loss: 0.01955069601535797 2023-01-22 19:28:37.534080: step: 324/527, loss: 0.051893677562475204 2023-01-22 19:28:38.584552: step: 328/527, loss: 0.013612684793770313 2023-01-22 19:28:39.642505: step: 332/527, loss: 0.012552589178085327 2023-01-22 19:28:40.687393: step: 336/527, loss: 0.03841537982225418 2023-01-22 19:28:41.751428: step: 340/527, loss: 0.010561777278780937 2023-01-22 19:28:42.797996: step: 344/527, loss: 0.04348571598529816 2023-01-22 19:28:43.855946: step: 348/527, loss: 0.04455193132162094 2023-01-22 19:28:44.912525: step: 352/527, loss: 0.010561950504779816 2023-01-22 19:28:45.967521: step: 356/527, loss: 0.014725720509886742 2023-01-22 19:28:47.006516: step: 360/527, loss: 0.01387280784547329 2023-01-22 19:28:48.053423: step: 364/527, loss: 0.0053038764744997025 2023-01-22 19:28:49.111732: step: 368/527, loss: 0.05532926321029663 2023-01-22 19:28:50.195461: step: 372/527, loss: 0.0818491280078888 2023-01-22 19:28:51.253774: step: 376/527, loss: 0.018701819702982903 2023-01-22 19:28:52.291706: step: 380/527, loss: 0.05909799784421921 2023-01-22 19:28:53.329441: step: 384/527, loss: 0.010057435370981693 2023-01-22 19:28:54.393968: step: 388/527, loss: 0.08583433926105499 2023-01-22 19:28:55.441070: step: 392/527, loss: 0.01276600081473589 2023-01-22 19:28:56.489262: step: 396/527, loss: 0.02201969176530838 2023-01-22 19:28:57.529977: step: 400/527, loss: 0.009124068543314934 2023-01-22 19:28:58.568784: step: 404/527, loss: 0.008228073827922344 2023-01-22 19:28:59.624931: step: 408/527, loss: 0.013764871284365654 2023-01-22 19:29:00.669327: step: 412/527, loss: 0.009872580878436565 2023-01-22 19:29:01.722952: step: 416/527, loss: 0.006959422491490841 2023-01-22 19:29:02.776119: step: 420/527, loss: 0.008295686915516853 2023-01-22 19:29:03.829027: step: 424/527, loss: 0.04326401278376579 2023-01-22 19:29:04.874198: step: 428/527, loss: 0.05508603900671005 2023-01-22 19:29:05.949250: step: 432/527, loss: 0.04993807524442673 2023-01-22 19:29:07.002745: step: 436/527, loss: 0.01477036438882351 2023-01-22 19:29:08.034763: step: 440/527, loss: 0.019256919622421265 2023-01-22 19:29:09.074153: step: 444/527, loss: 0.02263311669230461 2023-01-22 19:29:10.108518: step: 448/527, loss: 0.024339186027646065 2023-01-22 19:29:11.159869: step: 452/527, loss: 0.01530923880636692 2023-01-22 19:29:12.199555: step: 456/527, loss: 0.0 2023-01-22 19:29:13.250209: step: 460/527, loss: 0.01358681172132492 2023-01-22 19:29:14.296880: step: 464/527, loss: 0.05225981026887894 2023-01-22 19:29:15.353349: step: 468/527, loss: 0.012450088746845722 2023-01-22 19:29:16.389145: step: 472/527, loss: 0.00823119468986988 2023-01-22 19:29:17.430794: step: 476/527, loss: 0.04928778484463692 2023-01-22 19:29:18.485146: step: 480/527, loss: 0.014564726501703262 2023-01-22 19:29:19.543068: step: 484/527, loss: 0.011767190881073475 2023-01-22 19:29:20.583592: step: 488/527, loss: 0.052691712975502014 2023-01-22 19:29:21.619772: step: 492/527, loss: 0.014807982370257378 2023-01-22 19:29:22.659506: step: 496/527, loss: 0.0032415720634162426 2023-01-22 19:29:23.704207: step: 500/527, loss: 0.012801028788089752 2023-01-22 19:29:24.760391: step: 504/527, loss: 0.010832608677446842 2023-01-22 19:29:25.806699: step: 508/527, loss: 0.02535327523946762 2023-01-22 19:29:26.853642: step: 512/527, loss: 0.030488912016153336 2023-01-22 19:29:27.916115: step: 516/527, loss: 0.019557097926735878 2023-01-22 19:29:28.987447: step: 520/527, loss: 0.042833928018808365 2023-01-22 19:29:30.039706: step: 524/527, loss: 0.0021354856435209513 2023-01-22 19:29:31.083094: step: 528/527, loss: 0.012811513617634773 2023-01-22 19:29:32.125155: step: 532/527, loss: 0.009690161794424057 2023-01-22 19:29:33.166881: step: 536/527, loss: 0.061977434903383255 2023-01-22 19:29:34.204313: step: 540/527, loss: 0.01449664682149887 2023-01-22 19:29:35.261303: step: 544/527, loss: 0.019814014434814453 2023-01-22 19:29:36.336842: step: 548/527, loss: 0.05334167182445526 2023-01-22 19:29:37.383783: step: 552/527, loss: 0.015172426588833332 2023-01-22 19:29:38.447050: step: 556/527, loss: 0.01659049466252327 2023-01-22 19:29:39.494458: step: 560/527, loss: 0.010038601234555244 2023-01-22 19:29:40.545675: step: 564/527, loss: 0.012357079423964024 2023-01-22 19:29:41.594738: step: 568/527, loss: 0.010617715306580067 2023-01-22 19:29:42.655117: step: 572/527, loss: 0.06289331614971161 2023-01-22 19:29:43.702608: step: 576/527, loss: 0.007434506434947252 2023-01-22 19:29:44.752728: step: 580/527, loss: 0.007518386468291283 2023-01-22 19:29:45.818709: step: 584/527, loss: 0.013077018782496452 2023-01-22 19:29:46.861106: step: 588/527, loss: 0.07850120216608047 2023-01-22 19:29:47.906647: step: 592/527, loss: 0.03388079255819321 2023-01-22 19:29:48.960077: step: 596/527, loss: 0.035493068397045135 2023-01-22 19:29:50.048823: step: 600/527, loss: 0.03494912013411522 2023-01-22 19:29:51.102044: step: 604/527, loss: 0.019684508442878723 2023-01-22 19:29:52.155686: step: 608/527, loss: 0.009311813861131668 2023-01-22 19:29:53.217908: step: 612/527, loss: 0.057334091514348984 2023-01-22 19:29:54.263154: step: 616/527, loss: 0.006807522848248482 2023-01-22 19:29:55.312873: step: 620/527, loss: 0.013769448734819889 2023-01-22 19:29:56.360516: step: 624/527, loss: 0.014124431647360325 2023-01-22 19:29:57.415669: step: 628/527, loss: 0.007340814918279648 2023-01-22 19:29:58.457029: step: 632/527, loss: 0.013471441343426704 2023-01-22 19:29:59.526958: step: 636/527, loss: 0.04327579587697983 2023-01-22 19:30:00.592891: step: 640/527, loss: 0.010253100655972958 2023-01-22 19:30:01.659139: step: 644/527, loss: 0.014752187766134739 2023-01-22 19:30:02.712814: step: 648/527, loss: 0.06564674526453018 2023-01-22 19:30:03.762789: step: 652/527, loss: 0.011777317151427269 2023-01-22 19:30:04.824887: step: 656/527, loss: 0.008777322247624397 2023-01-22 19:30:05.881207: step: 660/527, loss: 0.023550763726234436 2023-01-22 19:30:06.926265: step: 664/527, loss: 0.029071390628814697 2023-01-22 19:30:07.980034: step: 668/527, loss: 0.011349665001034737 2023-01-22 19:30:09.041515: step: 672/527, loss: 0.022541021928191185 2023-01-22 19:30:10.097226: step: 676/527, loss: 0.03227221593260765 2023-01-22 19:30:11.141018: step: 680/527, loss: 0.045231226831674576 2023-01-22 19:30:12.203489: step: 684/527, loss: 0.02103865146636963 2023-01-22 19:30:13.263935: step: 688/527, loss: 0.018973032012581825 2023-01-22 19:30:14.305901: step: 692/527, loss: 0.026441611349582672 2023-01-22 19:30:15.350893: step: 696/527, loss: 0.05431555211544037 2023-01-22 19:30:16.398641: step: 700/527, loss: 0.0534958653151989 2023-01-22 19:30:17.440237: step: 704/527, loss: 0.003933245316147804 2023-01-22 19:30:18.482753: step: 708/527, loss: 0.011422310955822468 2023-01-22 19:30:19.548111: step: 712/527, loss: 0.032320261001586914 2023-01-22 19:30:20.600026: step: 716/527, loss: 0.008242209441959858 2023-01-22 19:30:21.637689: step: 720/527, loss: 0.011768379248678684 2023-01-22 19:30:22.708528: step: 724/527, loss: 0.0736803412437439 2023-01-22 19:30:23.747109: step: 728/527, loss: 0.028010506182909012 2023-01-22 19:30:24.799683: step: 732/527, loss: 0.041286222636699677 2023-01-22 19:30:25.855999: step: 736/527, loss: 0.026666691526770592 2023-01-22 19:30:26.896250: step: 740/527, loss: 0.01516043022274971 2023-01-22 19:30:27.950544: step: 744/527, loss: 0.07718952745199203 2023-01-22 19:30:28.999427: step: 748/527, loss: 0.010381504893302917 2023-01-22 19:30:30.047690: step: 752/527, loss: 0.007148078642785549 2023-01-22 19:30:31.109353: step: 756/527, loss: 0.010911373421549797 2023-01-22 19:30:32.160826: step: 760/527, loss: 0.022200558334589005 2023-01-22 19:30:33.208641: step: 764/527, loss: 0.014294442720711231 2023-01-22 19:30:34.232393: step: 768/527, loss: 0.030063582584261894 2023-01-22 19:30:35.272816: step: 772/527, loss: 0.00841920729726553 2023-01-22 19:30:36.308238: step: 776/527, loss: 0.03538789227604866 2023-01-22 19:30:37.346780: step: 780/527, loss: 0.021538633853197098 2023-01-22 19:30:38.401963: step: 784/527, loss: 0.010347490198910236 2023-01-22 19:30:39.443205: step: 788/527, loss: 0.01486788410693407 2023-01-22 19:30:40.491897: step: 792/527, loss: 0.040787018835544586 2023-01-22 19:30:41.539652: step: 796/527, loss: 0.00893323589116335 2023-01-22 19:30:42.591218: step: 800/527, loss: 0.013890317641198635 2023-01-22 19:30:43.620047: step: 804/527, loss: 0.010352769866585732 2023-01-22 19:30:44.662230: step: 808/527, loss: 0.040759921073913574 2023-01-22 19:30:45.705501: step: 812/527, loss: 0.018781933933496475 2023-01-22 19:30:46.742650: step: 816/527, loss: 0.014692301861941814 2023-01-22 19:30:47.790423: step: 820/527, loss: 0.00933702290058136 2023-01-22 19:30:48.848570: step: 824/527, loss: 0.023195777088403702 2023-01-22 19:30:49.899287: step: 828/527, loss: 0.006671345327049494 2023-01-22 19:30:50.930826: step: 832/527, loss: 0.017655614763498306 2023-01-22 19:30:51.995328: step: 836/527, loss: 0.03672616556286812 2023-01-22 19:30:53.041767: step: 840/527, loss: 0.008209939114749432 2023-01-22 19:30:54.099152: step: 844/527, loss: 0.009100494906306267 2023-01-22 19:30:55.142443: step: 848/527, loss: 0.033512990921735764 2023-01-22 19:30:56.184403: step: 852/527, loss: 0.007788464426994324 2023-01-22 19:30:57.226639: step: 856/527, loss: 0.016441211104393005 2023-01-22 19:30:58.274885: step: 860/527, loss: 0.0071509359404444695 2023-01-22 19:30:59.308708: step: 864/527, loss: 0.023158783093094826 2023-01-22 19:31:00.353077: step: 868/527, loss: 0.007005530409514904 2023-01-22 19:31:01.416273: step: 872/527, loss: 0.018883144482970238 2023-01-22 19:31:02.476953: step: 876/527, loss: 0.013088423758745193 2023-01-22 19:31:03.533407: step: 880/527, loss: 0.013050662353634834 2023-01-22 19:31:04.587297: step: 884/527, loss: 0.011257769539952278 2023-01-22 19:31:05.644766: step: 888/527, loss: 0.011637862771749496 2023-01-22 19:31:06.692250: step: 892/527, loss: 0.010010679252445698 2023-01-22 19:31:07.741897: step: 896/527, loss: 0.04781891033053398 2023-01-22 19:31:08.782220: step: 900/527, loss: 0.05245564505457878 2023-01-22 19:31:09.821195: step: 904/527, loss: 0.03748883679509163 2023-01-22 19:31:10.869871: step: 908/527, loss: 0.04363131523132324 2023-01-22 19:31:11.926852: step: 912/527, loss: 0.00877950806170702 2023-01-22 19:31:12.987752: step: 916/527, loss: 0.0525088869035244 2023-01-22 19:31:14.033069: step: 920/527, loss: 0.006738531868904829 2023-01-22 19:31:15.084116: step: 924/527, loss: 0.010972203686833382 2023-01-22 19:31:16.139281: step: 928/527, loss: 0.013852679170668125 2023-01-22 19:31:17.182800: step: 932/527, loss: 0.025796569883823395 2023-01-22 19:31:18.233323: step: 936/527, loss: 0.013459406793117523 2023-01-22 19:31:19.302311: step: 940/527, loss: 0.012326393276453018 2023-01-22 19:31:20.338707: step: 944/527, loss: 0.014031559228897095 2023-01-22 19:31:21.384937: step: 948/527, loss: 0.02120394818484783 2023-01-22 19:31:22.429007: step: 952/527, loss: 0.0197740625590086 2023-01-22 19:31:23.474655: step: 956/527, loss: 0.03490378335118294 2023-01-22 19:31:24.532535: step: 960/527, loss: 0.0196097269654274 2023-01-22 19:31:25.592709: step: 964/527, loss: 0.009637261740863323 2023-01-22 19:31:26.643283: step: 968/527, loss: 0.0049220542423427105 2023-01-22 19:31:27.684733: step: 972/527, loss: 0.013535046949982643 2023-01-22 19:31:28.712990: step: 976/527, loss: 0.010385559871792793 2023-01-22 19:31:29.756170: step: 980/527, loss: 0.013344008475542068 2023-01-22 19:31:30.798043: step: 984/527, loss: 0.0060921115800738335 2023-01-22 19:31:31.845253: step: 988/527, loss: 0.013868375681340694 2023-01-22 19:31:32.905928: step: 992/527, loss: 0.013543576002120972 2023-01-22 19:31:33.947876: step: 996/527, loss: 0.016153769567608833 2023-01-22 19:31:34.991358: step: 1000/527, loss: 0.005323866847902536 2023-01-22 19:31:36.037445: step: 1004/527, loss: 0.01036792155355215 2023-01-22 19:31:37.097858: step: 1008/527, loss: 0.041970763355493546 2023-01-22 19:31:38.132818: step: 1012/527, loss: 0.024016601964831352 2023-01-22 19:31:39.167137: step: 1016/527, loss: 0.040281787514686584 2023-01-22 19:31:40.230425: step: 1020/527, loss: 0.026675747707486153 2023-01-22 19:31:41.256787: step: 1024/527, loss: 0.013697942718863487 2023-01-22 19:31:42.314759: step: 1028/527, loss: 0.039383962750434875 2023-01-22 19:31:43.353988: step: 1032/527, loss: 0.014411290176212788 2023-01-22 19:31:44.408974: step: 1036/527, loss: 0.033551596105098724 2023-01-22 19:31:45.466973: step: 1040/527, loss: 0.010896379128098488 2023-01-22 19:31:46.517195: step: 1044/527, loss: 0.046110495924949646 2023-01-22 19:31:47.538137: step: 1048/527, loss: 0.05814679339528084 2023-01-22 19:31:48.597147: step: 1052/527, loss: 0.009706827811896801 2023-01-22 19:31:49.643313: step: 1056/527, loss: 0.016290973871946335 2023-01-22 19:31:50.676968: step: 1060/527, loss: 0.003077411325648427 2023-01-22 19:31:51.725161: step: 1064/527, loss: 0.008014618419110775 2023-01-22 19:31:52.782323: step: 1068/527, loss: 0.016418689861893654 2023-01-22 19:31:53.827527: step: 1072/527, loss: 0.009384910576045513 2023-01-22 19:31:54.891581: step: 1076/527, loss: 0.006921080872416496 2023-01-22 19:31:55.922688: step: 1080/527, loss: 0.020065046846866608 2023-01-22 19:31:56.974478: step: 1084/527, loss: 0.013070640154182911 2023-01-22 19:31:58.034973: step: 1088/527, loss: 0.02336178347468376 2023-01-22 19:31:59.087248: step: 1092/527, loss: 0.00855566468089819 2023-01-22 19:32:00.156663: step: 1096/527, loss: 0.006551303435117006 2023-01-22 19:32:01.198538: step: 1100/527, loss: 0.012348373420536518 2023-01-22 19:32:02.275035: step: 1104/527, loss: 0.017696712166070938 2023-01-22 19:32:03.309288: step: 1108/527, loss: 0.009105103090405464 2023-01-22 19:32:04.365261: step: 1112/527, loss: 0.046973537653684616 2023-01-22 19:32:05.423740: step: 1116/527, loss: 0.047595299780368805 2023-01-22 19:32:06.472756: step: 1120/527, loss: 0.012884553521871567 2023-01-22 19:32:07.516093: step: 1124/527, loss: 0.0391993410885334 2023-01-22 19:32:08.563433: step: 1128/527, loss: 0.0156722255051136 2023-01-22 19:32:09.612957: step: 1132/527, loss: 0.014547361060976982 2023-01-22 19:32:10.654095: step: 1136/527, loss: 0.07776817679405212 2023-01-22 19:32:11.718985: step: 1140/527, loss: 0.013856232166290283 2023-01-22 19:32:12.762130: step: 1144/527, loss: 0.020356912165880203 2023-01-22 19:32:13.817013: step: 1148/527, loss: 0.00910620205104351 2023-01-22 19:32:14.872764: step: 1152/527, loss: 0.005105924792587757 2023-01-22 19:32:15.909729: step: 1156/527, loss: 0.010484758764505386 2023-01-22 19:32:16.976729: step: 1160/527, loss: 0.03854474797844887 2023-01-22 19:32:18.022307: step: 1164/527, loss: 0.03841403126716614 2023-01-22 19:32:19.064447: step: 1168/527, loss: 0.009553241543471813 2023-01-22 19:32:20.101371: step: 1172/527, loss: 0.016660507768392563 2023-01-22 19:32:21.140823: step: 1176/527, loss: 0.00820698868483305 2023-01-22 19:32:22.184579: step: 1180/527, loss: 0.02934238128364086 2023-01-22 19:32:23.230781: step: 1184/527, loss: 0.007767180446535349 2023-01-22 19:32:24.278621: step: 1188/527, loss: 0.0075871325097978115 2023-01-22 19:32:25.311139: step: 1192/527, loss: 0.0077916705049574375 2023-01-22 19:32:26.381444: step: 1196/527, loss: 0.030841263011097908 2023-01-22 19:32:27.431966: step: 1200/527, loss: 0.03321535885334015 2023-01-22 19:32:28.468318: step: 1204/527, loss: 0.02610904909670353 2023-01-22 19:32:29.517331: step: 1208/527, loss: 0.020991189405322075 2023-01-22 19:32:30.562590: step: 1212/527, loss: 0.009386571124196053 2023-01-22 19:32:31.597716: step: 1216/527, loss: 0.023512419313192368 2023-01-22 19:32:32.625451: step: 1220/527, loss: 0.008178832940757275 2023-01-22 19:32:33.672122: step: 1224/527, loss: 0.005385445896536112 2023-01-22 19:32:34.727990: step: 1228/527, loss: 0.012898849323391914 2023-01-22 19:32:35.765520: step: 1232/527, loss: 0.012308849021792412 2023-01-22 19:32:36.808887: step: 1236/527, loss: 0.014423832297325134 2023-01-22 19:32:37.881183: step: 1240/527, loss: 0.01827123761177063 2023-01-22 19:32:38.908409: step: 1244/527, loss: 0.006917300634086132 2023-01-22 19:32:39.949513: step: 1248/527, loss: 0.009379149414598942 2023-01-22 19:32:41.003416: step: 1252/527, loss: 0.01035989262163639 2023-01-22 19:32:42.057071: step: 1256/527, loss: 0.008993702940642834 2023-01-22 19:32:43.120746: step: 1260/527, loss: 0.014762411825358868 2023-01-22 19:32:44.163059: step: 1264/527, loss: 0.03903532028198242 2023-01-22 19:32:45.202526: step: 1268/527, loss: 0.012304996140301228 2023-01-22 19:32:46.241397: step: 1272/527, loss: 0.020094679668545723 2023-01-22 19:32:47.277985: step: 1276/527, loss: 0.0125426622107625 2023-01-22 19:32:48.321870: step: 1280/527, loss: 0.01801511086523533 2023-01-22 19:32:49.371026: step: 1284/527, loss: 0.015352058224380016 2023-01-22 19:32:50.418692: step: 1288/527, loss: 0.0018933522514998913 2023-01-22 19:32:51.463524: step: 1292/527, loss: 0.010640609078109264 2023-01-22 19:32:52.506854: step: 1296/527, loss: 0.06180921569466591 2023-01-22 19:32:53.562867: step: 1300/527, loss: 0.04762286692857742 2023-01-22 19:32:54.603023: step: 1304/527, loss: 0.007108218967914581 2023-01-22 19:32:55.660805: step: 1308/527, loss: 0.005272018723189831 2023-01-22 19:32:56.712789: step: 1312/527, loss: 0.014031399972736835 2023-01-22 19:32:57.752178: step: 1316/527, loss: 0.03510887175798416 2023-01-22 19:32:58.797778: step: 1320/527, loss: 0.023932509124279022 2023-01-22 19:32:59.854591: step: 1324/527, loss: 0.014666752889752388 2023-01-22 19:33:00.903487: step: 1328/527, loss: 0.011219554580748081 2023-01-22 19:33:01.939688: step: 1332/527, loss: 0.009440240450203419 2023-01-22 19:33:02.996470: step: 1336/527, loss: 0.00312281702645123 2023-01-22 19:33:04.035007: step: 1340/527, loss: 0.0017904100241139531 2023-01-22 19:33:05.080152: step: 1344/527, loss: 0.029135361313819885 2023-01-22 19:33:06.127457: step: 1348/527, loss: 0.04780901223421097 2023-01-22 19:33:07.179240: step: 1352/527, loss: 0.013022257015109062 2023-01-22 19:33:08.242413: step: 1356/527, loss: 0.010202418081462383 2023-01-22 19:33:09.280899: step: 1360/527, loss: 0.013486332260072231 2023-01-22 19:33:10.359980: step: 1364/527, loss: 0.06929058581590652 2023-01-22 19:33:11.404676: step: 1368/527, loss: 0.004456627648323774 2023-01-22 19:33:12.447324: step: 1372/527, loss: 0.03167043626308441 2023-01-22 19:33:13.488506: step: 1376/527, loss: 0.014031722210347652 2023-01-22 19:33:14.545200: step: 1380/527, loss: 0.01597975194454193 2023-01-22 19:33:15.579453: step: 1384/527, loss: 0.012782503850758076 2023-01-22 19:33:16.629929: step: 1388/527, loss: 0.007612484972923994 2023-01-22 19:33:17.682992: step: 1392/527, loss: 0.008005892857909203 2023-01-22 19:33:18.730181: step: 1396/527, loss: 0.003933346830308437 2023-01-22 19:33:19.789853: step: 1400/527, loss: 0.00893066544085741 2023-01-22 19:33:20.822741: step: 1404/527, loss: 0.015828493982553482 2023-01-22 19:33:21.886721: step: 1408/527, loss: 0.010186729021370411 2023-01-22 19:33:22.931252: step: 1412/527, loss: 0.004033392760902643 2023-01-22 19:33:23.991211: step: 1416/527, loss: 0.007016969379037619 2023-01-22 19:33:25.049586: step: 1420/527, loss: 0.0452611967921257 2023-01-22 19:33:26.140304: step: 1424/527, loss: 0.011878282763063908 2023-01-22 19:33:27.184581: step: 1428/527, loss: 0.010270990431308746 2023-01-22 19:33:28.231172: step: 1432/527, loss: 0.011314825154840946 2023-01-22 19:33:29.271664: step: 1436/527, loss: 0.04765573889017105 2023-01-22 19:33:30.310111: step: 1440/527, loss: 0.013969712890684605 2023-01-22 19:33:31.347734: step: 1444/527, loss: 0.030887247994542122 2023-01-22 19:33:32.389139: step: 1448/527, loss: 0.013999665156006813 2023-01-22 19:33:33.437831: step: 1452/527, loss: 0.022451085969805717 2023-01-22 19:33:34.485203: step: 1456/527, loss: 0.037313204258680344 2023-01-22 19:33:35.540951: step: 1460/527, loss: 0.02577519230544567 2023-01-22 19:33:36.585025: step: 1464/527, loss: 0.011492750607430935 2023-01-22 19:33:37.618774: step: 1468/527, loss: 0.006659230217337608 2023-01-22 19:33:38.656846: step: 1472/527, loss: 0.014374570921063423 2023-01-22 19:33:39.708647: step: 1476/527, loss: 0.027423938736319542 2023-01-22 19:33:40.753806: step: 1480/527, loss: 0.0052278959192335606 2023-01-22 19:33:41.792441: step: 1484/527, loss: 0.00777996564283967 2023-01-22 19:33:42.849677: step: 1488/527, loss: 0.009974290616810322 2023-01-22 19:33:43.892823: step: 1492/527, loss: 0.030738811939954758 2023-01-22 19:33:44.915961: step: 1496/527, loss: 0.010464017279446125 2023-01-22 19:33:45.967384: step: 1500/527, loss: 0.007696117740124464 2023-01-22 19:33:47.015910: step: 1504/527, loss: 0.005756647791713476 2023-01-22 19:33:48.043675: step: 1508/527, loss: 0.006192333530634642 2023-01-22 19:33:49.088550: step: 1512/527, loss: 0.005386924371123314 2023-01-22 19:33:50.138601: step: 1516/527, loss: 0.004054374527186155 2023-01-22 19:33:51.189584: step: 1520/527, loss: 0.07301867008209229 2023-01-22 19:33:52.260364: step: 1524/527, loss: 0.04305405169725418 2023-01-22 19:33:53.312340: step: 1528/527, loss: 0.012200158089399338 2023-01-22 19:33:54.352939: step: 1532/527, loss: 0.015273768454790115 2023-01-22 19:33:55.404533: step: 1536/527, loss: 0.030957499518990517 2023-01-22 19:33:56.458348: step: 1540/527, loss: 0.01303430087864399 2023-01-22 19:33:57.484551: step: 1544/527, loss: 0.024624032899737358 2023-01-22 19:33:58.547453: step: 1548/527, loss: 0.007791030686348677 2023-01-22 19:33:59.620637: step: 1552/527, loss: 0.010313978418707848 2023-01-22 19:34:00.675418: step: 1556/527, loss: 0.009440312162041664 2023-01-22 19:34:01.708882: step: 1560/527, loss: 0.008625865913927555 2023-01-22 19:34:02.767778: step: 1564/527, loss: 0.007325490936636925 2023-01-22 19:34:03.825335: step: 1568/527, loss: 0.010690315626561642 2023-01-22 19:34:04.879217: step: 1572/527, loss: 0.018749846145510674 2023-01-22 19:34:05.921441: step: 1576/527, loss: 0.005152354948222637 2023-01-22 19:34:06.976702: step: 1580/527, loss: 0.04477600008249283 2023-01-22 19:34:08.018602: step: 1584/527, loss: 0.011427847668528557 2023-01-22 19:34:09.062171: step: 1588/527, loss: 0.007948131300508976 2023-01-22 19:34:10.110424: step: 1592/527, loss: 0.041392892599105835 2023-01-22 19:34:11.161157: step: 1596/527, loss: 0.009622021578252316 2023-01-22 19:34:12.205930: step: 1600/527, loss: 0.0037746590096503496 2023-01-22 19:34:13.234123: step: 1604/527, loss: 0.01732400804758072 2023-01-22 19:34:14.274252: step: 1608/527, loss: 0.03905663266777992 2023-01-22 19:34:15.322863: step: 1612/527, loss: 0.02227986603975296 2023-01-22 19:34:16.364149: step: 1616/527, loss: 0.014169770292937756 2023-01-22 19:34:17.406057: step: 1620/527, loss: 0.014529128558933735 2023-01-22 19:34:18.439794: step: 1624/527, loss: 0.004460591822862625 2023-01-22 19:34:19.490367: step: 1628/527, loss: 0.02952991984784603 2023-01-22 19:34:20.542783: step: 1632/527, loss: 0.0642104521393776 2023-01-22 19:34:21.587463: step: 1636/527, loss: 0.03177327662706375 2023-01-22 19:34:22.660327: step: 1640/527, loss: 0.00895645096898079 2023-01-22 19:34:23.698824: step: 1644/527, loss: 0.011493697762489319 2023-01-22 19:34:24.742066: step: 1648/527, loss: 0.002630829345434904 2023-01-22 19:34:25.796227: step: 1652/527, loss: 0.05805065110325813 2023-01-22 19:34:26.833205: step: 1656/527, loss: 0.008323252201080322 2023-01-22 19:34:27.891028: step: 1660/527, loss: 0.00827107671648264 2023-01-22 19:34:28.930615: step: 1664/527, loss: 0.02883150614798069 2023-01-22 19:34:29.995998: step: 1668/527, loss: 0.009317230433225632 2023-01-22 19:34:31.032333: step: 1672/527, loss: 0.031653061509132385 2023-01-22 19:34:32.079388: step: 1676/527, loss: 0.013449599035084248 2023-01-22 19:34:33.107930: step: 1680/527, loss: 0.00865088403224945 2023-01-22 19:34:34.157546: step: 1684/527, loss: 0.009988667443394661 2023-01-22 19:34:35.196407: step: 1688/527, loss: 0.034307632595300674 2023-01-22 19:34:36.235883: step: 1692/527, loss: 0.008244461379945278 2023-01-22 19:34:37.271979: step: 1696/527, loss: 0.015821607783436775 2023-01-22 19:34:38.313194: step: 1700/527, loss: 0.014487474225461483 2023-01-22 19:34:39.359836: step: 1704/527, loss: 0.006990600842982531 2023-01-22 19:34:40.424070: step: 1708/527, loss: 0.0161539688706398 2023-01-22 19:34:41.460320: step: 1712/527, loss: 0.025770502164959908 2023-01-22 19:34:42.509820: step: 1716/527, loss: 0.01005584467202425 2023-01-22 19:34:43.553481: step: 1720/527, loss: 0.0176510289311409 2023-01-22 19:34:44.614172: step: 1724/527, loss: 0.030513670295476913 2023-01-22 19:34:45.649444: step: 1728/527, loss: 0.013273200020194054 2023-01-22 19:34:46.697011: step: 1732/527, loss: 0.010769909247756004 2023-01-22 19:34:47.747632: step: 1736/527, loss: 0.053587283939123154 2023-01-22 19:34:48.782855: step: 1740/527, loss: 0.013260331936180592 2023-01-22 19:34:49.844956: step: 1744/527, loss: 0.041208021342754364 2023-01-22 19:34:50.909555: step: 1748/527, loss: 0.006117898039519787 2023-01-22 19:34:51.945768: step: 1752/527, loss: 0.04246333986520767 2023-01-22 19:34:53.006402: step: 1756/527, loss: 0.007902263663709164 2023-01-22 19:34:54.059350: step: 1760/527, loss: 0.037524960935115814 2023-01-22 19:34:55.104606: step: 1764/527, loss: 0.00913565419614315 2023-01-22 19:34:56.139898: step: 1768/527, loss: 0.021684909239411354 2023-01-22 19:34:57.185676: step: 1772/527, loss: 0.03140508010983467 2023-01-22 19:34:58.220485: step: 1776/527, loss: 0.008953028358519077 2023-01-22 19:34:59.250543: step: 1780/527, loss: 0.04349980130791664 2023-01-22 19:35:00.299211: step: 1784/527, loss: 0.008197353221476078 2023-01-22 19:35:01.337648: step: 1788/527, loss: 0.006808818783611059 2023-01-22 19:35:02.356321: step: 1792/527, loss: 0.005978906527161598 2023-01-22 19:35:03.405763: step: 1796/527, loss: 0.01945285126566887 2023-01-22 19:35:04.453711: step: 1800/527, loss: 0.007546133361756802 2023-01-22 19:35:05.513550: step: 1804/527, loss: 0.01640220545232296 2023-01-22 19:35:06.579795: step: 1808/527, loss: 0.041898034512996674 2023-01-22 19:35:07.637726: step: 1812/527, loss: 0.008640028536319733 2023-01-22 19:35:08.665855: step: 1816/527, loss: 0.017172126099467278 2023-01-22 19:35:09.705649: step: 1820/527, loss: 0.009918355382978916 2023-01-22 19:35:10.761983: step: 1824/527, loss: 0.02356589213013649 2023-01-22 19:35:11.810404: step: 1828/527, loss: 0.021140605211257935 2023-01-22 19:35:12.867760: step: 1832/527, loss: 0.0565682053565979 2023-01-22 19:35:13.903069: step: 1836/527, loss: 0.010100433602929115 2023-01-22 19:35:14.948297: step: 1840/527, loss: 0.006153437774628401 2023-01-22 19:35:15.986790: step: 1844/527, loss: 0.03274431452155113 2023-01-22 19:35:17.032762: step: 1848/527, loss: 0.01192291546612978 2023-01-22 19:35:18.083258: step: 1852/527, loss: 0.009296722710132599 2023-01-22 19:35:19.115120: step: 1856/527, loss: 0.009591406211256981 2023-01-22 19:35:20.183158: step: 1860/527, loss: 0.026896940544247627 2023-01-22 19:35:21.218818: step: 1864/527, loss: 0.03655450791120529 2023-01-22 19:35:22.257675: step: 1868/527, loss: 0.0128747234120965 2023-01-22 19:35:23.305485: step: 1872/527, loss: 0.012429596856236458 2023-01-22 19:35:24.341166: step: 1876/527, loss: 0.009383030235767365 2023-01-22 19:35:25.382589: step: 1880/527, loss: 0.02010120078921318 2023-01-22 19:35:26.427054: step: 1884/527, loss: 0.009326232597231865 2023-01-22 19:35:27.458345: step: 1888/527, loss: 0.012102792039513588 2023-01-22 19:35:28.495400: step: 1892/527, loss: 0.006842283997684717 2023-01-22 19:35:29.568720: step: 1896/527, loss: 0.03691709414124489 2023-01-22 19:35:30.611231: step: 1900/527, loss: 0.023316722363233566 2023-01-22 19:35:31.645320: step: 1904/527, loss: 0.009631594642996788 2023-01-22 19:35:32.696767: step: 1908/527, loss: 0.015783611685037613 2023-01-22 19:35:33.737396: step: 1912/527, loss: 0.01289606187492609 2023-01-22 19:35:34.788506: step: 1916/527, loss: 0.02982412464916706 2023-01-22 19:35:35.832849: step: 1920/527, loss: 0.018698425963521004 2023-01-22 19:35:36.882214: step: 1924/527, loss: 0.0043085478246212006 2023-01-22 19:35:37.931363: step: 1928/527, loss: 0.012623300775885582 2023-01-22 19:35:38.997667: step: 1932/527, loss: 0.034485820680856705 2023-01-22 19:35:40.059568: step: 1936/527, loss: 0.0386204868555069 2023-01-22 19:35:41.091595: step: 1940/527, loss: 0.012891515158116817 2023-01-22 19:35:42.128259: step: 1944/527, loss: 0.038250040262937546 2023-01-22 19:35:43.167997: step: 1948/527, loss: 0.011790971271693707 2023-01-22 19:35:44.212166: step: 1952/527, loss: 0.01894655078649521 2023-01-22 19:35:45.253172: step: 1956/527, loss: 0.012213597074151039 2023-01-22 19:35:46.305378: step: 1960/527, loss: 0.0093131298199296 2023-01-22 19:35:47.340336: step: 1964/527, loss: 0.007838459685444832 2023-01-22 19:35:48.394095: step: 1968/527, loss: 0.014537162147462368 2023-01-22 19:35:49.467621: step: 1972/527, loss: 0.01118561252951622 2023-01-22 19:35:50.531598: step: 1976/527, loss: 0.0030727682169526815 2023-01-22 19:35:51.587831: step: 1980/527, loss: 0.008721991442143917 2023-01-22 19:35:52.648670: step: 1984/527, loss: 0.020510070025920868 2023-01-22 19:35:53.692181: step: 1988/527, loss: 0.01436369027942419 2023-01-22 19:35:54.726856: step: 1992/527, loss: 0.03267817199230194 2023-01-22 19:35:55.782550: step: 1996/527, loss: 0.01706632599234581 2023-01-22 19:35:56.836079: step: 2000/527, loss: 0.007094400003552437 2023-01-22 19:35:57.884849: step: 2004/527, loss: 0.011377047747373581 2023-01-22 19:35:58.928931: step: 2008/527, loss: 0.026451947167515755 2023-01-22 19:35:59.974636: step: 2012/527, loss: 0.03436955064535141 2023-01-22 19:36:01.025703: step: 2016/527, loss: 0.00371461920440197 2023-01-22 19:36:02.065594: step: 2020/527, loss: 0.019599679857492447 2023-01-22 19:36:03.108034: step: 2024/527, loss: 0.01290049683302641 2023-01-22 19:36:04.155627: step: 2028/527, loss: 0.003558845492079854 2023-01-22 19:36:05.183616: step: 2032/527, loss: 0.005182476714253426 2023-01-22 19:36:06.219746: step: 2036/527, loss: 0.002554287202656269 2023-01-22 19:36:07.278889: step: 2040/527, loss: 0.011803234927356243 2023-01-22 19:36:08.335417: step: 2044/527, loss: 0.0056258318945765495 2023-01-22 19:36:09.373069: step: 2048/527, loss: 0.02907024510204792 2023-01-22 19:36:10.417719: step: 2052/527, loss: 0.034525007009506226 2023-01-22 19:36:11.452110: step: 2056/527, loss: 0.006202684249728918 2023-01-22 19:36:12.502389: step: 2060/527, loss: 0.01596401445567608 2023-01-22 19:36:13.568149: step: 2064/527, loss: 0.052929650992155075 2023-01-22 19:36:14.628942: step: 2068/527, loss: 0.03775949031114578 2023-01-22 19:36:15.662433: step: 2072/527, loss: 0.008293618448078632 2023-01-22 19:36:16.734325: step: 2076/527, loss: 0.057668909430503845 2023-01-22 19:36:17.789625: step: 2080/527, loss: 0.0658436194062233 2023-01-22 19:36:18.824641: step: 2084/527, loss: 0.011614012531936169 2023-01-22 19:36:19.851324: step: 2088/527, loss: 0.011249169707298279 2023-01-22 19:36:20.896407: step: 2092/527, loss: 0.005233381409198046 2023-01-22 19:36:21.953314: step: 2096/527, loss: 0.031973328441381454 2023-01-22 19:36:23.013145: step: 2100/527, loss: 0.004628822207450867 2023-01-22 19:36:24.077742: step: 2104/527, loss: 0.0032797774765640497 2023-01-22 19:36:25.123134: step: 2108/527, loss: 0.01285646203905344 ================================================== Loss: 0.022 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3264723230548723, 'r': 0.33576470416649107, 'f1': 0.3310533191688322}, 'combined': 0.24393402465071845, 'stategy': 1, 'epoch': 0} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.35636473460031554, 'r': 0.2976731814040852, 'f1': 0.32438554919493273}, 'combined': 0.23257831829070652, 'stategy': 1, 'epoch': 0} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3264723230548723, 'r': 0.33576470416649107, 'f1': 0.3310533191688322}, 'combined': 0.24393402465071845, 'stategy': 1, 'epoch': 0} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.35636473460031554, 'r': 0.2976731814040852, 'f1': 0.32438554919493273}, 'combined': 0.23257831829070652, 'stategy': 1, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 19:39:22.613122: step: 4/527, loss: 0.018123280256986618 2023-01-22 19:39:23.645858: step: 8/527, loss: 0.01113036647439003 2023-01-22 19:39:24.672321: step: 12/527, loss: 0.011442981660366058 2023-01-22 19:39:25.719878: step: 16/527, loss: 0.01591738872230053 2023-01-22 19:39:26.757777: step: 20/527, loss: 0.011292441748082638 2023-01-22 19:39:27.781418: step: 24/527, loss: 0.010824518278241158 2023-01-22 19:39:28.836184: step: 28/527, loss: 0.007420710287988186 2023-01-22 19:39:29.879214: step: 32/527, loss: 0.013100599870085716 2023-01-22 19:39:30.919129: step: 36/527, loss: 0.02413242682814598 2023-01-22 19:39:31.974478: step: 40/527, loss: 0.02858263999223709 2023-01-22 19:39:33.012317: step: 44/527, loss: 0.004235970322042704 2023-01-22 19:39:34.051510: step: 48/527, loss: 0.00850728154182434 2023-01-22 19:39:35.102202: step: 52/527, loss: 0.01044514961540699 2023-01-22 19:39:36.142787: step: 56/527, loss: 0.008220076560974121 2023-01-22 19:39:37.197865: step: 60/527, loss: 0.014187716878950596 2023-01-22 19:39:38.234633: step: 64/527, loss: 0.005522199906408787 2023-01-22 19:39:39.291685: step: 68/527, loss: 0.010371244512498379 2023-01-22 19:39:40.330799: step: 72/527, loss: 0.00860979501157999 2023-01-22 19:39:41.376027: step: 76/527, loss: 0.010606663301587105 2023-01-22 19:39:42.410546: step: 80/527, loss: 0.011769460514187813 2023-01-22 19:39:43.457715: step: 84/527, loss: 0.04180312901735306 2023-01-22 19:39:44.499524: step: 88/527, loss: 0.004522264935076237 2023-01-22 19:39:45.550098: step: 92/527, loss: 0.04639697074890137 2023-01-22 19:39:46.609900: step: 96/527, loss: 0.016354702413082123 2023-01-22 19:39:47.641565: step: 100/527, loss: 0.01197791751474142 2023-01-22 19:39:48.674366: step: 104/527, loss: 0.008995951153337955 2023-01-22 19:39:49.716506: step: 108/527, loss: 0.01863301731646061 2023-01-22 19:39:50.812292: step: 112/527, loss: 0.04583723843097687 2023-01-22 19:39:51.858843: step: 116/527, loss: 0.036483559757471085 2023-01-22 19:39:52.891338: step: 120/527, loss: 0.02192608267068863 2023-01-22 19:39:53.937331: step: 124/527, loss: 0.05907829478383064 2023-01-22 19:39:54.983067: step: 128/527, loss: 0.010600809939205647 2023-01-22 19:39:56.030345: step: 132/527, loss: 0.007352891843765974 2023-01-22 19:39:57.073990: step: 136/527, loss: 0.00801857840269804 2023-01-22 19:39:58.122801: step: 140/527, loss: 0.004974058363586664 2023-01-22 19:39:59.176774: step: 144/527, loss: 0.024974599480628967 2023-01-22 19:40:00.236601: step: 148/527, loss: 0.014170176349580288 2023-01-22 19:40:01.287052: step: 152/527, loss: 0.014072049409151077 2023-01-22 19:40:02.331637: step: 156/527, loss: 0.0 2023-01-22 19:40:03.384493: step: 160/527, loss: 0.005382283125072718 2023-01-22 19:40:04.428707: step: 164/527, loss: 0.015689590945839882 2023-01-22 19:40:05.487702: step: 168/527, loss: 0.03355139121413231 2023-01-22 19:40:06.551610: step: 172/527, loss: 0.01109452173113823 2023-01-22 19:40:07.600025: step: 176/527, loss: 0.010349083691835403 2023-01-22 19:40:08.636414: step: 180/527, loss: 0.008498347364366055 2023-01-22 19:40:09.673871: step: 184/527, loss: 0.013933093287050724 2023-01-22 19:40:10.737779: step: 188/527, loss: 0.008587097749114037 2023-01-22 19:40:11.780433: step: 192/527, loss: 0.017063884064555168 2023-01-22 19:40:12.831609: step: 196/527, loss: 0.0087741669267416 2023-01-22 19:40:13.878365: step: 200/527, loss: 0.012951512821018696 2023-01-22 19:40:14.926870: step: 204/527, loss: 0.009218239225447178 2023-01-22 19:40:15.996670: step: 208/527, loss: 0.007676574867218733 2023-01-22 19:40:17.041524: step: 212/527, loss: 0.003308728104457259 2023-01-22 19:40:18.080845: step: 216/527, loss: 0.00992603600025177 2023-01-22 19:40:19.124137: step: 220/527, loss: 0.06028809770941734 2023-01-22 19:40:20.186330: step: 224/527, loss: 0.02564755082130432 2023-01-22 19:40:21.243085: step: 228/527, loss: 0.010617483407258987 2023-01-22 19:40:22.297199: step: 232/527, loss: 0.01835789903998375 2023-01-22 19:40:23.341653: step: 236/527, loss: 0.015816032886505127 2023-01-22 19:40:24.391325: step: 240/527, loss: 0.008347642607986927 2023-01-22 19:40:25.469158: step: 244/527, loss: 0.03272389993071556 2023-01-22 19:40:26.518012: step: 248/527, loss: 0.010540221817791462 2023-01-22 19:40:27.560509: step: 252/527, loss: 0.06438185274600983 2023-01-22 19:40:28.634625: step: 256/527, loss: 0.002308423165231943 2023-01-22 19:40:29.695248: step: 260/527, loss: 0.0038301192689687014 2023-01-22 19:40:30.740567: step: 264/527, loss: 0.012879110872745514 2023-01-22 19:40:31.786948: step: 268/527, loss: 0.047585975378751755 2023-01-22 19:40:32.834542: step: 272/527, loss: 0.007781265303492546 2023-01-22 19:40:33.897419: step: 276/527, loss: 0.014498968608677387 2023-01-22 19:40:34.937264: step: 280/527, loss: 0.027096187695860863 2023-01-22 19:40:35.974088: step: 284/527, loss: 0.011391350999474525 2023-01-22 19:40:37.035700: step: 288/527, loss: 0.0075580524280667305 2023-01-22 19:40:38.096008: step: 292/527, loss: 0.02844475395977497 2023-01-22 19:40:39.148019: step: 296/527, loss: 0.029594112187623978 2023-01-22 19:40:40.198774: step: 300/527, loss: 0.004208073019981384 2023-01-22 19:40:41.254455: step: 304/527, loss: 0.01020155567675829 2023-01-22 19:40:42.306435: step: 308/527, loss: 0.009681605733931065 2023-01-22 19:40:43.356007: step: 312/527, loss: 0.017227400094270706 2023-01-22 19:40:44.400723: step: 316/527, loss: 0.008385112509131432 2023-01-22 19:40:45.447198: step: 320/527, loss: 0.0276352446526289 2023-01-22 19:40:46.479870: step: 324/527, loss: 0.014857093803584576 2023-01-22 19:40:47.539165: step: 328/527, loss: 0.014654400758445263 2023-01-22 19:40:48.580551: step: 332/527, loss: 0.003937077708542347 2023-01-22 19:40:49.624673: step: 336/527, loss: 0.006456168368458748 2023-01-22 19:40:50.679994: step: 340/527, loss: 0.009996989741921425 2023-01-22 19:40:51.724418: step: 344/527, loss: 0.047514837235212326 2023-01-22 19:40:52.801245: step: 348/527, loss: 0.006446389947086573 2023-01-22 19:40:53.843804: step: 352/527, loss: 0.03986106067895889 2023-01-22 19:40:54.887639: step: 356/527, loss: 0.013090807944536209 2023-01-22 19:40:55.931856: step: 360/527, loss: 0.007790746167302132 2023-01-22 19:40:56.976654: step: 364/527, loss: 0.004554046783596277 2023-01-22 19:40:58.049613: step: 368/527, loss: 0.014674236066639423 2023-01-22 19:40:59.103807: step: 372/527, loss: 0.026796650141477585 2023-01-22 19:41:00.157160: step: 376/527, loss: 0.02358562871813774 2023-01-22 19:41:01.195991: step: 380/527, loss: 0.044411905109882355 2023-01-22 19:41:02.239588: step: 384/527, loss: 0.012452136725187302 2023-01-22 19:41:03.285188: step: 388/527, loss: 0.028408855199813843 2023-01-22 19:41:04.342463: step: 392/527, loss: 0.04561150446534157 2023-01-22 19:41:05.380546: step: 396/527, loss: 0.008928496390581131 2023-01-22 19:41:06.432389: step: 400/527, loss: 0.014053450897336006 2023-01-22 19:41:07.491822: step: 404/527, loss: 0.010381078347563744 2023-01-22 19:41:08.524766: step: 408/527, loss: 0.005319703370332718 2023-01-22 19:41:09.570172: step: 412/527, loss: 0.008386380970478058 2023-01-22 19:41:10.618249: step: 416/527, loss: 0.04453711956739426 2023-01-22 19:41:11.656802: step: 420/527, loss: 0.011459262110292912 2023-01-22 19:41:12.728131: step: 424/527, loss: 0.04532322660088539 2023-01-22 19:41:13.778341: step: 428/527, loss: 0.032943982630968094 2023-01-22 19:41:14.831344: step: 432/527, loss: 0.020364120602607727 2023-01-22 19:41:15.890294: step: 436/527, loss: 0.02971181832253933 2023-01-22 19:41:16.944263: step: 440/527, loss: 0.011676590889692307 2023-01-22 19:41:17.996392: step: 444/527, loss: 0.012562346644699574 2023-01-22 19:41:19.059619: step: 448/527, loss: 0.018417175859212875 2023-01-22 19:41:20.113575: step: 452/527, loss: 0.02813078835606575 2023-01-22 19:41:21.169370: step: 456/527, loss: 0.019882671535015106 2023-01-22 19:41:22.216435: step: 460/527, loss: 0.017151357606053352 2023-01-22 19:41:23.258367: step: 464/527, loss: 0.009238461032509804 2023-01-22 19:41:24.329413: step: 468/527, loss: 0.01628982648253441 2023-01-22 19:41:25.380635: step: 472/527, loss: 0.009066357277333736 2023-01-22 19:41:26.425021: step: 476/527, loss: 0.01865309663116932 2023-01-22 19:41:27.466598: step: 480/527, loss: 0.003974127117544413 2023-01-22 19:41:28.507846: step: 484/527, loss: 0.004207334015518427 2023-01-22 19:41:29.553250: step: 488/527, loss: 0.024722395464777946 2023-01-22 19:41:30.593906: step: 492/527, loss: 0.002124141901731491 2023-01-22 19:41:31.654013: step: 496/527, loss: 0.01544024795293808 2023-01-22 19:41:32.692730: step: 500/527, loss: 0.019356610253453255 2023-01-22 19:41:33.737892: step: 504/527, loss: 0.014582685194909573 2023-01-22 19:41:34.780958: step: 508/527, loss: 0.0061416951939463615 2023-01-22 19:41:35.838983: step: 512/527, loss: 0.021815327927470207 2023-01-22 19:41:36.902260: step: 516/527, loss: 0.006505594588816166 2023-01-22 19:41:37.952415: step: 520/527, loss: 0.01506065670400858 2023-01-22 19:41:38.992042: step: 524/527, loss: 0.028149817138910294 2023-01-22 19:41:40.037440: step: 528/527, loss: 0.01869608648121357 2023-01-22 19:41:41.093743: step: 532/527, loss: 0.006509323138743639 2023-01-22 19:41:42.129985: step: 536/527, loss: 0.002958935219794512 2023-01-22 19:41:43.166095: step: 540/527, loss: 0.005710993893444538 2023-01-22 19:41:44.212984: step: 544/527, loss: 0.012775847688317299 2023-01-22 19:41:45.264845: step: 548/527, loss: 0.01182165089994669 2023-01-22 19:41:46.321069: step: 552/527, loss: 0.0019352142699062824 2023-01-22 19:41:47.366749: step: 556/527, loss: 0.010771512985229492 2023-01-22 19:41:48.443921: step: 560/527, loss: 0.021161146461963654 2023-01-22 19:41:49.480534: step: 564/527, loss: 0.004150019027292728 2023-01-22 19:41:50.528043: step: 568/527, loss: 0.011796188540756702 2023-01-22 19:41:51.592324: step: 572/527, loss: 0.013756037689745426 2023-01-22 19:41:52.647223: step: 576/527, loss: 0.006406253203749657 2023-01-22 19:41:53.695497: step: 580/527, loss: 0.05517202988266945 2023-01-22 19:41:54.746652: step: 584/527, loss: 0.0384361632168293 2023-01-22 19:41:55.799520: step: 588/527, loss: 0.030206233263015747 2023-01-22 19:41:56.839536: step: 592/527, loss: 0.016878217458724976 2023-01-22 19:41:57.883070: step: 596/527, loss: 0.03763467073440552 2023-01-22 19:41:58.932540: step: 600/527, loss: 0.013674174435436726 2023-01-22 19:41:59.989947: step: 604/527, loss: 0.026607193052768707 2023-01-22 19:42:01.031079: step: 608/527, loss: 0.021218687295913696 2023-01-22 19:42:02.076602: step: 612/527, loss: 0.0059247203171253204 2023-01-22 19:42:03.122551: step: 616/527, loss: 0.011762079782783985 2023-01-22 19:42:04.157108: step: 620/527, loss: 0.0027541397139430046 2023-01-22 19:42:05.201968: step: 624/527, loss: 0.013390828855335712 2023-01-22 19:42:06.258418: step: 628/527, loss: 0.011553638614714146 2023-01-22 19:42:07.291118: step: 632/527, loss: 0.0 2023-01-22 19:42:08.343733: step: 636/527, loss: 0.007285550236701965 2023-01-22 19:42:09.378713: step: 640/527, loss: 0.0054983580484986305 2023-01-22 19:42:10.417390: step: 644/527, loss: 0.010017899796366692 2023-01-22 19:42:11.446347: step: 648/527, loss: 0.047052692621946335 2023-01-22 19:42:12.511010: step: 652/527, loss: 0.006029477808624506 2023-01-22 19:42:13.549356: step: 656/527, loss: 0.003855266375467181 2023-01-22 19:42:14.591335: step: 660/527, loss: 0.0611177533864975 2023-01-22 19:42:15.645247: step: 664/527, loss: 0.00275116297416389 2023-01-22 19:42:16.680438: step: 668/527, loss: 0.0023651104420423508 2023-01-22 19:42:17.731296: step: 672/527, loss: 0.007305797655135393 2023-01-22 19:42:18.771737: step: 676/527, loss: 0.00989602506160736 2023-01-22 19:42:19.821966: step: 680/527, loss: 0.03270602971315384 2023-01-22 19:42:20.870560: step: 684/527, loss: 0.006119735073298216 2023-01-22 19:42:21.924265: step: 688/527, loss: 0.017699265852570534 2023-01-22 19:42:22.984811: step: 692/527, loss: 0.0 2023-01-22 19:42:24.039844: step: 696/527, loss: 0.008197142742574215 2023-01-22 19:42:25.105481: step: 700/527, loss: 0.008787122555077076 2023-01-22 19:42:26.163196: step: 704/527, loss: 0.009869945235550404 2023-01-22 19:42:27.203570: step: 708/527, loss: 0.007514165714383125 2023-01-22 19:42:28.242648: step: 712/527, loss: 0.018841061741113663 2023-01-22 19:42:29.276322: step: 716/527, loss: 0.007815783843398094 2023-01-22 19:42:30.334227: step: 720/527, loss: 0.023072386160492897 2023-01-22 19:42:31.400987: step: 724/527, loss: 0.021277835592627525 2023-01-22 19:42:32.460325: step: 728/527, loss: 0.008390788920223713 2023-01-22 19:42:33.503436: step: 732/527, loss: 0.013654526323080063 2023-01-22 19:42:34.561966: step: 736/527, loss: 0.006143168080598116 2023-01-22 19:42:35.602134: step: 740/527, loss: 0.004079313948750496 2023-01-22 19:42:36.653588: step: 744/527, loss: 0.006833904888480902 2023-01-22 19:42:37.706462: step: 748/527, loss: 0.036992333829402924 2023-01-22 19:42:38.744171: step: 752/527, loss: 0.019806277006864548 2023-01-22 19:42:39.806249: step: 756/527, loss: 0.03341313824057579 2023-01-22 19:42:40.856825: step: 760/527, loss: 0.0044708638451993465 2023-01-22 19:42:41.915206: step: 764/527, loss: 0.013582027517259121 2023-01-22 19:42:42.962376: step: 768/527, loss: 0.023160278797149658 2023-01-22 19:42:43.997568: step: 772/527, loss: 0.008496263064444065 2023-01-22 19:42:45.033211: step: 776/527, loss: 0.009298603050410748 2023-01-22 19:42:46.068434: step: 780/527, loss: 0.01866867020726204 2023-01-22 19:42:47.122712: step: 784/527, loss: 0.00896376371383667 2023-01-22 19:42:48.185715: step: 788/527, loss: 0.005661291535943747 2023-01-22 19:42:49.253233: step: 792/527, loss: 0.0055063762702047825 2023-01-22 19:42:50.296786: step: 796/527, loss: 0.007555162068456411 2023-01-22 19:42:51.344823: step: 800/527, loss: 0.013754550367593765 2023-01-22 19:42:52.411074: step: 804/527, loss: 0.01965765841305256 2023-01-22 19:42:53.457163: step: 808/527, loss: 0.002758193761110306 2023-01-22 19:42:54.509700: step: 812/527, loss: 0.013669691048562527 2023-01-22 19:42:55.554471: step: 816/527, loss: 0.045681025832891464 2023-01-22 19:42:56.598449: step: 820/527, loss: 0.06644242256879807 2023-01-22 19:42:57.650582: step: 824/527, loss: 0.013010891154408455 2023-01-22 19:42:58.698278: step: 828/527, loss: 0.048347603529691696 2023-01-22 19:42:59.756785: step: 832/527, loss: 0.010545626282691956 2023-01-22 19:43:00.804792: step: 836/527, loss: 0.0030454867519438267 2023-01-22 19:43:01.841876: step: 840/527, loss: 0.00696616992354393 2023-01-22 19:43:02.880805: step: 844/527, loss: 0.0010580855887383223 2023-01-22 19:43:03.935787: step: 848/527, loss: 0.007512359414249659 2023-01-22 19:43:04.977072: step: 852/527, loss: 0.015254289843142033 2023-01-22 19:43:06.030883: step: 856/527, loss: 0.024053217843174934 2023-01-22 19:43:07.090710: step: 860/527, loss: 0.011806032620370388 2023-01-22 19:43:08.164883: step: 864/527, loss: 0.0076525830663740635 2023-01-22 19:43:09.225149: step: 868/527, loss: 0.028572598472237587 2023-01-22 19:43:10.267633: step: 872/527, loss: 0.005810061935335398 2023-01-22 19:43:11.307744: step: 876/527, loss: 0.005152808502316475 2023-01-22 19:43:12.343797: step: 880/527, loss: 0.01600806601345539 2023-01-22 19:43:13.406015: step: 884/527, loss: 0.0075842938385903835 2023-01-22 19:43:14.445915: step: 888/527, loss: 0.00941468682140112 2023-01-22 19:43:15.501425: step: 892/527, loss: 0.026428719982504845 2023-01-22 19:43:16.550924: step: 896/527, loss: 0.0038772111292928457 2023-01-22 19:43:17.587221: step: 900/527, loss: 0.011145041324198246 2023-01-22 19:43:18.617266: step: 904/527, loss: 0.028039876371622086 2023-01-22 19:43:19.697479: step: 908/527, loss: 0.007937535643577576 2023-01-22 19:43:20.730767: step: 912/527, loss: 0.006360695231705904 2023-01-22 19:43:21.788828: step: 916/527, loss: 0.03871196135878563 2023-01-22 19:43:22.833475: step: 920/527, loss: 0.026212390512228012 2023-01-22 19:43:23.900687: step: 924/527, loss: 0.007839180529117584 2023-01-22 19:43:24.943250: step: 928/527, loss: 0.017373070120811462 2023-01-22 19:43:25.974765: step: 932/527, loss: 0.011837446130812168 2023-01-22 19:43:26.999821: step: 936/527, loss: 0.005958031862974167 2023-01-22 19:43:28.045715: step: 940/527, loss: 0.01136813685297966 2023-01-22 19:43:29.109391: step: 944/527, loss: 0.010612448677420616 2023-01-22 19:43:30.164897: step: 948/527, loss: 0.010316536761820316 2023-01-22 19:43:31.214057: step: 952/527, loss: 0.0356927253305912 2023-01-22 19:43:32.250616: step: 956/527, loss: 0.048618774861097336 2023-01-22 19:43:33.301561: step: 960/527, loss: 0.008397760801017284 2023-01-22 19:43:34.359000: step: 964/527, loss: 0.027125095948576927 2023-01-22 19:43:35.401527: step: 968/527, loss: 0.003724482608959079 2023-01-22 19:43:36.446785: step: 972/527, loss: 0.005296440329402685 2023-01-22 19:43:37.481214: step: 976/527, loss: 0.009009646251797676 2023-01-22 19:43:38.520386: step: 980/527, loss: 0.00929985474795103 2023-01-22 19:43:39.564340: step: 984/527, loss: 0.007095534820109606 2023-01-22 19:43:40.635661: step: 988/527, loss: 0.007457053754478693 2023-01-22 19:43:41.676120: step: 992/527, loss: 0.0038232351653277874 2023-01-22 19:43:42.723518: step: 996/527, loss: 0.009703759104013443 2023-01-22 19:43:43.764859: step: 1000/527, loss: 0.008962608873844147 2023-01-22 19:43:44.823049: step: 1004/527, loss: 0.004763863980770111 2023-01-22 19:43:45.869840: step: 1008/527, loss: 0.008955919183790684 2023-01-22 19:43:46.921886: step: 1012/527, loss: 0.03379448503255844 2023-01-22 19:43:47.950643: step: 1016/527, loss: 0.009278319776058197 2023-01-22 19:43:48.993183: step: 1020/527, loss: 0.003386962693184614 2023-01-22 19:43:50.041605: step: 1024/527, loss: 0.008770664222538471 2023-01-22 19:43:51.084701: step: 1028/527, loss: 0.008762028999626637 2023-01-22 19:43:52.132628: step: 1032/527, loss: 0.013162856921553612 2023-01-22 19:43:53.189290: step: 1036/527, loss: 0.040115103125572205 2023-01-22 19:43:54.230062: step: 1040/527, loss: 0.013916801661252975 2023-01-22 19:43:55.280670: step: 1044/527, loss: 0.005811354145407677 2023-01-22 19:43:56.354317: step: 1048/527, loss: 0.0283705722540617 2023-01-22 19:43:57.415814: step: 1052/527, loss: 0.008610613644123077 2023-01-22 19:43:58.467065: step: 1056/527, loss: 0.012159034609794617 2023-01-22 19:43:59.510990: step: 1060/527, loss: 0.007214654702693224 2023-01-22 19:44:00.554289: step: 1064/527, loss: 0.009670075960457325 2023-01-22 19:44:01.594028: step: 1068/527, loss: 0.020627478137612343 2023-01-22 19:44:02.648462: step: 1072/527, loss: 0.008537794463336468 2023-01-22 19:44:03.702582: step: 1076/527, loss: 0.004947997163981199 2023-01-22 19:44:04.749612: step: 1080/527, loss: 0.012593166902661324 2023-01-22 19:44:05.804814: step: 1084/527, loss: 0.03236313536763191 2023-01-22 19:44:06.840415: step: 1088/527, loss: 0.01142879854887724 2023-01-22 19:44:07.888663: step: 1092/527, loss: 0.009611567482352257 2023-01-22 19:44:08.933529: step: 1096/527, loss: 0.006485611200332642 2023-01-22 19:44:09.970717: step: 1100/527, loss: 0.0049271974712610245 2023-01-22 19:44:11.002719: step: 1104/527, loss: 0.054322656244039536 2023-01-22 19:44:12.054735: step: 1108/527, loss: 0.015809351578354836 2023-01-22 19:44:13.099062: step: 1112/527, loss: 0.008611029013991356 2023-01-22 19:44:14.138703: step: 1116/527, loss: 0.012601575814187527 2023-01-22 19:44:15.185493: step: 1120/527, loss: 0.006335206795483828 2023-01-22 19:44:16.242376: step: 1124/527, loss: 0.01283422764390707 2023-01-22 19:44:17.297105: step: 1128/527, loss: 0.025943251326680183 2023-01-22 19:44:18.356459: step: 1132/527, loss: 0.05394890159368515 2023-01-22 19:44:19.396606: step: 1136/527, loss: 0.0010803146287798882 2023-01-22 19:44:20.457450: step: 1140/527, loss: 0.021296994760632515 2023-01-22 19:44:21.513888: step: 1144/527, loss: 0.008542295545339584 2023-01-22 19:44:22.568731: step: 1148/527, loss: 0.008017108775675297 2023-01-22 19:44:23.637338: step: 1152/527, loss: 0.004137382842600346 2023-01-22 19:44:24.691415: step: 1156/527, loss: 0.010905577801167965 2023-01-22 19:44:25.731823: step: 1160/527, loss: 0.002994515234604478 2023-01-22 19:44:26.774161: step: 1164/527, loss: 0.010548352263867855 2023-01-22 19:44:27.818122: step: 1168/527, loss: 0.011764622293412685 2023-01-22 19:44:28.884580: step: 1172/527, loss: 0.008668174967169762 2023-01-22 19:44:29.919994: step: 1176/527, loss: 0.008785519748926163 2023-01-22 19:44:30.978294: step: 1180/527, loss: 0.007309382315725088 2023-01-22 19:44:32.018415: step: 1184/527, loss: 0.009179351851344109 2023-01-22 19:44:33.059331: step: 1188/527, loss: 0.024498607963323593 2023-01-22 19:44:34.100052: step: 1192/527, loss: 0.010385243222117424 2023-01-22 19:44:35.144234: step: 1196/527, loss: 0.007233790121972561 2023-01-22 19:44:36.197781: step: 1200/527, loss: 0.012378869578242302 2023-01-22 19:44:37.256353: step: 1204/527, loss: 0.006361040752381086 2023-01-22 19:44:38.309830: step: 1208/527, loss: 0.053797416388988495 2023-01-22 19:44:39.360892: step: 1212/527, loss: 0.007483770605176687 2023-01-22 19:44:40.419267: step: 1216/527, loss: 0.023426569998264313 2023-01-22 19:44:41.472314: step: 1220/527, loss: 0.06588196754455566 2023-01-22 19:44:42.515642: step: 1224/527, loss: 0.03428163751959801 2023-01-22 19:44:43.568864: step: 1228/527, loss: 0.007833165116608143 2023-01-22 19:44:44.630319: step: 1232/527, loss: 0.03375722095370293 2023-01-22 19:44:45.662744: step: 1236/527, loss: 0.007485273759812117 2023-01-22 19:44:46.695563: step: 1240/527, loss: 0.030452409759163857 2023-01-22 19:44:47.726700: step: 1244/527, loss: 0.022304274141788483 2023-01-22 19:44:48.761429: step: 1248/527, loss: 0.011296301148831844 2023-01-22 19:44:49.791239: step: 1252/527, loss: 0.031903501600027084 2023-01-22 19:44:50.848810: step: 1256/527, loss: 0.004422258585691452 2023-01-22 19:44:51.896336: step: 1260/527, loss: 0.04615252465009689 2023-01-22 19:44:52.933732: step: 1264/527, loss: 0.017315508797764778 2023-01-22 19:44:53.984441: step: 1268/527, loss: 0.01905614510178566 2023-01-22 19:44:55.029906: step: 1272/527, loss: 0.03857870772480965 2023-01-22 19:44:56.078815: step: 1276/527, loss: 0.0055539365857839584 2023-01-22 19:44:57.115033: step: 1280/527, loss: 0.010360308922827244 2023-01-22 19:44:58.157740: step: 1284/527, loss: 0.011691239662468433 2023-01-22 19:44:59.199904: step: 1288/527, loss: 0.004534219857305288 2023-01-22 19:45:00.241073: step: 1292/527, loss: 0.011701731011271477 2023-01-22 19:45:01.292355: step: 1296/527, loss: 0.016721613705158234 2023-01-22 19:45:02.338540: step: 1300/527, loss: 0.004042757209390402 2023-01-22 19:45:03.405262: step: 1304/527, loss: 0.010980832390487194 2023-01-22 19:45:04.445052: step: 1308/527, loss: 0.004228595644235611 2023-01-22 19:45:05.484843: step: 1312/527, loss: 0.022516494616866112 2023-01-22 19:45:06.533126: step: 1316/527, loss: 0.02876768447458744 2023-01-22 19:45:07.576868: step: 1320/527, loss: 0.01709855906665325 2023-01-22 19:45:08.631234: step: 1324/527, loss: 0.011386437341570854 2023-01-22 19:45:09.669954: step: 1328/527, loss: 0.007310559507459402 2023-01-22 19:45:10.689990: step: 1332/527, loss: 0.01878192462027073 2023-01-22 19:45:11.754795: step: 1336/527, loss: 0.020790914073586464 2023-01-22 19:45:12.809495: step: 1340/527, loss: 0.006903341971337795 2023-01-22 19:45:13.857418: step: 1344/527, loss: 0.002603257307782769 2023-01-22 19:45:14.907884: step: 1348/527, loss: 0.004824474919587374 2023-01-22 19:45:15.954142: step: 1352/527, loss: 0.009527553804218769 2023-01-22 19:45:17.001544: step: 1356/527, loss: 0.0046661836095154285 2023-01-22 19:45:18.046896: step: 1360/527, loss: 0.0019602004904299974 2023-01-22 19:45:19.096434: step: 1364/527, loss: 0.005691582802683115 2023-01-22 19:45:20.148884: step: 1368/527, loss: 0.08224951475858688 2023-01-22 19:45:21.201332: step: 1372/527, loss: 0.007733750622719526 2023-01-22 19:45:22.246434: step: 1376/527, loss: 0.0047766296193003654 2023-01-22 19:45:23.312197: step: 1380/527, loss: 0.006940816529095173 2023-01-22 19:45:24.361793: step: 1384/527, loss: 0.012327251955866814 2023-01-22 19:45:25.418941: step: 1388/527, loss: 0.0059159486554563046 2023-01-22 19:45:26.465266: step: 1392/527, loss: 0.005582594778388739 2023-01-22 19:45:27.513813: step: 1396/527, loss: 0.010335014201700687 2023-01-22 19:45:28.558637: step: 1400/527, loss: 0.01100069098174572 2023-01-22 19:45:29.604117: step: 1404/527, loss: 0.014025130309164524 2023-01-22 19:45:30.662841: step: 1408/527, loss: 0.024998778477311134 2023-01-22 19:45:31.709210: step: 1412/527, loss: 0.005577197764068842 2023-01-22 19:45:32.747302: step: 1416/527, loss: 0.014429930597543716 2023-01-22 19:45:33.805672: step: 1420/527, loss: 0.017160825431346893 2023-01-22 19:45:34.847131: step: 1424/527, loss: 0.016941893845796585 2023-01-22 19:45:35.896897: step: 1428/527, loss: 0.012483743950724602 2023-01-22 19:45:36.945719: step: 1432/527, loss: 0.003954943735152483 2023-01-22 19:45:37.983149: step: 1436/527, loss: 0.009415700100362301 2023-01-22 19:45:39.032043: step: 1440/527, loss: 0.009498503059148788 2023-01-22 19:45:40.071902: step: 1444/527, loss: 0.010833312757313251 2023-01-22 19:45:41.135414: step: 1448/527, loss: 0.0066607436165213585 2023-01-22 19:45:42.180267: step: 1452/527, loss: 0.006890186574310064 2023-01-22 19:45:43.231344: step: 1456/527, loss: 0.0044693974778056145 2023-01-22 19:45:44.268944: step: 1460/527, loss: 0.0029753760900348425 2023-01-22 19:45:45.310004: step: 1464/527, loss: 0.034443605691194534 2023-01-22 19:45:46.351107: step: 1468/527, loss: 0.00898395199328661 2023-01-22 19:45:47.385282: step: 1472/527, loss: 0.008948897942900658 2023-01-22 19:45:48.445220: step: 1476/527, loss: 0.012350128963589668 2023-01-22 19:45:49.528280: step: 1480/527, loss: 0.03569713234901428 2023-01-22 19:45:50.571493: step: 1484/527, loss: 0.005233981646597385 2023-01-22 19:45:51.615538: step: 1488/527, loss: 0.015533572062849998 2023-01-22 19:45:52.673525: step: 1492/527, loss: 0.008105774410068989 2023-01-22 19:45:53.718081: step: 1496/527, loss: 0.0019224288407713175 2023-01-22 19:45:54.764448: step: 1500/527, loss: 0.012415507808327675 2023-01-22 19:45:55.810380: step: 1504/527, loss: 0.0042087240144610405 2023-01-22 19:45:56.857175: step: 1508/527, loss: 0.012882072478532791 2023-01-22 19:45:57.903862: step: 1512/527, loss: 0.012384502217173576 2023-01-22 19:45:58.951828: step: 1516/527, loss: 0.005620758514851332 2023-01-22 19:45:59.993659: step: 1520/527, loss: 0.008363313972949982 2023-01-22 19:46:01.036764: step: 1524/527, loss: 0.008510236628353596 2023-01-22 19:46:02.073993: step: 1528/527, loss: 0.009194700978696346 2023-01-22 19:46:03.134209: step: 1532/527, loss: 0.005406866781413555 2023-01-22 19:46:04.186754: step: 1536/527, loss: 0.005108795594424009 2023-01-22 19:46:05.232986: step: 1540/527, loss: 0.02588803693652153 2023-01-22 19:46:06.260986: step: 1544/527, loss: 0.0032069990411400795 2023-01-22 19:46:07.328749: step: 1548/527, loss: 0.018077164888381958 2023-01-22 19:46:08.374274: step: 1552/527, loss: 0.007549144793301821 2023-01-22 19:46:09.416106: step: 1556/527, loss: 0.006199760362505913 2023-01-22 19:46:10.465335: step: 1560/527, loss: 0.030280020087957382 2023-01-22 19:46:11.507721: step: 1564/527, loss: 0.021325180307030678 2023-01-22 19:46:12.548262: step: 1568/527, loss: 0.008346924558281898 2023-01-22 19:46:13.595488: step: 1572/527, loss: 0.004945177119225264 2023-01-22 19:46:14.643698: step: 1576/527, loss: 0.004650773946195841 2023-01-22 19:46:15.684963: step: 1580/527, loss: 0.03521714359521866 2023-01-22 19:46:16.739751: step: 1584/527, loss: 0.004980199970304966 2023-01-22 19:46:17.792214: step: 1588/527, loss: 0.011413614265620708 2023-01-22 19:46:18.828705: step: 1592/527, loss: 0.01686055399477482 2023-01-22 19:46:19.879910: step: 1596/527, loss: 0.03376294672489166 2023-01-22 19:46:20.919931: step: 1600/527, loss: 0.008297530002892017 2023-01-22 19:46:21.965837: step: 1604/527, loss: 0.023515764623880386 2023-01-22 19:46:23.007061: step: 1608/527, loss: 0.007583301048725843 2023-01-22 19:46:24.044823: step: 1612/527, loss: 0.0026325020007789135 2023-01-22 19:46:25.087194: step: 1616/527, loss: 0.0022506117820739746 2023-01-22 19:46:26.139073: step: 1620/527, loss: 0.008329613134264946 2023-01-22 19:46:27.204937: step: 1624/527, loss: 0.00849849358201027 2023-01-22 19:46:28.273834: step: 1628/527, loss: 0.008249761536717415 2023-01-22 19:46:29.311278: step: 1632/527, loss: 0.060595810413360596 2023-01-22 19:46:30.366385: step: 1636/527, loss: 0.006368107162415981 2023-01-22 19:46:31.425989: step: 1640/527, loss: 0.010098547674715519 2023-01-22 19:46:32.477639: step: 1644/527, loss: 0.012657450512051582 2023-01-22 19:46:33.523563: step: 1648/527, loss: 0.027670621871948242 2023-01-22 19:46:34.575883: step: 1652/527, loss: 0.015922911465168 2023-01-22 19:46:35.621155: step: 1656/527, loss: 0.010136590339243412 2023-01-22 19:46:36.658539: step: 1660/527, loss: 0.029393425211310387 2023-01-22 19:46:37.709199: step: 1664/527, loss: 0.025644952431321144 2023-01-22 19:46:38.758018: step: 1668/527, loss: 0.009974253363907337 2023-01-22 19:46:39.802916: step: 1672/527, loss: 0.05640428513288498 2023-01-22 19:46:40.840812: step: 1676/527, loss: 0.004832874517887831 2023-01-22 19:46:41.882008: step: 1680/527, loss: 0.007441962603479624 2023-01-22 19:46:42.925582: step: 1684/527, loss: 0.018733033910393715 2023-01-22 19:46:43.965968: step: 1688/527, loss: 0.0010351308155804873 2023-01-22 19:46:45.006471: step: 1692/527, loss: 0.00944183487445116 2023-01-22 19:46:46.069882: step: 1696/527, loss: 0.006964751984924078 2023-01-22 19:46:47.117453: step: 1700/527, loss: 0.005171839613467455 2023-01-22 19:46:48.166048: step: 1704/527, loss: 0.01986236497759819 2023-01-22 19:46:49.247660: step: 1708/527, loss: 0.009714815765619278 2023-01-22 19:46:50.288405: step: 1712/527, loss: 0.010378447361290455 2023-01-22 19:46:51.323815: step: 1716/527, loss: 0.00603974936529994 2023-01-22 19:46:52.361625: step: 1720/527, loss: 0.02692074328660965 2023-01-22 19:46:53.403967: step: 1724/527, loss: 0.012999355792999268 2023-01-22 19:46:54.450183: step: 1728/527, loss: 0.009751521982252598 2023-01-22 19:46:55.489213: step: 1732/527, loss: 0.0057746293023228645 2023-01-22 19:46:56.527554: step: 1736/527, loss: 0.011407960206270218 2023-01-22 19:46:57.572566: step: 1740/527, loss: 0.02211788110435009 2023-01-22 19:46:58.625550: step: 1744/527, loss: 0.009267452172935009 2023-01-22 19:46:59.677910: step: 1748/527, loss: 0.0103457598015666 2023-01-22 19:47:00.738688: step: 1752/527, loss: 0.004588868468999863 2023-01-22 19:47:01.780398: step: 1756/527, loss: 0.00400989456102252 2023-01-22 19:47:02.810467: step: 1760/527, loss: 0.015373006463050842 2023-01-22 19:47:03.851967: step: 1764/527, loss: 0.004751627333462238 2023-01-22 19:47:04.904743: step: 1768/527, loss: 0.009738551452755928 2023-01-22 19:47:05.951678: step: 1772/527, loss: 0.005140881985425949 2023-01-22 19:47:06.999084: step: 1776/527, loss: 0.002203483134508133 2023-01-22 19:47:08.056616: step: 1780/527, loss: 0.006194181274622679 2023-01-22 19:47:09.113388: step: 1784/527, loss: 0.006157819181680679 2023-01-22 19:47:10.156716: step: 1788/527, loss: 0.003330419072881341 2023-01-22 19:47:11.233002: step: 1792/527, loss: 0.011604590341448784 2023-01-22 19:47:12.269704: step: 1796/527, loss: 0.03503284603357315 2023-01-22 19:47:13.319934: step: 1800/527, loss: 0.00614347355440259 2023-01-22 19:47:14.367702: step: 1804/527, loss: 0.01627521961927414 2023-01-22 19:47:15.394992: step: 1808/527, loss: 0.015544203110039234 2023-01-22 19:47:16.450526: step: 1812/527, loss: 0.029974795877933502 2023-01-22 19:47:17.495436: step: 1816/527, loss: 0.015867266803979874 2023-01-22 19:47:18.535765: step: 1820/527, loss: 0.013727860525250435 2023-01-22 19:47:19.581697: step: 1824/527, loss: 0.0074148159474134445 2023-01-22 19:47:20.622620: step: 1828/527, loss: 0.031863220036029816 2023-01-22 19:47:21.666580: step: 1832/527, loss: 0.007135330233722925 2023-01-22 19:47:22.711024: step: 1836/527, loss: 0.0053549036383628845 2023-01-22 19:47:23.765997: step: 1840/527, loss: 0.0060627879574894905 2023-01-22 19:47:24.801879: step: 1844/527, loss: 0.01659911684691906 2023-01-22 19:47:25.838834: step: 1848/527, loss: 0.006961342878639698 2023-01-22 19:47:26.883785: step: 1852/527, loss: 0.00940422248095274 2023-01-22 19:47:27.924350: step: 1856/527, loss: 0.02290002815425396 2023-01-22 19:47:28.971452: step: 1860/527, loss: 0.014562358148396015 2023-01-22 19:47:29.998941: step: 1864/527, loss: 0.0039009125903248787 2023-01-22 19:47:31.063883: step: 1868/527, loss: 0.010911677032709122 2023-01-22 19:47:32.095553: step: 1872/527, loss: 0.0071758064441382885 2023-01-22 19:47:33.146320: step: 1876/527, loss: 0.020267516374588013 2023-01-22 19:47:34.193610: step: 1880/527, loss: 0.0289051104336977 2023-01-22 19:47:35.248071: step: 1884/527, loss: 0.0031412984244525433 2023-01-22 19:47:36.295741: step: 1888/527, loss: 0.003647695994004607 2023-01-22 19:47:37.340762: step: 1892/527, loss: 0.00880490243434906 2023-01-22 19:47:38.375629: step: 1896/527, loss: 0.002776987385004759 2023-01-22 19:47:39.413606: step: 1900/527, loss: 0.008150040172040462 2023-01-22 19:47:40.449226: step: 1904/527, loss: 0.005806076806038618 2023-01-22 19:47:41.528754: step: 1908/527, loss: 0.005653920117765665 2023-01-22 19:47:42.558146: step: 1912/527, loss: 0.013935298658907413 2023-01-22 19:47:43.594565: step: 1916/527, loss: 0.010857168585062027 2023-01-22 19:47:44.645780: step: 1920/527, loss: 0.009352513588964939 2023-01-22 19:47:45.689335: step: 1924/527, loss: 0.03028571791946888 2023-01-22 19:47:46.732509: step: 1928/527, loss: 0.006492685992270708 2023-01-22 19:47:47.775818: step: 1932/527, loss: 0.07627073675394058 2023-01-22 19:47:48.815123: step: 1936/527, loss: 0.0060774837620556355 2023-01-22 19:47:49.870997: step: 1940/527, loss: 0.02620861679315567 2023-01-22 19:47:50.924807: step: 1944/527, loss: 0.017924658954143524 2023-01-22 19:47:51.973667: step: 1948/527, loss: 0.027919495478272438 2023-01-22 19:47:53.006151: step: 1952/527, loss: 0.004058958496898413 2023-01-22 19:47:54.059889: step: 1956/527, loss: 0.008040946908295155 2023-01-22 19:47:55.100903: step: 1960/527, loss: 0.012648562900722027 2023-01-22 19:47:56.138517: step: 1964/527, loss: 0.017326952889561653 2023-01-22 19:47:57.192593: step: 1968/527, loss: 0.031215587630867958 2023-01-22 19:47:58.246144: step: 1972/527, loss: 0.02237948402762413 2023-01-22 19:47:59.289575: step: 1976/527, loss: 0.006214444525539875 2023-01-22 19:48:00.332715: step: 1980/527, loss: 0.02078646421432495 2023-01-22 19:48:01.385815: step: 1984/527, loss: 0.012756140902638435 2023-01-22 19:48:02.433880: step: 1988/527, loss: 0.005657571833580732 2023-01-22 19:48:03.476763: step: 1992/527, loss: 0.013083589263260365 2023-01-22 19:48:04.514975: step: 1996/527, loss: 0.011768629774451256 2023-01-22 19:48:05.558280: step: 2000/527, loss: 0.007825766690075397 2023-01-22 19:48:06.600065: step: 2004/527, loss: 0.03417177125811577 2023-01-22 19:48:07.644350: step: 2008/527, loss: 0.027935318648815155 2023-01-22 19:48:08.678294: step: 2012/527, loss: 0.02431550808250904 2023-01-22 19:48:09.730998: step: 2016/527, loss: 0.03146444633603096 2023-01-22 19:48:10.775017: step: 2020/527, loss: 0.0378304049372673 2023-01-22 19:48:11.827854: step: 2024/527, loss: 0.005935895722359419 2023-01-22 19:48:12.869515: step: 2028/527, loss: 0.0038613921497017145 2023-01-22 19:48:13.927989: step: 2032/527, loss: 0.008751094341278076 2023-01-22 19:48:14.979930: step: 2036/527, loss: 0.030986562371253967 2023-01-22 19:48:16.023871: step: 2040/527, loss: 0.010361522436141968 2023-01-22 19:48:17.073454: step: 2044/527, loss: 0.004071739036589861 2023-01-22 19:48:18.130595: step: 2048/527, loss: 0.011194843798875809 2023-01-22 19:48:19.182359: step: 2052/527, loss: 0.017472080886363983 2023-01-22 19:48:20.221828: step: 2056/527, loss: 0.003715306054800749 2023-01-22 19:48:21.269333: step: 2060/527, loss: 0.006260259076952934 2023-01-22 19:48:22.309806: step: 2064/527, loss: 0.0017354401061311364 2023-01-22 19:48:23.382455: step: 2068/527, loss: 0.0698724240064621 2023-01-22 19:48:24.443838: step: 2072/527, loss: 0.004602520726621151 2023-01-22 19:48:25.494569: step: 2076/527, loss: 0.009425695054233074 2023-01-22 19:48:26.544327: step: 2080/527, loss: 0.03477642685174942 2023-01-22 19:48:27.585649: step: 2084/527, loss: 0.01011382881551981 2023-01-22 19:48:28.616393: step: 2088/527, loss: 0.018314184620976448 2023-01-22 19:48:29.652500: step: 2092/527, loss: 0.010672174394130707 2023-01-22 19:48:30.703750: step: 2096/527, loss: 0.012546362355351448 2023-01-22 19:48:31.756681: step: 2100/527, loss: 0.0014359191991388798 2023-01-22 19:48:32.805059: step: 2104/527, loss: 0.02782692387700081 2023-01-22 19:48:33.851067: step: 2108/527, loss: 0.019237518310546875 ================================================== Loss: 0.015 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3234346539162113, 'r': 0.33693666982922205, 'f1': 0.33004763011152416}, 'combined': 0.24319299060849148, 'stategy': 1, 'epoch': 1} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33822594476589063, 'r': 0.3105529129214087, 'f1': 0.323799245700047}, 'combined': 0.20723151724803004, 'stategy': 1, 'epoch': 1} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3244479100509166, 'r': 0.3552304442113451, 'f1': 0.33914210887568635}, 'combined': 0.24989418548734782, 'stategy': 1, 'epoch': 1} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3571283740040584, 'r': 0.31687026638905547, 'f1': 0.33579700677067537}, 'combined': 0.2149100843332322, 'stategy': 1, 'epoch': 1} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3287984773683285, 'r': 0.32817457133916655, 'f1': 0.32848622810207173}, 'combined': 0.2420424838646844, 'stategy': 1, 'epoch': 1} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3646947426906274, 'r': 0.3013128538335666, 'f1': 0.3299878688222119}, 'combined': 0.2365950757593218, 'stategy': 1, 'epoch': 1} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 1} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 1} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 1} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3264723230548723, 'r': 0.33576470416649107, 'f1': 0.3310533191688322}, 'combined': 0.24393402465071845, 'stategy': 1, 'epoch': 0} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.35636473460031554, 'r': 0.2976731814040852, 'f1': 0.32438554919493273}, 'combined': 0.23257831829070652, 'stategy': 1, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 0} ****************************** Epoch: 2 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 19:51:04.374163: step: 4/527, loss: 0.0019472651183605194 2023-01-22 19:51:05.412989: step: 8/527, loss: 0.00045936647802591324 2023-01-22 19:51:06.455137: step: 12/527, loss: 0.05795268714427948 2023-01-22 19:51:07.484083: step: 16/527, loss: 0.01626136526465416 2023-01-22 19:51:08.523431: step: 20/527, loss: 0.003307248931378126 2023-01-22 19:51:09.575910: step: 24/527, loss: 0.012959376908838749 2023-01-22 19:51:10.605268: step: 28/527, loss: 0.04898662492632866 2023-01-22 19:51:11.637534: step: 32/527, loss: 0.0029615783132612705 2023-01-22 19:51:12.683488: step: 36/527, loss: 0.006359742023050785 2023-01-22 19:51:13.714659: step: 40/527, loss: 0.006178473122417927 2023-01-22 19:51:14.749503: step: 44/527, loss: 0.008812231943011284 2023-01-22 19:51:15.805159: step: 48/527, loss: 0.005887409206479788 2023-01-22 19:51:16.858641: step: 52/527, loss: 0.016075002029538155 2023-01-22 19:51:17.900990: step: 56/527, loss: 0.005801203195005655 2023-01-22 19:51:18.941813: step: 60/527, loss: 0.03931571543216705 2023-01-22 19:51:19.981560: step: 64/527, loss: 0.035839393734931946 2023-01-22 19:51:21.037756: step: 68/527, loss: 0.007224041037261486 2023-01-22 19:51:22.070924: step: 72/527, loss: 0.003933871164917946 2023-01-22 19:51:23.112204: step: 76/527, loss: 0.00912049226462841 2023-01-22 19:51:24.177723: step: 80/527, loss: 0.017245899885892868 2023-01-22 19:51:25.213101: step: 84/527, loss: 0.01848674565553665 2023-01-22 19:51:26.250410: step: 88/527, loss: 0.001915311673656106 2023-01-22 19:51:27.301142: step: 92/527, loss: 0.06305068731307983 2023-01-22 19:51:28.346474: step: 96/527, loss: 0.009344151243567467 2023-01-22 19:51:29.385767: step: 100/527, loss: 0.019177034497261047 2023-01-22 19:51:30.428237: step: 104/527, loss: 0.00846317782998085 2023-01-22 19:51:31.466475: step: 108/527, loss: 0.0064727808348834515 2023-01-22 19:51:32.522478: step: 112/527, loss: 0.010484704747796059 2023-01-22 19:51:33.569679: step: 116/527, loss: 0.01720639131963253 2023-01-22 19:51:34.618906: step: 120/527, loss: 0.00674505066126585 2023-01-22 19:51:35.668114: step: 124/527, loss: 0.008513525128364563 2023-01-22 19:51:36.712111: step: 128/527, loss: 0.011099004186689854 2023-01-22 19:51:37.753667: step: 132/527, loss: 0.0030008687172085047 2023-01-22 19:51:38.791813: step: 136/527, loss: 0.01110610831528902 2023-01-22 19:51:39.845729: step: 140/527, loss: 0.006258530542254448 2023-01-22 19:51:40.888997: step: 144/527, loss: 0.005188685841858387 2023-01-22 19:51:41.944020: step: 148/527, loss: 0.00519842654466629 2023-01-22 19:51:42.993467: step: 152/527, loss: 0.011154260486364365 2023-01-22 19:51:44.044192: step: 156/527, loss: 0.021539486944675446 2023-01-22 19:51:45.114945: step: 160/527, loss: 0.010539108887314796 2023-01-22 19:51:46.166422: step: 164/527, loss: 0.017954345792531967 2023-01-22 19:51:47.227974: step: 168/527, loss: 0.040802549570798874 2023-01-22 19:51:48.270860: step: 172/527, loss: 0.00432178657501936 2023-01-22 19:51:49.311604: step: 176/527, loss: 0.0024467159528285265 2023-01-22 19:51:50.347442: step: 180/527, loss: 0.004625953733921051 2023-01-22 19:51:51.395455: step: 184/527, loss: 0.03450179845094681 2023-01-22 19:51:52.441901: step: 188/527, loss: 0.019679058343172073 2023-01-22 19:51:53.491913: step: 192/527, loss: 0.003656997112557292 2023-01-22 19:51:54.525939: step: 196/527, loss: 0.01148401852697134 2023-01-22 19:51:55.594932: step: 200/527, loss: 0.03636989742517471 2023-01-22 19:51:56.628322: step: 204/527, loss: 0.007308941334486008 2023-01-22 19:51:57.664257: step: 208/527, loss: 0.0032253314275294542 2023-01-22 19:51:58.711860: step: 212/527, loss: 0.02569805644452572 2023-01-22 19:51:59.755108: step: 216/527, loss: 0.04698735848069191 2023-01-22 19:52:00.831429: step: 220/527, loss: 0.011449437588453293 2023-01-22 19:52:01.876440: step: 224/527, loss: 0.0006532570696435869 2023-01-22 19:52:02.925919: step: 228/527, loss: 0.010402513667941093 2023-01-22 19:52:03.984066: step: 232/527, loss: 0.010985426604747772 2023-01-22 19:52:05.020751: step: 236/527, loss: 0.0016882745549082756 2023-01-22 19:52:06.074626: step: 240/527, loss: 0.011718044988811016 2023-01-22 19:52:07.101617: step: 244/527, loss: 0.013160007074475288 2023-01-22 19:52:08.148638: step: 248/527, loss: 0.00553869316354394 2023-01-22 19:52:09.200308: step: 252/527, loss: 0.010326913557946682 2023-01-22 19:52:10.260408: step: 256/527, loss: 0.006913777906447649 2023-01-22 19:52:11.315280: step: 260/527, loss: 0.0071021574549376965 2023-01-22 19:52:12.358517: step: 264/527, loss: 0.007434515282511711 2023-01-22 19:52:13.412796: step: 268/527, loss: 0.03991125524044037 2023-01-22 19:52:14.465795: step: 272/527, loss: 0.013365563936531544 2023-01-22 19:52:15.500985: step: 276/527, loss: 0.023147787898778915 2023-01-22 19:52:16.550832: step: 280/527, loss: 0.011969683691859245 2023-01-22 19:52:17.599397: step: 284/527, loss: 0.0029195528477430344 2023-01-22 19:52:18.637673: step: 288/527, loss: 0.007126101292669773 2023-01-22 19:52:19.709597: step: 292/527, loss: 0.009884514845907688 2023-01-22 19:52:20.764573: step: 296/527, loss: 0.006057575345039368 2023-01-22 19:52:21.823205: step: 300/527, loss: 0.005265767220407724 2023-01-22 19:52:22.879761: step: 304/527, loss: 0.010103418491780758 2023-01-22 19:52:23.929769: step: 308/527, loss: 0.006355458404868841 2023-01-22 19:52:24.977037: step: 312/527, loss: 0.007941097021102905 2023-01-22 19:52:26.036668: step: 316/527, loss: 0.005205498076975346 2023-01-22 19:52:27.088602: step: 320/527, loss: 0.003815067233517766 2023-01-22 19:52:28.163127: step: 324/527, loss: 0.03367021679878235 2023-01-22 19:52:29.206526: step: 328/527, loss: 0.012014445848762989 2023-01-22 19:52:30.279324: step: 332/527, loss: 0.015222628600895405 2023-01-22 19:52:31.331455: step: 336/527, loss: 0.03171848878264427 2023-01-22 19:52:32.400575: step: 340/527, loss: 0.010363505221903324 2023-01-22 19:52:33.459903: step: 344/527, loss: 0.007276894990354776 2023-01-22 19:52:34.514471: step: 348/527, loss: 0.0567045696079731 2023-01-22 19:52:35.569815: step: 352/527, loss: 0.026962192729115486 2023-01-22 19:52:36.609519: step: 356/527, loss: 0.02342049777507782 2023-01-22 19:52:37.668515: step: 360/527, loss: 0.005566221196204424 2023-01-22 19:52:38.727835: step: 364/527, loss: 0.02161421813070774 2023-01-22 19:52:39.799790: step: 368/527, loss: 0.007915153168141842 2023-01-22 19:52:40.852153: step: 372/527, loss: 0.004455730319023132 2023-01-22 19:52:41.912418: step: 376/527, loss: 0.009517781436443329 2023-01-22 19:52:42.968618: step: 380/527, loss: 0.006966050714254379 2023-01-22 19:52:44.017385: step: 384/527, loss: 0.004054506774991751 2023-01-22 19:52:45.081819: step: 388/527, loss: 0.007609184365719557 2023-01-22 19:52:46.141844: step: 392/527, loss: 0.012637414038181305 2023-01-22 19:52:47.212827: step: 396/527, loss: 0.021564841270446777 2023-01-22 19:52:48.252943: step: 400/527, loss: 0.00016283965669572353 2023-01-22 19:52:49.310528: step: 404/527, loss: 0.006162731908261776 2023-01-22 19:52:50.379725: step: 408/527, loss: 0.06655421108007431 2023-01-22 19:52:51.444571: step: 412/527, loss: 0.0010615808423608541 2023-01-22 19:52:52.490026: step: 416/527, loss: 0.0028288750909268856 2023-01-22 19:52:53.555130: step: 420/527, loss: 0.004220309667289257 2023-01-22 19:52:54.606886: step: 424/527, loss: 0.021024566143751144 2023-01-22 19:52:55.659494: step: 428/527, loss: 0.01765807531774044 2023-01-22 19:52:56.714016: step: 432/527, loss: 0.014151964336633682 2023-01-22 19:52:57.766395: step: 436/527, loss: 0.006327706854790449 2023-01-22 19:52:58.828724: step: 440/527, loss: 0.00396690284833312 2023-01-22 19:52:59.892730: step: 444/527, loss: 0.026631010696291924 2023-01-22 19:53:00.947941: step: 448/527, loss: 0.011762662790715694 2023-01-22 19:53:02.002298: step: 452/527, loss: 0.01275633554905653 2023-01-22 19:53:03.073312: step: 456/527, loss: 0.005071002058684826 2023-01-22 19:53:04.120558: step: 460/527, loss: 0.016328396275639534 2023-01-22 19:53:05.153183: step: 464/527, loss: 0.012322410941123962 2023-01-22 19:53:06.206338: step: 468/527, loss: 0.006799472030252218 2023-01-22 19:53:07.269569: step: 472/527, loss: 0.0020863860845565796 2023-01-22 19:53:08.331007: step: 476/527, loss: 0.009194653481245041 2023-01-22 19:53:09.366884: step: 480/527, loss: 0.011204421520233154 2023-01-22 19:53:10.411583: step: 484/527, loss: 0.031243499368429184 2023-01-22 19:53:11.479178: step: 488/527, loss: 0.028876738622784615 2023-01-22 19:53:12.540190: step: 492/527, loss: 0.020245103165507317 2023-01-22 19:53:13.598599: step: 496/527, loss: 0.009308438748121262 2023-01-22 19:53:14.659690: step: 500/527, loss: 0.004317193757742643 2023-01-22 19:53:15.712669: step: 504/527, loss: 0.007558333687484264 2023-01-22 19:53:16.740789: step: 508/527, loss: 0.006202024407684803 2023-01-22 19:53:17.794202: step: 512/527, loss: 0.00624776491895318 2023-01-22 19:53:18.856158: step: 516/527, loss: 0.003422696143388748 2023-01-22 19:53:19.949727: step: 520/527, loss: 0.008958730846643448 2023-01-22 19:53:20.992306: step: 524/527, loss: 0.008507025428116322 2023-01-22 19:53:22.056475: step: 528/527, loss: 0.007898330688476562 2023-01-22 19:53:23.107121: step: 532/527, loss: 0.006204289849847555 2023-01-22 19:53:24.152193: step: 536/527, loss: 9.811071504373103e-05 2023-01-22 19:53:25.197328: step: 540/527, loss: 0.001846045721322298 2023-01-22 19:53:26.241322: step: 544/527, loss: 0.005641768220812082 2023-01-22 19:53:27.285528: step: 548/527, loss: 0.0037336130626499653 2023-01-22 19:53:28.330966: step: 552/527, loss: 0.00901532731950283 2023-01-22 19:53:29.391309: step: 556/527, loss: 0.005902671720832586 2023-01-22 19:53:30.447239: step: 560/527, loss: 0.06052519008517265 2023-01-22 19:53:31.495068: step: 564/527, loss: 0.027774371206760406 2023-01-22 19:53:32.556039: step: 568/527, loss: 0.005201473832130432 2023-01-22 19:53:33.599700: step: 572/527, loss: 0.02104785107076168 2023-01-22 19:53:34.646504: step: 576/527, loss: 0.00717878108844161 2023-01-22 19:53:35.696704: step: 580/527, loss: 0.04445670545101166 2023-01-22 19:53:36.754047: step: 584/527, loss: 0.008279498666524887 2023-01-22 19:53:37.805365: step: 588/527, loss: 0.004925409331917763 2023-01-22 19:53:38.879136: step: 592/527, loss: 0.009139998815953732 2023-01-22 19:53:39.928488: step: 596/527, loss: 0.037103522568941116 2023-01-22 19:53:40.983330: step: 600/527, loss: 0.012650898657739162 2023-01-22 19:53:42.034956: step: 604/527, loss: 0.023029830306768417 2023-01-22 19:53:43.081042: step: 608/527, loss: 0.06605393439531326 2023-01-22 19:53:44.135261: step: 612/527, loss: 0.004582384135574102 2023-01-22 19:53:45.211292: step: 616/527, loss: 0.014879208989441395 2023-01-22 19:53:46.247073: step: 620/527, loss: 0.004906008951365948 2023-01-22 19:53:47.306659: step: 624/527, loss: 0.028087178245186806 2023-01-22 19:53:48.361145: step: 628/527, loss: 0.01640423573553562 2023-01-22 19:53:49.428958: step: 632/527, loss: 0.009729445911943913 2023-01-22 19:53:50.465104: step: 636/527, loss: 0.009771459735929966 2023-01-22 19:53:51.509996: step: 640/527, loss: 0.017573563382029533 2023-01-22 19:53:52.567917: step: 644/527, loss: 0.005665437784045935 2023-01-22 19:53:53.620771: step: 648/527, loss: 0.04363008588552475 2023-01-22 19:53:54.673438: step: 652/527, loss: 0.0037811200600117445 2023-01-22 19:53:55.702129: step: 656/527, loss: 0.003920457325875759 2023-01-22 19:53:56.764683: step: 660/527, loss: 0.024555912241339684 2023-01-22 19:53:57.829562: step: 664/527, loss: 0.01844160072505474 2023-01-22 19:53:58.877834: step: 668/527, loss: 0.04845099523663521 2023-01-22 19:53:59.918890: step: 672/527, loss: 0.016640199348330498 2023-01-22 19:54:00.971595: step: 676/527, loss: 0.0036800731904804707 2023-01-22 19:54:02.036656: step: 680/527, loss: 0.04842953011393547 2023-01-22 19:54:03.096512: step: 684/527, loss: 0.009288104251027107 2023-01-22 19:54:04.146259: step: 688/527, loss: 0.007285847328603268 2023-01-22 19:54:05.201058: step: 692/527, loss: 0.006700329482555389 2023-01-22 19:54:06.258035: step: 696/527, loss: 0.00467517739161849 2023-01-22 19:54:07.309033: step: 700/527, loss: 0.0001013468427117914 2023-01-22 19:54:08.353268: step: 704/527, loss: 0.015326598659157753 2023-01-22 19:54:09.407611: step: 708/527, loss: 0.00843038596212864 2023-01-22 19:54:10.457454: step: 712/527, loss: 0.003256844822317362 2023-01-22 19:54:11.508202: step: 716/527, loss: 0.013553074561059475 2023-01-22 19:54:12.538873: step: 720/527, loss: 0.008905477821826935 2023-01-22 19:54:13.593499: step: 724/527, loss: 0.012882758863270283 2023-01-22 19:54:14.647757: step: 728/527, loss: 0.01840660721063614 2023-01-22 19:54:15.701173: step: 732/527, loss: 0.005589759908616543 2023-01-22 19:54:16.742429: step: 736/527, loss: 0.00702686095610261 2023-01-22 19:54:17.793479: step: 740/527, loss: 0.1297561228275299 2023-01-22 19:54:18.849176: step: 744/527, loss: 0.01522014383226633 2023-01-22 19:54:19.894865: step: 748/527, loss: 0.024061452597379684 2023-01-22 19:54:20.951553: step: 752/527, loss: 0.006916760932654142 2023-01-22 19:54:22.004129: step: 756/527, loss: 0.03815501183271408 2023-01-22 19:54:23.053263: step: 760/527, loss: 0.045862630009651184 2023-01-22 19:54:24.095947: step: 764/527, loss: 0.016699789091944695 2023-01-22 19:54:25.149195: step: 768/527, loss: 0.012677228078246117 2023-01-22 19:54:26.208484: step: 772/527, loss: 0.001255549374036491 2023-01-22 19:54:27.251845: step: 776/527, loss: 0.026327967643737793 2023-01-22 19:54:28.300370: step: 780/527, loss: 0.003769800765439868 2023-01-22 19:54:29.339359: step: 784/527, loss: 0.06169474869966507 2023-01-22 19:54:30.376306: step: 788/527, loss: 0.0110657112672925 2023-01-22 19:54:31.423904: step: 792/527, loss: 0.019507996737957 2023-01-22 19:54:32.471435: step: 796/527, loss: 0.04768926650285721 2023-01-22 19:54:33.525839: step: 800/527, loss: 0.004048857372254133 2023-01-22 19:54:34.562409: step: 804/527, loss: 0.01741715706884861 2023-01-22 19:54:35.617600: step: 808/527, loss: 0.009874519892036915 2023-01-22 19:54:36.679745: step: 812/527, loss: 0.022808825597167015 2023-01-22 19:54:37.741684: step: 816/527, loss: 0.004506480414420366 2023-01-22 19:54:38.809732: step: 820/527, loss: 0.023001598194241524 2023-01-22 19:54:39.861670: step: 824/527, loss: 0.0019482597708702087 2023-01-22 19:54:40.929308: step: 828/527, loss: 0.06149015575647354 2023-01-22 19:54:41.964206: step: 832/527, loss: 0.0024485127069056034 2023-01-22 19:54:43.011778: step: 836/527, loss: 0.009832807816565037 2023-01-22 19:54:44.061463: step: 840/527, loss: 0.00894632376730442 2023-01-22 19:54:45.109035: step: 844/527, loss: 0.005954229738563299 2023-01-22 19:54:46.170334: step: 848/527, loss: 0.002683592028915882 2023-01-22 19:54:47.220221: step: 852/527, loss: 0.0031669861637055874 2023-01-22 19:54:48.275103: step: 856/527, loss: 0.00842226855456829 2023-01-22 19:54:49.331781: step: 860/527, loss: 0.01319480873644352 2023-01-22 19:54:50.409400: step: 864/527, loss: 0.013089342042803764 2023-01-22 19:54:51.471377: step: 868/527, loss: 0.016493460163474083 2023-01-22 19:54:52.519547: step: 872/527, loss: 0.010519245639443398 2023-01-22 19:54:53.546101: step: 876/527, loss: 0.00758923776447773 2023-01-22 19:54:54.607030: step: 880/527, loss: 0.007590882480144501 2023-01-22 19:54:55.638374: step: 884/527, loss: 0.018082482740283012 2023-01-22 19:54:56.685471: step: 888/527, loss: 0.005745083559304476 2023-01-22 19:54:57.734457: step: 892/527, loss: 0.004090290050953627 2023-01-22 19:54:58.779976: step: 896/527, loss: 0.005211257375776768 2023-01-22 19:54:59.832551: step: 900/527, loss: 0.009369587525725365 2023-01-22 19:55:00.876037: step: 904/527, loss: 0.004668599460273981 2023-01-22 19:55:01.925194: step: 908/527, loss: 0.00914757139980793 2023-01-22 19:55:02.958907: step: 912/527, loss: 0.007890195585787296 2023-01-22 19:55:04.006728: step: 916/527, loss: 0.003551984904333949 2023-01-22 19:55:05.060783: step: 920/527, loss: 0.005148402415215969 2023-01-22 19:55:06.096077: step: 924/527, loss: 0.015609413385391235 2023-01-22 19:55:07.142916: step: 928/527, loss: 0.005506484303623438 2023-01-22 19:55:08.190020: step: 932/527, loss: 0.012508809566497803 2023-01-22 19:55:09.231241: step: 936/527, loss: 0.002940128790214658 2023-01-22 19:55:10.290162: step: 940/527, loss: 0.006656321696937084 2023-01-22 19:55:11.354086: step: 944/527, loss: 0.003383379429578781 2023-01-22 19:55:12.406013: step: 948/527, loss: 0.019145991653203964 2023-01-22 19:55:13.446462: step: 952/527, loss: 0.008929871954023838 2023-01-22 19:55:14.494609: step: 956/527, loss: 0.016706202179193497 2023-01-22 19:55:15.548965: step: 960/527, loss: 0.03805391117930412 2023-01-22 19:55:16.618557: step: 964/527, loss: 0.002713154535740614 2023-01-22 19:55:17.668355: step: 968/527, loss: 0.012656132690608501 2023-01-22 19:55:18.718386: step: 972/527, loss: 0.03401487320661545 2023-01-22 19:55:19.787018: step: 976/527, loss: 0.013186130672693253 2023-01-22 19:55:20.841297: step: 980/527, loss: 0.0037962726783007383 2023-01-22 19:55:21.873948: step: 984/527, loss: 0.012958310544490814 2023-01-22 19:55:22.928394: step: 988/527, loss: 0.0232698954641819 2023-01-22 19:55:23.977296: step: 992/527, loss: 0.0036712519358843565 2023-01-22 19:55:25.019942: step: 996/527, loss: 0.01900840364396572 2023-01-22 19:55:26.088162: step: 1000/527, loss: 0.00459720753133297 2023-01-22 19:55:27.127212: step: 1004/527, loss: 0.002389513188973069 2023-01-22 19:55:28.174625: step: 1008/527, loss: 0.0050201681442558765 2023-01-22 19:55:29.216273: step: 1012/527, loss: 0.016077689826488495 2023-01-22 19:55:30.268576: step: 1016/527, loss: 0.02259223908185959 2023-01-22 19:55:31.317185: step: 1020/527, loss: 0.023427749052643776 2023-01-22 19:55:32.368073: step: 1024/527, loss: 0.014956757426261902 2023-01-22 19:55:33.434870: step: 1028/527, loss: 0.02330690436065197 2023-01-22 19:55:34.492319: step: 1032/527, loss: 0.00952431745827198 2023-01-22 19:55:35.535553: step: 1036/527, loss: 0.009679542854428291 2023-01-22 19:55:36.573657: step: 1040/527, loss: 0.0 2023-01-22 19:55:37.635190: step: 1044/527, loss: 0.0035289146471768618 2023-01-22 19:55:38.674355: step: 1048/527, loss: 0.013830331154167652 2023-01-22 19:55:39.712401: step: 1052/527, loss: 0.0029604050796478987 2023-01-22 19:55:40.755402: step: 1056/527, loss: 0.009815986268222332 2023-01-22 19:55:41.803371: step: 1060/527, loss: 0.001073924358934164 2023-01-22 19:55:42.841746: step: 1064/527, loss: 0.004243234638124704 2023-01-22 19:55:43.862657: step: 1068/527, loss: 0.007456034421920776 2023-01-22 19:55:44.923219: step: 1072/527, loss: 0.006499788723886013 2023-01-22 19:55:45.969796: step: 1076/527, loss: 0.002173013985157013 2023-01-22 19:55:47.010230: step: 1080/527, loss: 0.004147028550505638 2023-01-22 19:55:48.059410: step: 1084/527, loss: 0.0065850671380758286 2023-01-22 19:55:49.101841: step: 1088/527, loss: 0.0026100939139723778 2023-01-22 19:55:50.178845: step: 1092/527, loss: 0.011555547825992107 2023-01-22 19:55:51.232823: step: 1096/527, loss: 0.003990706522017717 2023-01-22 19:55:52.287649: step: 1100/527, loss: 0.004512408282607794 2023-01-22 19:55:53.324273: step: 1104/527, loss: 0.0035088045988231897 2023-01-22 19:55:54.376172: step: 1108/527, loss: 0.002885064808651805 2023-01-22 19:55:55.414349: step: 1112/527, loss: 0.019411331042647362 2023-01-22 19:55:56.459325: step: 1116/527, loss: 0.00392560288310051 2023-01-22 19:55:57.534933: step: 1120/527, loss: 0.046864207834005356 2023-01-22 19:55:58.563242: step: 1124/527, loss: 0.0027047258336097 2023-01-22 19:55:59.602934: step: 1128/527, loss: 0.0007161131361499429 2023-01-22 19:56:00.642693: step: 1132/527, loss: 0.008406261913478374 2023-01-22 19:56:01.693299: step: 1136/527, loss: 0.01717405579984188 2023-01-22 19:56:02.754446: step: 1140/527, loss: 0.011404656805098057 2023-01-22 19:56:03.799063: step: 1144/527, loss: 0.004440920427441597 2023-01-22 19:56:04.851793: step: 1148/527, loss: 0.05285952240228653 2023-01-22 19:56:05.888663: step: 1152/527, loss: 0.008084303699433804 2023-01-22 19:56:06.941434: step: 1156/527, loss: 0.008709576912224293 2023-01-22 19:56:07.994076: step: 1160/527, loss: 0.006324164569377899 2023-01-22 19:56:09.034682: step: 1164/527, loss: 0.011261779814958572 2023-01-22 19:56:10.103715: step: 1168/527, loss: 0.03855516016483307 2023-01-22 19:56:11.157023: step: 1172/527, loss: 0.003255103714764118 2023-01-22 19:56:12.183925: step: 1176/527, loss: 0.003080525901168585 2023-01-22 19:56:13.251186: step: 1180/527, loss: 0.0041559552773833275 2023-01-22 19:56:14.287457: step: 1184/527, loss: 0.00935401488095522 2023-01-22 19:56:15.347619: step: 1188/527, loss: 0.003595223417505622 2023-01-22 19:56:16.398769: step: 1192/527, loss: 0.007169822230935097 2023-01-22 19:56:17.447143: step: 1196/527, loss: 0.031760334968566895 2023-01-22 19:56:18.504451: step: 1200/527, loss: 0.003564560553058982 2023-01-22 19:56:19.560487: step: 1204/527, loss: 0.011569914408028126 2023-01-22 19:56:20.610352: step: 1208/527, loss: 0.0045539080165326595 2023-01-22 19:56:21.648891: step: 1212/527, loss: 0.0023483119439333677 2023-01-22 19:56:22.690822: step: 1216/527, loss: 0.03838162496685982 2023-01-22 19:56:23.741185: step: 1220/527, loss: 0.009832527488470078 2023-01-22 19:56:24.778560: step: 1224/527, loss: 0.0030852321069687605 2023-01-22 19:56:25.847197: step: 1228/527, loss: 0.004961112514138222 2023-01-22 19:56:26.890560: step: 1232/527, loss: 0.0038893776945769787 2023-01-22 19:56:27.931873: step: 1236/527, loss: 0.061964794993400574 2023-01-22 19:56:28.976412: step: 1240/527, loss: 0.0016118728090077639 2023-01-22 19:56:30.016803: step: 1244/527, loss: 0.008791719563305378 2023-01-22 19:56:31.060749: step: 1248/527, loss: 0.00736392242833972 2023-01-22 19:56:32.092065: step: 1252/527, loss: 0.0373251847922802 2023-01-22 19:56:33.136966: step: 1256/527, loss: 0.013917316682636738 2023-01-22 19:56:34.197460: step: 1260/527, loss: 0.003297120798379183 2023-01-22 19:56:35.251556: step: 1264/527, loss: 0.01039439719170332 2023-01-22 19:56:36.308835: step: 1268/527, loss: 0.02753548137843609 2023-01-22 19:56:37.353809: step: 1272/527, loss: 0.007985890842974186 2023-01-22 19:56:38.398773: step: 1276/527, loss: 0.0016036515589803457 2023-01-22 19:56:39.445917: step: 1280/527, loss: 0.010330391116440296 2023-01-22 19:56:40.517058: step: 1284/527, loss: 0.003874722635373473 2023-01-22 19:56:41.567585: step: 1288/527, loss: 0.015858067199587822 2023-01-22 19:56:42.617890: step: 1292/527, loss: 0.004830340389162302 2023-01-22 19:56:43.670288: step: 1296/527, loss: 0.06678352504968643 2023-01-22 19:56:44.725660: step: 1300/527, loss: 0.07401052862405777 2023-01-22 19:56:45.770847: step: 1304/527, loss: 0.005815865937620401 2023-01-22 19:56:46.828577: step: 1308/527, loss: 0.074469655752182 2023-01-22 19:56:47.869425: step: 1312/527, loss: 0.007657233159989119 2023-01-22 19:56:48.900689: step: 1316/527, loss: 0.010126876644790173 2023-01-22 19:56:49.961811: step: 1320/527, loss: 0.026489200070500374 2023-01-22 19:56:51.040491: step: 1324/527, loss: 0.012346141040325165 2023-01-22 19:56:52.111644: step: 1328/527, loss: 0.06684651970863342 2023-01-22 19:56:53.178478: step: 1332/527, loss: 0.023989371955394745 2023-01-22 19:56:54.237323: step: 1336/527, loss: 0.005770236719399691 2023-01-22 19:56:55.279476: step: 1340/527, loss: 0.0052342722192406654 2023-01-22 19:56:56.329159: step: 1344/527, loss: 0.008701438084244728 2023-01-22 19:56:57.364950: step: 1348/527, loss: 0.014546336606144905 2023-01-22 19:56:58.418395: step: 1352/527, loss: 0.0050030602142214775 2023-01-22 19:56:59.462685: step: 1356/527, loss: 0.004134844057261944 2023-01-22 19:57:00.517646: step: 1360/527, loss: 0.01821618527173996 2023-01-22 19:57:01.564754: step: 1364/527, loss: 0.029602590948343277 2023-01-22 19:57:02.607512: step: 1368/527, loss: 0.006892577279359102 2023-01-22 19:57:03.652979: step: 1372/527, loss: 0.0030683078803122044 2023-01-22 19:57:04.697844: step: 1376/527, loss: 0.05501647666096687 2023-01-22 19:57:05.733491: step: 1380/527, loss: 0.0 2023-01-22 19:57:06.783450: step: 1384/527, loss: 0.014835318550467491 2023-01-22 19:57:07.834038: step: 1388/527, loss: 0.005086015444248915 2023-01-22 19:57:08.884028: step: 1392/527, loss: 0.035345274955034256 2023-01-22 19:57:09.924148: step: 1396/527, loss: 0.00954096857458353 2023-01-22 19:57:10.971926: step: 1400/527, loss: 0.0013969441642984748 2023-01-22 19:57:12.020483: step: 1404/527, loss: 0.008144868537783623 2023-01-22 19:57:13.075259: step: 1408/527, loss: 0.0046132588759064674 2023-01-22 19:57:14.125634: step: 1412/527, loss: 0.0046321372501552105 2023-01-22 19:57:15.189542: step: 1416/527, loss: 0.0024720369838178158 2023-01-22 19:57:16.227938: step: 1420/527, loss: 0.00758796650916338 2023-01-22 19:57:17.277242: step: 1424/527, loss: 0.019266853109002113 2023-01-22 19:57:18.319384: step: 1428/527, loss: 0.014469382353127003 2023-01-22 19:57:19.377203: step: 1432/527, loss: 0.027585215866565704 2023-01-22 19:57:20.416649: step: 1436/527, loss: 0.022586941719055176 2023-01-22 19:57:21.461837: step: 1440/527, loss: 0.007138972170650959 2023-01-22 19:57:22.516276: step: 1444/527, loss: 0.0023658229038119316 2023-01-22 19:57:23.571751: step: 1448/527, loss: 0.018535802140831947 2023-01-22 19:57:24.636141: step: 1452/527, loss: 0.030143490061163902 2023-01-22 19:57:25.676735: step: 1456/527, loss: 0.002750793006271124 2023-01-22 19:57:26.729253: step: 1460/527, loss: 0.0014810208231210709 2023-01-22 19:57:27.811842: step: 1464/527, loss: 0.09174622595310211 2023-01-22 19:57:28.856997: step: 1468/527, loss: 0.0077330670319497585 2023-01-22 19:57:29.895794: step: 1472/527, loss: 0.003374515101313591 2023-01-22 19:57:30.942682: step: 1476/527, loss: 0.014382085762917995 2023-01-22 19:57:31.985970: step: 1480/527, loss: 0.006805216893553734 2023-01-22 19:57:33.042431: step: 1484/527, loss: 0.013255644589662552 2023-01-22 19:57:34.088503: step: 1488/527, loss: 0.007177384570240974 2023-01-22 19:57:35.147937: step: 1492/527, loss: 0.006798482034355402 2023-01-22 19:57:36.190304: step: 1496/527, loss: 0.013003462925553322 2023-01-22 19:57:37.228859: step: 1500/527, loss: 0.001318597118370235 2023-01-22 19:57:38.297984: step: 1504/527, loss: 0.011287938803434372 2023-01-22 19:57:39.346780: step: 1508/527, loss: 0.008135458454489708 2023-01-22 19:57:40.398408: step: 1512/527, loss: 0.016993878409266472 2023-01-22 19:57:41.448399: step: 1516/527, loss: 0.07533727586269379 2023-01-22 19:57:42.487829: step: 1520/527, loss: 0.006875397637486458 2023-01-22 19:57:43.529127: step: 1524/527, loss: 0.005762930028140545 2023-01-22 19:57:44.573828: step: 1528/527, loss: 0.01687450520694256 2023-01-22 19:57:45.620905: step: 1532/527, loss: 0.00984887219965458 2023-01-22 19:57:46.662138: step: 1536/527, loss: 0.0028607877902686596 2023-01-22 19:57:47.719803: step: 1540/527, loss: 0.03572121635079384 2023-01-22 19:57:48.778125: step: 1544/527, loss: 0.00899457186460495 2023-01-22 19:57:49.816098: step: 1548/527, loss: 0.039776477962732315 2023-01-22 19:57:50.861483: step: 1552/527, loss: 0.034703560173511505 2023-01-22 19:57:51.899400: step: 1556/527, loss: 0.00732477568089962 2023-01-22 19:57:52.932904: step: 1560/527, loss: 0.00390958646312356 2023-01-22 19:57:53.982021: step: 1564/527, loss: 0.009502626024186611 2023-01-22 19:57:55.033648: step: 1568/527, loss: 0.009714369662106037 2023-01-22 19:57:56.094488: step: 1572/527, loss: 0.021448219195008278 2023-01-22 19:57:57.142349: step: 1576/527, loss: 0.011181551963090897 2023-01-22 19:57:58.191781: step: 1580/527, loss: 0.009241553954780102 2023-01-22 19:57:59.225050: step: 1584/527, loss: 0.01548182312399149 2023-01-22 19:58:00.268592: step: 1588/527, loss: 0.01882026344537735 2023-01-22 19:58:01.317723: step: 1592/527, loss: 0.005637936294078827 2023-01-22 19:58:02.370040: step: 1596/527, loss: 0.02417459525167942 2023-01-22 19:58:03.416567: step: 1600/527, loss: 0.003551296191290021 2023-01-22 19:58:04.483194: step: 1604/527, loss: 0.02125997096300125 2023-01-22 19:58:05.553291: step: 1608/527, loss: 0.012055809609591961 2023-01-22 19:58:06.624375: step: 1612/527, loss: 0.01648874022066593 2023-01-22 19:58:07.677393: step: 1616/527, loss: 0.004450049716979265 2023-01-22 19:58:08.718049: step: 1620/527, loss: 0.001215534983202815 2023-01-22 19:58:09.762427: step: 1624/527, loss: 0.005571991670876741 2023-01-22 19:58:10.813570: step: 1628/527, loss: 0.005596946459263563 2023-01-22 19:58:11.857946: step: 1632/527, loss: 0.008297860622406006 2023-01-22 19:58:12.925026: step: 1636/527, loss: 0.007273584604263306 2023-01-22 19:58:13.971827: step: 1640/527, loss: 0.013721923343837261 2023-01-22 19:58:15.021745: step: 1644/527, loss: 0.008100690320134163 2023-01-22 19:58:16.069652: step: 1648/527, loss: 0.013942421413958073 2023-01-22 19:58:17.144169: step: 1652/527, loss: 0.0038877883926033974 2023-01-22 19:58:18.181614: step: 1656/527, loss: 0.058846235275268555 2023-01-22 19:58:19.243308: step: 1660/527, loss: 0.013884962536394596 2023-01-22 19:58:20.274780: step: 1664/527, loss: 0.05978230759501457 2023-01-22 19:58:21.350894: step: 1668/527, loss: 0.02195640467107296 2023-01-22 19:58:22.400844: step: 1672/527, loss: 0.012425919063389301 2023-01-22 19:58:23.444536: step: 1676/527, loss: 0.002420072676613927 2023-01-22 19:58:24.493103: step: 1680/527, loss: 0.02230508252978325 2023-01-22 19:58:25.532603: step: 1684/527, loss: 0.010975967161357403 2023-01-22 19:58:26.592467: step: 1688/527, loss: 0.009344184771180153 2023-01-22 19:58:27.635305: step: 1692/527, loss: 0.007283311802893877 2023-01-22 19:58:28.711273: step: 1696/527, loss: 0.02210157737135887 2023-01-22 19:58:29.758456: step: 1700/527, loss: 0.004351050592958927 2023-01-22 19:58:30.797269: step: 1704/527, loss: 0.003865842241793871 2023-01-22 19:58:31.865365: step: 1708/527, loss: 0.018277203664183617 2023-01-22 19:58:32.906105: step: 1712/527, loss: 0.0022287829779088497 2023-01-22 19:58:33.986521: step: 1716/527, loss: 0.0048011732287704945 2023-01-22 19:58:35.025869: step: 1720/527, loss: 0.007409593090415001 2023-01-22 19:58:36.068036: step: 1724/527, loss: 0.005694955121725798 2023-01-22 19:58:37.104677: step: 1728/527, loss: 0.011186397634446621 2023-01-22 19:58:38.134637: step: 1732/527, loss: 0.033404815942049026 2023-01-22 19:58:39.196835: step: 1736/527, loss: 0.007245366461575031 2023-01-22 19:58:40.247974: step: 1740/527, loss: 0.01462315022945404 2023-01-22 19:58:41.299924: step: 1744/527, loss: 0.007646861020475626 2023-01-22 19:58:42.355450: step: 1748/527, loss: 0.0016970261931419373 2023-01-22 19:58:43.411947: step: 1752/527, loss: 0.00301720155403018 2023-01-22 19:58:44.466282: step: 1756/527, loss: 0.023948779329657555 2023-01-22 19:58:45.531063: step: 1760/527, loss: 0.011112038046121597 2023-01-22 19:58:46.564524: step: 1764/527, loss: 0.011977504938840866 2023-01-22 19:58:47.617278: step: 1768/527, loss: 0.017544664442539215 2023-01-22 19:58:48.650525: step: 1772/527, loss: 0.00577186606824398 2023-01-22 19:58:49.707943: step: 1776/527, loss: 0.008312851190567017 2023-01-22 19:58:50.763699: step: 1780/527, loss: 0.005504704546183348 2023-01-22 19:58:51.819088: step: 1784/527, loss: 0.0010715055977925658 2023-01-22 19:58:52.864324: step: 1788/527, loss: 0.01258862018585205 2023-01-22 19:58:53.920971: step: 1792/527, loss: 0.021817810833454132 2023-01-22 19:58:54.972355: step: 1796/527, loss: 0.007140064146369696 2023-01-22 19:58:56.025270: step: 1800/527, loss: 0.0007058614864945412 2023-01-22 19:58:57.078152: step: 1804/527, loss: 0.013723025098443031 2023-01-22 19:58:58.138235: step: 1808/527, loss: 0.015082364901900291 2023-01-22 19:58:59.191003: step: 1812/527, loss: 0.00910787470638752 2023-01-22 19:59:00.235416: step: 1816/527, loss: 0.011241174302995205 2023-01-22 19:59:01.284694: step: 1820/527, loss: 0.01689973846077919 2023-01-22 19:59:02.368047: step: 1824/527, loss: 0.0022004563361406326 2023-01-22 19:59:03.422790: step: 1828/527, loss: 0.03286900371313095 2023-01-22 19:59:04.480300: step: 1832/527, loss: 0.014647014439105988 2023-01-22 19:59:05.539189: step: 1836/527, loss: 0.004199439659714699 2023-01-22 19:59:06.572376: step: 1840/527, loss: 0.0033072875812649727 2023-01-22 19:59:07.633632: step: 1844/527, loss: 0.13581690192222595 2023-01-22 19:59:08.689563: step: 1848/527, loss: 0.004670616239309311 2023-01-22 19:59:09.746011: step: 1852/527, loss: 0.010398413054645061 2023-01-22 19:59:10.797867: step: 1856/527, loss: 0.05213936045765877 2023-01-22 19:59:11.857300: step: 1860/527, loss: 0.009890140034258366 2023-01-22 19:59:12.897562: step: 1864/527, loss: 0.007445559371262789 2023-01-22 19:59:13.954287: step: 1868/527, loss: 0.0048018814995884895 2023-01-22 19:59:14.996966: step: 1872/527, loss: 0.003319947514683008 2023-01-22 19:59:16.029377: step: 1876/527, loss: 0.030779991298913956 2023-01-22 19:59:17.070702: step: 1880/527, loss: 0.0026039485819637775 2023-01-22 19:59:18.134503: step: 1884/527, loss: 0.010190390981733799 2023-01-22 19:59:19.188968: step: 1888/527, loss: 0.013070452958345413 2023-01-22 19:59:20.278154: step: 1892/527, loss: 0.007194320671260357 2023-01-22 19:59:21.324020: step: 1896/527, loss: 0.0035049982834607363 2023-01-22 19:59:22.367878: step: 1900/527, loss: 0.07489101588726044 2023-01-22 19:59:23.422833: step: 1904/527, loss: 0.014729148708283901 2023-01-22 19:59:24.469436: step: 1908/527, loss: 0.0036900367122143507 2023-01-22 19:59:25.509324: step: 1912/527, loss: 0.0036355936899781227 2023-01-22 19:59:26.562286: step: 1916/527, loss: 0.0054442123509943485 2023-01-22 19:59:27.599766: step: 1920/527, loss: 0.00244968943297863 2023-01-22 19:59:28.648166: step: 1924/527, loss: 0.015027009882032871 2023-01-22 19:59:29.695138: step: 1928/527, loss: 0.003041490912437439 2023-01-22 19:59:30.747217: step: 1932/527, loss: 0.004020232707262039 2023-01-22 19:59:31.778019: step: 1936/527, loss: 0.07933308184146881 2023-01-22 19:59:32.824473: step: 1940/527, loss: 0.013274340890347958 2023-01-22 19:59:33.867381: step: 1944/527, loss: 0.01796991191804409 2023-01-22 19:59:34.927147: step: 1948/527, loss: 0.024167299270629883 2023-01-22 19:59:35.984714: step: 1952/527, loss: 0.009892450645565987 2023-01-22 19:59:37.040280: step: 1956/527, loss: 0.007570433896034956 2023-01-22 19:59:38.091400: step: 1960/527, loss: 0.07348640263080597 2023-01-22 19:59:39.160489: step: 1964/527, loss: 0.006582705304026604 2023-01-22 19:59:40.220150: step: 1968/527, loss: 0.030393436551094055 2023-01-22 19:59:41.275604: step: 1972/527, loss: 0.0058537875302135944 2023-01-22 19:59:42.326051: step: 1976/527, loss: 0.0026069083251059055 2023-01-22 19:59:43.364835: step: 1980/527, loss: 0.009423689916729927 2023-01-22 19:59:44.419051: step: 1984/527, loss: 0.022480305284261703 2023-01-22 19:59:45.458815: step: 1988/527, loss: 0.030856041237711906 2023-01-22 19:59:46.515417: step: 1992/527, loss: 0.004838964436203241 2023-01-22 19:59:47.558591: step: 1996/527, loss: 0.005152531433850527 2023-01-22 19:59:48.601125: step: 2000/527, loss: 0.005777544807642698 2023-01-22 19:59:49.660694: step: 2004/527, loss: 0.014499610289931297 2023-01-22 19:59:50.703377: step: 2008/527, loss: 0.0042851450853049755 2023-01-22 19:59:51.759837: step: 2012/527, loss: 0.012328427284955978 2023-01-22 19:59:52.798913: step: 2016/527, loss: 0.016944842413067818 2023-01-22 19:59:53.829755: step: 2020/527, loss: 0.007113997358828783 2023-01-22 19:59:54.875585: step: 2024/527, loss: 0.05141862481832504 2023-01-22 19:59:55.921959: step: 2028/527, loss: 0.012867379933595657 2023-01-22 19:59:56.956777: step: 2032/527, loss: 0.004897923208773136 2023-01-22 19:59:58.007821: step: 2036/527, loss: 0.00516819953918457 2023-01-22 19:59:59.038258: step: 2040/527, loss: 0.016002006828784943 2023-01-22 20:00:00.083633: step: 2044/527, loss: 0.005539006553590298 2023-01-22 20:00:01.133176: step: 2048/527, loss: 0.007042177952826023 2023-01-22 20:00:02.195843: step: 2052/527, loss: 0.02467612735927105 2023-01-22 20:00:03.249144: step: 2056/527, loss: 0.009669848717749119 2023-01-22 20:00:04.287161: step: 2060/527, loss: 0.002288132905960083 2023-01-22 20:00:05.322386: step: 2064/527, loss: 0.014058349654078484 2023-01-22 20:00:06.368301: step: 2068/527, loss: 0.006001413334161043 2023-01-22 20:00:07.438756: step: 2072/527, loss: 0.026808515191078186 2023-01-22 20:00:08.485551: step: 2076/527, loss: 0.005903674755245447 2023-01-22 20:00:09.531982: step: 2080/527, loss: 0.00975488405674696 2023-01-22 20:00:10.592121: step: 2084/527, loss: 0.00612968485802412 2023-01-22 20:00:11.645323: step: 2088/527, loss: 0.012655384838581085 2023-01-22 20:00:12.682095: step: 2092/527, loss: 0.017377832904458046 2023-01-22 20:00:13.715366: step: 2096/527, loss: 0.00624031713232398 2023-01-22 20:00:14.759598: step: 2100/527, loss: 0.005781632848083973 2023-01-22 20:00:15.791294: step: 2104/527, loss: 3.921493771485984e-06 2023-01-22 20:00:16.830658: step: 2108/527, loss: 0.012473038397729397 ================================================== Loss: 0.014 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3258496806569343, 'r': 0.33883420303605316, 'f1': 0.3322151162790698}, 'combined': 0.24479008567931457, 'stategy': 1, 'epoch': 2} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.34233743704903313, 'r': 0.30717004578854157, 'f1': 0.32380167740047505}, 'combined': 0.207233073536304, 'stategy': 1, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33179762539295776, 'r': 0.3594998939267151, 'f1': 0.3450937050990508}, 'combined': 0.2542795721782479, 'stategy': 1, 'epoch': 2} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3607296104330796, 'r': 0.30891572093451003, 'f1': 0.3328181126620578}, 'combined': 0.21300359210371697, 'stategy': 1, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33005343338881826, 'r': 0.32817457133916655, 'f1': 0.3291113208291927}, 'combined': 0.24250307850572092, 'stategy': 1, 'epoch': 2} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36864655184161255, 'r': 0.2945147156660016, 'f1': 0.32743720032062296}, 'combined': 0.23476629456950326, 'stategy': 1, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 2} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3264723230548723, 'r': 0.33576470416649107, 'f1': 0.3310533191688322}, 'combined': 0.24393402465071845, 'stategy': 1, 'epoch': 0} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.35636473460031554, 'r': 0.2976731814040852, 'f1': 0.32438554919493273}, 'combined': 0.23257831829070652, 'stategy': 1, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 0} ****************************** Epoch: 3 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 20:02:41.326461: step: 4/527, loss: 0.09630347788333893 2023-01-22 20:02:42.372222: step: 8/527, loss: 0.004018387757241726 2023-01-22 20:02:43.430737: step: 12/527, loss: 0.006106141023337841 2023-01-22 20:02:44.477532: step: 16/527, loss: 0.019484447315335274 2023-01-22 20:02:45.534231: step: 20/527, loss: 0.025568706914782524 2023-01-22 20:02:46.591144: step: 24/527, loss: 0.005435064435005188 2023-01-22 20:02:47.647197: step: 28/527, loss: 0.002151063410565257 2023-01-22 20:02:48.686988: step: 32/527, loss: 0.009927656501531601 2023-01-22 20:02:49.714448: step: 36/527, loss: 0.005331465974450111 2023-01-22 20:02:50.761066: step: 40/527, loss: 0.013922012411057949 2023-01-22 20:02:51.796275: step: 44/527, loss: 0.006039150059223175 2023-01-22 20:02:52.850454: step: 48/527, loss: 0.001960847992449999 2023-01-22 20:02:53.886413: step: 52/527, loss: 0.011658263392746449 2023-01-22 20:02:54.927392: step: 56/527, loss: 0.047435592859983444 2023-01-22 20:02:55.957022: step: 60/527, loss: 0.007725862320512533 2023-01-22 20:02:57.007900: step: 64/527, loss: 0.004883904475718737 2023-01-22 20:02:58.052849: step: 68/527, loss: 0.010088262148201466 2023-01-22 20:02:59.090993: step: 72/527, loss: 0.001696677994914353 2023-01-22 20:03:00.118785: step: 76/527, loss: 0.0034419659059494734 2023-01-22 20:03:01.166557: step: 80/527, loss: 0.0023205492179840803 2023-01-22 20:03:02.220010: step: 84/527, loss: 0.0010177545482292771 2023-01-22 20:03:03.256847: step: 88/527, loss: 0.0009474693215452135 2023-01-22 20:03:04.315982: step: 92/527, loss: 0.005022803321480751 2023-01-22 20:03:05.374532: step: 96/527, loss: 0.005060776136815548 2023-01-22 20:03:06.437684: step: 100/527, loss: 0.01681423746049404 2023-01-22 20:03:07.497064: step: 104/527, loss: 0.01177146378904581 2023-01-22 20:03:08.556284: step: 108/527, loss: 0.007815971039235592 2023-01-22 20:03:09.586627: step: 112/527, loss: 0.007773871533572674 2023-01-22 20:03:10.659840: step: 116/527, loss: 0.0027232191059738398 2023-01-22 20:03:11.724262: step: 120/527, loss: 0.005991565063595772 2023-01-22 20:03:12.764549: step: 124/527, loss: 0.00684939231723547 2023-01-22 20:03:13.795520: step: 128/527, loss: 0.0034862826578319073 2023-01-22 20:03:14.834956: step: 132/527, loss: 0.006345641799271107 2023-01-22 20:03:15.889988: step: 136/527, loss: 0.026764852926135063 2023-01-22 20:03:16.950791: step: 140/527, loss: 0.006506483536213636 2023-01-22 20:03:17.991790: step: 144/527, loss: 0.0008612524252384901 2023-01-22 20:03:19.031263: step: 148/527, loss: 0.005222842562943697 2023-01-22 20:03:20.078319: step: 152/527, loss: 0.00484373839572072 2023-01-22 20:03:21.119511: step: 156/527, loss: 0.006098178215324879 2023-01-22 20:03:22.163685: step: 160/527, loss: 0.004398141521960497 2023-01-22 20:03:23.201700: step: 164/527, loss: 0.010773123241961002 2023-01-22 20:03:24.252504: step: 168/527, loss: 0.036310240626335144 2023-01-22 20:03:25.314453: step: 172/527, loss: 0.007324503269046545 2023-01-22 20:03:26.368656: step: 176/527, loss: 0.005582513753324747 2023-01-22 20:03:27.405536: step: 180/527, loss: 0.011807762086391449 2023-01-22 20:03:28.480434: step: 184/527, loss: 0.0037986496463418007 2023-01-22 20:03:29.521336: step: 188/527, loss: 0.0023038529325276613 2023-01-22 20:03:30.565365: step: 192/527, loss: 0.0028449594974517822 2023-01-22 20:03:31.610856: step: 196/527, loss: 0.006153217051178217 2023-01-22 20:03:32.668150: step: 200/527, loss: 0.01319731306284666 2023-01-22 20:03:33.717802: step: 204/527, loss: 0.006297122221440077 2023-01-22 20:03:34.774108: step: 208/527, loss: 0.010150426998734474 2023-01-22 20:03:35.818856: step: 212/527, loss: 0.013856690376996994 2023-01-22 20:03:36.854264: step: 216/527, loss: 0.0013324968749657273 2023-01-22 20:03:37.893605: step: 220/527, loss: 0.010310085490345955 2023-01-22 20:03:38.940716: step: 224/527, loss: 0.007142497692257166 2023-01-22 20:03:39.999330: step: 228/527, loss: 0.002759611699730158 2023-01-22 20:03:41.039388: step: 232/527, loss: 0.005071406718343496 2023-01-22 20:03:42.080137: step: 236/527, loss: 0.013709788210690022 2023-01-22 20:03:43.137773: step: 240/527, loss: 0.00850035808980465 2023-01-22 20:03:44.177471: step: 244/527, loss: 0.009937750175595284 2023-01-22 20:03:45.236523: step: 248/527, loss: 0.008300106041133404 2023-01-22 20:03:46.288892: step: 252/527, loss: 0.023348016664385796 2023-01-22 20:03:47.359059: step: 256/527, loss: 0.01973486691713333 2023-01-22 20:03:48.411729: step: 260/527, loss: 0.00919699389487505 2023-01-22 20:03:49.469447: step: 264/527, loss: 0.00831583235412836 2023-01-22 20:03:50.520179: step: 268/527, loss: 0.002271933713927865 2023-01-22 20:03:51.564128: step: 272/527, loss: 0.009893148206174374 2023-01-22 20:03:52.605341: step: 276/527, loss: 0.005620964802801609 2023-01-22 20:03:53.667950: step: 280/527, loss: 0.025527391582727432 2023-01-22 20:03:54.744922: step: 284/527, loss: 0.00900374073535204 2023-01-22 20:03:55.811405: step: 288/527, loss: 0.005033768247812986 2023-01-22 20:03:56.851360: step: 292/527, loss: 0.022912397980690002 2023-01-22 20:03:57.912321: step: 296/527, loss: 0.005532699637115002 2023-01-22 20:03:58.975237: step: 300/527, loss: 0.004621663596481085 2023-01-22 20:04:00.030341: step: 304/527, loss: 0.0020022594835609198 2023-01-22 20:04:01.079003: step: 308/527, loss: 0.015827590599656105 2023-01-22 20:04:02.133596: step: 312/527, loss: 0.0014097942039370537 2023-01-22 20:04:03.198584: step: 316/527, loss: 0.012368621304631233 2023-01-22 20:04:04.260455: step: 320/527, loss: 0.004982339218258858 2023-01-22 20:04:05.300020: step: 324/527, loss: 0.03387031704187393 2023-01-22 20:04:06.348906: step: 328/527, loss: 0.01007978618144989 2023-01-22 20:04:07.387010: step: 332/527, loss: 0.02267213724553585 2023-01-22 20:04:08.459627: step: 336/527, loss: 0.005016049835830927 2023-01-22 20:04:09.524841: step: 340/527, loss: 0.02270680107176304 2023-01-22 20:04:10.577588: step: 344/527, loss: 0.02230987511575222 2023-01-22 20:04:11.636625: step: 348/527, loss: 0.024212274700403214 2023-01-22 20:04:12.696860: step: 352/527, loss: 0.006584585178643465 2023-01-22 20:04:13.730087: step: 356/527, loss: 0.004892353899776936 2023-01-22 20:04:14.789153: step: 360/527, loss: 0.0075839790515601635 2023-01-22 20:04:15.835518: step: 364/527, loss: 0.006894493941217661 2023-01-22 20:04:16.901922: step: 368/527, loss: 0.019412977620959282 2023-01-22 20:04:17.942384: step: 372/527, loss: 0.0022092692088335752 2023-01-22 20:04:18.997770: step: 376/527, loss: 0.0050852615386247635 2023-01-22 20:04:20.049342: step: 380/527, loss: 0.002591255586594343 2023-01-22 20:04:21.120799: step: 384/527, loss: 0.06252925843000412 2023-01-22 20:04:22.172715: step: 388/527, loss: 0.008522537536919117 2023-01-22 20:04:23.244921: step: 392/527, loss: 0.008968872018158436 2023-01-22 20:04:24.286098: step: 396/527, loss: 0.00962995458394289 2023-01-22 20:04:25.341387: step: 400/527, loss: 0.008363430388271809 2023-01-22 20:04:26.399933: step: 404/527, loss: 0.01252360362559557 2023-01-22 20:04:27.445284: step: 408/527, loss: 0.007110966369509697 2023-01-22 20:04:28.499156: step: 412/527, loss: 0.006380102597177029 2023-01-22 20:04:29.556333: step: 416/527, loss: 0.006093526259064674 2023-01-22 20:04:30.610521: step: 420/527, loss: 0.011622357182204723 2023-01-22 20:04:31.665962: step: 424/527, loss: 0.0031945989467203617 2023-01-22 20:04:32.704512: step: 428/527, loss: 0.01747400499880314 2023-01-22 20:04:33.765758: step: 432/527, loss: 0.045519184321165085 2023-01-22 20:04:34.823472: step: 436/527, loss: 0.006821592804044485 2023-01-22 20:04:35.868408: step: 440/527, loss: 0.004795973189175129 2023-01-22 20:04:36.907664: step: 444/527, loss: 0.005497786216437817 2023-01-22 20:04:37.964239: step: 448/527, loss: 0.006155397742986679 2023-01-22 20:04:39.012729: step: 452/527, loss: 0.005620886571705341 2023-01-22 20:04:40.079985: step: 456/527, loss: 0.009695395827293396 2023-01-22 20:04:41.155361: step: 460/527, loss: 0.02128414995968342 2023-01-22 20:04:42.219496: step: 464/527, loss: 0.004674040712416172 2023-01-22 20:04:43.256709: step: 468/527, loss: 0.015705401077866554 2023-01-22 20:04:44.291585: step: 472/527, loss: 0.0018358565866947174 2023-01-22 20:04:45.338269: step: 476/527, loss: 0.0009739425731822848 2023-01-22 20:04:46.382190: step: 480/527, loss: 0.035310838371515274 2023-01-22 20:04:47.438265: step: 484/527, loss: 0.006547639146447182 2023-01-22 20:04:48.490229: step: 488/527, loss: 0.0026764969807118177 2023-01-22 20:04:49.558069: step: 492/527, loss: 0.002714770380407572 2023-01-22 20:04:50.619420: step: 496/527, loss: 0.02095515839755535 2023-01-22 20:04:51.673399: step: 500/527, loss: 0.006102669518440962 2023-01-22 20:04:52.731309: step: 504/527, loss: 0.02948002889752388 2023-01-22 20:04:53.788953: step: 508/527, loss: 0.005745660979300737 2023-01-22 20:04:54.862652: step: 512/527, loss: 0.013627566397190094 2023-01-22 20:04:55.920593: step: 516/527, loss: 0.009460663422942162 2023-01-22 20:04:56.968637: step: 520/527, loss: 0.008031336590647697 2023-01-22 20:04:58.033014: step: 524/527, loss: 0.0041961390525102615 2023-01-22 20:04:59.077893: step: 528/527, loss: 0.00910942256450653 2023-01-22 20:05:00.141683: step: 532/527, loss: 0.004254691768437624 2023-01-22 20:05:01.212637: step: 536/527, loss: 0.0016187336295843124 2023-01-22 20:05:02.266263: step: 540/527, loss: 0.0005065691657364368 2023-01-22 20:05:03.315496: step: 544/527, loss: 0.02607870101928711 2023-01-22 20:05:04.373545: step: 548/527, loss: 0.0027424455620348454 2023-01-22 20:05:05.414352: step: 552/527, loss: 0.005031141918152571 2023-01-22 20:05:06.466184: step: 556/527, loss: 0.006300052627921104 2023-01-22 20:05:07.513823: step: 560/527, loss: 0.022951407358050346 2023-01-22 20:05:08.584771: step: 564/527, loss: 0.028079045936465263 2023-01-22 20:05:09.642227: step: 568/527, loss: 0.0240947138518095 2023-01-22 20:05:10.701883: step: 572/527, loss: 0.013987342827022076 2023-01-22 20:05:11.755351: step: 576/527, loss: 0.014969791285693645 2023-01-22 20:05:12.803554: step: 580/527, loss: 0.013831953518092632 2023-01-22 20:05:13.871767: step: 584/527, loss: 0.013960978016257286 2023-01-22 20:05:14.913064: step: 588/527, loss: 0.005384617485105991 2023-01-22 20:05:15.968398: step: 592/527, loss: 0.049049802124500275 2023-01-22 20:05:17.026321: step: 596/527, loss: 0.004875629674643278 2023-01-22 20:05:18.076887: step: 600/527, loss: 0.01216600276529789 2023-01-22 20:05:19.124464: step: 604/527, loss: 0.03795206546783447 2023-01-22 20:05:20.194463: step: 608/527, loss: 0.009664355777204037 2023-01-22 20:05:21.248275: step: 612/527, loss: 0.012557614594697952 2023-01-22 20:05:22.301151: step: 616/527, loss: 0.0034800933208316565 2023-01-22 20:05:23.358150: step: 620/527, loss: 0.005707267206162214 2023-01-22 20:05:24.399470: step: 624/527, loss: 0.01435763668268919 2023-01-22 20:05:25.440396: step: 628/527, loss: 0.009481802582740784 2023-01-22 20:05:26.491311: step: 632/527, loss: 0.012051774188876152 2023-01-22 20:05:27.548446: step: 636/527, loss: 0.002101621124893427 2023-01-22 20:05:28.607856: step: 640/527, loss: 0.007650560233741999 2023-01-22 20:05:29.663047: step: 644/527, loss: 0.024623781442642212 2023-01-22 20:05:30.711708: step: 648/527, loss: 0.00864055659621954 2023-01-22 20:05:31.769818: step: 652/527, loss: 0.013074246235191822 2023-01-22 20:05:32.842550: step: 656/527, loss: 0.004558406304568052 2023-01-22 20:05:33.902614: step: 660/527, loss: 0.002944357693195343 2023-01-22 20:05:34.959855: step: 664/527, loss: 0.008235105313360691 2023-01-22 20:05:36.014568: step: 668/527, loss: 0.008779531344771385 2023-01-22 20:05:37.060277: step: 672/527, loss: 0.008774541318416595 2023-01-22 20:05:38.116779: step: 676/527, loss: 0.00882588978856802 2023-01-22 20:05:39.169003: step: 680/527, loss: 0.006947447080165148 2023-01-22 20:05:40.229239: step: 684/527, loss: 0.002728177234530449 2023-01-22 20:05:41.287961: step: 688/527, loss: 0.005696764215826988 2023-01-22 20:05:42.332868: step: 692/527, loss: 0.0032642583828419447 2023-01-22 20:05:43.376106: step: 696/527, loss: 0.006180267781019211 2023-01-22 20:05:44.408647: step: 700/527, loss: 0.013511242344975471 2023-01-22 20:05:45.473673: step: 704/527, loss: 0.01107818353921175 2023-01-22 20:05:46.518802: step: 708/527, loss: 0.0003384547890163958 2023-01-22 20:05:47.570939: step: 712/527, loss: 0.006935957819223404 2023-01-22 20:05:48.609083: step: 716/527, loss: 0.0007469377596862614 2023-01-22 20:05:49.697319: step: 720/527, loss: 0.007006959989666939 2023-01-22 20:05:50.743437: step: 724/527, loss: 0.015508010052144527 2023-01-22 20:05:51.793243: step: 728/527, loss: 0.00976771954447031 2023-01-22 20:05:52.851887: step: 732/527, loss: 0.005833568051457405 2023-01-22 20:05:53.902362: step: 736/527, loss: 0.016178976744413376 2023-01-22 20:05:54.957999: step: 740/527, loss: 0.005664953030645847 2023-01-22 20:05:56.007062: step: 744/527, loss: 0.009345514699816704 2023-01-22 20:05:57.059379: step: 748/527, loss: 0.029329312965273857 2023-01-22 20:05:58.111217: step: 752/527, loss: 0.013494308106601238 2023-01-22 20:05:59.146310: step: 756/527, loss: 0.008023953065276146 2023-01-22 20:06:00.187034: step: 760/527, loss: 0.004013955593109131 2023-01-22 20:06:01.235409: step: 764/527, loss: 0.0053697009570896626 2023-01-22 20:06:02.295331: step: 768/527, loss: 0.01831982284784317 2023-01-22 20:06:03.342416: step: 772/527, loss: 0.006706717889755964 2023-01-22 20:06:04.387296: step: 776/527, loss: 0.006412239279597998 2023-01-22 20:06:05.446965: step: 780/527, loss: 0.010564597323536873 2023-01-22 20:06:06.506112: step: 784/527, loss: 0.023201294243335724 2023-01-22 20:06:07.555360: step: 788/527, loss: 0.008991423062980175 2023-01-22 20:06:08.617990: step: 792/527, loss: 0.004017589148133993 2023-01-22 20:06:09.656309: step: 796/527, loss: 7.491168798878789e-05 2023-01-22 20:06:10.726015: step: 800/527, loss: 0.01232368964701891 2023-01-22 20:06:11.792997: step: 804/527, loss: 0.0076598916202783585 2023-01-22 20:06:12.859330: step: 808/527, loss: 0.019467106088995934 2023-01-22 20:06:13.908368: step: 812/527, loss: 0.005291010718792677 2023-01-22 20:06:14.977466: step: 816/527, loss: 0.054278384894132614 2023-01-22 20:06:16.016517: step: 820/527, loss: 0.010627686977386475 2023-01-22 20:06:17.060787: step: 824/527, loss: 0.029858341440558434 2023-01-22 20:06:18.111978: step: 828/527, loss: 0.002811152022331953 2023-01-22 20:06:19.169603: step: 832/527, loss: 0.018511833623051643 2023-01-22 20:06:20.221350: step: 836/527, loss: 0.02764853835105896 2023-01-22 20:06:21.269284: step: 840/527, loss: 0.022996187210083008 2023-01-22 20:06:22.322005: step: 844/527, loss: 0.006607236806303263 2023-01-22 20:06:23.383224: step: 848/527, loss: 0.0020034622866660357 2023-01-22 20:06:24.438935: step: 852/527, loss: 0.005831459537148476 2023-01-22 20:06:25.517594: step: 856/527, loss: 0.0005701023619621992 2023-01-22 20:06:26.561319: step: 860/527, loss: 0.0021332784090191126 2023-01-22 20:06:27.605059: step: 864/527, loss: 0.004820593632757664 2023-01-22 20:06:28.674646: step: 868/527, loss: 0.00851569976657629 2023-01-22 20:06:29.727447: step: 872/527, loss: 0.0057282340712845325 2023-01-22 20:06:30.797624: step: 876/527, loss: 0.03575843200087547 2023-01-22 20:06:31.842972: step: 880/527, loss: 0.007071408908814192 2023-01-22 20:06:32.891102: step: 884/527, loss: 0.010876198299229145 2023-01-22 20:06:33.941122: step: 888/527, loss: 0.0015082152094691992 2023-01-22 20:06:34.998413: step: 892/527, loss: 0.04648592323064804 2023-01-22 20:06:36.048782: step: 896/527, loss: 0.003673673141747713 2023-01-22 20:06:37.089748: step: 900/527, loss: 0.011082727462053299 2023-01-22 20:06:38.152485: step: 904/527, loss: 0.009784271940588951 2023-01-22 20:06:39.222710: step: 908/527, loss: 0.0042418260127305984 2023-01-22 20:06:40.279171: step: 912/527, loss: 0.005253738723695278 2023-01-22 20:06:41.341774: step: 916/527, loss: 0.002525955904275179 2023-01-22 20:06:42.406064: step: 920/527, loss: 0.005559364799410105 2023-01-22 20:06:43.467655: step: 924/527, loss: 0.001615817192941904 2023-01-22 20:06:44.511734: step: 928/527, loss: 0.011296793818473816 2023-01-22 20:06:45.579248: step: 932/527, loss: 0.007280383259057999 2023-01-22 20:06:46.618033: step: 936/527, loss: 0.007404201664030552 2023-01-22 20:06:47.670043: step: 940/527, loss: 0.00373828480951488 2023-01-22 20:06:48.721285: step: 944/527, loss: 0.014845043420791626 2023-01-22 20:06:49.786983: step: 948/527, loss: 0.003047631587833166 2023-01-22 20:06:50.846220: step: 952/527, loss: 0.07062729448080063 2023-01-22 20:06:51.884194: step: 956/527, loss: 0.005586340092122555 2023-01-22 20:06:52.939910: step: 960/527, loss: 0.007183740381151438 2023-01-22 20:06:53.971476: step: 964/527, loss: 0.00403355248272419 2023-01-22 20:06:55.011744: step: 968/527, loss: 0.005623816046863794 2023-01-22 20:06:56.063849: step: 972/527, loss: 0.011466557160019875 2023-01-22 20:06:57.117356: step: 976/527, loss: 0.02611096389591694 2023-01-22 20:06:58.156995: step: 980/527, loss: 0.004564451519399881 2023-01-22 20:06:59.206335: step: 984/527, loss: 0.0065962281078100204 2023-01-22 20:07:00.257255: step: 988/527, loss: 0.031127341091632843 2023-01-22 20:07:01.297609: step: 992/527, loss: 0.029385266825556755 2023-01-22 20:07:02.337757: step: 996/527, loss: 0.004823361989110708 2023-01-22 20:07:03.394435: step: 1000/527, loss: 0.009636526927351952 2023-01-22 20:07:04.451188: step: 1004/527, loss: 0.009387986734509468 2023-01-22 20:07:05.493962: step: 1008/527, loss: 0.002747615799307823 2023-01-22 20:07:06.543655: step: 1012/527, loss: 0.005756690166890621 2023-01-22 20:07:07.606907: step: 1016/527, loss: 0.010362193919718266 2023-01-22 20:07:08.668760: step: 1020/527, loss: 0.0043361494317650795 2023-01-22 20:07:09.726456: step: 1024/527, loss: 0.014931879006326199 2023-01-22 20:07:10.778938: step: 1028/527, loss: 0.03934931010007858 2023-01-22 20:07:11.829588: step: 1032/527, loss: 0.007010089699178934 2023-01-22 20:07:12.890294: step: 1036/527, loss: 0.02363082766532898 2023-01-22 20:07:13.945439: step: 1040/527, loss: 0.003583756275475025 2023-01-22 20:07:14.985237: step: 1044/527, loss: 0.02431095950305462 2023-01-22 20:07:16.030256: step: 1048/527, loss: 0.005139993038028479 2023-01-22 20:07:17.088560: step: 1052/527, loss: 0.01207831222563982 2023-01-22 20:07:18.136417: step: 1056/527, loss: 0.025138530880212784 2023-01-22 20:07:19.222745: step: 1060/527, loss: 0.011888965964317322 2023-01-22 20:07:20.274288: step: 1064/527, loss: 0.019171904772520065 2023-01-22 20:07:21.330692: step: 1068/527, loss: 0.013876696117222309 2023-01-22 20:07:22.367904: step: 1072/527, loss: 0.015924135223031044 2023-01-22 20:07:23.425033: step: 1076/527, loss: 0.04972352832555771 2023-01-22 20:07:24.458544: step: 1080/527, loss: 0.00970274768769741 2023-01-22 20:07:25.528502: step: 1084/527, loss: 0.005475292447954416 2023-01-22 20:07:26.581454: step: 1088/527, loss: 0.004668317269533873 2023-01-22 20:07:27.631330: step: 1092/527, loss: 0.011428778059780598 2023-01-22 20:07:28.676343: step: 1096/527, loss: 0.01907227374613285 2023-01-22 20:07:29.733879: step: 1100/527, loss: 0.0066899326629936695 2023-01-22 20:07:30.800579: step: 1104/527, loss: 0.008204030804336071 2023-01-22 20:07:31.852876: step: 1108/527, loss: 0.003582809353247285 2023-01-22 20:07:32.902938: step: 1112/527, loss: 0.009270605631172657 2023-01-22 20:07:33.958436: step: 1116/527, loss: 0.004008301068097353 2023-01-22 20:07:35.001685: step: 1120/527, loss: 0.034290336072444916 2023-01-22 20:07:36.049241: step: 1124/527, loss: 0.02331523410975933 2023-01-22 20:07:37.096379: step: 1128/527, loss: 0.0029394710436463356 2023-01-22 20:07:38.162892: step: 1132/527, loss: 0.021229039877653122 2023-01-22 20:07:39.211661: step: 1136/527, loss: 0.013098455965518951 2023-01-22 20:07:40.260174: step: 1140/527, loss: 0.004252019338309765 2023-01-22 20:07:41.308053: step: 1144/527, loss: 0.004101179540157318 2023-01-22 20:07:42.366260: step: 1148/527, loss: 0.01199449971318245 2023-01-22 20:07:43.409228: step: 1152/527, loss: 0.009503703564405441 2023-01-22 20:07:44.463303: step: 1156/527, loss: 0.00757455313578248 2023-01-22 20:07:45.512589: step: 1160/527, loss: 0.034820638597011566 2023-01-22 20:07:46.573170: step: 1164/527, loss: 0.027094146236777306 2023-01-22 20:07:47.616303: step: 1168/527, loss: 0.0037842022720724344 2023-01-22 20:07:48.665875: step: 1172/527, loss: 0.014584150165319443 2023-01-22 20:07:49.724402: step: 1176/527, loss: 0.011031536385416985 2023-01-22 20:07:50.776956: step: 1180/527, loss: 0.0002956017560791224 2023-01-22 20:07:51.836789: step: 1184/527, loss: 0.005388931836932898 2023-01-22 20:07:52.917030: step: 1188/527, loss: 0.004438537638634443 2023-01-22 20:07:53.978718: step: 1192/527, loss: 0.0017783530056476593 2023-01-22 20:07:55.030627: step: 1196/527, loss: 0.0026618139818310738 2023-01-22 20:07:56.083444: step: 1200/527, loss: 0.0013757027918472886 2023-01-22 20:07:57.141135: step: 1204/527, loss: 0.0038849753327667713 2023-01-22 20:07:58.180362: step: 1208/527, loss: 0.003766994457691908 2023-01-22 20:07:59.236515: step: 1212/527, loss: 0.0011678735027089715 2023-01-22 20:08:00.282635: step: 1216/527, loss: 0.0021237481851130724 2023-01-22 20:08:01.315841: step: 1220/527, loss: 0.009384707547724247 2023-01-22 20:08:02.353457: step: 1224/527, loss: 0.005486843641847372 2023-01-22 20:08:03.396633: step: 1228/527, loss: 0.004588239826261997 2023-01-22 20:08:04.445060: step: 1232/527, loss: 0.007448290474712849 2023-01-22 20:08:05.486952: step: 1236/527, loss: 0.017806917428970337 2023-01-22 20:08:06.540294: step: 1240/527, loss: 0.03841574490070343 2023-01-22 20:08:07.585876: step: 1244/527, loss: 0.02057504653930664 2023-01-22 20:08:08.627956: step: 1248/527, loss: 0.004881734028458595 2023-01-22 20:08:09.671834: step: 1252/527, loss: 0.03089827299118042 2023-01-22 20:08:10.724963: step: 1256/527, loss: 0.0021302737295627594 2023-01-22 20:08:11.774565: step: 1260/527, loss: 0.0062431758269667625 2023-01-22 20:08:12.807746: step: 1264/527, loss: 0.0011526638409122825 2023-01-22 20:08:13.852587: step: 1268/527, loss: 0.008987120352685452 2023-01-22 20:08:14.904700: step: 1272/527, loss: 0.0035955901257693768 2023-01-22 20:08:15.959054: step: 1276/527, loss: 0.02776988409459591 2023-01-22 20:08:17.007471: step: 1280/527, loss: 0.011404940858483315 2023-01-22 20:08:18.051757: step: 1284/527, loss: 0.00906613189727068 2023-01-22 20:08:19.098087: step: 1288/527, loss: 0.013027424924075603 2023-01-22 20:08:20.140207: step: 1292/527, loss: 0.009509925730526447 2023-01-22 20:08:21.182557: step: 1296/527, loss: 0.024461472406983376 2023-01-22 20:08:22.226208: step: 1300/527, loss: 0.0008931338088586926 2023-01-22 20:08:23.292801: step: 1304/527, loss: 0.00578805897384882 2023-01-22 20:08:24.341401: step: 1308/527, loss: 0.011017491109669209 2023-01-22 20:08:25.384421: step: 1312/527, loss: 0.0020963025745004416 2023-01-22 20:08:26.442841: step: 1316/527, loss: 0.007990492507815361 2023-01-22 20:08:27.480648: step: 1320/527, loss: 0.03873471915721893 2023-01-22 20:08:28.537834: step: 1324/527, loss: 0.0013404254568740726 2023-01-22 20:08:29.576422: step: 1328/527, loss: 0.007413769606500864 2023-01-22 20:08:30.633354: step: 1332/527, loss: 0.018406588584184647 2023-01-22 20:08:31.668364: step: 1336/527, loss: 0.0036480906419456005 2023-01-22 20:08:32.720499: step: 1340/527, loss: 0.0018968930235132575 2023-01-22 20:08:33.773024: step: 1344/527, loss: 0.004612892400473356 2023-01-22 20:08:34.814047: step: 1348/527, loss: 0.003450157353654504 2023-01-22 20:08:35.872900: step: 1352/527, loss: 0.018316416069865227 2023-01-22 20:08:36.932779: step: 1356/527, loss: 0.003896051086485386 2023-01-22 20:08:37.961348: step: 1360/527, loss: 0.006394432857632637 2023-01-22 20:08:39.002134: step: 1364/527, loss: 0.010783486999571323 2023-01-22 20:08:40.037751: step: 1368/527, loss: 0.007230890914797783 2023-01-22 20:08:41.082613: step: 1372/527, loss: 0.008847628720104694 2023-01-22 20:08:42.123058: step: 1376/527, loss: 9.944938938133419e-05 2023-01-22 20:08:43.188677: step: 1380/527, loss: 0.01847342774271965 2023-01-22 20:08:44.251901: step: 1384/527, loss: 0.006782527081668377 2023-01-22 20:08:45.303393: step: 1388/527, loss: 0.009193900972604752 2023-01-22 20:08:46.334872: step: 1392/527, loss: 0.0022746522445231676 2023-01-22 20:08:47.382522: step: 1396/527, loss: 0.0030602654442191124 2023-01-22 20:08:48.438361: step: 1400/527, loss: 0.006369085982441902 2023-01-22 20:08:49.494157: step: 1404/527, loss: 0.02380537986755371 2023-01-22 20:08:50.546128: step: 1408/527, loss: 0.0026016035117208958 2023-01-22 20:08:51.588531: step: 1412/527, loss: 0.0034504702780395746 2023-01-22 20:08:52.657429: step: 1416/527, loss: 0.00810646079480648 2023-01-22 20:08:53.715048: step: 1420/527, loss: 0.027019396424293518 2023-01-22 20:08:54.772695: step: 1424/527, loss: 0.01575886830687523 2023-01-22 20:08:55.823550: step: 1428/527, loss: 0.007510844152420759 2023-01-22 20:08:56.884651: step: 1432/527, loss: 0.011695628985762596 2023-01-22 20:08:57.950472: step: 1436/527, loss: 0.0044456482864916325 2023-01-22 20:08:59.012000: step: 1440/527, loss: 0.01765453815460205 2023-01-22 20:09:00.048750: step: 1444/527, loss: 0.005490010604262352 2023-01-22 20:09:01.097863: step: 1448/527, loss: 0.001851929584518075 2023-01-22 20:09:02.134887: step: 1452/527, loss: 0.015075921081006527 2023-01-22 20:09:03.192787: step: 1456/527, loss: 0.0053201522678136826 2023-01-22 20:09:04.231412: step: 1460/527, loss: 0.005342629738152027 2023-01-22 20:09:05.277558: step: 1464/527, loss: 0.01672927848994732 2023-01-22 20:09:06.318223: step: 1468/527, loss: 0.006210397928953171 2023-01-22 20:09:07.390082: step: 1472/527, loss: 0.050433091819286346 2023-01-22 20:09:08.436264: step: 1476/527, loss: 0.015234251506626606 2023-01-22 20:09:09.477835: step: 1480/527, loss: 0.0038743913173675537 2023-01-22 20:09:10.499332: step: 1484/527, loss: 0.01153487153351307 2023-01-22 20:09:11.560272: step: 1488/527, loss: 0.012831402011215687 2023-01-22 20:09:12.606313: step: 1492/527, loss: 0.0058450717478990555 2023-01-22 20:09:13.676633: step: 1496/527, loss: 0.0267823226749897 2023-01-22 20:09:14.721081: step: 1500/527, loss: 0.0058557698503136635 2023-01-22 20:09:15.776215: step: 1504/527, loss: 0.005743971094489098 2023-01-22 20:09:16.816201: step: 1508/527, loss: 0.0 2023-01-22 20:09:17.875658: step: 1512/527, loss: 0.016628021374344826 2023-01-22 20:09:18.906603: step: 1516/527, loss: 0.005140840541571379 2023-01-22 20:09:19.943406: step: 1520/527, loss: 0.004369328264147043 2023-01-22 20:09:21.004290: step: 1524/527, loss: 0.003731008153408766 2023-01-22 20:09:22.062196: step: 1528/527, loss: 0.03829382359981537 2023-01-22 20:09:23.108479: step: 1532/527, loss: 0.008014494553208351 2023-01-22 20:09:24.162311: step: 1536/527, loss: 0.011885719373822212 2023-01-22 20:09:25.224509: step: 1540/527, loss: 0.01657886803150177 2023-01-22 20:09:26.273458: step: 1544/527, loss: 0.01222853735089302 2023-01-22 20:09:27.318848: step: 1548/527, loss: 0.0029392321594059467 2023-01-22 20:09:28.357310: step: 1552/527, loss: 0.004848426673561335 2023-01-22 20:09:29.395891: step: 1556/527, loss: 0.02532697655260563 2023-01-22 20:09:30.459825: step: 1560/527, loss: 0.01223083771765232 2023-01-22 20:09:31.512697: step: 1564/527, loss: 0.0010586688295006752 2023-01-22 20:09:32.538701: step: 1568/527, loss: 0.0027050855569541454 2023-01-22 20:09:33.583552: step: 1572/527, loss: 0.002986507024616003 2023-01-22 20:09:34.641822: step: 1576/527, loss: 0.005963047035038471 2023-01-22 20:09:35.697240: step: 1580/527, loss: 0.011919020675122738 2023-01-22 20:09:36.747703: step: 1584/527, loss: 0.010144336149096489 2023-01-22 20:09:37.804080: step: 1588/527, loss: 0.11049667745828629 2023-01-22 20:09:38.850256: step: 1592/527, loss: 0.0020202547311782837 2023-01-22 20:09:39.899285: step: 1596/527, loss: 0.003238413482904434 2023-01-22 20:09:40.934486: step: 1600/527, loss: 0.013794923201203346 2023-01-22 20:09:41.997319: step: 1604/527, loss: 0.007212123367935419 2023-01-22 20:09:43.059258: step: 1608/527, loss: 0.01122779306024313 2023-01-22 20:09:44.112015: step: 1612/527, loss: 0.035339292138814926 2023-01-22 20:09:45.152374: step: 1616/527, loss: 0.0023843515664339066 2023-01-22 20:09:46.199596: step: 1620/527, loss: 0.009645218960940838 2023-01-22 20:09:47.257702: step: 1624/527, loss: 0.011995306238532066 2023-01-22 20:09:48.310000: step: 1628/527, loss: 0.013134126551449299 2023-01-22 20:09:49.363790: step: 1632/527, loss: 0.02185082621872425 2023-01-22 20:09:50.399525: step: 1636/527, loss: 0.03338497877120972 2023-01-22 20:09:51.468917: step: 1640/527, loss: 0.0027892731595784426 2023-01-22 20:09:52.507038: step: 1644/527, loss: 0.004808598663657904 2023-01-22 20:09:53.563856: step: 1648/527, loss: 0.003547914791852236 2023-01-22 20:09:54.599748: step: 1652/527, loss: 0.0038693707901984453 2023-01-22 20:09:55.661652: step: 1656/527, loss: 0.015623710118234158 2023-01-22 20:09:56.711380: step: 1660/527, loss: 0.014417893253266811 2023-01-22 20:09:57.755936: step: 1664/527, loss: 8.103435538941994e-05 2023-01-22 20:09:58.814422: step: 1668/527, loss: 0.0063257645815610886 2023-01-22 20:09:59.858719: step: 1672/527, loss: 0.00885890331119299 2023-01-22 20:10:00.898679: step: 1676/527, loss: 0.005163947585970163 2023-01-22 20:10:01.956976: step: 1680/527, loss: 0.003962541464716196 2023-01-22 20:10:02.992418: step: 1684/527, loss: 0.0019523502560332417 2023-01-22 20:10:04.038060: step: 1688/527, loss: 0.00751158781349659 2023-01-22 20:10:05.075367: step: 1692/527, loss: 0.00586579879745841 2023-01-22 20:10:06.123808: step: 1696/527, loss: 0.03698648512363434 2023-01-22 20:10:07.152283: step: 1700/527, loss: 0.005073432344943285 2023-01-22 20:10:08.200648: step: 1704/527, loss: 0.04573269188404083 2023-01-22 20:10:09.246421: step: 1708/527, loss: 0.03874180093407631 2023-01-22 20:10:10.290417: step: 1712/527, loss: 0.009860971011221409 2023-01-22 20:10:11.355340: step: 1716/527, loss: 0.018232179805636406 2023-01-22 20:10:12.401391: step: 1720/527, loss: 0.008927026763558388 2023-01-22 20:10:13.463084: step: 1724/527, loss: 0.010142615996301174 2023-01-22 20:10:14.515014: step: 1728/527, loss: 0.008999330922961235 2023-01-22 20:10:15.563147: step: 1732/527, loss: 0.004921245388686657 2023-01-22 20:10:16.639109: step: 1736/527, loss: 0.00727794598788023 2023-01-22 20:10:17.687033: step: 1740/527, loss: 0.05593428388237953 2023-01-22 20:10:18.745894: step: 1744/527, loss: 0.007851863279938698 2023-01-22 20:10:19.799714: step: 1748/527, loss: 0.008311571553349495 2023-01-22 20:10:20.841470: step: 1752/527, loss: 0.03206690773367882 2023-01-22 20:10:21.897585: step: 1756/527, loss: 0.013040123507380486 2023-01-22 20:10:22.931908: step: 1760/527, loss: 0.02845063805580139 2023-01-22 20:10:23.974632: step: 1764/527, loss: 0.006320222280919552 2023-01-22 20:10:25.033674: step: 1768/527, loss: 0.009972857311367989 2023-01-22 20:10:26.072178: step: 1772/527, loss: 0.003992057871073484 2023-01-22 20:10:27.127247: step: 1776/527, loss: 0.0038140942342579365 2023-01-22 20:10:28.183387: step: 1780/527, loss: 0.004188275430351496 2023-01-22 20:10:29.221279: step: 1784/527, loss: 0.008014023303985596 2023-01-22 20:10:30.271735: step: 1788/527, loss: 0.009456348605453968 2023-01-22 20:10:31.328639: step: 1792/527, loss: 0.007598466239869595 2023-01-22 20:10:32.396981: step: 1796/527, loss: 0.006078129168599844 2023-01-22 20:10:33.436470: step: 1800/527, loss: 0.026917781680822372 2023-01-22 20:10:34.487230: step: 1804/527, loss: 0.010735039599239826 2023-01-22 20:10:35.540088: step: 1808/527, loss: 0.004990758839994669 2023-01-22 20:10:36.583824: step: 1812/527, loss: 0.002181178657338023 2023-01-22 20:10:37.636499: step: 1816/527, loss: 0.0036130601074546576 2023-01-22 20:10:38.694342: step: 1820/527, loss: 0.004037720616906881 2023-01-22 20:10:39.739259: step: 1824/527, loss: 0.0014690338866785169 2023-01-22 20:10:40.787882: step: 1828/527, loss: 0.005429030396044254 2023-01-22 20:10:41.834161: step: 1832/527, loss: 0.0015013277297839522 2023-01-22 20:10:42.887146: step: 1836/527, loss: 0.010830406099557877 2023-01-22 20:10:43.927149: step: 1840/527, loss: 0.0026708939112722874 2023-01-22 20:10:44.977734: step: 1844/527, loss: 0.008172836154699326 2023-01-22 20:10:46.032383: step: 1848/527, loss: 0.03005950339138508 2023-01-22 20:10:47.089405: step: 1852/527, loss: 0.00440314831212163 2023-01-22 20:10:48.128829: step: 1856/527, loss: 1.2473748938646168e-05 2023-01-22 20:10:49.168692: step: 1860/527, loss: 0.025359120219945908 2023-01-22 20:10:50.204156: step: 1864/527, loss: 0.02318563126027584 2023-01-22 20:10:51.263106: step: 1868/527, loss: 0.0008550297934561968 2023-01-22 20:10:52.320052: step: 1872/527, loss: 0.0002936632663477212 2023-01-22 20:10:53.359616: step: 1876/527, loss: 0.017690308392047882 2023-01-22 20:10:54.420193: step: 1880/527, loss: 0.007762353401631117 2023-01-22 20:10:55.470848: step: 1884/527, loss: 0.005471579264849424 2023-01-22 20:10:56.514764: step: 1888/527, loss: 0.0071036312729120255 2023-01-22 20:10:57.566992: step: 1892/527, loss: 0.01841791905462742 2023-01-22 20:10:58.618242: step: 1896/527, loss: 0.04170481488108635 2023-01-22 20:10:59.670388: step: 1900/527, loss: 0.03876521438360214 2023-01-22 20:11:00.724183: step: 1904/527, loss: 0.0042602806352078915 2023-01-22 20:11:01.783401: step: 1908/527, loss: 0.0024893295485526323 2023-01-22 20:11:02.840160: step: 1912/527, loss: 0.01631828211247921 2023-01-22 20:11:03.870199: step: 1916/527, loss: 0.018093852326273918 2023-01-22 20:11:04.918137: step: 1920/527, loss: 0.014130670577287674 2023-01-22 20:11:05.980815: step: 1924/527, loss: 0.0051779416389763355 2023-01-22 20:11:07.010997: step: 1928/527, loss: 0.007823098450899124 2023-01-22 20:11:08.057105: step: 1932/527, loss: 0.0035974006168544292 2023-01-22 20:11:09.113240: step: 1936/527, loss: 0.011179525405168533 2023-01-22 20:11:10.162677: step: 1940/527, loss: 0.0007153789047151804 2023-01-22 20:11:11.217814: step: 1944/527, loss: 0.006225020624697208 2023-01-22 20:11:12.248355: step: 1948/527, loss: 0.01100581232458353 2023-01-22 20:11:13.298546: step: 1952/527, loss: 0.0048622265458106995 2023-01-22 20:11:14.357253: step: 1956/527, loss: 0.00982161145657301 2023-01-22 20:11:15.405537: step: 1960/527, loss: 0.010374289005994797 2023-01-22 20:11:16.442683: step: 1964/527, loss: 0.006359482649713755 2023-01-22 20:11:17.480032: step: 1968/527, loss: 0.03307987377047539 2023-01-22 20:11:18.520649: step: 1972/527, loss: 0.00792419258505106 2023-01-22 20:11:19.555634: step: 1976/527, loss: 0.013120067305862904 2023-01-22 20:11:20.601015: step: 1980/527, loss: 0.0016312769148498774 2023-01-22 20:11:21.656317: step: 1984/527, loss: 0.0024427808821201324 2023-01-22 20:11:22.738629: step: 1988/527, loss: 0.01781153678894043 2023-01-22 20:11:23.778793: step: 1992/527, loss: 0.004233812913298607 2023-01-22 20:11:24.828044: step: 1996/527, loss: 0.005162104032933712 2023-01-22 20:11:25.874519: step: 2000/527, loss: 0.004399866797029972 2023-01-22 20:11:26.928167: step: 2004/527, loss: 0.0024165452923625708 2023-01-22 20:11:27.962181: step: 2008/527, loss: 0.01303536631166935 2023-01-22 20:11:29.003472: step: 2012/527, loss: 0.00421428307890892 2023-01-22 20:11:30.046469: step: 2016/527, loss: 0.005972134880721569 2023-01-22 20:11:31.113184: step: 2020/527, loss: 0.002349003218114376 2023-01-22 20:11:32.167873: step: 2024/527, loss: 0.012383715249598026 2023-01-22 20:11:33.206101: step: 2028/527, loss: 0.004792159888893366 2023-01-22 20:11:34.254574: step: 2032/527, loss: 0.041699036955833435 2023-01-22 20:11:35.300329: step: 2036/527, loss: 0.006186272948980331 2023-01-22 20:11:36.352915: step: 2040/527, loss: 0.0013455228181555867 2023-01-22 20:11:37.400746: step: 2044/527, loss: 0.003466531168669462 2023-01-22 20:11:38.458302: step: 2048/527, loss: 0.008250934071838856 2023-01-22 20:11:39.514946: step: 2052/527, loss: 0.029534438624978065 2023-01-22 20:11:40.563420: step: 2056/527, loss: 0.008110631257295609 2023-01-22 20:11:41.605646: step: 2060/527, loss: 0.002982160309329629 2023-01-22 20:11:42.637367: step: 2064/527, loss: 0.006174801383167505 2023-01-22 20:11:43.705221: step: 2068/527, loss: 0.0038659879937767982 2023-01-22 20:11:44.748299: step: 2072/527, loss: 0.005116707645356655 2023-01-22 20:11:45.778316: step: 2076/527, loss: 0.006999629084020853 2023-01-22 20:11:46.827899: step: 2080/527, loss: 0.011560342274606228 2023-01-22 20:11:47.885297: step: 2084/527, loss: 0.023093916475772858 2023-01-22 20:11:48.931195: step: 2088/527, loss: 0.0035630809143185616 2023-01-22 20:11:49.984321: step: 2092/527, loss: 0.0036325466353446245 2023-01-22 20:11:51.030008: step: 2096/527, loss: 0.0018851844361051917 2023-01-22 20:11:52.075962: step: 2100/527, loss: 0.010529978200793266 2023-01-22 20:11:53.119363: step: 2104/527, loss: 0.0022613334003835917 2023-01-22 20:11:54.164931: step: 2108/527, loss: 0.05637083202600479 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3232776032315979, 'r': 0.3416805028462998, 'f1': 0.3322244003690037}, 'combined': 0.24479692658768692, 'stategy': 1, 'epoch': 3} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3367795932148915, 'r': 0.30830640942490517, 'f1': 0.3219146182889375}, 'combined': 0.20602535570491998, 'stategy': 1, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3203312716684298, 'r': 0.3604486605301307, 'f1': 0.33920793589174797}, 'combined': 0.24994268960444585, 'stategy': 1, 'epoch': 3} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.35874336385196404, 'r': 0.3173248118436009, 'f1': 0.3367653574799431}, 'combined': 0.21552982878716354, 'stategy': 1, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32866854820290164, 'r': 0.33116318613992557, 'f1': 0.3299111514097179}, 'combined': 0.24309242735452896, 'stategy': 1, 'epoch': 3} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3637014265884931, 'r': 0.2951971542465294, 'f1': 0.32588816927868997}, 'combined': 0.23365566853943812, 'stategy': 1, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 3} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3264723230548723, 'r': 0.33576470416649107, 'f1': 0.3310533191688322}, 'combined': 0.24393402465071845, 'stategy': 1, 'epoch': 0} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.35636473460031554, 'r': 0.2976731814040852, 'f1': 0.32438554919493273}, 'combined': 0.23257831829070652, 'stategy': 1, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 0} ****************************** Epoch: 4 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 20:14:23.979932: step: 4/527, loss: 0.004798795562237501 2023-01-22 20:14:25.033508: step: 8/527, loss: 0.0029309168457984924 2023-01-22 20:14:26.053170: step: 12/527, loss: 0.0011004777625203133 2023-01-22 20:14:27.095565: step: 16/527, loss: 0.014637584798038006 2023-01-22 20:14:28.128746: step: 20/527, loss: 0.01262011006474495 2023-01-22 20:14:29.198404: step: 24/527, loss: 0.01471824012696743 2023-01-22 20:14:30.256246: step: 28/527, loss: 0.01483780425041914 2023-01-22 20:14:31.302765: step: 32/527, loss: 0.005570894572883844 2023-01-22 20:14:32.358509: step: 36/527, loss: 0.0041198041290044785 2023-01-22 20:14:33.395877: step: 40/527, loss: 0.001607234706170857 2023-01-22 20:14:34.443025: step: 44/527, loss: 0.005927626509219408 2023-01-22 20:14:35.498022: step: 48/527, loss: 0.01730796881020069 2023-01-22 20:14:36.535066: step: 52/527, loss: 0.00280939182266593 2023-01-22 20:14:37.594037: step: 56/527, loss: 0.008762883022427559 2023-01-22 20:14:38.638468: step: 60/527, loss: 0.03321302309632301 2023-01-22 20:14:39.667085: step: 64/527, loss: 0.0030898000113666058 2023-01-22 20:14:40.706548: step: 68/527, loss: 0.005210083909332752 2023-01-22 20:14:41.744418: step: 72/527, loss: 0.015607021749019623 2023-01-22 20:14:42.797889: step: 76/527, loss: 0.03765184059739113 2023-01-22 20:14:43.865213: step: 80/527, loss: 0.006425447762012482 2023-01-22 20:14:44.907701: step: 84/527, loss: 0.007916656322777271 2023-01-22 20:14:45.950139: step: 88/527, loss: 0.003766612382605672 2023-01-22 20:14:47.006463: step: 92/527, loss: 0.010192527435719967 2023-01-22 20:14:48.062321: step: 96/527, loss: 0.0017012403113767505 2023-01-22 20:14:49.105351: step: 100/527, loss: 0.007971801795065403 2023-01-22 20:14:50.164776: step: 104/527, loss: 0.0047422912903130054 2023-01-22 20:14:51.221318: step: 108/527, loss: 0.006395967677235603 2023-01-22 20:14:52.258536: step: 112/527, loss: 0.011883337050676346 2023-01-22 20:14:53.299024: step: 116/527, loss: 0.005027411971241236 2023-01-22 20:14:54.362085: step: 120/527, loss: 0.0009183065267279744 2023-01-22 20:14:55.408275: step: 124/527, loss: 0.020124750211834908 2023-01-22 20:14:56.453317: step: 128/527, loss: 0.004020323511213064 2023-01-22 20:14:57.507992: step: 132/527, loss: 0.006376670673489571 2023-01-22 20:14:58.562137: step: 136/527, loss: 0.003969069104641676 2023-01-22 20:14:59.612576: step: 140/527, loss: 0.004379172809422016 2023-01-22 20:15:00.656705: step: 144/527, loss: 0.034320998936891556 2023-01-22 20:15:01.704436: step: 148/527, loss: 0.004560271743685007 2023-01-22 20:15:02.762852: step: 152/527, loss: 0.011144979856908321 2023-01-22 20:15:03.832036: step: 156/527, loss: 0.02316906489431858 2023-01-22 20:15:04.870029: step: 160/527, loss: 0.00314704654738307 2023-01-22 20:15:05.915696: step: 164/527, loss: 0.05408405140042305 2023-01-22 20:15:06.954182: step: 168/527, loss: 0.0041894447058439255 2023-01-22 20:15:07.999942: step: 172/527, loss: 0.02124343253672123 2023-01-22 20:15:09.056632: step: 176/527, loss: 0.007926053367555141 2023-01-22 20:15:10.092795: step: 180/527, loss: 0.025341147556900978 2023-01-22 20:15:11.150224: step: 184/527, loss: 0.01771632768213749 2023-01-22 20:15:12.196443: step: 188/527, loss: 0.003960701171308756 2023-01-22 20:15:13.238580: step: 192/527, loss: 0.013658388517796993 2023-01-22 20:15:14.297990: step: 196/527, loss: 0.012370912358164787 2023-01-22 20:15:15.344784: step: 200/527, loss: 0.01231129840016365 2023-01-22 20:15:16.401343: step: 204/527, loss: 0.000923582527320832 2023-01-22 20:15:17.445131: step: 208/527, loss: 0.015850689262151718 2023-01-22 20:15:18.498747: step: 212/527, loss: 0.050579167902469635 2023-01-22 20:15:19.551540: step: 216/527, loss: 0.0015516100684180856 2023-01-22 20:15:20.608207: step: 220/527, loss: 0.003035517642274499 2023-01-22 20:15:21.648353: step: 224/527, loss: 0.004313284531235695 2023-01-22 20:15:22.696131: step: 228/527, loss: 0.009695312939584255 2023-01-22 20:15:23.744117: step: 232/527, loss: 0.022842252627015114 2023-01-22 20:15:24.784449: step: 236/527, loss: 0.0045544798485934734 2023-01-22 20:15:25.846419: step: 240/527, loss: 0.027661394327878952 2023-01-22 20:15:26.900219: step: 244/527, loss: 0.01109894085675478 2023-01-22 20:15:27.951582: step: 248/527, loss: 0.03341621533036232 2023-01-22 20:15:29.003864: step: 252/527, loss: 0.012176484800875187 2023-01-22 20:15:30.060891: step: 256/527, loss: 0.004430671222507954 2023-01-22 20:15:31.135762: step: 260/527, loss: 0.003769872710108757 2023-01-22 20:15:32.182023: step: 264/527, loss: 0.0009316790965385735 2023-01-22 20:15:33.211119: step: 268/527, loss: 0.017801864072680473 2023-01-22 20:15:34.248616: step: 272/527, loss: 0.009023908525705338 2023-01-22 20:15:35.302006: step: 276/527, loss: 0.004498614463955164 2023-01-22 20:15:36.333914: step: 280/527, loss: 0.0053957137279212475 2023-01-22 20:15:37.383006: step: 284/527, loss: 0.012265692465007305 2023-01-22 20:15:38.434038: step: 288/527, loss: 0.04098077490925789 2023-01-22 20:15:39.488204: step: 292/527, loss: 0.0038089097943156958 2023-01-22 20:15:40.546303: step: 296/527, loss: 0.00184653012547642 2023-01-22 20:15:41.595305: step: 300/527, loss: 0.013015697710216045 2023-01-22 20:15:42.647482: step: 304/527, loss: 0.013965689577162266 2023-01-22 20:15:43.686009: step: 308/527, loss: 0.020143380388617516 2023-01-22 20:15:44.745203: step: 312/527, loss: 0.0039999233558773994 2023-01-22 20:15:45.780330: step: 316/527, loss: 0.0055448804050683975 2023-01-22 20:15:46.843606: step: 320/527, loss: 0.006654916796833277 2023-01-22 20:15:47.901407: step: 324/527, loss: 0.004255559295415878 2023-01-22 20:15:48.952808: step: 328/527, loss: 0.005554490722715855 2023-01-22 20:15:50.007968: step: 332/527, loss: 0.004568501841276884 2023-01-22 20:15:51.046513: step: 336/527, loss: 0.005421648267656565 2023-01-22 20:15:52.098236: step: 340/527, loss: 0.0110981035977602 2023-01-22 20:15:53.135819: step: 344/527, loss: 0.008874951861798763 2023-01-22 20:15:54.169538: step: 348/527, loss: 0.0033424256835132837 2023-01-22 20:15:55.219117: step: 352/527, loss: 0.01518099382519722 2023-01-22 20:15:56.272437: step: 356/527, loss: 0.007731577381491661 2023-01-22 20:15:57.317497: step: 360/527, loss: 0.013318195939064026 2023-01-22 20:15:58.362984: step: 364/527, loss: 0.0269757229834795 2023-01-22 20:15:59.427076: step: 368/527, loss: 0.0055317347869277 2023-01-22 20:16:00.484231: step: 372/527, loss: 0.01124410331249237 2023-01-22 20:16:01.533713: step: 376/527, loss: 0.0033939657732844353 2023-01-22 20:16:02.576562: step: 380/527, loss: 0.004648104310035706 2023-01-22 20:16:03.623062: step: 384/527, loss: 0.012351693585515022 2023-01-22 20:16:04.688720: step: 388/527, loss: 0.034307222813367844 2023-01-22 20:16:05.745907: step: 392/527, loss: 0.02188608981668949 2023-01-22 20:16:06.812240: step: 396/527, loss: 0.00450518075376749 2023-01-22 20:16:07.849874: step: 400/527, loss: 0.002653697971254587 2023-01-22 20:16:08.886837: step: 404/527, loss: 3.119441316812299e-05 2023-01-22 20:16:09.935356: step: 408/527, loss: 0.0036495416425168514 2023-01-22 20:16:10.992039: step: 412/527, loss: 0.0007335481350310147 2023-01-22 20:16:12.052585: step: 416/527, loss: 0.03987220674753189 2023-01-22 20:16:13.102988: step: 420/527, loss: 0.009604169055819511 2023-01-22 20:16:14.154605: step: 424/527, loss: 0.008431042544543743 2023-01-22 20:16:15.213839: step: 428/527, loss: 0.033479176461696625 2023-01-22 20:16:16.262329: step: 432/527, loss: 0.027300521731376648 2023-01-22 20:16:17.306762: step: 436/527, loss: 0.007673206273466349 2023-01-22 20:16:18.359565: step: 440/527, loss: 0.004509297665208578 2023-01-22 20:16:19.405086: step: 444/527, loss: 0.009341058321297169 2023-01-22 20:16:20.454773: step: 448/527, loss: 0.002177347196266055 2023-01-22 20:16:21.502022: step: 452/527, loss: 0.02137337252497673 2023-01-22 20:16:22.540368: step: 456/527, loss: 0.0025187835562974215 2023-01-22 20:16:23.587583: step: 460/527, loss: 0.01424535270780325 2023-01-22 20:16:24.638531: step: 464/527, loss: 0.02961236983537674 2023-01-22 20:16:25.687747: step: 468/527, loss: 0.014148181304335594 2023-01-22 20:16:26.753319: step: 472/527, loss: 0.003883884521201253 2023-01-22 20:16:27.808837: step: 476/527, loss: 0.013511565513908863 2023-01-22 20:16:28.855889: step: 480/527, loss: 0.010334148071706295 2023-01-22 20:16:29.904327: step: 484/527, loss: 0.0026350081898272038 2023-01-22 20:16:30.946781: step: 488/527, loss: 0.016213631257414818 2023-01-22 20:16:31.998941: step: 492/527, loss: 0.006265159696340561 2023-01-22 20:16:33.033896: step: 496/527, loss: 0.003069190541282296 2023-01-22 20:16:34.092696: step: 500/527, loss: 0.006416722200810909 2023-01-22 20:16:35.134776: step: 504/527, loss: 0.0011410184670239687 2023-01-22 20:16:36.163955: step: 508/527, loss: 0.02092679776251316 2023-01-22 20:16:37.218448: step: 512/527, loss: 0.00888666883111 2023-01-22 20:16:38.270464: step: 516/527, loss: 0.008873362094163895 2023-01-22 20:16:39.322448: step: 520/527, loss: 0.008417917415499687 2023-01-22 20:16:40.368103: step: 524/527, loss: 0.005173789337277412 2023-01-22 20:16:41.417320: step: 528/527, loss: 0.009263423271477222 2023-01-22 20:16:42.466264: step: 532/527, loss: 0.011409972794353962 2023-01-22 20:16:43.523637: step: 536/527, loss: 0.030294368043541908 2023-01-22 20:16:44.562393: step: 540/527, loss: 0.004879303276538849 2023-01-22 20:16:45.605704: step: 544/527, loss: 0.04425245523452759 2023-01-22 20:16:46.647042: step: 548/527, loss: 0.0044807796366512775 2023-01-22 20:16:47.700845: step: 552/527, loss: 0.02414599061012268 2023-01-22 20:16:48.748885: step: 556/527, loss: 0.005646579433232546 2023-01-22 20:16:49.788338: step: 560/527, loss: 0.02515777014195919 2023-01-22 20:16:50.840439: step: 564/527, loss: 0.018965693190693855 2023-01-22 20:16:51.897392: step: 568/527, loss: 0.0115940161049366 2023-01-22 20:16:52.965356: step: 572/527, loss: 0.021791767328977585 2023-01-22 20:16:54.003294: step: 576/527, loss: 0.007542330306023359 2023-01-22 20:16:55.061752: step: 580/527, loss: 0.010915880091488361 2023-01-22 20:16:56.114124: step: 584/527, loss: 0.0016205202555283904 2023-01-22 20:16:57.178267: step: 588/527, loss: 0.004844542592763901 2023-01-22 20:16:58.235870: step: 592/527, loss: 0.004172665532678366 2023-01-22 20:16:59.289365: step: 596/527, loss: 0.005101096350699663 2023-01-22 20:17:00.343565: step: 600/527, loss: 0.0026512285694479942 2023-01-22 20:17:01.379888: step: 604/527, loss: 0.0021779858507215977 2023-01-22 20:17:02.425032: step: 608/527, loss: 0.005923046264797449 2023-01-22 20:17:03.476007: step: 612/527, loss: 0.0003572655259631574 2023-01-22 20:17:04.535098: step: 616/527, loss: 0.005643285345286131 2023-01-22 20:17:05.585346: step: 620/527, loss: 0.003275799797847867 2023-01-22 20:17:06.640774: step: 624/527, loss: 0.017841234803199768 2023-01-22 20:17:07.688006: step: 628/527, loss: 0.008541619405150414 2023-01-22 20:17:08.724446: step: 632/527, loss: 0.0028953668661415577 2023-01-22 20:17:09.765564: step: 636/527, loss: 0.0056040650233626366 2023-01-22 20:17:10.828850: step: 640/527, loss: 0.003257660660892725 2023-01-22 20:17:11.899282: step: 644/527, loss: 0.011811007745563984 2023-01-22 20:17:12.956405: step: 648/527, loss: 0.002472582971677184 2023-01-22 20:17:13.996651: step: 652/527, loss: 0.035941798239946365 2023-01-22 20:17:15.048543: step: 656/527, loss: 0.0047346120700240135 2023-01-22 20:17:16.100563: step: 660/527, loss: 0.02992715686559677 2023-01-22 20:17:17.143206: step: 664/527, loss: 0.002437981776893139 2023-01-22 20:17:18.182805: step: 668/527, loss: 0.0063609168864786625 2023-01-22 20:17:19.258329: step: 672/527, loss: 0.004914470948278904 2023-01-22 20:17:20.305443: step: 676/527, loss: 0.0037450063973665237 2023-01-22 20:17:21.357650: step: 680/527, loss: 0.007549316622316837 2023-01-22 20:17:22.418098: step: 684/527, loss: 0.0010297569679096341 2023-01-22 20:17:23.465781: step: 688/527, loss: 0.003464050590991974 2023-01-22 20:17:24.519684: step: 692/527, loss: 0.014582104049623013 2023-01-22 20:17:25.592982: step: 696/527, loss: 0.004058813210576773 2023-01-22 20:17:26.645182: step: 700/527, loss: 0.016312744468450546 2023-01-22 20:17:27.693044: step: 704/527, loss: 0.015867892652750015 2023-01-22 20:17:28.759920: step: 708/527, loss: 0.00382168497890234 2023-01-22 20:17:29.804578: step: 712/527, loss: 0.004260249435901642 2023-01-22 20:17:30.853680: step: 716/527, loss: 0.006121458951383829 2023-01-22 20:17:31.903325: step: 720/527, loss: 0.004989981651306152 2023-01-22 20:17:32.955035: step: 724/527, loss: 0.011474031955003738 2023-01-22 20:17:33.991552: step: 728/527, loss: 0.001981210894882679 2023-01-22 20:17:35.050727: step: 732/527, loss: 0.016398560255765915 2023-01-22 20:17:36.128211: step: 736/527, loss: 0.033520765602588654 2023-01-22 20:17:37.173426: step: 740/527, loss: 0.010036283172667027 2023-01-22 20:17:38.217990: step: 744/527, loss: 0.006581769324839115 2023-01-22 20:17:39.273573: step: 748/527, loss: 0.0044938949868083 2023-01-22 20:17:40.317068: step: 752/527, loss: 0.00012804719153791666 2023-01-22 20:17:41.381923: step: 756/527, loss: 0.0044247424229979515 2023-01-22 20:17:42.433284: step: 760/527, loss: 0.0003996891318820417 2023-01-22 20:17:43.509431: step: 764/527, loss: 0.016622699797153473 2023-01-22 20:17:44.562267: step: 768/527, loss: 0.02555469051003456 2023-01-22 20:17:45.619628: step: 772/527, loss: 0.008782824501395226 2023-01-22 20:17:46.666303: step: 776/527, loss: 0.019817352294921875 2023-01-22 20:17:47.708354: step: 780/527, loss: 0.012991409748792648 2023-01-22 20:17:48.777995: step: 784/527, loss: 0.007173345889896154 2023-01-22 20:17:49.829452: step: 788/527, loss: 0.006991872098296881 2023-01-22 20:17:50.873608: step: 792/527, loss: 0.025888260453939438 2023-01-22 20:17:51.920439: step: 796/527, loss: 0.01478448137640953 2023-01-22 20:17:52.987179: step: 800/527, loss: 0.007971160113811493 2023-01-22 20:17:54.046354: step: 804/527, loss: 0.01862054504454136 2023-01-22 20:17:55.106272: step: 808/527, loss: 0.007140390574932098 2023-01-22 20:17:56.181928: step: 812/527, loss: 0.013608364388346672 2023-01-22 20:17:57.231533: step: 816/527, loss: 0.0107984384521842 2023-01-22 20:17:58.288410: step: 820/527, loss: 0.028694182634353638 2023-01-22 20:17:59.348199: step: 824/527, loss: 0.007930814288556576 2023-01-22 20:18:00.412860: step: 828/527, loss: 0.022879892960190773 2023-01-22 20:18:01.449754: step: 832/527, loss: 0.014855567365884781 2023-01-22 20:18:02.501074: step: 836/527, loss: 0.024786897003650665 2023-01-22 20:18:03.566570: step: 840/527, loss: 0.006193472072482109 2023-01-22 20:18:04.613792: step: 844/527, loss: 0.029054781422019005 2023-01-22 20:18:05.636658: step: 848/527, loss: 0.016979951411485672 2023-01-22 20:18:06.683455: step: 852/527, loss: 0.011195342987775803 2023-01-22 20:18:07.730503: step: 856/527, loss: 0.0110057033598423 2023-01-22 20:18:08.783502: step: 860/527, loss: 0.005805686116218567 2023-01-22 20:18:09.832150: step: 864/527, loss: 0.0028822796884924173 2023-01-22 20:18:10.889212: step: 868/527, loss: 0.017254924401640892 2023-01-22 20:18:11.933859: step: 872/527, loss: 0.06004353240132332 2023-01-22 20:18:12.977425: step: 876/527, loss: 0.014923245646059513 2023-01-22 20:18:14.020779: step: 880/527, loss: 0.008303902111947536 2023-01-22 20:18:15.065763: step: 884/527, loss: 0.007086020428687334 2023-01-22 20:18:16.109495: step: 888/527, loss: 0.007097942288964987 2023-01-22 20:18:17.151807: step: 892/527, loss: 0.002108887070789933 2023-01-22 20:18:18.181872: step: 896/527, loss: 0.007246498018503189 2023-01-22 20:18:19.254146: step: 900/527, loss: 0.03540923818945885 2023-01-22 20:18:20.298234: step: 904/527, loss: 0.014381722547113895 2023-01-22 20:18:21.336671: step: 908/527, loss: 0.00739369448274374 2023-01-22 20:18:22.376159: step: 912/527, loss: 0.012750803492963314 2023-01-22 20:18:23.443736: step: 916/527, loss: 0.007424303330481052 2023-01-22 20:18:24.488586: step: 920/527, loss: 0.02110300585627556 2023-01-22 20:18:25.523271: step: 924/527, loss: 0.013303983956575394 2023-01-22 20:18:26.586162: step: 928/527, loss: 0.006884288974106312 2023-01-22 20:18:27.652507: step: 932/527, loss: 0.011327382177114487 2023-01-22 20:18:28.700549: step: 936/527, loss: 0.002374954055994749 2023-01-22 20:18:29.757324: step: 940/527, loss: 0.00717934500426054 2023-01-22 20:18:30.807358: step: 944/527, loss: 0.02322743646800518 2023-01-22 20:18:31.853221: step: 948/527, loss: 0.006958703976124525 2023-01-22 20:18:32.895536: step: 952/527, loss: 0.0018137339502573013 2023-01-22 20:18:33.939176: step: 956/527, loss: 0.006922414526343346 2023-01-22 20:18:34.996334: step: 960/527, loss: 0.008034479804337025 2023-01-22 20:18:36.036957: step: 964/527, loss: 0.005844409111887217 2023-01-22 20:18:37.090336: step: 968/527, loss: 0.03279508650302887 2023-01-22 20:18:38.128088: step: 972/527, loss: 0.022560451179742813 2023-01-22 20:18:39.185669: step: 976/527, loss: 0.002702136058360338 2023-01-22 20:18:40.259376: step: 980/527, loss: 0.032179586589336395 2023-01-22 20:18:41.310426: step: 984/527, loss: 0.0238084327429533 2023-01-22 20:18:42.359395: step: 988/527, loss: 0.016198376193642616 2023-01-22 20:18:43.403983: step: 992/527, loss: 0.00468091294169426 2023-01-22 20:18:44.455559: step: 996/527, loss: 0.005705349612981081 2023-01-22 20:18:45.509362: step: 1000/527, loss: 0.012686754576861858 2023-01-22 20:18:46.549906: step: 1004/527, loss: 0.0036708025727421045 2023-01-22 20:18:47.608016: step: 1008/527, loss: 0.011974627152085304 2023-01-22 20:18:48.650719: step: 1012/527, loss: 0.0050416444428265095 2023-01-22 20:18:49.712709: step: 1016/527, loss: 0.005563591606914997 2023-01-22 20:18:50.754603: step: 1020/527, loss: 0.00893318559974432 2023-01-22 20:18:51.812059: step: 1024/527, loss: 0.004403574857860804 2023-01-22 20:18:52.847145: step: 1028/527, loss: 0.006151125766336918 2023-01-22 20:18:53.877349: step: 1032/527, loss: 0.00041893532034009695 2023-01-22 20:18:54.935407: step: 1036/527, loss: 0.004295796155929565 2023-01-22 20:18:55.987612: step: 1040/527, loss: 0.008454680442810059 2023-01-22 20:18:57.039159: step: 1044/527, loss: 0.02144450508058071 2023-01-22 20:18:58.084656: step: 1048/527, loss: 0.013071775436401367 2023-01-22 20:18:59.132723: step: 1052/527, loss: 0.007843488827347755 2023-01-22 20:19:00.175677: step: 1056/527, loss: 0.02198074199259281 2023-01-22 20:19:01.232762: step: 1060/527, loss: 0.006096964236348867 2023-01-22 20:19:02.297176: step: 1064/527, loss: 0.011427431367337704 2023-01-22 20:19:03.329557: step: 1068/527, loss: 0.01927776262164116 2023-01-22 20:19:04.376642: step: 1072/527, loss: 0.005857015494257212 2023-01-22 20:19:05.440604: step: 1076/527, loss: 0.002251280937343836 2023-01-22 20:19:06.477817: step: 1080/527, loss: 0.0239707138389349 2023-01-22 20:19:07.542325: step: 1084/527, loss: 0.022794032469391823 2023-01-22 20:19:08.581587: step: 1088/527, loss: 0.0021768512669950724 2023-01-22 20:19:09.616632: step: 1092/527, loss: 0.0073958998546004295 2023-01-22 20:19:10.671957: step: 1096/527, loss: 0.09977622330188751 2023-01-22 20:19:11.711863: step: 1100/527, loss: 0.00838471483439207 2023-01-22 20:19:12.760223: step: 1104/527, loss: 0.037144169211387634 2023-01-22 20:19:13.820736: step: 1108/527, loss: 0.0156878512352705 2023-01-22 20:19:14.898595: step: 1112/527, loss: 0.001348948571830988 2023-01-22 20:19:15.963694: step: 1116/527, loss: 0.02171880006790161 2023-01-22 20:19:17.000681: step: 1120/527, loss: 0.007988469675183296 2023-01-22 20:19:18.057380: step: 1124/527, loss: 0.007513338699936867 2023-01-22 20:19:19.121778: step: 1128/527, loss: 0.005066821817308664 2023-01-22 20:19:20.206428: step: 1132/527, loss: 0.006134570576250553 2023-01-22 20:19:21.253487: step: 1136/527, loss: 0.024731654673814774 2023-01-22 20:19:22.288503: step: 1140/527, loss: 0.0020646771881729364 2023-01-22 20:19:23.353234: step: 1144/527, loss: 0.012673301622271538 2023-01-22 20:19:24.385975: step: 1148/527, loss: 0.010123465210199356 2023-01-22 20:19:25.448103: step: 1152/527, loss: 0.006663178559392691 2023-01-22 20:19:26.492217: step: 1156/527, loss: 0.008794235065579414 2023-01-22 20:19:27.546549: step: 1160/527, loss: 0.0056288582272827625 2023-01-22 20:19:28.612692: step: 1164/527, loss: 0.007989570498466492 2023-01-22 20:19:29.660280: step: 1168/527, loss: 0.009098177775740623 2023-01-22 20:19:30.701672: step: 1172/527, loss: 0.008912991732358932 2023-01-22 20:19:31.750749: step: 1176/527, loss: 0.004241116810590029 2023-01-22 20:19:32.796017: step: 1180/527, loss: 0.004711063113063574 2023-01-22 20:19:33.867613: step: 1184/527, loss: 0.0161958746612072 2023-01-22 20:19:34.921439: step: 1188/527, loss: 0.03359711915254593 2023-01-22 20:19:35.970339: step: 1192/527, loss: 0.0030436310917139053 2023-01-22 20:19:37.033234: step: 1196/527, loss: 0.014609329402446747 2023-01-22 20:19:38.080030: step: 1200/527, loss: 0.0043011498637497425 2023-01-22 20:19:39.131267: step: 1204/527, loss: 0.012104524299502373 2023-01-22 20:19:40.198788: step: 1208/527, loss: 0.002537710592150688 2023-01-22 20:19:41.231412: step: 1212/527, loss: 0.006075495854020119 2023-01-22 20:19:42.278422: step: 1216/527, loss: 0.03430553898215294 2023-01-22 20:19:43.327026: step: 1220/527, loss: 0.009878157638013363 2023-01-22 20:19:44.376868: step: 1224/527, loss: 0.0021241381764411926 2023-01-22 20:19:45.423524: step: 1228/527, loss: 0.016044380143284798 2023-01-22 20:19:46.467376: step: 1232/527, loss: 0.03437155857682228 2023-01-22 20:19:47.508631: step: 1236/527, loss: 0.009187907911837101 2023-01-22 20:19:48.558512: step: 1240/527, loss: 0.026836981996893883 2023-01-22 20:19:49.598282: step: 1244/527, loss: 0.02395525947213173 2023-01-22 20:19:50.640679: step: 1248/527, loss: 0.037901222705841064 2023-01-22 20:19:51.706319: step: 1252/527, loss: 0.0072151361964643 2023-01-22 20:19:52.747666: step: 1256/527, loss: 0.011878948658704758 2023-01-22 20:19:53.782167: step: 1260/527, loss: 0.03061460517346859 2023-01-22 20:19:54.839042: step: 1264/527, loss: 0.03187352418899536 2023-01-22 20:19:55.905297: step: 1268/527, loss: 0.014175578020513058 2023-01-22 20:19:56.968397: step: 1272/527, loss: 0.016100579872727394 2023-01-22 20:19:58.020561: step: 1276/527, loss: 0.004758130759000778 2023-01-22 20:19:59.082434: step: 1280/527, loss: 0.009700021706521511 2023-01-22 20:20:00.137618: step: 1284/527, loss: 0.013814649544656277 2023-01-22 20:20:01.206149: step: 1288/527, loss: 0.008729923516511917 2023-01-22 20:20:02.269311: step: 1292/527, loss: 0.0033721658401191235 2023-01-22 20:20:03.338508: step: 1296/527, loss: 0.03465705364942551 2023-01-22 20:20:04.380766: step: 1300/527, loss: 0.0037103756330907345 2023-01-22 20:20:05.433441: step: 1304/527, loss: 0.009560495615005493 2023-01-22 20:20:06.472732: step: 1308/527, loss: 0.00837091263383627 2023-01-22 20:20:07.514197: step: 1312/527, loss: 0.00782929826527834 2023-01-22 20:20:08.569870: step: 1316/527, loss: 0.00788565631955862 2023-01-22 20:20:09.617687: step: 1320/527, loss: 0.03325852006673813 2023-01-22 20:20:10.675906: step: 1324/527, loss: 0.013403902761638165 2023-01-22 20:20:11.718208: step: 1328/527, loss: 0.001752275973558426 2023-01-22 20:20:12.777982: step: 1332/527, loss: 0.015953881666064262 2023-01-22 20:20:13.819741: step: 1336/527, loss: 0.03616141527891159 2023-01-22 20:20:14.864937: step: 1340/527, loss: 0.04026893898844719 2023-01-22 20:20:15.916805: step: 1344/527, loss: 0.01275169663131237 2023-01-22 20:20:16.980638: step: 1348/527, loss: 0.008130903355777264 2023-01-22 20:20:18.035629: step: 1352/527, loss: 0.0037108901888132095 2023-01-22 20:20:19.087378: step: 1356/527, loss: 0.026075756177306175 2023-01-22 20:20:20.135788: step: 1360/527, loss: 0.0015174155123531818 2023-01-22 20:20:21.178340: step: 1364/527, loss: 0.007370895706117153 2023-01-22 20:20:22.231752: step: 1368/527, loss: 0.014556348323822021 2023-01-22 20:20:23.273744: step: 1372/527, loss: 0.009829215705394745 2023-01-22 20:20:24.348165: step: 1376/527, loss: 0.006865905597805977 2023-01-22 20:20:25.419624: step: 1380/527, loss: 0.007092609070241451 2023-01-22 20:20:26.481261: step: 1384/527, loss: 0.037890467792749405 2023-01-22 20:20:27.522104: step: 1388/527, loss: 0.016621911898255348 2023-01-22 20:20:28.562512: step: 1392/527, loss: 0.0073476312682032585 2023-01-22 20:20:29.617635: step: 1396/527, loss: 0.017868949100375175 2023-01-22 20:20:30.660757: step: 1400/527, loss: 0.01745520532131195 2023-01-22 20:20:31.707282: step: 1404/527, loss: 0.01043170876801014 2023-01-22 20:20:32.772203: step: 1408/527, loss: 0.033055443316698074 2023-01-22 20:20:33.837455: step: 1412/527, loss: 0.06796393543481827 2023-01-22 20:20:34.882067: step: 1416/527, loss: 0.005566603038460016 2023-01-22 20:20:35.936021: step: 1420/527, loss: 0.0189081858843565 2023-01-22 20:20:36.981221: step: 1424/527, loss: 0.013220678083598614 2023-01-22 20:20:38.032610: step: 1428/527, loss: 0.010770805180072784 2023-01-22 20:20:39.064539: step: 1432/527, loss: 0.0138105982914567 2023-01-22 20:20:40.109366: step: 1436/527, loss: 0.010265029966831207 2023-01-22 20:20:41.162494: step: 1440/527, loss: 0.08747374266386032 2023-01-22 20:20:42.226141: step: 1444/527, loss: 0.004809635691344738 2023-01-22 20:20:43.268462: step: 1448/527, loss: 0.0032816233579069376 2023-01-22 20:20:44.331961: step: 1452/527, loss: 0.0036028623580932617 2023-01-22 20:20:45.386696: step: 1456/527, loss: 0.009788953699171543 2023-01-22 20:20:46.428299: step: 1460/527, loss: 0.009355951100587845 2023-01-22 20:20:47.487722: step: 1464/527, loss: 0.005973074119538069 2023-01-22 20:20:48.527593: step: 1468/527, loss: 0.04549450799822807 2023-01-22 20:20:49.582888: step: 1472/527, loss: 0.024780411273241043 2023-01-22 20:20:50.630105: step: 1476/527, loss: 0.028518397361040115 2023-01-22 20:20:51.701563: step: 1480/527, loss: 0.002183921867981553 2023-01-22 20:20:52.744308: step: 1484/527, loss: 0.010395008139312267 2023-01-22 20:20:53.796569: step: 1488/527, loss: 0.0060562510043382645 2023-01-22 20:20:54.847274: step: 1492/527, loss: 0.007435582112520933 2023-01-22 20:20:55.895683: step: 1496/527, loss: 0.0027978983707726 2023-01-22 20:20:56.935280: step: 1500/527, loss: 0.006862413138151169 2023-01-22 20:20:57.972998: step: 1504/527, loss: 0.011185969226062298 2023-01-22 20:20:59.004465: step: 1508/527, loss: 0.02064533531665802 2023-01-22 20:21:00.052065: step: 1512/527, loss: 0.021287381649017334 2023-01-22 20:21:01.098152: step: 1516/527, loss: 0.008476035669445992 2023-01-22 20:21:02.148535: step: 1520/527, loss: 0.004403266590088606 2023-01-22 20:21:03.193053: step: 1524/527, loss: 0.003930867649614811 2023-01-22 20:21:04.243117: step: 1528/527, loss: 0.005387548822909594 2023-01-22 20:21:05.283014: step: 1532/527, loss: 0.004389140289276838 2023-01-22 20:21:06.344798: step: 1536/527, loss: 0.010187477804720402 2023-01-22 20:21:07.382973: step: 1540/527, loss: 0.02456819824874401 2023-01-22 20:21:08.429575: step: 1544/527, loss: 0.004405968822538853 2023-01-22 20:21:09.471841: step: 1548/527, loss: 0.03128984570503235 2023-01-22 20:21:10.526442: step: 1552/527, loss: 0.005347860511392355 2023-01-22 20:21:11.590894: step: 1556/527, loss: 0.02187165431678295 2023-01-22 20:21:12.644684: step: 1560/527, loss: 0.0018834836082533002 2023-01-22 20:21:13.687382: step: 1564/527, loss: 0.0028059252072125673 2023-01-22 20:21:14.750642: step: 1568/527, loss: 0.036271851509809494 2023-01-22 20:21:15.794672: step: 1572/527, loss: 9.282708924729377e-05 2023-01-22 20:21:16.855850: step: 1576/527, loss: 0.006122369319200516 2023-01-22 20:21:17.898311: step: 1580/527, loss: 0.0030074601527303457 2023-01-22 20:21:18.948097: step: 1584/527, loss: 0.01225368957966566 2023-01-22 20:21:19.986914: step: 1588/527, loss: 0.009725715965032578 2023-01-22 20:21:21.027909: step: 1592/527, loss: 0.03296318277716637 2023-01-22 20:21:22.075097: step: 1596/527, loss: 0.0022004502825438976 2023-01-22 20:21:23.127652: step: 1600/527, loss: 0.012196688912808895 2023-01-22 20:21:24.154735: step: 1604/527, loss: 0.0054717231541872025 2023-01-22 20:21:25.214870: step: 1608/527, loss: 0.020669111981987953 2023-01-22 20:21:26.259140: step: 1612/527, loss: 0.014488261193037033 2023-01-22 20:21:27.311279: step: 1616/527, loss: 0.012031824328005314 2023-01-22 20:21:28.361287: step: 1620/527, loss: 0.003091371851041913 2023-01-22 20:21:29.411654: step: 1624/527, loss: 0.013397125527262688 2023-01-22 20:21:30.472120: step: 1628/527, loss: 0.009961170144379139 2023-01-22 20:21:31.526632: step: 1632/527, loss: 0.0032252143137156963 2023-01-22 20:21:32.580065: step: 1636/527, loss: 0.083075612783432 2023-01-22 20:21:33.623206: step: 1640/527, loss: 0.0017018612707033753 2023-01-22 20:21:34.659841: step: 1644/527, loss: 0.002244790317490697 2023-01-22 20:21:35.721409: step: 1648/527, loss: 0.019958626478910446 2023-01-22 20:21:36.769869: step: 1652/527, loss: 0.0018631464336067438 2023-01-22 20:21:37.810804: step: 1656/527, loss: 0.004147750791162252 2023-01-22 20:21:38.861759: step: 1660/527, loss: 0.005792287643998861 2023-01-22 20:21:39.910206: step: 1664/527, loss: 0.004970904439687729 2023-01-22 20:21:40.957908: step: 1668/527, loss: 0.03062223643064499 2023-01-22 20:21:41.997911: step: 1672/527, loss: 0.005043509881943464 2023-01-22 20:21:43.042170: step: 1676/527, loss: 0.0039184922352433205 2023-01-22 20:21:44.094449: step: 1680/527, loss: 0.023870836943387985 2023-01-22 20:21:45.144482: step: 1684/527, loss: 0.0137215880677104 2023-01-22 20:21:46.191063: step: 1688/527, loss: 0.016430791467428207 2023-01-22 20:21:47.250840: step: 1692/527, loss: 0.07737080752849579 2023-01-22 20:21:48.296328: step: 1696/527, loss: 0.01367567852139473 2023-01-22 20:21:49.354301: step: 1700/527, loss: 0.03803643211722374 2023-01-22 20:21:50.392405: step: 1704/527, loss: 0.0014364771777763963 2023-01-22 20:21:51.425489: step: 1708/527, loss: 0.022895200178027153 2023-01-22 20:21:52.479794: step: 1712/527, loss: 0.009638975374400616 2023-01-22 20:21:53.532798: step: 1716/527, loss: 0.007268958725035191 2023-01-22 20:21:54.579589: step: 1720/527, loss: 0.008098487742245197 2023-01-22 20:21:55.627573: step: 1724/527, loss: 0.009427439421415329 2023-01-22 20:21:56.673681: step: 1728/527, loss: 0.00845731794834137 2023-01-22 20:21:57.722777: step: 1732/527, loss: 0.003358046058565378 2023-01-22 20:21:58.778040: step: 1736/527, loss: 0.00312783638946712 2023-01-22 20:21:59.832422: step: 1740/527, loss: 0.0025260988622903824 2023-01-22 20:22:00.878666: step: 1744/527, loss: 0.006439708638936281 2023-01-22 20:22:01.934613: step: 1748/527, loss: 0.008595471270382404 2023-01-22 20:22:02.988736: step: 1752/527, loss: 0.02937445417046547 2023-01-22 20:22:04.045207: step: 1756/527, loss: 0.07778147608041763 2023-01-22 20:22:05.115586: step: 1760/527, loss: 0.007810445036739111 2023-01-22 20:22:06.168220: step: 1764/527, loss: 0.016113916411995888 2023-01-22 20:22:07.204755: step: 1768/527, loss: 0.015464873984456062 2023-01-22 20:22:08.250339: step: 1772/527, loss: 0.05303625389933586 2023-01-22 20:22:09.287373: step: 1776/527, loss: 0.0011791265569627285 2023-01-22 20:22:10.333305: step: 1780/527, loss: 0.019628018140792847 2023-01-22 20:22:11.372282: step: 1784/527, loss: 0.002410472836345434 2023-01-22 20:22:12.415947: step: 1788/527, loss: 0.019561175256967545 2023-01-22 20:22:13.469134: step: 1792/527, loss: 0.007290417794138193 2023-01-22 20:22:14.522418: step: 1796/527, loss: 0.013378962874412537 2023-01-22 20:22:15.564951: step: 1800/527, loss: 0.003619763534516096 2023-01-22 20:22:16.612038: step: 1804/527, loss: 0.005326869431883097 2023-01-22 20:22:17.650835: step: 1808/527, loss: 0.007123375777155161 2023-01-22 20:22:18.697783: step: 1812/527, loss: 0.001479431870393455 2023-01-22 20:22:19.731953: step: 1816/527, loss: 0.0002616850833874196 2023-01-22 20:22:20.788935: step: 1820/527, loss: 0.004525614436715841 2023-01-22 20:22:21.845464: step: 1824/527, loss: 0.0013356241397559643 2023-01-22 20:22:22.880729: step: 1828/527, loss: 0.00701129250228405 2023-01-22 20:22:23.929748: step: 1832/527, loss: 0.006185004487633705 2023-01-22 20:22:24.975418: step: 1836/527, loss: 0.005583008285611868 2023-01-22 20:22:26.037229: step: 1840/527, loss: 0.00713853957131505 2023-01-22 20:22:27.111595: step: 1844/527, loss: 0.005338139832019806 2023-01-22 20:22:28.154695: step: 1848/527, loss: 0.004853971302509308 2023-01-22 20:22:29.195871: step: 1852/527, loss: 0.01282561756670475 2023-01-22 20:22:30.256753: step: 1856/527, loss: 0.009962816722691059 2023-01-22 20:22:31.304017: step: 1860/527, loss: 0.0063874083571136 2023-01-22 20:22:32.368229: step: 1864/527, loss: 0.012606088072061539 2023-01-22 20:22:33.430441: step: 1868/527, loss: 0.003858232870697975 2023-01-22 20:22:34.477573: step: 1872/527, loss: 0.027438918128609657 2023-01-22 20:22:35.528742: step: 1876/527, loss: 0.001566319027915597 2023-01-22 20:22:36.614595: step: 1880/527, loss: 0.004437906201928854 2023-01-22 20:22:37.675529: step: 1884/527, loss: 0.011134578846395016 2023-01-22 20:22:38.721251: step: 1888/527, loss: 0.013183685950934887 2023-01-22 20:22:39.775851: step: 1892/527, loss: 0.011707274243235588 2023-01-22 20:22:40.842802: step: 1896/527, loss: 0.005555757321417332 2023-01-22 20:22:41.889749: step: 1900/527, loss: 0.016193009912967682 2023-01-22 20:22:42.921365: step: 1904/527, loss: 0.0006685860571451485 2023-01-22 20:22:43.969464: step: 1908/527, loss: 0.006723953410983086 2023-01-22 20:22:45.020041: step: 1912/527, loss: 0.0076876431703567505 2023-01-22 20:22:46.079294: step: 1916/527, loss: 0.006427178159356117 2023-01-22 20:22:47.141935: step: 1920/527, loss: 0.005880905780941248 2023-01-22 20:22:48.188498: step: 1924/527, loss: 0.0012676960323005915 2023-01-22 20:22:49.258255: step: 1928/527, loss: 0.008009851910173893 2023-01-22 20:22:50.303899: step: 1932/527, loss: 0.0434921570122242 2023-01-22 20:22:51.352225: step: 1936/527, loss: 0.0025629119481891394 2023-01-22 20:22:52.396513: step: 1940/527, loss: 0.006032139994204044 2023-01-22 20:22:53.430021: step: 1944/527, loss: 0.008024443872272968 2023-01-22 20:22:54.497311: step: 1948/527, loss: 0.006688220892101526 2023-01-22 20:22:55.538451: step: 1952/527, loss: 0.007636374793946743 2023-01-22 20:22:56.569770: step: 1956/527, loss: 0.018973510712385178 2023-01-22 20:22:57.620971: step: 1960/527, loss: 0.00806934479624033 2023-01-22 20:22:58.681320: step: 1964/527, loss: 0.003991689067333937 2023-01-22 20:22:59.708110: step: 1968/527, loss: 0.005725575610995293 2023-01-22 20:23:00.756439: step: 1972/527, loss: 0.006628451868891716 2023-01-22 20:23:01.814366: step: 1976/527, loss: 0.004891253542155027 2023-01-22 20:23:02.875016: step: 1980/527, loss: 0.03173601254820824 2023-01-22 20:23:03.927079: step: 1984/527, loss: 0.0022919070906937122 2023-01-22 20:23:04.970035: step: 1988/527, loss: 0.002828119555488229 2023-01-22 20:23:06.017893: step: 1992/527, loss: 0.04499104246497154 2023-01-22 20:23:07.068645: step: 1996/527, loss: 0.0010336079867556691 2023-01-22 20:23:08.141607: step: 2000/527, loss: 0.009758410975337029 2023-01-22 20:23:09.179004: step: 2004/527, loss: 0.008675065822899342 2023-01-22 20:23:10.213267: step: 2008/527, loss: 0.012204213067889214 2023-01-22 20:23:11.268794: step: 2012/527, loss: 0.005348739214241505 2023-01-22 20:23:12.327683: step: 2016/527, loss: 0.006154006812721491 2023-01-22 20:23:13.386202: step: 2020/527, loss: 0.008105491288006306 2023-01-22 20:23:14.429382: step: 2024/527, loss: 0.0480058379471302 2023-01-22 20:23:15.503300: step: 2028/527, loss: 0.0035810123663395643 2023-01-22 20:23:16.551544: step: 2032/527, loss: 0.009700424037873745 2023-01-22 20:23:17.600791: step: 2036/527, loss: 0.010781696066260338 2023-01-22 20:23:18.637848: step: 2040/527, loss: 0.01747804507613182 2023-01-22 20:23:19.713193: step: 2044/527, loss: 0.03458770364522934 2023-01-22 20:23:20.771070: step: 2048/527, loss: 0.0025242650881409645 2023-01-22 20:23:21.820255: step: 2052/527, loss: 0.021751495078206062 2023-01-22 20:23:22.870259: step: 2056/527, loss: 0.009766235947608948 2023-01-22 20:23:23.927049: step: 2060/527, loss: 0.013502796180546284 2023-01-22 20:23:24.963889: step: 2064/527, loss: 0.0034496246371418238 2023-01-22 20:23:26.014043: step: 2068/527, loss: 0.033799927681684494 2023-01-22 20:23:27.054678: step: 2072/527, loss: 0.0007875370210967958 2023-01-22 20:23:28.105546: step: 2076/527, loss: 0.013467349112033844 2023-01-22 20:23:29.173954: step: 2080/527, loss: 0.020308133214712143 2023-01-22 20:23:30.220293: step: 2084/527, loss: 0.0029298237059265375 2023-01-22 20:23:31.258389: step: 2088/527, loss: 0.0046996851451694965 2023-01-22 20:23:32.294070: step: 2092/527, loss: 0.017699599266052246 2023-01-22 20:23:33.336588: step: 2096/527, loss: 0.011798987165093422 2023-01-22 20:23:34.370751: step: 2100/527, loss: 0.016439277678728104 2023-01-22 20:23:35.411675: step: 2104/527, loss: 0.0310671404004097 2023-01-22 20:23:36.456444: step: 2108/527, loss: 0.010649049654603004 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3239921171171171, 'r': 0.34120611954459207, 'f1': 0.33237638632162664}, 'combined': 0.24490891623698804, 'stategy': 1, 'epoch': 4} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33875319144775856, 'r': 0.30918927655777234, 'f1': 0.32329677206611174}, 'combined': 0.20690993412231148, 'stategy': 1, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3217204172638955, 'r': 0.3571279774181762, 'f1': 0.33850079873989003}, 'combined': 0.24942164117676105, 'stategy': 1, 'epoch': 4} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3552706105510955, 'r': 0.3116692174380065, 'f1': 0.33204468685889316}, 'combined': 0.2125085995896916, 'stategy': 1, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33260491322446273, 'r': 0.33386717095965995, 'f1': 0.3332348467722363}, 'combined': 0.2455414660427004, 'stategy': 1, 'epoch': 4} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3675928546671409, 'r': 0.298355619984613, 'f1': 0.32937501392575563}, 'combined': 0.23615567036186255, 'stategy': 1, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.261437908496732, 'r': 0.38095238095238093, 'f1': 0.31007751937984496}, 'combined': 0.20671834625322996, 'stategy': 1, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'stategy': 1, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3264723230548723, 'r': 0.33576470416649107, 'f1': 0.3310533191688322}, 'combined': 0.24393402465071845, 'stategy': 1, 'epoch': 0} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.35636473460031554, 'r': 0.2976731814040852, 'f1': 0.32438554919493273}, 'combined': 0.23257831829070652, 'stategy': 1, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 0} ****************************** Epoch: 5 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 20:26:04.291837: step: 4/527, loss: 0.002969354623928666 2023-01-22 20:26:05.327711: step: 8/527, loss: 0.0014827997656539083 2023-01-22 20:26:06.369550: step: 12/527, loss: 0.011391973122954369 2023-01-22 20:26:07.409277: step: 16/527, loss: 0.004701155703514814 2023-01-22 20:26:08.456210: step: 20/527, loss: 0.002939260099083185 2023-01-22 20:26:09.499968: step: 24/527, loss: 0.007404353469610214 2023-01-22 20:26:10.532499: step: 28/527, loss: 0.008736512623727322 2023-01-22 20:26:11.579628: step: 32/527, loss: 0.002266091527417302 2023-01-22 20:26:12.622380: step: 36/527, loss: 0.008729124441742897 2023-01-22 20:26:13.672046: step: 40/527, loss: 0.005469411611557007 2023-01-22 20:26:14.730510: step: 44/527, loss: 0.007562476675957441 2023-01-22 20:26:15.785360: step: 48/527, loss: 0.003686920739710331 2023-01-22 20:26:16.826535: step: 52/527, loss: 0.0032590485643595457 2023-01-22 20:26:17.866003: step: 56/527, loss: 0.013321204110980034 2023-01-22 20:26:18.905769: step: 60/527, loss: 0.020682601258158684 2023-01-22 20:26:19.956514: step: 64/527, loss: 0.05404923856258392 2023-01-22 20:26:21.009363: step: 68/527, loss: 0.0010306095937266946 2023-01-22 20:26:22.046762: step: 72/527, loss: 0.024613196030259132 2023-01-22 20:26:23.092742: step: 76/527, loss: 0.005776167381554842 2023-01-22 20:26:24.155945: step: 80/527, loss: 0.015013959258794785 2023-01-22 20:26:25.216859: step: 84/527, loss: 0.019363850355148315 2023-01-22 20:26:26.254687: step: 88/527, loss: 0.004050699528306723 2023-01-22 20:26:27.292492: step: 92/527, loss: 0.03389830142259598 2023-01-22 20:26:28.338337: step: 96/527, loss: 0.0010759391589090228 2023-01-22 20:26:29.385951: step: 100/527, loss: 0.0017122612334787846 2023-01-22 20:26:30.433506: step: 104/527, loss: 0.0018421270651742816 2023-01-22 20:26:31.482062: step: 108/527, loss: 0.0018057593842968345 2023-01-22 20:26:32.545100: step: 112/527, loss: 0.03548183664679527 2023-01-22 20:26:33.586624: step: 116/527, loss: 0.0017532685305923223 2023-01-22 20:26:34.623589: step: 120/527, loss: 0.011274599470198154 2023-01-22 20:26:35.683361: step: 124/527, loss: 0.017829814925789833 2023-01-22 20:26:36.727620: step: 128/527, loss: 0.0006709989975206554 2023-01-22 20:26:37.775051: step: 132/527, loss: 0.014907999895513058 2023-01-22 20:26:38.804208: step: 136/527, loss: 0.01871870458126068 2023-01-22 20:26:39.858040: step: 140/527, loss: 0.005816675256937742 2023-01-22 20:26:40.901296: step: 144/527, loss: 0.008847100660204887 2023-01-22 20:26:41.954579: step: 148/527, loss: 0.0017618443816900253 2023-01-22 20:26:42.989547: step: 152/527, loss: 0.010032327845692635 2023-01-22 20:26:44.034182: step: 156/527, loss: 0.021026909351348877 2023-01-22 20:26:45.076130: step: 160/527, loss: 0.010059729218482971 2023-01-22 20:26:46.122371: step: 164/527, loss: 0.013219148851931095 2023-01-22 20:26:47.170383: step: 168/527, loss: 0.0032328632660210133 2023-01-22 20:26:48.223307: step: 172/527, loss: 0.0002496513770893216 2023-01-22 20:26:49.303488: step: 176/527, loss: 0.0064623611979186535 2023-01-22 20:26:50.350067: step: 180/527, loss: 0.029678309336304665 2023-01-22 20:26:51.398501: step: 184/527, loss: 0.01414361409842968 2023-01-22 20:26:52.452269: step: 188/527, loss: 0.011568525806069374 2023-01-22 20:26:53.508725: step: 192/527, loss: 0.006473075598478317 2023-01-22 20:26:54.561786: step: 196/527, loss: 0.015940334647893906 2023-01-22 20:26:55.596448: step: 200/527, loss: 0.009600615128874779 2023-01-22 20:26:56.655990: step: 204/527, loss: 0.007409001234918833 2023-01-22 20:26:57.701607: step: 208/527, loss: 0.004151436500251293 2023-01-22 20:26:58.746928: step: 212/527, loss: 0.005052375141531229 2023-01-22 20:26:59.801434: step: 216/527, loss: 0.007848276756703854 2023-01-22 20:27:00.836935: step: 220/527, loss: 0.007075367961078882 2023-01-22 20:27:01.883554: step: 224/527, loss: 0.03806260600686073 2023-01-22 20:27:02.928776: step: 228/527, loss: 0.002090335823595524 2023-01-22 20:27:03.975905: step: 232/527, loss: 0.004145875573158264 2023-01-22 20:27:05.031945: step: 236/527, loss: 0.0048092626966536045 2023-01-22 20:27:06.089653: step: 240/527, loss: 0.006860788911581039 2023-01-22 20:27:07.123920: step: 244/527, loss: 0.011368861421942711 2023-01-22 20:27:08.157368: step: 248/527, loss: 0.02221207693219185 2023-01-22 20:27:09.194314: step: 252/527, loss: 0.005869068671017885 2023-01-22 20:27:10.231949: step: 256/527, loss: 0.0024019372649490833 2023-01-22 20:27:11.287936: step: 260/527, loss: 0.009915287606418133 2023-01-22 20:27:12.317029: step: 264/527, loss: 0.0054763746447861195 2023-01-22 20:27:13.377404: step: 268/527, loss: 0.008758329786360264 2023-01-22 20:27:14.410576: step: 272/527, loss: 0.007811664137989283 2023-01-22 20:27:15.457275: step: 276/527, loss: 0.03288523852825165 2023-01-22 20:27:16.496190: step: 280/527, loss: 0.01274183765053749 2023-01-22 20:27:17.545294: step: 284/527, loss: 0.011776238679885864 2023-01-22 20:27:18.588667: step: 288/527, loss: 0.0012714744079858065 2023-01-22 20:27:19.649356: step: 292/527, loss: 0.005683154799044132 2023-01-22 20:27:20.704788: step: 296/527, loss: 0.0038965095300227404 2023-01-22 20:27:21.760125: step: 300/527, loss: 0.009876362048089504 2023-01-22 20:27:22.800165: step: 304/527, loss: 0.0007933435845188797 2023-01-22 20:27:23.850373: step: 308/527, loss: 0.009363846853375435 2023-01-22 20:27:24.900435: step: 312/527, loss: 0.03849930316209793 2023-01-22 20:27:25.949162: step: 316/527, loss: 0.02594965510070324 2023-01-22 20:27:26.988678: step: 320/527, loss: 0.0011536639649420977 2023-01-22 20:27:28.031113: step: 324/527, loss: 0.01028281357139349 2023-01-22 20:27:29.090699: step: 328/527, loss: 0.011264101602137089 2023-01-22 20:27:30.115839: step: 332/527, loss: 0.0010639885440468788 2023-01-22 20:27:31.177188: step: 336/527, loss: 0.03970203548669815 2023-01-22 20:27:32.245023: step: 340/527, loss: 0.006717577110975981 2023-01-22 20:27:33.305853: step: 344/527, loss: 0.002548638731241226 2023-01-22 20:27:34.351343: step: 348/527, loss: 0.020918942987918854 2023-01-22 20:27:35.418554: step: 352/527, loss: 0.03664001449942589 2023-01-22 20:27:36.472308: step: 356/527, loss: 0.0031418537255376577 2023-01-22 20:27:37.516303: step: 360/527, loss: 0.013708369806408882 2023-01-22 20:27:38.571616: step: 364/527, loss: 0.007406481541693211 2023-01-22 20:27:39.614622: step: 368/527, loss: 0.02865022048354149 2023-01-22 20:27:40.675341: step: 372/527, loss: 0.008868347853422165 2023-01-22 20:27:41.721155: step: 376/527, loss: 0.006646595895290375 2023-01-22 20:27:42.771620: step: 380/527, loss: 0.004173017106950283 2023-01-22 20:27:43.824012: step: 384/527, loss: 0.0068471902050077915 2023-01-22 20:27:44.866563: step: 388/527, loss: 0.007518232800066471 2023-01-22 20:27:45.928179: step: 392/527, loss: 0.003517703851684928 2023-01-22 20:27:46.979500: step: 396/527, loss: 0.0074419742450118065 2023-01-22 20:27:48.043085: step: 400/527, loss: 0.03165902569890022 2023-01-22 20:27:49.087506: step: 404/527, loss: 0.003697504522278905 2023-01-22 20:27:50.139389: step: 408/527, loss: 0.0033218746539205313 2023-01-22 20:27:51.196234: step: 412/527, loss: 0.010349982418119907 2023-01-22 20:27:52.253935: step: 416/527, loss: 0.0014420952647924423 2023-01-22 20:27:53.323230: step: 420/527, loss: 0.040701959282159805 2023-01-22 20:27:54.379097: step: 424/527, loss: 0.009987237863242626 2023-01-22 20:27:55.415966: step: 428/527, loss: 0.006658504717051983 2023-01-22 20:27:56.463783: step: 432/527, loss: 0.006679753307253122 2023-01-22 20:27:57.510154: step: 436/527, loss: 0.03932838886976242 2023-01-22 20:27:58.553636: step: 440/527, loss: 0.0023192455992102623 2023-01-22 20:27:59.627597: step: 444/527, loss: 0.03153736889362335 2023-01-22 20:28:00.668561: step: 448/527, loss: 0.0013293507508933544 2023-01-22 20:28:01.711532: step: 452/527, loss: 0.052417173981666565 2023-01-22 20:28:02.761837: step: 456/527, loss: 0.005422278307378292 2023-01-22 20:28:03.806143: step: 460/527, loss: 0.006223623640835285 2023-01-22 20:28:04.862970: step: 464/527, loss: 0.014947882853448391 2023-01-22 20:28:05.915868: step: 468/527, loss: 0.008250672370195389 2023-01-22 20:28:06.969868: step: 472/527, loss: 0.018661482259631157 2023-01-22 20:28:08.033276: step: 476/527, loss: 0.00026580668054521084 2023-01-22 20:28:09.087112: step: 480/527, loss: 0.02620423398911953 2023-01-22 20:28:10.147459: step: 484/527, loss: 0.012208274565637112 2023-01-22 20:28:11.187976: step: 488/527, loss: 0.013318589888513088 2023-01-22 20:28:12.237337: step: 492/527, loss: 0.000285789486952126 2023-01-22 20:28:13.292332: step: 496/527, loss: 0.009143110364675522 2023-01-22 20:28:14.335696: step: 500/527, loss: 0.0045474739745259285 2023-01-22 20:28:15.370120: step: 504/527, loss: 0.0029171151109039783 2023-01-22 20:28:16.428940: step: 508/527, loss: 0.01945706084370613 2023-01-22 20:28:17.480893: step: 512/527, loss: 0.0071182711981236935 2023-01-22 20:28:18.525105: step: 516/527, loss: 0.006222281139343977 2023-01-22 20:28:19.574396: step: 520/527, loss: 0.01364617794752121 2023-01-22 20:28:20.617965: step: 524/527, loss: 0.009088690392673016 2023-01-22 20:28:21.675225: step: 528/527, loss: 0.015786701813340187 2023-01-22 20:28:22.729724: step: 532/527, loss: 0.0033564644400030375 2023-01-22 20:28:23.778737: step: 536/527, loss: 0.008154332637786865 2023-01-22 20:28:24.820390: step: 540/527, loss: 0.0022767100017517805 2023-01-22 20:28:25.880023: step: 544/527, loss: 0.016711972653865814 2023-01-22 20:28:26.951524: step: 548/527, loss: 0.007074535824358463 2023-01-22 20:28:28.021514: step: 552/527, loss: 0.0381755530834198 2023-01-22 20:28:29.065290: step: 556/527, loss: 0.005773419979959726 2023-01-22 20:28:30.112269: step: 560/527, loss: 0.001738192979246378 2023-01-22 20:28:31.176609: step: 564/527, loss: 0.0025373923126608133 2023-01-22 20:28:32.226848: step: 568/527, loss: 0.023184970021247864 2023-01-22 20:28:33.277031: step: 572/527, loss: 0.0022909336257725954 2023-01-22 20:28:34.322727: step: 576/527, loss: 0.04253803566098213 2023-01-22 20:28:35.372889: step: 580/527, loss: 0.021489733830094337 2023-01-22 20:28:36.418243: step: 584/527, loss: 0.0005806525005027652 2023-01-22 20:28:37.458867: step: 588/527, loss: 0.007432470563799143 2023-01-22 20:28:38.511082: step: 592/527, loss: 0.012636465951800346 2023-01-22 20:28:39.558204: step: 596/527, loss: 0.010633801110088825 2023-01-22 20:28:40.614644: step: 600/527, loss: 0.0061531951650977135 2023-01-22 20:28:41.666081: step: 604/527, loss: 0.017504367977380753 2023-01-22 20:28:42.701069: step: 608/527, loss: 0.026440482586622238 2023-01-22 20:28:43.742806: step: 612/527, loss: 0.009087401442229748 2023-01-22 20:28:44.793908: step: 616/527, loss: 0.0044678207486867905 2023-01-22 20:28:45.842415: step: 620/527, loss: 0.011231588199734688 2023-01-22 20:28:46.877320: step: 624/527, loss: 0.0037329308688640594 2023-01-22 20:28:47.919513: step: 628/527, loss: 0.005498392041772604 2023-01-22 20:28:48.978730: step: 632/527, loss: 0.012921069748699665 2023-01-22 20:28:50.043876: step: 636/527, loss: 0.029064837843179703 2023-01-22 20:28:51.098755: step: 640/527, loss: 0.004471190273761749 2023-01-22 20:28:52.145159: step: 644/527, loss: 0.007544382940977812 2023-01-22 20:28:53.193346: step: 648/527, loss: 0.007371215149760246 2023-01-22 20:28:54.238141: step: 652/527, loss: 0.0022037853486835957 2023-01-22 20:28:55.283919: step: 656/527, loss: 0.007532479707151651 2023-01-22 20:28:56.328336: step: 660/527, loss: 0.0028670087922364473 2023-01-22 20:28:57.371674: step: 664/527, loss: 0.00627627968788147 2023-01-22 20:28:58.419616: step: 668/527, loss: 0.006216144654899836 2023-01-22 20:28:59.478228: step: 672/527, loss: 0.01613222062587738 2023-01-22 20:29:00.541675: step: 676/527, loss: 0.0036915522068738937 2023-01-22 20:29:01.595593: step: 680/527, loss: 0.013016798533499241 2023-01-22 20:29:02.650457: step: 684/527, loss: 0.005806453060358763 2023-01-22 20:29:03.680855: step: 688/527, loss: 0.0005305635277181864 2023-01-22 20:29:04.744913: step: 692/527, loss: 0.00021035685495007783 2023-01-22 20:29:05.799208: step: 696/527, loss: 0.004059563856571913 2023-01-22 20:29:06.837523: step: 700/527, loss: 0.014852388761937618 2023-01-22 20:29:07.894387: step: 704/527, loss: 0.009128103032708168 2023-01-22 20:29:08.942291: step: 708/527, loss: 0.01652321219444275 2023-01-22 20:29:09.996206: step: 712/527, loss: 0.02944299951195717 2023-01-22 20:29:11.044027: step: 716/527, loss: 0.003906652331352234 2023-01-22 20:29:12.096079: step: 720/527, loss: 0.007499282713979483 2023-01-22 20:29:13.157857: step: 724/527, loss: 0.009264912456274033 2023-01-22 20:29:14.206609: step: 728/527, loss: 0.01113964430987835 2023-01-22 20:29:15.250712: step: 732/527, loss: 0.008038428612053394 2023-01-22 20:29:16.293106: step: 736/527, loss: 0.03619419038295746 2023-01-22 20:29:17.341706: step: 740/527, loss: 0.002730597974732518 2023-01-22 20:29:18.404804: step: 744/527, loss: 0.003389413934201002 2023-01-22 20:29:19.470537: step: 748/527, loss: 0.011952356435358524 2023-01-22 20:29:20.541035: step: 752/527, loss: 0.012719747610390186 2023-01-22 20:29:21.586666: step: 756/527, loss: 0.0021842161659151316 2023-01-22 20:29:22.627615: step: 760/527, loss: 0.007026479579508305 2023-01-22 20:29:23.682816: step: 764/527, loss: 0.0064879958517849445 2023-01-22 20:29:24.723745: step: 768/527, loss: 0.0022806536871939898 2023-01-22 20:29:25.754853: step: 772/527, loss: 0.006598201580345631 2023-01-22 20:29:26.834127: step: 776/527, loss: 0.005487419664859772 2023-01-22 20:29:27.892222: step: 780/527, loss: 0.0011780932545661926 2023-01-22 20:29:28.945680: step: 784/527, loss: 0.002447796519845724 2023-01-22 20:29:29.988055: step: 788/527, loss: 0.011724085547029972 2023-01-22 20:29:31.037111: step: 792/527, loss: 0.010715875774621964 2023-01-22 20:29:32.079018: step: 796/527, loss: 0.004847763571888208 2023-01-22 20:29:33.120582: step: 800/527, loss: 0.0019091146532446146 2023-01-22 20:29:34.166426: step: 804/527, loss: 0.00435184221714735 2023-01-22 20:29:35.207657: step: 808/527, loss: 0.006244426593184471 2023-01-22 20:29:36.264397: step: 812/527, loss: 0.0171041302382946 2023-01-22 20:29:37.334459: step: 816/527, loss: 0.03753026947379112 2023-01-22 20:29:38.380014: step: 820/527, loss: 0.0028294953517615795 2023-01-22 20:29:39.445853: step: 824/527, loss: 0.006130837369710207 2023-01-22 20:29:40.496759: step: 828/527, loss: 0.0036286322865635157 2023-01-22 20:29:41.562481: step: 832/527, loss: 0.010490822605788708 2023-01-22 20:29:42.597934: step: 836/527, loss: 0.017498401924967766 2023-01-22 20:29:43.666120: step: 840/527, loss: 0.002916432451456785 2023-01-22 20:29:44.706244: step: 844/527, loss: 0.013589801266789436 2023-01-22 20:29:45.761889: step: 848/527, loss: 0.005216329358518124 2023-01-22 20:29:46.800188: step: 852/527, loss: 0.002796849934384227 2023-01-22 20:29:47.863466: step: 856/527, loss: 0.019581666216254234 2023-01-22 20:29:48.910077: step: 860/527, loss: 0.011264914646744728 2023-01-22 20:29:49.956453: step: 864/527, loss: 0.008222275413572788 2023-01-22 20:29:51.016779: step: 868/527, loss: 0.018901150673627853 2023-01-22 20:29:52.055331: step: 872/527, loss: 0.018296141177415848 2023-01-22 20:29:53.114691: step: 876/527, loss: 0.0031639791559427977 2023-01-22 20:29:54.166652: step: 880/527, loss: 0.025872010737657547 2023-01-22 20:29:55.214115: step: 884/527, loss: 0.002451978623867035 2023-01-22 20:29:56.263494: step: 888/527, loss: 0.005923663266003132 2023-01-22 20:29:57.308376: step: 892/527, loss: 0.016273170709609985 2023-01-22 20:29:58.380266: step: 896/527, loss: 0.00237936619669199 2023-01-22 20:29:59.443815: step: 900/527, loss: 0.02248707227408886 2023-01-22 20:30:00.504564: step: 904/527, loss: 0.004151094704866409 2023-01-22 20:30:01.572012: step: 908/527, loss: 0.0021933824755251408 2023-01-22 20:30:02.614016: step: 912/527, loss: 0.002742059761658311 2023-01-22 20:30:03.645196: step: 916/527, loss: 0.002953851129859686 2023-01-22 20:30:04.711813: step: 920/527, loss: 0.0033365164417773485 2023-01-22 20:30:05.770035: step: 924/527, loss: 0.004420835059136152 2023-01-22 20:30:06.844140: step: 928/527, loss: 0.013445116579532623 2023-01-22 20:30:07.898474: step: 932/527, loss: 7.957030175020918e-05 2023-01-22 20:30:08.948963: step: 936/527, loss: 0.003749594558030367 2023-01-22 20:30:10.011703: step: 940/527, loss: 0.017982684075832367 2023-01-22 20:30:11.069649: step: 944/527, loss: 0.01481990609318018 2023-01-22 20:30:12.113207: step: 948/527, loss: 0.0020884007681161165 2023-01-22 20:30:13.170212: step: 952/527, loss: 0.004878515377640724 2023-01-22 20:30:14.213617: step: 956/527, loss: 0.00833574403077364 2023-01-22 20:30:15.256955: step: 960/527, loss: 0.004025594796985388 2023-01-22 20:30:16.295928: step: 964/527, loss: 0.036562662571668625 2023-01-22 20:30:17.381716: step: 968/527, loss: 0.0011795436730608344 2023-01-22 20:30:18.425721: step: 972/527, loss: 0.026035521179437637 2023-01-22 20:30:19.494925: step: 976/527, loss: 0.0139453811571002 2023-01-22 20:30:20.540375: step: 980/527, loss: 0.0044447993859648705 2023-01-22 20:30:21.584838: step: 984/527, loss: 0.008876659907400608 2023-01-22 20:30:22.620689: step: 988/527, loss: 0.00867005530744791 2023-01-22 20:30:23.656133: step: 992/527, loss: 0.004711296409368515 2023-01-22 20:30:24.712038: step: 996/527, loss: 0.004192500840872526 2023-01-22 20:30:25.753048: step: 1000/527, loss: 0.0024711224250495434 2023-01-22 20:30:26.822395: step: 1004/527, loss: 0.0039834328927099705 2023-01-22 20:30:27.853586: step: 1008/527, loss: 0.0023470204323530197 2023-01-22 20:30:28.906782: step: 1012/527, loss: 0.006675094366073608 2023-01-22 20:30:29.965520: step: 1016/527, loss: 0.031667426228523254 2023-01-22 20:30:31.005640: step: 1020/527, loss: 0.003976329229772091 2023-01-22 20:30:32.059377: step: 1024/527, loss: 0.005665579345077276 2023-01-22 20:30:33.096158: step: 1028/527, loss: 0.026949133723974228 2023-01-22 20:30:34.144844: step: 1032/527, loss: 0.00815789494663477 2023-01-22 20:30:35.202481: step: 1036/527, loss: 0.004347816109657288 2023-01-22 20:30:36.243283: step: 1040/527, loss: 0.022784659639000893 2023-01-22 20:30:37.304543: step: 1044/527, loss: 0.010411013849079609 2023-01-22 20:30:38.372841: step: 1048/527, loss: 0.0279465951025486 2023-01-22 20:30:39.424597: step: 1052/527, loss: 0.0038170020561665297 2023-01-22 20:30:40.477862: step: 1056/527, loss: 0.004797334782779217 2023-01-22 20:30:41.532263: step: 1060/527, loss: 0.013142174109816551 2023-01-22 20:30:42.577300: step: 1064/527, loss: 0.011441509239375591 2023-01-22 20:30:43.638205: step: 1068/527, loss: 0.010451341979205608 2023-01-22 20:30:44.677377: step: 1072/527, loss: 0.0277873482555151 2023-01-22 20:30:45.719993: step: 1076/527, loss: 0.018156759440898895 2023-01-22 20:30:46.781341: step: 1080/527, loss: 0.014458867721259594 2023-01-22 20:30:47.829944: step: 1084/527, loss: 0.01924041286110878 2023-01-22 20:30:48.883125: step: 1088/527, loss: 0.005845629144459963 2023-01-22 20:30:49.970370: step: 1092/527, loss: 0.011642617173492908 2023-01-22 20:30:51.029999: step: 1096/527, loss: 0.02554919384419918 2023-01-22 20:30:52.071906: step: 1100/527, loss: 0.0011409070575609803 2023-01-22 20:30:53.117592: step: 1104/527, loss: 0.0065157730132341385 2023-01-22 20:30:54.163961: step: 1108/527, loss: 0.020355820655822754 2023-01-22 20:30:55.237418: step: 1112/527, loss: 0.011787410825490952 2023-01-22 20:30:56.298333: step: 1116/527, loss: 0.027298949658870697 2023-01-22 20:30:57.347908: step: 1120/527, loss: 0.021569516509771347 2023-01-22 20:30:58.406880: step: 1124/527, loss: 0.007379552815109491 2023-01-22 20:30:59.461725: step: 1128/527, loss: 0.008127819746732712 2023-01-22 20:31:00.530044: step: 1132/527, loss: 0.01808037795126438 2023-01-22 20:31:01.583898: step: 1136/527, loss: 0.008086105808615685 2023-01-22 20:31:02.643405: step: 1140/527, loss: 0.004028093535453081 2023-01-22 20:31:03.684737: step: 1144/527, loss: 0.009277253411710262 2023-01-22 20:31:04.747195: step: 1148/527, loss: 0.005190738011151552 2023-01-22 20:31:05.798663: step: 1152/527, loss: 0.005365185905247927 2023-01-22 20:31:06.870971: step: 1156/527, loss: 0.009627090767025948 2023-01-22 20:31:07.927894: step: 1160/527, loss: 0.022511985152959824 2023-01-22 20:31:08.983628: step: 1164/527, loss: 0.00419845525175333 2023-01-22 20:31:10.046725: step: 1168/527, loss: 0.002840465633198619 2023-01-22 20:31:11.092471: step: 1172/527, loss: 0.0045182122848927975 2023-01-22 20:31:12.136316: step: 1176/527, loss: 0.022903745993971825 2023-01-22 20:31:13.194069: step: 1180/527, loss: 0.003719200612977147 2023-01-22 20:31:14.241213: step: 1184/527, loss: 0.01360474992543459 2023-01-22 20:31:15.286451: step: 1188/527, loss: 0.0035302340984344482 2023-01-22 20:31:16.348359: step: 1192/527, loss: 0.023574169725179672 2023-01-22 20:31:17.407810: step: 1196/527, loss: 0.038673803210258484 2023-01-22 20:31:18.448930: step: 1200/527, loss: 0.0092678964138031 2023-01-22 20:31:19.477367: step: 1204/527, loss: 0.0035165001172572374 2023-01-22 20:31:20.517220: step: 1208/527, loss: 5.5617874750168994e-05 2023-01-22 20:31:21.582010: step: 1212/527, loss: 0.010505360551178455 2023-01-22 20:31:22.630028: step: 1216/527, loss: 0.04568411037325859 2023-01-22 20:31:23.675394: step: 1220/527, loss: 0.019272875040769577 2023-01-22 20:31:24.704370: step: 1224/527, loss: 0.011340446770191193 2023-01-22 20:31:25.750973: step: 1228/527, loss: 0.01919662021100521 2023-01-22 20:31:26.823223: step: 1232/527, loss: 0.05767522752285004 2023-01-22 20:31:27.863080: step: 1236/527, loss: 0.029466670006513596 2023-01-22 20:31:28.925514: step: 1240/527, loss: 0.007538609206676483 2023-01-22 20:31:29.972390: step: 1244/527, loss: 0.006212037988007069 2023-01-22 20:31:31.015564: step: 1248/527, loss: 0.009144243784248829 2023-01-22 20:31:32.071750: step: 1252/527, loss: 0.0007239718688651919 2023-01-22 20:31:33.128539: step: 1256/527, loss: 0.021091515198349953 2023-01-22 20:31:34.177600: step: 1260/527, loss: 0.006575802341103554 2023-01-22 20:31:35.216615: step: 1264/527, loss: 0.010765178129076958 2023-01-22 20:31:36.262912: step: 1268/527, loss: 0.008394895121455193 2023-01-22 20:31:37.305315: step: 1272/527, loss: 3.491679672151804e-05 2023-01-22 20:31:38.377766: step: 1276/527, loss: 0.019259685650467873 2023-01-22 20:31:39.427894: step: 1280/527, loss: 0.01469599362462759 2023-01-22 20:31:40.472836: step: 1284/527, loss: 0.01065925695002079 2023-01-22 20:31:41.520276: step: 1288/527, loss: 0.012464450672268867 2023-01-22 20:31:42.582173: step: 1292/527, loss: 0.026256101205945015 2023-01-22 20:31:43.613220: step: 1296/527, loss: 0.002674462739378214 2023-01-22 20:31:44.669516: step: 1300/527, loss: 0.0031843220349401236 2023-01-22 20:31:45.721863: step: 1304/527, loss: 0.005963290110230446 2023-01-22 20:31:46.771946: step: 1308/527, loss: 0.002744704717770219 2023-01-22 20:31:47.810978: step: 1312/527, loss: 0.037120621651411057 2023-01-22 20:31:48.868520: step: 1316/527, loss: 0.00917765125632286 2023-01-22 20:31:49.952720: step: 1320/527, loss: 0.012225555256009102 2023-01-22 20:31:50.995989: step: 1324/527, loss: 0.004405536223202944 2023-01-22 20:31:52.027740: step: 1328/527, loss: 0.003313221503049135 2023-01-22 20:31:53.062775: step: 1332/527, loss: 0.000770586309954524 2023-01-22 20:31:54.107473: step: 1336/527, loss: 0.004324547480791807 2023-01-22 20:31:55.136915: step: 1340/527, loss: 0.002289133844897151 2023-01-22 20:31:56.182297: step: 1344/527, loss: 0.002626942005008459 2023-01-22 20:31:57.231533: step: 1348/527, loss: 0.0017340255435556173 2023-01-22 20:31:58.288519: step: 1352/527, loss: 0.0002952470094896853 2023-01-22 20:31:59.329554: step: 1356/527, loss: 0.0034138751216232777 2023-01-22 20:32:00.393590: step: 1360/527, loss: 0.007477168459445238 2023-01-22 20:32:01.456992: step: 1364/527, loss: 0.003875673282891512 2023-01-22 20:32:02.509336: step: 1368/527, loss: 0.013424725271761417 2023-01-22 20:32:03.575204: step: 1372/527, loss: 0.003630690276622772 2023-01-22 20:32:04.614950: step: 1376/527, loss: 0.0028609067667275667 2023-01-22 20:32:05.651087: step: 1380/527, loss: 0.011709722690284252 2023-01-22 20:32:06.713386: step: 1384/527, loss: 0.005834583193063736 2023-01-22 20:32:07.756064: step: 1388/527, loss: 0.04574430361390114 2023-01-22 20:32:08.797301: step: 1392/527, loss: 0.02212393283843994 2023-01-22 20:32:09.851275: step: 1396/527, loss: 0.0039162845350801945 2023-01-22 20:32:10.884744: step: 1400/527, loss: 0.0017135880189016461 2023-01-22 20:32:11.957107: step: 1404/527, loss: 0.02152174711227417 2023-01-22 20:32:12.999462: step: 1408/527, loss: 0.004200255032628775 2023-01-22 20:32:14.053087: step: 1412/527, loss: 0.007566146086901426 2023-01-22 20:32:15.132553: step: 1416/527, loss: 0.03885927423834801 2023-01-22 20:32:16.187619: step: 1420/527, loss: 0.006391413044184446 2023-01-22 20:32:17.242633: step: 1424/527, loss: 0.00448015658184886 2023-01-22 20:32:18.279318: step: 1428/527, loss: 0.026661232113838196 2023-01-22 20:32:19.327214: step: 1432/527, loss: 0.01487383246421814 2023-01-22 20:32:20.368212: step: 1436/527, loss: 0.004046610556542873 2023-01-22 20:32:21.415924: step: 1440/527, loss: 0.005410411395132542 2023-01-22 20:32:22.456710: step: 1444/527, loss: 0.004408024251461029 2023-01-22 20:32:23.506565: step: 1448/527, loss: 0.016233105212450027 2023-01-22 20:32:24.562828: step: 1452/527, loss: 0.01613014005124569 2023-01-22 20:32:25.619913: step: 1456/527, loss: 0.008741861209273338 2023-01-22 20:32:26.676374: step: 1460/527, loss: 0.010356834158301353 2023-01-22 20:32:27.735998: step: 1464/527, loss: 0.002057206118479371 2023-01-22 20:32:28.785846: step: 1468/527, loss: 0.0026880325749516487 2023-01-22 20:32:29.830234: step: 1472/527, loss: 0.002831254852935672 2023-01-22 20:32:30.901335: step: 1476/527, loss: 0.002983893733471632 2023-01-22 20:32:31.975185: step: 1480/527, loss: 0.0180650781840086 2023-01-22 20:32:33.017221: step: 1484/527, loss: 0.04800155386328697 2023-01-22 20:32:34.055791: step: 1488/527, loss: 0.004443780984729528 2023-01-22 20:32:35.107690: step: 1492/527, loss: 0.005661570001393557 2023-01-22 20:32:36.198906: step: 1496/527, loss: 0.007349614053964615 2023-01-22 20:32:37.241359: step: 1500/527, loss: 0.011952241882681847 2023-01-22 20:32:38.279577: step: 1504/527, loss: 0.00030225442606024444 2023-01-22 20:32:39.335333: step: 1508/527, loss: 0.01005775947123766 2023-01-22 20:32:40.382389: step: 1512/527, loss: 0.008232921361923218 2023-01-22 20:32:41.434682: step: 1516/527, loss: 0.006454213988035917 2023-01-22 20:32:42.485428: step: 1520/527, loss: 0.0024536894634366035 2023-01-22 20:32:43.511758: step: 1524/527, loss: 8.150745270540938e-05 2023-01-22 20:32:44.577473: step: 1528/527, loss: 0.0038888035342097282 2023-01-22 20:32:45.615534: step: 1532/527, loss: 0.00791660975664854 2023-01-22 20:32:46.674637: step: 1536/527, loss: 0.0034462171606719494 2023-01-22 20:32:47.726403: step: 1540/527, loss: 0.004976110532879829 2023-01-22 20:32:48.784437: step: 1544/527, loss: 0.013882022351026535 2023-01-22 20:32:49.838172: step: 1548/527, loss: 0.01641835644841194 2023-01-22 20:32:50.881321: step: 1552/527, loss: 0.006974536459892988 2023-01-22 20:32:51.919064: step: 1556/527, loss: 0.008180052973330021 2023-01-22 20:32:52.981163: step: 1560/527, loss: 0.030093254521489143 2023-01-22 20:32:54.024365: step: 1564/527, loss: 0.030447788536548615 2023-01-22 20:32:55.076989: step: 1568/527, loss: 0.007571370340883732 2023-01-22 20:32:56.125738: step: 1572/527, loss: 0.005400706082582474 2023-01-22 20:32:57.187450: step: 1576/527, loss: 0.005711423698812723 2023-01-22 20:32:58.223856: step: 1580/527, loss: 0.018246019259095192 2023-01-22 20:32:59.288633: step: 1584/527, loss: 0.031007826328277588 2023-01-22 20:33:00.345869: step: 1588/527, loss: 0.003460594918578863 2023-01-22 20:33:01.405156: step: 1592/527, loss: 0.05547678843140602 2023-01-22 20:33:02.471971: step: 1596/527, loss: 0.006731846369802952 2023-01-22 20:33:03.519302: step: 1600/527, loss: 0.028741005808115005 2023-01-22 20:33:04.569843: step: 1604/527, loss: 0.010584404692053795 2023-01-22 20:33:05.621214: step: 1608/527, loss: 0.0035399007610976696 2023-01-22 20:33:06.664847: step: 1612/527, loss: 0.003735631238669157 2023-01-22 20:33:07.717440: step: 1616/527, loss: 0.005376417655497789 2023-01-22 20:33:08.762501: step: 1620/527, loss: 0.023900914937257767 2023-01-22 20:33:09.801044: step: 1624/527, loss: 0.009086580947041512 2023-01-22 20:33:10.845311: step: 1628/527, loss: 0.00670094508677721 2023-01-22 20:33:11.889513: step: 1632/527, loss: 0.0018897296395152807 2023-01-22 20:33:12.950534: step: 1636/527, loss: 0.0021955417469143867 2023-01-22 20:33:13.998249: step: 1640/527, loss: 0.0009489068761467934 2023-01-22 20:33:15.028855: step: 1644/527, loss: 0.008215104229748249 2023-01-22 20:33:16.079098: step: 1648/527, loss: 0.007600032724440098 2023-01-22 20:33:17.128806: step: 1652/527, loss: 0.011808929964900017 2023-01-22 20:33:18.168876: step: 1656/527, loss: 0.003579444717615843 2023-01-22 20:33:19.239626: step: 1660/527, loss: 0.004173597786575556 2023-01-22 20:33:20.287168: step: 1664/527, loss: 0.014817196875810623 2023-01-22 20:33:21.330484: step: 1668/527, loss: 0.006623163819313049 2023-01-22 20:33:22.383213: step: 1672/527, loss: 0.00447028037160635 2023-01-22 20:33:23.423148: step: 1676/527, loss: 0.011469677090644836 2023-01-22 20:33:24.452808: step: 1680/527, loss: 0.004856418818235397 2023-01-22 20:33:25.489887: step: 1684/527, loss: 0.0012031658552587032 2023-01-22 20:33:26.541150: step: 1688/527, loss: 0.019825154915452003 2023-01-22 20:33:27.588574: step: 1692/527, loss: 0.0031039812602102757 2023-01-22 20:33:28.644583: step: 1696/527, loss: 0.013692040927708149 2023-01-22 20:33:29.698107: step: 1700/527, loss: 0.0028177835047245026 2023-01-22 20:33:30.751272: step: 1704/527, loss: 0.006275123916566372 2023-01-22 20:33:31.785922: step: 1708/527, loss: 0.03189357370138168 2023-01-22 20:33:32.821618: step: 1712/527, loss: 0.004919901955872774 2023-01-22 20:33:33.883388: step: 1716/527, loss: 0.004597228951752186 2023-01-22 20:33:34.934960: step: 1720/527, loss: 0.005968692246824503 2023-01-22 20:33:36.001636: step: 1724/527, loss: 0.04046793654561043 2023-01-22 20:33:37.055201: step: 1728/527, loss: 0.008305750787258148 2023-01-22 20:33:38.094171: step: 1732/527, loss: 0.0064603895880281925 2023-01-22 20:33:39.143008: step: 1736/527, loss: 0.007813424803316593 2023-01-22 20:33:40.199237: step: 1740/527, loss: 0.0038077947683632374 2023-01-22 20:33:41.264816: step: 1744/527, loss: 0.005299651529639959 2023-01-22 20:33:42.316062: step: 1748/527, loss: 0.009291463531553745 2023-01-22 20:33:43.356199: step: 1752/527, loss: 0.0017409819411113858 2023-01-22 20:33:44.399459: step: 1756/527, loss: 0.007004793733358383 2023-01-22 20:33:45.464035: step: 1760/527, loss: 0.009343842044472694 2023-01-22 20:33:46.511275: step: 1764/527, loss: 0.009979411959648132 2023-01-22 20:33:47.546366: step: 1768/527, loss: 0.0006102912011556327 2023-01-22 20:33:48.606693: step: 1772/527, loss: 0.016799774020910263 2023-01-22 20:33:49.651264: step: 1776/527, loss: 0.0033510886132717133 2023-01-22 20:33:50.692271: step: 1780/527, loss: 0.012840097770094872 2023-01-22 20:33:51.753833: step: 1784/527, loss: 0.0252709798514843 2023-01-22 20:33:52.804965: step: 1788/527, loss: 0.0019171589519828558 2023-01-22 20:33:53.860135: step: 1792/527, loss: 0.007466321811079979 2023-01-22 20:33:54.905721: step: 1796/527, loss: 0.028352845460176468 2023-01-22 20:33:55.954433: step: 1800/527, loss: 0.020462721586227417 2023-01-22 20:33:57.008291: step: 1804/527, loss: 0.001348096877336502 2023-01-22 20:33:58.057650: step: 1808/527, loss: 0.004780034068971872 2023-01-22 20:33:59.111970: step: 1812/527, loss: 0.008875470608472824 2023-01-22 20:34:00.165145: step: 1816/527, loss: 0.0059275031089782715 2023-01-22 20:34:01.200648: step: 1820/527, loss: 0.027109917253255844 2023-01-22 20:34:02.238170: step: 1824/527, loss: 0.005415304563939571 2023-01-22 20:34:03.289973: step: 1828/527, loss: 0.05976094678044319 2023-01-22 20:34:04.329667: step: 1832/527, loss: 0.00021260854555293918 2023-01-22 20:34:05.383361: step: 1836/527, loss: 0.010727842338383198 2023-01-22 20:34:06.442169: step: 1840/527, loss: 0.018707668408751488 2023-01-22 20:34:07.472914: step: 1844/527, loss: 0.0036897333338856697 2023-01-22 20:34:08.535551: step: 1848/527, loss: 0.001390081481076777 2023-01-22 20:34:09.576261: step: 1852/527, loss: 0.005566315725445747 2023-01-22 20:34:10.625521: step: 1856/527, loss: 0.0016809921944513917 2023-01-22 20:34:11.671940: step: 1860/527, loss: 0.008079471997916698 2023-01-22 20:34:12.731466: step: 1864/527, loss: 0.004517318680882454 2023-01-22 20:34:13.781169: step: 1868/527, loss: 0.03713999316096306 2023-01-22 20:34:14.838344: step: 1872/527, loss: 0.005344726145267487 2023-01-22 20:34:15.898247: step: 1876/527, loss: 0.013714388012886047 2023-01-22 20:34:16.942030: step: 1880/527, loss: 0.01340159960091114 2023-01-22 20:34:17.996701: step: 1884/527, loss: 0.004132885951548815 2023-01-22 20:34:19.044568: step: 1888/527, loss: 0.02024313248693943 2023-01-22 20:34:20.118752: step: 1892/527, loss: 0.005223647691309452 2023-01-22 20:34:21.161841: step: 1896/527, loss: 0.028457213193178177 2023-01-22 20:34:22.205908: step: 1900/527, loss: 0.012500789947807789 2023-01-22 20:34:23.245785: step: 1904/527, loss: 0.002309531206265092 2023-01-22 20:34:24.281403: step: 1908/527, loss: 0.003618075279518962 2023-01-22 20:34:25.319154: step: 1912/527, loss: 0.004269478842616081 2023-01-22 20:34:26.377818: step: 1916/527, loss: 0.0013688202016055584 2023-01-22 20:34:27.420488: step: 1920/527, loss: 0.009433871135115623 2023-01-22 20:34:28.467898: step: 1924/527, loss: 0.007213959936052561 2023-01-22 20:34:29.513974: step: 1928/527, loss: 0.002427024533972144 2023-01-22 20:34:30.545888: step: 1932/527, loss: 0.0053839352913200855 2023-01-22 20:34:31.602481: step: 1936/527, loss: 0.01548903901129961 2023-01-22 20:34:32.678365: step: 1940/527, loss: 0.018061330541968346 2023-01-22 20:34:33.738610: step: 1944/527, loss: 0.000532692123670131 2023-01-22 20:34:34.804604: step: 1948/527, loss: 0.0012011309154331684 2023-01-22 20:34:35.850375: step: 1952/527, loss: 0.02080696076154709 2023-01-22 20:34:36.898220: step: 1956/527, loss: 0.003533650655299425 2023-01-22 20:34:37.958038: step: 1960/527, loss: 0.010113691911101341 2023-01-22 20:34:39.020396: step: 1964/527, loss: 0.004794816952198744 2023-01-22 20:34:40.061708: step: 1968/527, loss: 0.0 2023-01-22 20:34:41.116264: step: 1972/527, loss: 0.02904944308102131 2023-01-22 20:34:42.161470: step: 1976/527, loss: 0.003015698166564107 2023-01-22 20:34:43.195540: step: 1980/527, loss: 0.0051984768360853195 2023-01-22 20:34:44.256276: step: 1984/527, loss: 0.0041220299899578094 2023-01-22 20:34:45.292079: step: 1988/527, loss: 0.01986858807504177 2023-01-22 20:34:46.338116: step: 1992/527, loss: 0.010533158667385578 2023-01-22 20:34:47.400703: step: 1996/527, loss: 0.004473670851439238 2023-01-22 20:34:48.461367: step: 2000/527, loss: 0.0014370603021234274 2023-01-22 20:34:49.554280: step: 2004/527, loss: 0.0035524177365005016 2023-01-22 20:34:50.611648: step: 2008/527, loss: 0.0029565240256488323 2023-01-22 20:34:51.658807: step: 2012/527, loss: 0.0009394868393428624 2023-01-22 20:34:52.697672: step: 2016/527, loss: 0.006197627633810043 2023-01-22 20:34:53.733443: step: 2020/527, loss: 0.008275207132101059 2023-01-22 20:34:54.770179: step: 2024/527, loss: 0.0004387865774333477 2023-01-22 20:34:55.823562: step: 2028/527, loss: 0.008020678535103798 2023-01-22 20:34:56.878501: step: 2032/527, loss: 0.012076794169843197 2023-01-22 20:34:57.921375: step: 2036/527, loss: 0.0034120480995625257 2023-01-22 20:34:58.963055: step: 2040/527, loss: 0.0007011427660472691 2023-01-22 20:35:00.019326: step: 2044/527, loss: 0.0012681630905717611 2023-01-22 20:35:01.049833: step: 2048/527, loss: 0.0023309236858040094 2023-01-22 20:35:02.107634: step: 2052/527, loss: 0.000602313200943172 2023-01-22 20:35:03.164324: step: 2056/527, loss: 0.025826985016465187 2023-01-22 20:35:04.211669: step: 2060/527, loss: 0.0006361486157402396 2023-01-22 20:35:05.264699: step: 2064/527, loss: 0.004090974107384682 2023-01-22 20:35:06.314827: step: 2068/527, loss: 0.0165651086717844 2023-01-22 20:35:07.375633: step: 2072/527, loss: 0.012894713319838047 2023-01-22 20:35:08.426897: step: 2076/527, loss: 0.003250683192163706 2023-01-22 20:35:09.471653: step: 2080/527, loss: 0.0031449878588318825 2023-01-22 20:35:10.525582: step: 2084/527, loss: 0.033035244792699814 2023-01-22 20:35:11.576085: step: 2088/527, loss: 0.00939682312309742 2023-01-22 20:35:12.633765: step: 2092/527, loss: 0.0061929416842758656 2023-01-22 20:35:13.684891: step: 2096/527, loss: 0.004464729223400354 2023-01-22 20:35:14.733496: step: 2100/527, loss: 0.016868475824594498 2023-01-22 20:35:15.778544: step: 2104/527, loss: 0.013949783518910408 2023-01-22 20:35:16.820220: step: 2108/527, loss: 0.0008946279413066804 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32439424818840584, 'r': 0.3397829696394687, 'f1': 0.33191033364226136}, 'combined': 0.24456550899956098, 'stategy': 1, 'epoch': 5} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3422170033979775, 'r': 0.30830640942490517, 'f1': 0.32437785783586387}, 'combined': 0.20760182901495283, 'stategy': 1, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3257477367406913, 'r': 0.35665359411646846, 'f1': 0.3405008045278603}, 'combined': 0.2508953296521076, 'stategy': 1, 'epoch': 5} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.36108364243455804, 'r': 0.3111884482072373, 'f1': 0.3342844658476182}, 'combined': 0.2139420581424756, 'stategy': 1, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33163756944163336, 'r': 0.33163756944163336, 'f1': 0.33163756944163336}, 'combined': 0.24436452485172983, 'stategy': 1, 'epoch': 5} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3694882106357715, 'r': 0.2945147156660016, 'f1': 0.32776878229563117}, 'combined': 0.23500403258932048, 'stategy': 1, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 5} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33163756944163336, 'r': 0.33163756944163336, 'f1': 0.33163756944163336}, 'combined': 0.24436452485172983, 'stategy': 1, 'epoch': 5} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3694882106357715, 'r': 0.2945147156660016, 'f1': 0.32776878229563117}, 'combined': 0.23500403258932048, 'stategy': 1, 'epoch': 5} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 5} ****************************** Epoch: 6 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 20:37:50.214573: step: 4/527, loss: 0.013858789578080177 2023-01-22 20:37:51.253567: step: 8/527, loss: 0.0068472097627818584 2023-01-22 20:37:52.286058: step: 12/527, loss: 0.008142342790961266 2023-01-22 20:37:53.324291: step: 16/527, loss: 0.001789297442883253 2023-01-22 20:37:54.400673: step: 20/527, loss: 0.006914615631103516 2023-01-22 20:37:55.451185: step: 24/527, loss: 0.005576504860073328 2023-01-22 20:37:56.509048: step: 28/527, loss: 0.019201723858714104 2023-01-22 20:37:57.567122: step: 32/527, loss: 0.0037800746504217386 2023-01-22 20:37:58.611585: step: 36/527, loss: 0.013856600970029831 2023-01-22 20:37:59.650502: step: 40/527, loss: 0.004672815091907978 2023-01-22 20:38:00.691976: step: 44/527, loss: 0.005283959675580263 2023-01-22 20:38:01.756724: step: 48/527, loss: 0.0046623265370726585 2023-01-22 20:38:02.818271: step: 52/527, loss: 0.027216605842113495 2023-01-22 20:38:03.870468: step: 56/527, loss: 0.008942375890910625 2023-01-22 20:38:04.904924: step: 60/527, loss: 0.005253983661532402 2023-01-22 20:38:05.952492: step: 64/527, loss: 0.0028702174313366413 2023-01-22 20:38:06.982048: step: 68/527, loss: 0.012408256530761719 2023-01-22 20:38:08.021623: step: 72/527, loss: 0.00613422179594636 2023-01-22 20:38:09.052814: step: 76/527, loss: 0.0018351746257394552 2023-01-22 20:38:10.103710: step: 80/527, loss: 0.0019695486407727003 2023-01-22 20:38:11.167507: step: 84/527, loss: 0.0032592942006886005 2023-01-22 20:38:12.219645: step: 88/527, loss: 0.00019935319141950458 2023-01-22 20:38:13.274656: step: 92/527, loss: 0.007793416269123554 2023-01-22 20:38:14.321847: step: 96/527, loss: 0.0067306384444236755 2023-01-22 20:38:15.366854: step: 100/527, loss: 0.009403027594089508 2023-01-22 20:38:16.427392: step: 104/527, loss: 0.0064580426551401615 2023-01-22 20:38:17.472447: step: 108/527, loss: 0.0035173215437680483 2023-01-22 20:38:18.520447: step: 112/527, loss: 0.00279689091257751 2023-01-22 20:38:19.567246: step: 116/527, loss: 0.004287382122129202 2023-01-22 20:38:20.604742: step: 120/527, loss: 0.001108821015805006 2023-01-22 20:38:21.642868: step: 124/527, loss: 0.010869753547012806 2023-01-22 20:38:22.675706: step: 128/527, loss: 0.003302973695099354 2023-01-22 20:38:23.713332: step: 132/527, loss: 0.0060366056859493256 2023-01-22 20:38:24.751400: step: 136/527, loss: 0.003964675590395927 2023-01-22 20:38:25.807512: step: 140/527, loss: 0.01599006913602352 2023-01-22 20:38:26.875395: step: 144/527, loss: 0.009253821335732937 2023-01-22 20:38:27.913362: step: 148/527, loss: 0.0041815536096692085 2023-01-22 20:38:28.966020: step: 152/527, loss: 0.00996509101241827 2023-01-22 20:38:30.012979: step: 156/527, loss: 0.026652518659830093 2023-01-22 20:38:31.058348: step: 160/527, loss: 0.002544883405789733 2023-01-22 20:38:32.103411: step: 164/527, loss: 0.0016231551999226213 2023-01-22 20:38:33.175387: step: 168/527, loss: 0.0036847686860710382 2023-01-22 20:38:34.226684: step: 172/527, loss: 0.01465566549450159 2023-01-22 20:38:35.264228: step: 176/527, loss: 0.006843749899417162 2023-01-22 20:38:36.311400: step: 180/527, loss: 0.00491265719756484 2023-01-22 20:38:37.370073: step: 184/527, loss: 0.009041664190590382 2023-01-22 20:38:38.422467: step: 188/527, loss: 0.030131850391626358 2023-01-22 20:38:39.462246: step: 192/527, loss: 0.0046157860197126865 2023-01-22 20:38:40.501635: step: 196/527, loss: 0.008553413674235344 2023-01-22 20:38:41.553670: step: 200/527, loss: 0.0022844148334115744 2023-01-22 20:38:42.586340: step: 204/527, loss: 0.0013727594632655382 2023-01-22 20:38:43.635028: step: 208/527, loss: 0.012246148660779 2023-01-22 20:38:44.666920: step: 212/527, loss: 0.005377107299864292 2023-01-22 20:38:45.713151: step: 216/527, loss: 0.0049127694219350815 2023-01-22 20:38:46.774342: step: 220/527, loss: 0.007973438128829002 2023-01-22 20:38:47.822986: step: 224/527, loss: 0.0065606338903307915 2023-01-22 20:38:48.857716: step: 228/527, loss: 0.01133472379297018 2023-01-22 20:38:49.924918: step: 232/527, loss: 0.007218982558697462 2023-01-22 20:38:50.965044: step: 236/527, loss: 0.004554699175059795 2023-01-22 20:38:52.013937: step: 240/527, loss: 0.008885643444955349 2023-01-22 20:38:53.042082: step: 244/527, loss: 0.006284310016781092 2023-01-22 20:38:54.069780: step: 248/527, loss: 0.007876320742070675 2023-01-22 20:38:55.119519: step: 252/527, loss: 0.005983966402709484 2023-01-22 20:38:56.171231: step: 256/527, loss: 0.009778381325304508 2023-01-22 20:38:57.213947: step: 260/527, loss: 0.009121174924075603 2023-01-22 20:38:58.262810: step: 264/527, loss: 0.0055168066173791885 2023-01-22 20:38:59.326263: step: 268/527, loss: 0.014901691116392612 2023-01-22 20:39:00.378800: step: 272/527, loss: 0.0025227766018360853 2023-01-22 20:39:01.421586: step: 276/527, loss: 0.003489930648356676 2023-01-22 20:39:02.459190: step: 280/527, loss: 0.00649976497516036 2023-01-22 20:39:03.512500: step: 284/527, loss: 0.01988983154296875 2023-01-22 20:39:04.559892: step: 288/527, loss: 0.005568318068981171 2023-01-22 20:39:05.603488: step: 292/527, loss: 0.006438949145376682 2023-01-22 20:39:06.672764: step: 296/527, loss: 0.006255479995161295 2023-01-22 20:39:07.733423: step: 300/527, loss: 0.006532568950206041 2023-01-22 20:39:08.798835: step: 304/527, loss: 0.004828070290386677 2023-01-22 20:39:09.841786: step: 308/527, loss: 0.0060275159776210785 2023-01-22 20:39:10.905324: step: 312/527, loss: 0.00839760061353445 2023-01-22 20:39:11.960308: step: 316/527, loss: 0.0012062633177265525 2023-01-22 20:39:13.019939: step: 320/527, loss: 0.006844904739409685 2023-01-22 20:39:14.075285: step: 324/527, loss: 0.0021191516425460577 2023-01-22 20:39:15.126416: step: 328/527, loss: 0.007696196436882019 2023-01-22 20:39:16.167752: step: 332/527, loss: 0.03252635896205902 2023-01-22 20:39:17.226449: step: 336/527, loss: 0.0032477176282554865 2023-01-22 20:39:18.271503: step: 340/527, loss: 0.034856900572776794 2023-01-22 20:39:19.316342: step: 344/527, loss: 0.01766323670744896 2023-01-22 20:39:20.379765: step: 348/527, loss: 0.007777311839163303 2023-01-22 20:39:21.417289: step: 352/527, loss: 0.004600877873599529 2023-01-22 20:39:22.466763: step: 356/527, loss: 0.004632095340639353 2023-01-22 20:39:23.519640: step: 360/527, loss: 0.014119184575974941 2023-01-22 20:39:24.613059: step: 364/527, loss: 0.003761056810617447 2023-01-22 20:39:25.656096: step: 368/527, loss: 0.0042406306602060795 2023-01-22 20:39:26.693113: step: 372/527, loss: 0.026668522506952286 2023-01-22 20:39:27.743241: step: 376/527, loss: 0.004697869531810284 2023-01-22 20:39:28.808829: step: 380/527, loss: 0.006828742101788521 2023-01-22 20:39:29.867581: step: 384/527, loss: 0.01649160124361515 2023-01-22 20:39:30.916976: step: 388/527, loss: 0.013972537592053413 2023-01-22 20:39:31.989591: step: 392/527, loss: 0.007497820537537336 2023-01-22 20:39:33.038217: step: 396/527, loss: 0.005453927908092737 2023-01-22 20:39:34.087992: step: 400/527, loss: 0.010237008333206177 2023-01-22 20:39:35.148505: step: 404/527, loss: 0.01906120590865612 2023-01-22 20:39:36.209572: step: 408/527, loss: 9.83189747785218e-05 2023-01-22 20:39:37.255833: step: 412/527, loss: 0.008188781328499317 2023-01-22 20:39:38.306430: step: 416/527, loss: 0.002651694929227233 2023-01-22 20:39:39.345802: step: 420/527, loss: 0.0013871266273781657 2023-01-22 20:39:40.405689: step: 424/527, loss: 0.0038687598425894976 2023-01-22 20:39:41.456793: step: 428/527, loss: 0.005508510395884514 2023-01-22 20:39:42.540045: step: 432/527, loss: 0.012011495418846607 2023-01-22 20:39:43.573102: step: 436/527, loss: 0.0023733838461339474 2023-01-22 20:39:44.622427: step: 440/527, loss: 0.0031815289985388517 2023-01-22 20:39:45.666503: step: 444/527, loss: 0.002654285402968526 2023-01-22 20:39:46.717543: step: 448/527, loss: 0.006022052373737097 2023-01-22 20:39:47.788809: step: 452/527, loss: 0.01670355349779129 2023-01-22 20:39:48.837951: step: 456/527, loss: 0.006755279377102852 2023-01-22 20:39:49.884688: step: 460/527, loss: 0.0013523250818252563 2023-01-22 20:39:50.930437: step: 464/527, loss: 0.015229029580950737 2023-01-22 20:39:51.983176: step: 468/527, loss: 0.007500459440052509 2023-01-22 20:39:53.029891: step: 472/527, loss: 0.004672660026699305 2023-01-22 20:39:54.090939: step: 476/527, loss: 0.02393902651965618 2023-01-22 20:39:55.134199: step: 480/527, loss: 0.002292448654770851 2023-01-22 20:39:56.185815: step: 484/527, loss: 0.002667705761268735 2023-01-22 20:39:57.227447: step: 488/527, loss: 0.005085463635623455 2023-01-22 20:39:58.284783: step: 492/527, loss: 0.010912981815636158 2023-01-22 20:39:59.340798: step: 496/527, loss: 0.0022410873789340258 2023-01-22 20:40:00.381969: step: 500/527, loss: 0.04836438223719597 2023-01-22 20:40:01.437739: step: 504/527, loss: 0.0013247053138911724 2023-01-22 20:40:02.489940: step: 508/527, loss: 0.006348397117108107 2023-01-22 20:40:03.539619: step: 512/527, loss: 0.0023427847772836685 2023-01-22 20:40:04.579972: step: 516/527, loss: 0.004499959293752909 2023-01-22 20:40:05.638320: step: 520/527, loss: 0.006391513627022505 2023-01-22 20:40:06.690841: step: 524/527, loss: 0.00026770145632326603 2023-01-22 20:40:07.733123: step: 528/527, loss: 0.023344792425632477 2023-01-22 20:40:08.775805: step: 532/527, loss: 0.004374243319034576 2023-01-22 20:40:09.837098: step: 536/527, loss: 0.017439447343349457 2023-01-22 20:40:10.890046: step: 540/527, loss: 0.013942568562924862 2023-01-22 20:40:11.937029: step: 544/527, loss: 0.00022667463053949177 2023-01-22 20:40:12.979980: step: 548/527, loss: 0.00016721135762054473 2023-01-22 20:40:14.033861: step: 552/527, loss: 0.003708152798935771 2023-01-22 20:40:15.094982: step: 556/527, loss: 0.004756446927785873 2023-01-22 20:40:16.133175: step: 560/527, loss: 0.003914811182767153 2023-01-22 20:40:17.183223: step: 564/527, loss: 0.005228899419307709 2023-01-22 20:40:18.241217: step: 568/527, loss: 0.014796995557844639 2023-01-22 20:40:19.309908: step: 572/527, loss: 0.0010173649061471224 2023-01-22 20:40:20.352770: step: 576/527, loss: 0.0005553365917876363 2023-01-22 20:40:21.393000: step: 580/527, loss: 0.017911789938807487 2023-01-22 20:40:22.441521: step: 584/527, loss: 0.006082690320909023 2023-01-22 20:40:23.499345: step: 588/527, loss: 0.11595837026834488 2023-01-22 20:40:24.573770: step: 592/527, loss: 0.0053380210883915424 2023-01-22 20:40:25.637318: step: 596/527, loss: 0.00829209852963686 2023-01-22 20:40:26.675661: step: 600/527, loss: 2.9707933208555914e-05 2023-01-22 20:40:27.735814: step: 604/527, loss: 0.004472649190574884 2023-01-22 20:40:28.784975: step: 608/527, loss: 0.0002699148317333311 2023-01-22 20:40:29.852450: step: 612/527, loss: 0.0035728083457797766 2023-01-22 20:40:30.906939: step: 616/527, loss: 0.006894966587424278 2023-01-22 20:40:31.949831: step: 620/527, loss: 0.0008049603784456849 2023-01-22 20:40:33.007009: step: 624/527, loss: 0.04674700275063515 2023-01-22 20:40:34.062627: step: 628/527, loss: 0.014055876061320305 2023-01-22 20:40:35.122011: step: 632/527, loss: 0.006032452918589115 2023-01-22 20:40:36.163847: step: 636/527, loss: 0.0017511112382635474 2023-01-22 20:40:37.207784: step: 640/527, loss: 0.005384983029216528 2023-01-22 20:40:38.249833: step: 644/527, loss: 0.012973129749298096 2023-01-22 20:40:39.306388: step: 648/527, loss: 0.004904575180262327 2023-01-22 20:40:40.366406: step: 652/527, loss: 0.032769348472356796 2023-01-22 20:40:41.417925: step: 656/527, loss: 0.019441980868577957 2023-01-22 20:40:42.491930: step: 660/527, loss: 0.0044271028600633144 2023-01-22 20:40:43.528650: step: 664/527, loss: 0.0037613767199218273 2023-01-22 20:40:44.572539: step: 668/527, loss: 0.0008808940183371305 2023-01-22 20:40:45.640876: step: 672/527, loss: 0.011488381773233414 2023-01-22 20:40:46.696932: step: 676/527, loss: 0.009461192414164543 2023-01-22 20:40:47.759231: step: 680/527, loss: 0.005549016874283552 2023-01-22 20:40:48.821848: step: 684/527, loss: 0.0047487132251262665 2023-01-22 20:40:49.871286: step: 688/527, loss: 0.07491440325975418 2023-01-22 20:40:50.937298: step: 692/527, loss: 0.010135111398994923 2023-01-22 20:40:51.980533: step: 696/527, loss: 0.0026387826073914766 2023-01-22 20:40:53.041905: step: 700/527, loss: 0.003873219480738044 2023-01-22 20:40:54.082032: step: 704/527, loss: 0.0010550337610766292 2023-01-22 20:40:55.129468: step: 708/527, loss: 0.010220732539892197 2023-01-22 20:40:56.173206: step: 712/527, loss: 0.0015176909510046244 2023-01-22 20:40:57.219292: step: 716/527, loss: 0.008627385832369328 2023-01-22 20:40:58.266357: step: 720/527, loss: 0.003731341799721122 2023-01-22 20:40:59.338319: step: 724/527, loss: 0.01755712553858757 2023-01-22 20:41:00.386164: step: 728/527, loss: 0.004385507199913263 2023-01-22 20:41:01.454478: step: 732/527, loss: 0.007302356883883476 2023-01-22 20:41:02.511175: step: 736/527, loss: 0.006042297929525375 2023-01-22 20:41:03.551919: step: 740/527, loss: 0.012654879130423069 2023-01-22 20:41:04.606250: step: 744/527, loss: 0.007096535060554743 2023-01-22 20:41:05.649337: step: 748/527, loss: 0.008909719996154308 2023-01-22 20:41:06.692863: step: 752/527, loss: 0.008092156611382961 2023-01-22 20:41:07.731112: step: 756/527, loss: 0.00423313956707716 2023-01-22 20:41:08.766624: step: 760/527, loss: 0.002946165855973959 2023-01-22 20:41:09.813942: step: 764/527, loss: 0.010203739628195763 2023-01-22 20:41:10.883277: step: 768/527, loss: 0.006518718786537647 2023-01-22 20:41:11.927859: step: 772/527, loss: 0.011371956206858158 2023-01-22 20:41:12.990235: step: 776/527, loss: 0.007888561114668846 2023-01-22 20:41:14.029853: step: 780/527, loss: 0.012847617268562317 2023-01-22 20:41:15.077040: step: 784/527, loss: 0.00215515517629683 2023-01-22 20:41:16.131373: step: 788/527, loss: 0.013022433035075665 2023-01-22 20:41:17.185803: step: 792/527, loss: 0.008794841356575489 2023-01-22 20:41:18.255743: step: 796/527, loss: 0.20729830861091614 2023-01-22 20:41:19.310226: step: 800/527, loss: 0.013999639078974724 2023-01-22 20:41:20.363871: step: 804/527, loss: 0.007690807338804007 2023-01-22 20:41:21.429288: step: 808/527, loss: 0.05223587527871132 2023-01-22 20:41:22.490327: step: 812/527, loss: 0.0006392710492946208 2023-01-22 20:41:23.553262: step: 816/527, loss: 0.011065471917390823 2023-01-22 20:41:24.608983: step: 820/527, loss: 0.00035611592466011643 2023-01-22 20:41:25.651232: step: 824/527, loss: 0.03008735366165638 2023-01-22 20:41:26.704996: step: 828/527, loss: 0.03518468886613846 2023-01-22 20:41:27.749803: step: 832/527, loss: 0.010908438824117184 2023-01-22 20:41:28.792808: step: 836/527, loss: 0.001667057629674673 2023-01-22 20:41:29.850480: step: 840/527, loss: 0.00141559645999223 2023-01-22 20:41:30.904538: step: 844/527, loss: 0.0023408320266753435 2023-01-22 20:41:31.955702: step: 848/527, loss: 0.006133314687758684 2023-01-22 20:41:33.008791: step: 852/527, loss: 0.038530729711055756 2023-01-22 20:41:34.059854: step: 856/527, loss: 0.0035190805792808533 2023-01-22 20:41:35.108740: step: 860/527, loss: 0.011946484446525574 2023-01-22 20:41:36.153138: step: 864/527, loss: 0.010769824497401714 2023-01-22 20:41:37.199167: step: 868/527, loss: 0.004497906658798456 2023-01-22 20:41:38.263262: step: 872/527, loss: 0.0001892131840577349 2023-01-22 20:41:39.313280: step: 876/527, loss: 0.019124925136566162 2023-01-22 20:41:40.368038: step: 880/527, loss: 0.0088470708578825 2023-01-22 20:41:41.422771: step: 884/527, loss: 0.0115622254088521 2023-01-22 20:41:42.465291: step: 888/527, loss: 0.0007585367420688272 2023-01-22 20:41:43.516896: step: 892/527, loss: 0.004425411578267813 2023-01-22 20:41:44.560156: step: 896/527, loss: 0.008273812010884285 2023-01-22 20:41:45.628479: step: 900/527, loss: 0.0007577762007713318 2023-01-22 20:41:46.679350: step: 904/527, loss: 0.0016283057630062103 2023-01-22 20:41:47.736013: step: 908/527, loss: 0.007221805397421122 2023-01-22 20:41:48.777092: step: 912/527, loss: 0.00942810345441103 2023-01-22 20:41:49.814857: step: 916/527, loss: 0.009683789685368538 2023-01-22 20:41:50.862744: step: 920/527, loss: 0.0033346693962812424 2023-01-22 20:41:51.904550: step: 924/527, loss: 0.011278386227786541 2023-01-22 20:41:52.944794: step: 928/527, loss: 0.008082011714577675 2023-01-22 20:41:54.009464: step: 932/527, loss: 0.008215578272938728 2023-01-22 20:41:55.069835: step: 936/527, loss: 0.004424274433404207 2023-01-22 20:41:56.105697: step: 940/527, loss: 0.009950798936188221 2023-01-22 20:41:57.157471: step: 944/527, loss: 0.00863655749708414 2023-01-22 20:41:58.215820: step: 948/527, loss: 0.003876417176797986 2023-01-22 20:41:59.252618: step: 952/527, loss: 0.008662656880915165 2023-01-22 20:42:00.294566: step: 956/527, loss: 0.004613472148776054 2023-01-22 20:42:01.339844: step: 960/527, loss: 0.005549674388021231 2023-01-22 20:42:02.411183: step: 964/527, loss: 0.03802313655614853 2023-01-22 20:42:03.461980: step: 968/527, loss: 0.0015504133189097047 2023-01-22 20:42:04.504468: step: 972/527, loss: 0.01773657463490963 2023-01-22 20:42:05.543654: step: 976/527, loss: 0.00028582499362528324 2023-01-22 20:42:06.595712: step: 980/527, loss: 0.004120856523513794 2023-01-22 20:42:07.654934: step: 984/527, loss: 0.003703651251271367 2023-01-22 20:42:08.708237: step: 988/527, loss: 0.0038197690155357122 2023-01-22 20:42:09.758284: step: 992/527, loss: 0.0008174747345037758 2023-01-22 20:42:10.802498: step: 996/527, loss: 0.013780878856778145 2023-01-22 20:42:11.860004: step: 1000/527, loss: 0.009317596442997456 2023-01-22 20:42:12.908355: step: 1004/527, loss: 0.010493717156350613 2023-01-22 20:42:13.962737: step: 1008/527, loss: 0.019394738599658012 2023-01-22 20:42:14.995838: step: 1012/527, loss: 0.02577967382967472 2023-01-22 20:42:16.053519: step: 1016/527, loss: 0.005452234763652086 2023-01-22 20:42:17.110533: step: 1020/527, loss: 0.010642754845321178 2023-01-22 20:42:18.155751: step: 1024/527, loss: 0.0095218475908041 2023-01-22 20:42:19.252454: step: 1028/527, loss: 0.04922255128622055 2023-01-22 20:42:20.290556: step: 1032/527, loss: 0.00043056276626884937 2023-01-22 20:42:21.318750: step: 1036/527, loss: 0.003438494633883238 2023-01-22 20:42:22.362256: step: 1040/527, loss: 0.004783006850630045 2023-01-22 20:42:23.411628: step: 1044/527, loss: 0.01521259918808937 2023-01-22 20:42:24.451408: step: 1048/527, loss: 0.006692052818834782 2023-01-22 20:42:25.522233: step: 1052/527, loss: 0.0197843499481678 2023-01-22 20:42:26.562712: step: 1056/527, loss: 0.004113410599529743 2023-01-22 20:42:27.614898: step: 1060/527, loss: 0.018774723634123802 2023-01-22 20:42:28.666621: step: 1064/527, loss: 0.01182416919618845 2023-01-22 20:42:29.726854: step: 1068/527, loss: 0.006834171712398529 2023-01-22 20:42:30.779104: step: 1072/527, loss: 0.005259782075881958 2023-01-22 20:42:31.828277: step: 1076/527, loss: 0.0082195233553648 2023-01-22 20:42:32.877041: step: 1080/527, loss: 0.010673260316252708 2023-01-22 20:42:33.920950: step: 1084/527, loss: 0.010740244761109352 2023-01-22 20:42:34.980830: step: 1088/527, loss: 0.03633970767259598 2023-01-22 20:42:36.032043: step: 1092/527, loss: 0.004128754138946533 2023-01-22 20:42:37.069792: step: 1096/527, loss: 0.009518708102405071 2023-01-22 20:42:38.118722: step: 1100/527, loss: 0.012892210856080055 2023-01-22 20:42:39.163400: step: 1104/527, loss: 0.003520045895129442 2023-01-22 20:42:40.219794: step: 1108/527, loss: 0.01902119815349579 2023-01-22 20:42:41.265695: step: 1112/527, loss: 0.008588296361267567 2023-01-22 20:42:42.325002: step: 1116/527, loss: 0.0016149862203747034 2023-01-22 20:42:43.359605: step: 1120/527, loss: 0.009680958464741707 2023-01-22 20:42:44.407639: step: 1124/527, loss: 0.004695790354162455 2023-01-22 20:42:45.461517: step: 1128/527, loss: 0.017347292974591255 2023-01-22 20:42:46.526994: step: 1132/527, loss: 0.004620944615453482 2023-01-22 20:42:47.570706: step: 1136/527, loss: 0.02449961006641388 2023-01-22 20:42:48.627481: step: 1140/527, loss: 0.012774487026035786 2023-01-22 20:42:49.709700: step: 1144/527, loss: 0.05937916412949562 2023-01-22 20:42:50.752486: step: 1148/527, loss: 0.011631275527179241 2023-01-22 20:42:51.824985: step: 1152/527, loss: 0.012058062478899956 2023-01-22 20:42:52.861353: step: 1156/527, loss: 0.008152112364768982 2023-01-22 20:42:53.911575: step: 1160/527, loss: 0.008222196251153946 2023-01-22 20:42:54.951816: step: 1164/527, loss: 0.0014271700056269765 2023-01-22 20:42:56.020262: step: 1168/527, loss: 0.01853320747613907 2023-01-22 20:42:57.062927: step: 1172/527, loss: 0.003556971438229084 2023-01-22 20:42:58.106195: step: 1176/527, loss: 0.008396762423217297 2023-01-22 20:42:59.167066: step: 1180/527, loss: 0.011722382158041 2023-01-22 20:43:00.218967: step: 1184/527, loss: 0.007182334549725056 2023-01-22 20:43:01.264484: step: 1188/527, loss: 0.006055328529328108 2023-01-22 20:43:02.307903: step: 1192/527, loss: 0.036495935171842575 2023-01-22 20:43:03.346413: step: 1196/527, loss: 0.02186264470219612 2023-01-22 20:43:04.398157: step: 1200/527, loss: 0.01584651879966259 2023-01-22 20:43:05.452601: step: 1204/527, loss: 0.010809837840497494 2023-01-22 20:43:06.502806: step: 1208/527, loss: 0.02692943997681141 2023-01-22 20:43:07.564484: step: 1212/527, loss: 0.009454113431274891 2023-01-22 20:43:08.608411: step: 1216/527, loss: 0.027783891186118126 2023-01-22 20:43:09.648094: step: 1220/527, loss: 0.004693866707384586 2023-01-22 20:43:10.718222: step: 1224/527, loss: 0.00023963444982655346 2023-01-22 20:43:11.748424: step: 1228/527, loss: 0.0038243255112320185 2023-01-22 20:43:12.792817: step: 1232/527, loss: 0.02387555129826069 2023-01-22 20:43:13.841867: step: 1236/527, loss: 0.014292039908468723 2023-01-22 20:43:14.901615: step: 1240/527, loss: 0.004833715967833996 2023-01-22 20:43:15.943556: step: 1244/527, loss: 0.002711265115067363 2023-01-22 20:43:17.011963: step: 1248/527, loss: 0.007548751775175333 2023-01-22 20:43:18.072681: step: 1252/527, loss: 0.00460277684032917 2023-01-22 20:43:19.133255: step: 1256/527, loss: 0.004603673703968525 2023-01-22 20:43:20.201339: step: 1260/527, loss: 0.007028146181255579 2023-01-22 20:43:21.240851: step: 1264/527, loss: 0.010227174498140812 2023-01-22 20:43:22.308794: step: 1268/527, loss: 0.004150801338255405 2023-01-22 20:43:23.353795: step: 1272/527, loss: 0.007186429109424353 2023-01-22 20:43:24.412252: step: 1276/527, loss: 0.02736535668373108 2023-01-22 20:43:25.477990: step: 1280/527, loss: 0.005742392502725124 2023-01-22 20:43:26.525223: step: 1284/527, loss: 0.005569592118263245 2023-01-22 20:43:27.588581: step: 1288/527, loss: 0.0033510392531752586 2023-01-22 20:43:28.637641: step: 1292/527, loss: 0.005017543211579323 2023-01-22 20:43:29.678925: step: 1296/527, loss: 0.01486192923039198 2023-01-22 20:43:30.751130: step: 1300/527, loss: 0.00042533804662525654 2023-01-22 20:43:31.798419: step: 1304/527, loss: 0.0162766445428133 2023-01-22 20:43:32.863146: step: 1308/527, loss: 0.008644442074000835 2023-01-22 20:43:33.908759: step: 1312/527, loss: 0.004043469671159983 2023-01-22 20:43:34.958211: step: 1316/527, loss: 0.016971854493021965 2023-01-22 20:43:36.009433: step: 1320/527, loss: 0.02391149289906025 2023-01-22 20:43:37.061936: step: 1324/527, loss: 0.0059237172827124596 2023-01-22 20:43:38.098883: step: 1328/527, loss: 0.040738124400377274 2023-01-22 20:43:39.165202: step: 1332/527, loss: 0.005882258526980877 2023-01-22 20:43:40.208911: step: 1336/527, loss: 0.006508353166282177 2023-01-22 20:43:41.268818: step: 1340/527, loss: 0.004140972625464201 2023-01-22 20:43:42.324004: step: 1344/527, loss: 0.0037541233468800783 2023-01-22 20:43:43.381097: step: 1348/527, loss: 0.006124555133283138 2023-01-22 20:43:44.417294: step: 1352/527, loss: 0.0029107732698321342 2023-01-22 20:43:45.462729: step: 1356/527, loss: 0.06534484028816223 2023-01-22 20:43:46.507476: step: 1360/527, loss: 0.00041893587331287563 2023-01-22 20:43:47.537536: step: 1364/527, loss: 0.0002086635649902746 2023-01-22 20:43:48.599825: step: 1368/527, loss: 0.010120702907443047 2023-01-22 20:43:49.649751: step: 1372/527, loss: 0.011724220588803291 2023-01-22 20:43:50.690961: step: 1376/527, loss: 0.0452674962580204 2023-01-22 20:43:51.755196: step: 1380/527, loss: 0.004746389575302601 2023-01-22 20:43:52.795084: step: 1384/527, loss: 0.00955195352435112 2023-01-22 20:43:53.839191: step: 1388/527, loss: 0.0017424571560695767 2023-01-22 20:43:54.881681: step: 1392/527, loss: 0.004315395839512348 2023-01-22 20:43:55.944015: step: 1396/527, loss: 0.007179904729127884 2023-01-22 20:43:57.010230: step: 1400/527, loss: 0.006254470907151699 2023-01-22 20:43:58.076294: step: 1404/527, loss: 0.011929718777537346 2023-01-22 20:43:59.133093: step: 1408/527, loss: 0.02120901457965374 2023-01-22 20:44:00.193031: step: 1412/527, loss: 0.01178006362169981 2023-01-22 20:44:01.244846: step: 1416/527, loss: 0.013863730244338512 2023-01-22 20:44:02.296545: step: 1420/527, loss: 0.029085958376526833 2023-01-22 20:44:03.352614: step: 1424/527, loss: 0.014001265168190002 2023-01-22 20:44:04.420272: step: 1428/527, loss: 0.010790448635816574 2023-01-22 20:44:05.478433: step: 1432/527, loss: 0.008747573010623455 2023-01-22 20:44:06.530210: step: 1436/527, loss: 0.03057769685983658 2023-01-22 20:44:07.566102: step: 1440/527, loss: 0.002285042544826865 2023-01-22 20:44:08.616578: step: 1444/527, loss: 0.017266442999243736 2023-01-22 20:44:09.647564: step: 1448/527, loss: 0.0036021436098963022 2023-01-22 20:44:10.685521: step: 1452/527, loss: 0.009646909311413765 2023-01-22 20:44:11.717516: step: 1456/527, loss: 0.0003939413873013109 2023-01-22 20:44:12.742280: step: 1460/527, loss: 0.016654180362820625 2023-01-22 20:44:13.796651: step: 1464/527, loss: 0.038487453013658524 2023-01-22 20:44:14.850149: step: 1468/527, loss: 0.017812369391322136 2023-01-22 20:44:15.905199: step: 1472/527, loss: 0.020881012082099915 2023-01-22 20:44:16.943764: step: 1476/527, loss: 0.01195050310343504 2023-01-22 20:44:17.988489: step: 1480/527, loss: 0.008777450770139694 2023-01-22 20:44:19.039440: step: 1484/527, loss: 0.011199723929166794 2023-01-22 20:44:20.120444: step: 1488/527, loss: 0.0025973960291594267 2023-01-22 20:44:21.173915: step: 1492/527, loss: 0.0035221234429627657 2023-01-22 20:44:22.223364: step: 1496/527, loss: 0.02600690722465515 2023-01-22 20:44:23.268967: step: 1500/527, loss: 0.010533664375543594 2023-01-22 20:44:24.325263: step: 1504/527, loss: 0.0397987961769104 2023-01-22 20:44:25.362489: step: 1508/527, loss: 0.018840234726667404 2023-01-22 20:44:26.401218: step: 1512/527, loss: 0.0033310847356915474 2023-01-22 20:44:27.452450: step: 1516/527, loss: 0.0038425689563155174 2023-01-22 20:44:28.506784: step: 1520/527, loss: 0.003512369003146887 2023-01-22 20:44:29.560561: step: 1524/527, loss: 0.005522511899471283 2023-01-22 20:44:30.619295: step: 1528/527, loss: 0.002252943115308881 2023-01-22 20:44:31.650439: step: 1532/527, loss: 0.018201924860477448 2023-01-22 20:44:32.724407: step: 1536/527, loss: 0.0021684232633560896 2023-01-22 20:44:33.770823: step: 1540/527, loss: 0.006839650683104992 2023-01-22 20:44:34.816555: step: 1544/527, loss: 0.006699761375784874 2023-01-22 20:44:35.874320: step: 1548/527, loss: 0.002429414540529251 2023-01-22 20:44:36.939615: step: 1552/527, loss: 0.00799989141523838 2023-01-22 20:44:37.991324: step: 1556/527, loss: 0.007017431780695915 2023-01-22 20:44:39.033019: step: 1560/527, loss: 0.008746856823563576 2023-01-22 20:44:40.076649: step: 1564/527, loss: 0.009376038797199726 2023-01-22 20:44:41.130961: step: 1568/527, loss: 0.01515976246446371 2023-01-22 20:44:42.175611: step: 1572/527, loss: 0.008484846912324429 2023-01-22 20:44:43.231590: step: 1576/527, loss: 0.0073716407641768456 2023-01-22 20:44:44.273903: step: 1580/527, loss: 0.002795808482915163 2023-01-22 20:44:45.343882: step: 1584/527, loss: 0.05331748351454735 2023-01-22 20:44:46.396113: step: 1588/527, loss: 0.004696378484368324 2023-01-22 20:44:47.449927: step: 1592/527, loss: 0.028546925634145737 2023-01-22 20:44:48.492043: step: 1596/527, loss: 0.023770490661263466 2023-01-22 20:44:49.555157: step: 1600/527, loss: 0.016821762546896935 2023-01-22 20:44:50.588340: step: 1604/527, loss: 0.005958883091807365 2023-01-22 20:44:51.636515: step: 1608/527, loss: 0.011343484744429588 2023-01-22 20:44:52.692506: step: 1612/527, loss: 0.04513514041900635 2023-01-22 20:44:53.749687: step: 1616/527, loss: 0.002865853952243924 2023-01-22 20:44:54.793649: step: 1620/527, loss: 0.020046016201376915 2023-01-22 20:44:55.836548: step: 1624/527, loss: 0.005779031198471785 2023-01-22 20:44:56.881216: step: 1628/527, loss: 0.024139486253261566 2023-01-22 20:44:57.952096: step: 1632/527, loss: 0.009461689740419388 2023-01-22 20:44:59.021695: step: 1636/527, loss: 0.018036510795354843 2023-01-22 20:45:00.056462: step: 1640/527, loss: 0.0014199659926816821 2023-01-22 20:45:01.097564: step: 1644/527, loss: 0.002625318244099617 2023-01-22 20:45:02.154295: step: 1648/527, loss: 0.011786160059273243 2023-01-22 20:45:03.200674: step: 1652/527, loss: 0.0008031951729208231 2023-01-22 20:45:04.247636: step: 1656/527, loss: 0.0030371220782399178 2023-01-22 20:45:05.296397: step: 1660/527, loss: 0.03089003451168537 2023-01-22 20:45:06.347608: step: 1664/527, loss: 0.002912268042564392 2023-01-22 20:45:07.423731: step: 1668/527, loss: 0.016414275392889977 2023-01-22 20:45:08.470630: step: 1672/527, loss: 0.0059144445694983006 2023-01-22 20:45:09.508167: step: 1676/527, loss: 0.005163245834410191 2023-01-22 20:45:10.562998: step: 1680/527, loss: 0.017710890620946884 2023-01-22 20:45:11.603640: step: 1684/527, loss: 0.001954286592081189 2023-01-22 20:45:12.636533: step: 1688/527, loss: 0.00947886798530817 2023-01-22 20:45:13.671491: step: 1692/527, loss: 0.0005294825532473624 2023-01-22 20:45:14.709247: step: 1696/527, loss: 0.010331869125366211 2023-01-22 20:45:15.746756: step: 1700/527, loss: 0.016443584114313126 2023-01-22 20:45:16.784937: step: 1704/527, loss: 0.003135921899229288 2023-01-22 20:45:17.847458: step: 1708/527, loss: 0.0030218912288546562 2023-01-22 20:45:18.884967: step: 1712/527, loss: 0.005403540097177029 2023-01-22 20:45:19.957900: step: 1716/527, loss: 0.003915396519005299 2023-01-22 20:45:21.003820: step: 1720/527, loss: 0.00950311403721571 2023-01-22 20:45:22.047779: step: 1724/527, loss: 0.00293614505790174 2023-01-22 20:45:23.101774: step: 1728/527, loss: 0.0018410051707178354 2023-01-22 20:45:24.138018: step: 1732/527, loss: 0.0005031333421356976 2023-01-22 20:45:25.190927: step: 1736/527, loss: 0.01653796061873436 2023-01-22 20:45:26.232592: step: 1740/527, loss: 0.012764845974743366 2023-01-22 20:45:27.279942: step: 1744/527, loss: 0.006373863201588392 2023-01-22 20:45:28.339313: step: 1748/527, loss: 0.006958916783332825 2023-01-22 20:45:29.393724: step: 1752/527, loss: 0.007993902079761028 2023-01-22 20:45:30.430547: step: 1756/527, loss: 0.0008073113858699799 2023-01-22 20:45:31.480877: step: 1760/527, loss: 0.0011528790928423405 2023-01-22 20:45:32.524564: step: 1764/527, loss: 0.002510812832042575 2023-01-22 20:45:33.573670: step: 1768/527, loss: 0.010897216387093067 2023-01-22 20:45:34.648710: step: 1772/527, loss: 0.02104314975440502 2023-01-22 20:45:35.677253: step: 1776/527, loss: 0.0071724653244018555 2023-01-22 20:45:36.734067: step: 1780/527, loss: 0.006791979540139437 2023-01-22 20:45:37.769312: step: 1784/527, loss: 0.004590075463056564 2023-01-22 20:45:38.823099: step: 1788/527, loss: 0.0061460682190954685 2023-01-22 20:45:39.883034: step: 1792/527, loss: 0.009351025335490704 2023-01-22 20:45:40.934415: step: 1796/527, loss: 0.00035061786184087396 2023-01-22 20:45:41.998206: step: 1800/527, loss: 0.004905285779386759 2023-01-22 20:45:43.059996: step: 1804/527, loss: 0.005148687399923801 2023-01-22 20:45:44.114712: step: 1808/527, loss: 0.009701180271804333 2023-01-22 20:45:45.152849: step: 1812/527, loss: 0.0033806965220719576 2023-01-22 20:45:46.222169: step: 1816/527, loss: 0.006968039553612471 2023-01-22 20:45:47.263470: step: 1820/527, loss: 0.015023082494735718 2023-01-22 20:45:48.304267: step: 1824/527, loss: 0.006000255234539509 2023-01-22 20:45:49.366728: step: 1828/527, loss: 0.008121452294290066 2023-01-22 20:45:50.415400: step: 1832/527, loss: 0.05960588529706001 2023-01-22 20:45:51.451708: step: 1836/527, loss: 0.0023274431005120277 2023-01-22 20:45:52.501570: step: 1840/527, loss: 0.039414361119270325 2023-01-22 20:45:53.562944: step: 1844/527, loss: 0.00644992059096694 2023-01-22 20:45:54.608871: step: 1848/527, loss: 0.006524163763970137 2023-01-22 20:45:55.677390: step: 1852/527, loss: 0.01013212651014328 2023-01-22 20:45:56.731642: step: 1856/527, loss: 0.004018633626401424 2023-01-22 20:45:57.779615: step: 1860/527, loss: 0.014596362598240376 2023-01-22 20:45:58.820850: step: 1864/527, loss: 0.025225821882486343 2023-01-22 20:45:59.873023: step: 1868/527, loss: 0.005242825020104647 2023-01-22 20:46:00.918047: step: 1872/527, loss: 0.01516877394169569 2023-01-22 20:46:01.972805: step: 1876/527, loss: 0.014488687738776207 2023-01-22 20:46:03.007229: step: 1880/527, loss: 0.005859555676579475 2023-01-22 20:46:04.054614: step: 1884/527, loss: 0.007024526596069336 2023-01-22 20:46:05.109101: step: 1888/527, loss: 0.015668893232941628 2023-01-22 20:46:06.155511: step: 1892/527, loss: 0.000999676762148738 2023-01-22 20:46:07.218177: step: 1896/527, loss: 0.0033800743985921144 2023-01-22 20:46:08.251218: step: 1900/527, loss: 0.013505561277270317 2023-01-22 20:46:09.304294: step: 1904/527, loss: 0.005636982619762421 2023-01-22 20:46:10.356162: step: 1908/527, loss: 0.012868654914200306 2023-01-22 20:46:11.433362: step: 1912/527, loss: 0.0019239624962210655 2023-01-22 20:46:12.476380: step: 1916/527, loss: 0.012410313822329044 2023-01-22 20:46:13.525012: step: 1920/527, loss: 0.0013808478834107518 2023-01-22 20:46:14.571801: step: 1924/527, loss: 0.0061982120387256145 2023-01-22 20:46:15.607240: step: 1928/527, loss: 0.004160887096077204 2023-01-22 20:46:16.684359: step: 1932/527, loss: 0.007977792993187904 2023-01-22 20:46:17.730125: step: 1936/527, loss: 0.006800683680921793 2023-01-22 20:46:18.763444: step: 1940/527, loss: 0.0003702831454575062 2023-01-22 20:46:19.817614: step: 1944/527, loss: 0.008752088062465191 2023-01-22 20:46:20.874967: step: 1948/527, loss: 0.006395608186721802 2023-01-22 20:46:21.931932: step: 1952/527, loss: 0.007251636125147343 2023-01-22 20:46:22.978129: step: 1956/527, loss: 0.0034479154273867607 2023-01-22 20:46:24.026464: step: 1960/527, loss: 0.005253272131085396 2023-01-22 20:46:25.074697: step: 1964/527, loss: 0.0028642623219639063 2023-01-22 20:46:26.126041: step: 1968/527, loss: 0.0029721830505877733 2023-01-22 20:46:27.183108: step: 1972/527, loss: 0.003342901123687625 2023-01-22 20:46:28.257498: step: 1976/527, loss: 0.08128899335861206 2023-01-22 20:46:29.316683: step: 1980/527, loss: 0.00876810122281313 2023-01-22 20:46:30.375977: step: 1984/527, loss: 0.0020359300542622805 2023-01-22 20:46:31.418565: step: 1988/527, loss: 0.0035185536835342646 2023-01-22 20:46:32.469470: step: 1992/527, loss: 0.006642031483352184 2023-01-22 20:46:33.523260: step: 1996/527, loss: 0.0020395193714648485 2023-01-22 20:46:34.605927: step: 2000/527, loss: 0.006527036428451538 2023-01-22 20:46:35.645463: step: 2004/527, loss: 0.00270862621255219 2023-01-22 20:46:36.705286: step: 2008/527, loss: 0.022922977805137634 2023-01-22 20:46:37.755357: step: 2012/527, loss: 0.002328132512047887 2023-01-22 20:46:38.812706: step: 2016/527, loss: 0.000278402934782207 2023-01-22 20:46:39.858715: step: 2020/527, loss: 0.0019245247822254896 2023-01-22 20:46:40.914337: step: 2024/527, loss: 0.0017612857045605779 2023-01-22 20:46:41.965984: step: 2028/527, loss: 0.00990452989935875 2023-01-22 20:46:43.016510: step: 2032/527, loss: 0.005405002739280462 2023-01-22 20:46:44.064021: step: 2036/527, loss: 0.016722535714507103 2023-01-22 20:46:45.122266: step: 2040/527, loss: 0.02127009630203247 2023-01-22 20:46:46.169354: step: 2044/527, loss: 0.00259059458039701 2023-01-22 20:46:47.215636: step: 2048/527, loss: 0.012533239088952541 2023-01-22 20:46:48.260921: step: 2052/527, loss: 0.012309527024626732 2023-01-22 20:46:49.317435: step: 2056/527, loss: 0.0042281183414161205 2023-01-22 20:46:50.360797: step: 2060/527, loss: 0.0022847955115139484 2023-01-22 20:46:51.420471: step: 2064/527, loss: 0.00895863026380539 2023-01-22 20:46:52.472760: step: 2068/527, loss: 0.012750847265124321 2023-01-22 20:46:53.523022: step: 2072/527, loss: 0.007263650186359882 2023-01-22 20:46:54.574971: step: 2076/527, loss: 0.006345091387629509 2023-01-22 20:46:55.613044: step: 2080/527, loss: 0.004227068740874529 2023-01-22 20:46:56.662444: step: 2084/527, loss: 0.0013252180069684982 2023-01-22 20:46:57.720402: step: 2088/527, loss: 0.014781714417040348 2023-01-22 20:46:58.756648: step: 2092/527, loss: 0.002819240093231201 2023-01-22 20:46:59.801594: step: 2096/527, loss: 0.00046975380973890424 2023-01-22 20:47:00.842025: step: 2100/527, loss: 0.015434990637004375 2023-01-22 20:47:01.893377: step: 2104/527, loss: 0.0014861068921163678 2023-01-22 20:47:02.946299: step: 2108/527, loss: 0.005742221605032682 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32602840909090913, 'r': 0.3402573529411765, 'f1': 0.33299094707520893}, 'combined': 0.24536175047646971, 'stategy': 1, 'epoch': 6} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3386648798144771, 'r': 0.30941654928504503, 'f1': 0.3233807165924461}, 'combined': 0.20696365861916546, 'stategy': 1, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32227130838934737, 'r': 0.3571279774181762, 'f1': 0.33880547992687465}, 'combined': 0.2496461431040129, 'stategy': 1, 'epoch': 6} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.357010474361371, 'r': 0.31416921743800647, 'f1': 0.33422257174256004}, 'combined': 0.21390244591523838, 'stategy': 1, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33228787707526664, 'r': 0.33291840435624437, 'f1': 0.33260284188766026}, 'combined': 0.2450757782330128, 'stategy': 1, 'epoch': 6} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3655014852890142, 'r': 0.2969907428235575, 'f1': 0.3277036409267968}, 'combined': 0.2349573274569487, 'stategy': 1, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 6} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33228787707526664, 'r': 0.33291840435624437, 'f1': 0.33260284188766026}, 'combined': 0.2450757782330128, 'stategy': 1, 'epoch': 6} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3655014852890142, 'r': 0.2969907428235575, 'f1': 0.3277036409267968}, 'combined': 0.2349573274569487, 'stategy': 1, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 6} ****************************** Epoch: 7 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 20:49:41.112211: step: 4/527, loss: 0.0011762861395254731 2023-01-22 20:49:42.164146: step: 8/527, loss: 0.005770666524767876 2023-01-22 20:49:43.205145: step: 12/527, loss: 0.0058799730613827705 2023-01-22 20:49:44.239668: step: 16/527, loss: 0.011273756623268127 2023-01-22 20:49:45.288704: step: 20/527, loss: 0.0026257140561938286 2023-01-22 20:49:46.322600: step: 24/527, loss: 0.00032926027779467404 2023-01-22 20:49:47.360166: step: 28/527, loss: 0.0006809047190472484 2023-01-22 20:49:48.412972: step: 32/527, loss: 0.004259904380887747 2023-01-22 20:49:49.478962: step: 36/527, loss: 0.0050410921685397625 2023-01-22 20:49:50.513816: step: 40/527, loss: 0.018239814788103104 2023-01-22 20:49:51.551027: step: 44/527, loss: 0.004635321907699108 2023-01-22 20:49:52.593179: step: 48/527, loss: 0.023965172469615936 2023-01-22 20:49:53.630049: step: 52/527, loss: 0.0016597952926531434 2023-01-22 20:49:54.670209: step: 56/527, loss: 0.00301774637773633 2023-01-22 20:49:55.722606: step: 60/527, loss: 0.008930178359150887 2023-01-22 20:49:56.768914: step: 64/527, loss: 0.0018494927790015936 2023-01-22 20:49:57.831255: step: 68/527, loss: 0.004239272326231003 2023-01-22 20:49:58.891718: step: 72/527, loss: 0.006199757102876902 2023-01-22 20:49:59.928423: step: 76/527, loss: 0.009897388517856598 2023-01-22 20:50:00.964230: step: 80/527, loss: 0.0020823294762521982 2023-01-22 20:50:02.020227: step: 84/527, loss: 0.005397001747041941 2023-01-22 20:50:03.064912: step: 88/527, loss: 0.0019322435837239027 2023-01-22 20:50:04.100509: step: 92/527, loss: 0.00641339085996151 2023-01-22 20:50:05.138549: step: 96/527, loss: 0.002924043918028474 2023-01-22 20:50:06.200650: step: 100/527, loss: 0.0010368796065449715 2023-01-22 20:50:07.242204: step: 104/527, loss: 0.0012904191389679909 2023-01-22 20:50:08.290875: step: 108/527, loss: 0.011828163638710976 2023-01-22 20:50:09.343185: step: 112/527, loss: 0.007874327711760998 2023-01-22 20:50:10.395094: step: 116/527, loss: 0.03741808608174324 2023-01-22 20:50:11.438618: step: 120/527, loss: 0.0041434974409639835 2023-01-22 20:50:12.482588: step: 124/527, loss: 0.015231378376483917 2023-01-22 20:50:13.524940: step: 128/527, loss: 0.0068403431214392185 2023-01-22 20:50:14.560413: step: 132/527, loss: 0.01907469518482685 2023-01-22 20:50:15.624822: step: 136/527, loss: 0.0038294813130050898 2023-01-22 20:50:16.679282: step: 140/527, loss: 0.0294569730758667 2023-01-22 20:50:17.737622: step: 144/527, loss: 0.016039030626416206 2023-01-22 20:50:18.785064: step: 148/527, loss: 0.004499699920415878 2023-01-22 20:50:19.857106: step: 152/527, loss: 0.004218806512653828 2023-01-22 20:50:20.902098: step: 156/527, loss: 0.003108569886535406 2023-01-22 20:50:21.939679: step: 160/527, loss: 0.0011567205656319857 2023-01-22 20:50:22.974967: step: 164/527, loss: 0.012250793166458607 2023-01-22 20:50:24.023371: step: 168/527, loss: 0.001836702460423112 2023-01-22 20:50:25.086107: step: 172/527, loss: 0.015831436961889267 2023-01-22 20:50:26.133649: step: 176/527, loss: 0.058130908757448196 2023-01-22 20:50:27.170761: step: 180/527, loss: 0.012560397386550903 2023-01-22 20:50:28.211031: step: 184/527, loss: 0.011316278018057346 2023-01-22 20:50:29.257807: step: 188/527, loss: 0.006172460038214922 2023-01-22 20:50:30.300758: step: 192/527, loss: 0.004307650029659271 2023-01-22 20:50:31.331954: step: 196/527, loss: 0.017391672357916832 2023-01-22 20:50:32.389466: step: 200/527, loss: 0.029506081715226173 2023-01-22 20:50:33.444210: step: 204/527, loss: 0.010138626210391521 2023-01-22 20:50:34.503074: step: 208/527, loss: 0.005423405673354864 2023-01-22 20:50:35.555487: step: 212/527, loss: 0.012503663077950478 2023-01-22 20:50:36.613326: step: 216/527, loss: 0.0118255615234375 2023-01-22 20:50:37.655464: step: 220/527, loss: 0.0021275379694998264 2023-01-22 20:50:38.703804: step: 224/527, loss: 0.005360076203942299 2023-01-22 20:50:39.757731: step: 228/527, loss: 0.004129832144826651 2023-01-22 20:50:40.798737: step: 232/527, loss: 0.003537638345733285 2023-01-22 20:50:41.852191: step: 236/527, loss: 0.016127267852425575 2023-01-22 20:50:42.896945: step: 240/527, loss: 0.008914195001125336 2023-01-22 20:50:43.936474: step: 244/527, loss: 0.006991427391767502 2023-01-22 20:50:44.989768: step: 248/527, loss: 0.014179665595293045 2023-01-22 20:50:46.027901: step: 252/527, loss: 0.0028274371288716793 2023-01-22 20:50:47.071027: step: 256/527, loss: 0.02464323490858078 2023-01-22 20:50:48.133938: step: 260/527, loss: 0.004434527363628149 2023-01-22 20:50:49.173482: step: 264/527, loss: 0.004369425121694803 2023-01-22 20:50:50.200292: step: 268/527, loss: 0.001455672667361796 2023-01-22 20:50:51.238788: step: 272/527, loss: 0.0025727252941578627 2023-01-22 20:50:52.297585: step: 276/527, loss: 0.015328909270465374 2023-01-22 20:50:53.342929: step: 280/527, loss: 0.0028654129710048437 2023-01-22 20:50:54.385509: step: 284/527, loss: 0.0019427868537604809 2023-01-22 20:50:55.473866: step: 288/527, loss: 0.004314064979553223 2023-01-22 20:50:56.526399: step: 292/527, loss: 0.004622200969606638 2023-01-22 20:50:57.582660: step: 296/527, loss: 0.001130652497522533 2023-01-22 20:50:58.639204: step: 300/527, loss: 0.004627163987606764 2023-01-22 20:50:59.685044: step: 304/527, loss: 0.006596438121050596 2023-01-22 20:51:00.724599: step: 308/527, loss: 0.0031702998094260693 2023-01-22 20:51:01.765265: step: 312/527, loss: 0.05189930647611618 2023-01-22 20:51:02.818415: step: 316/527, loss: 0.014070438221096992 2023-01-22 20:51:03.877135: step: 320/527, loss: 0.033695101737976074 2023-01-22 20:51:04.941040: step: 324/527, loss: 0.0010572454193606973 2023-01-22 20:51:05.988071: step: 328/527, loss: 0.0032614138908684254 2023-01-22 20:51:07.036741: step: 332/527, loss: 0.002125186612829566 2023-01-22 20:51:08.090974: step: 336/527, loss: 0.013929967768490314 2023-01-22 20:51:09.140133: step: 340/527, loss: 0.007350783795118332 2023-01-22 20:51:10.189440: step: 344/527, loss: 0.0029599317349493504 2023-01-22 20:51:11.230830: step: 348/527, loss: 0.0033247913233935833 2023-01-22 20:51:12.269284: step: 352/527, loss: 0.022639818489551544 2023-01-22 20:51:13.321290: step: 356/527, loss: 0.03472037985920906 2023-01-22 20:51:14.361112: step: 360/527, loss: 0.009670925326645374 2023-01-22 20:51:15.423477: step: 364/527, loss: 0.007359437178820372 2023-01-22 20:51:16.458659: step: 368/527, loss: 0.0007571582100354135 2023-01-22 20:51:17.507582: step: 372/527, loss: 0.01009386032819748 2023-01-22 20:51:18.563255: step: 376/527, loss: 0.011879819445312023 2023-01-22 20:51:19.611317: step: 380/527, loss: 0.002377040684223175 2023-01-22 20:51:20.661397: step: 384/527, loss: 0.004287994932383299 2023-01-22 20:51:21.718873: step: 388/527, loss: 0.014191503636538982 2023-01-22 20:51:22.768512: step: 392/527, loss: 0.005081395152956247 2023-01-22 20:51:23.812310: step: 396/527, loss: 0.005458964500576258 2023-01-22 20:51:24.862582: step: 400/527, loss: 0.011647974140942097 2023-01-22 20:51:25.934572: step: 404/527, loss: 0.04495551809668541 2023-01-22 20:51:26.984902: step: 408/527, loss: 0.001693369005806744 2023-01-22 20:51:28.024319: step: 412/527, loss: 0.0005285778315737844 2023-01-22 20:51:29.100040: step: 416/527, loss: 0.011131852865219116 2023-01-22 20:51:30.174890: step: 420/527, loss: 0.015228178352117538 2023-01-22 20:51:31.226299: step: 424/527, loss: 0.004915312398225069 2023-01-22 20:51:32.274045: step: 428/527, loss: 0.0009033419191837311 2023-01-22 20:51:33.322296: step: 432/527, loss: 0.005241208244115114 2023-01-22 20:51:34.384805: step: 436/527, loss: 0.004751232452690601 2023-01-22 20:51:35.440068: step: 440/527, loss: 0.002280104672536254 2023-01-22 20:51:36.470208: step: 444/527, loss: 0.00037919796886853874 2023-01-22 20:51:37.536649: step: 448/527, loss: 0.008740345016121864 2023-01-22 20:51:38.582390: step: 452/527, loss: 0.005086951889097691 2023-01-22 20:51:39.637810: step: 456/527, loss: 0.001967105781659484 2023-01-22 20:51:40.660835: step: 460/527, loss: 0.018408872187137604 2023-01-22 20:51:41.711230: step: 464/527, loss: 0.016983093693852425 2023-01-22 20:51:42.771950: step: 468/527, loss: 0.004702928941696882 2023-01-22 20:51:43.805866: step: 472/527, loss: 0.0006862246082164347 2023-01-22 20:51:44.859061: step: 476/527, loss: 0.015025509521365166 2023-01-22 20:51:45.918768: step: 480/527, loss: 0.0058099376037716866 2023-01-22 20:51:46.975244: step: 484/527, loss: 3.218735218979418e-05 2023-01-22 20:51:48.024532: step: 488/527, loss: 0.017115792259573936 2023-01-22 20:51:49.077904: step: 492/527, loss: 0.021023690700531006 2023-01-22 20:51:50.135333: step: 496/527, loss: 0.0056746965274214745 2023-01-22 20:51:51.200862: step: 500/527, loss: 0.0018623566720634699 2023-01-22 20:51:52.276451: step: 504/527, loss: 0.020436029881238937 2023-01-22 20:51:53.318922: step: 508/527, loss: 0.0036087525077164173 2023-01-22 20:51:54.366265: step: 512/527, loss: 0.0015722350217401981 2023-01-22 20:51:55.402636: step: 516/527, loss: 0.0060618286952376366 2023-01-22 20:51:56.454643: step: 520/527, loss: 0.014606723561882973 2023-01-22 20:51:57.507063: step: 524/527, loss: 0.03179058060050011 2023-01-22 20:51:58.559153: step: 528/527, loss: 0.007102565374225378 2023-01-22 20:51:59.610613: step: 532/527, loss: 0.002537408145144582 2023-01-22 20:52:00.649789: step: 536/527, loss: 0.0037960298359394073 2023-01-22 20:52:01.713360: step: 540/527, loss: 0.0036290937568992376 2023-01-22 20:52:02.781893: step: 544/527, loss: 0.006896753795444965 2023-01-22 20:52:03.823952: step: 548/527, loss: 0.01382986456155777 2023-01-22 20:52:04.897997: step: 552/527, loss: 0.020476488396525383 2023-01-22 20:52:05.951177: step: 556/527, loss: 0.007108153309673071 2023-01-22 20:52:07.000325: step: 560/527, loss: 0.0004783706972375512 2023-01-22 20:52:08.046489: step: 564/527, loss: 0.002413227455690503 2023-01-22 20:52:09.099213: step: 568/527, loss: 0.005749106407165527 2023-01-22 20:52:10.162957: step: 572/527, loss: 0.0029543943237513304 2023-01-22 20:52:11.226600: step: 576/527, loss: 0.0054740700870752335 2023-01-22 20:52:12.279456: step: 580/527, loss: 0.004652662668377161 2023-01-22 20:52:13.346147: step: 584/527, loss: 0.009162668138742447 2023-01-22 20:52:14.399926: step: 588/527, loss: 0.01541721448302269 2023-01-22 20:52:15.440317: step: 592/527, loss: 0.004740078002214432 2023-01-22 20:52:16.521958: step: 596/527, loss: 0.014616711065173149 2023-01-22 20:52:17.564036: step: 600/527, loss: 0.0007777657592669129 2023-01-22 20:52:18.612644: step: 604/527, loss: 0.0008933142526075244 2023-01-22 20:52:19.706010: step: 608/527, loss: 0.015309763140976429 2023-01-22 20:52:20.782539: step: 612/527, loss: 0.020909041166305542 2023-01-22 20:52:21.822928: step: 616/527, loss: 0.00640073511749506 2023-01-22 20:52:22.883669: step: 620/527, loss: 0.005111007019877434 2023-01-22 20:52:23.931281: step: 624/527, loss: 0.019747916609048843 2023-01-22 20:52:24.986277: step: 628/527, loss: 0.008677047677338123 2023-01-22 20:52:26.025390: step: 632/527, loss: 0.001905033364892006 2023-01-22 20:52:27.093821: step: 636/527, loss: 0.011477400548756123 2023-01-22 20:52:28.154491: step: 640/527, loss: 0.013894579373300076 2023-01-22 20:52:29.220319: step: 644/527, loss: 0.001782936160452664 2023-01-22 20:52:30.264217: step: 648/527, loss: 0.008497907780110836 2023-01-22 20:52:31.313395: step: 652/527, loss: 0.005031628534197807 2023-01-22 20:52:32.369369: step: 656/527, loss: 0.00017212475358974189 2023-01-22 20:52:33.433881: step: 660/527, loss: 0.007511868141591549 2023-01-22 20:52:34.492657: step: 664/527, loss: 0.006619738414883614 2023-01-22 20:52:35.547803: step: 668/527, loss: 0.006116640754044056 2023-01-22 20:52:36.633170: step: 672/527, loss: 0.004539389628916979 2023-01-22 20:52:37.693861: step: 676/527, loss: 0.010941772721707821 2023-01-22 20:52:38.748632: step: 680/527, loss: 0.012777542695403099 2023-01-22 20:52:39.801959: step: 684/527, loss: 0.028100110590457916 2023-01-22 20:52:40.855777: step: 688/527, loss: 0.0022654919885098934 2023-01-22 20:52:41.909903: step: 692/527, loss: 0.0014765422092750669 2023-01-22 20:52:42.947854: step: 696/527, loss: 0.0004512390587478876 2023-01-22 20:52:44.002903: step: 700/527, loss: 0.026108723133802414 2023-01-22 20:52:45.065484: step: 704/527, loss: 0.0021571919787675142 2023-01-22 20:52:46.107319: step: 708/527, loss: 0.034226201474666595 2023-01-22 20:52:47.168627: step: 712/527, loss: 0.015098475851118565 2023-01-22 20:52:48.220528: step: 716/527, loss: 0.03442610800266266 2023-01-22 20:52:49.284699: step: 720/527, loss: 0.004797043744474649 2023-01-22 20:52:50.342781: step: 724/527, loss: 2.5107931378443027e-07 2023-01-22 20:52:51.377045: step: 728/527, loss: 0.0004034257435705513 2023-01-22 20:52:52.438576: step: 732/527, loss: 0.0008951377240009606 2023-01-22 20:52:53.500198: step: 736/527, loss: 0.007962681353092194 2023-01-22 20:52:54.544252: step: 740/527, loss: 0.007262140978127718 2023-01-22 20:52:55.592710: step: 744/527, loss: 0.020501291379332542 2023-01-22 20:52:56.660665: step: 748/527, loss: 0.00503790145739913 2023-01-22 20:52:57.729961: step: 752/527, loss: 0.011922625824809074 2023-01-22 20:52:58.784985: step: 756/527, loss: 0.015006141737103462 2023-01-22 20:52:59.844482: step: 760/527, loss: 0.004677725490182638 2023-01-22 20:53:00.905288: step: 764/527, loss: 0.0007494300953112543 2023-01-22 20:53:01.951584: step: 768/527, loss: 0.0038482944946736097 2023-01-22 20:53:03.001458: step: 772/527, loss: 0.0001417314779246226 2023-01-22 20:53:04.059771: step: 776/527, loss: 0.029271438717842102 2023-01-22 20:53:05.118436: step: 780/527, loss: 0.007724877446889877 2023-01-22 20:53:06.164338: step: 784/527, loss: 0.00010587528231553733 2023-01-22 20:53:07.214447: step: 788/527, loss: 0.0071006715297698975 2023-01-22 20:53:08.270368: step: 792/527, loss: 0.007228069007396698 2023-01-22 20:53:09.331396: step: 796/527, loss: 0.007850813679397106 2023-01-22 20:53:10.385386: step: 800/527, loss: 0.010215306654572487 2023-01-22 20:53:11.440750: step: 804/527, loss: 0.005301190540194511 2023-01-22 20:53:12.491393: step: 808/527, loss: 0.0041169882752001286 2023-01-22 20:53:13.538686: step: 812/527, loss: 0.006976403295993805 2023-01-22 20:53:14.597835: step: 816/527, loss: 0.0020775748416781425 2023-01-22 20:53:15.636612: step: 820/527, loss: 0.01090807281434536 2023-01-22 20:53:16.694841: step: 824/527, loss: 0.007368314079940319 2023-01-22 20:53:17.739407: step: 828/527, loss: 0.0047179278917610645 2023-01-22 20:53:18.813182: step: 832/527, loss: 0.0035620173439383507 2023-01-22 20:53:19.866092: step: 836/527, loss: 0.0265518631786108 2023-01-22 20:53:20.913915: step: 840/527, loss: 0.03120836615562439 2023-01-22 20:53:21.972681: step: 844/527, loss: 0.04151497036218643 2023-01-22 20:53:23.024852: step: 848/527, loss: 0.009576591663062572 2023-01-22 20:53:24.072792: step: 852/527, loss: 0.006244817283004522 2023-01-22 20:53:25.109214: step: 856/527, loss: 0.0016189676243811846 2023-01-22 20:53:26.182651: step: 860/527, loss: 0.026952020823955536 2023-01-22 20:53:27.227931: step: 864/527, loss: 0.0037856451235711575 2023-01-22 20:53:28.277673: step: 868/527, loss: 0.007088650017976761 2023-01-22 20:53:29.337399: step: 872/527, loss: 0.0004930454306304455 2023-01-22 20:53:30.401381: step: 876/527, loss: 0.013458835892379284 2023-01-22 20:53:31.439446: step: 880/527, loss: 0.006491994950920343 2023-01-22 20:53:32.514010: step: 884/527, loss: 0.00951909739524126 2023-01-22 20:53:33.569892: step: 888/527, loss: 0.006218143738806248 2023-01-22 20:53:34.630920: step: 892/527, loss: 0.010673061944544315 2023-01-22 20:53:35.701218: step: 896/527, loss: 0.0011394878383725882 2023-01-22 20:53:36.758107: step: 900/527, loss: 0.004675300791859627 2023-01-22 20:53:37.818792: step: 904/527, loss: 0.011332347057759762 2023-01-22 20:53:38.872889: step: 908/527, loss: 0.08071253448724747 2023-01-22 20:53:39.912845: step: 912/527, loss: 0.002875578124076128 2023-01-22 20:53:40.970063: step: 916/527, loss: 0.0056468416005373 2023-01-22 20:53:42.026917: step: 920/527, loss: 0.017186572775244713 2023-01-22 20:53:43.092655: step: 924/527, loss: 0.018323130905628204 2023-01-22 20:53:44.142184: step: 928/527, loss: 0.0195182953029871 2023-01-22 20:53:45.184873: step: 932/527, loss: 0.022910984233021736 2023-01-22 20:53:46.246059: step: 936/527, loss: 0.014719245955348015 2023-01-22 20:53:47.297345: step: 940/527, loss: 0.004016861319541931 2023-01-22 20:53:48.354348: step: 944/527, loss: 0.004240079782903194 2023-01-22 20:53:49.393430: step: 948/527, loss: 0.011007502675056458 2023-01-22 20:53:50.457745: step: 952/527, loss: 0.004683236591517925 2023-01-22 20:53:51.490410: step: 956/527, loss: 0.022693928331136703 2023-01-22 20:53:52.556179: step: 960/527, loss: 0.0053064958192408085 2023-01-22 20:53:53.603450: step: 964/527, loss: 0.012264814227819443 2023-01-22 20:53:54.654302: step: 968/527, loss: 0.006496482063084841 2023-01-22 20:53:55.722625: step: 972/527, loss: 0.007213902194052935 2023-01-22 20:53:56.770066: step: 976/527, loss: 0.003221244551241398 2023-01-22 20:53:57.813944: step: 980/527, loss: 0.018335863947868347 2023-01-22 20:53:58.871398: step: 984/527, loss: 0.0014510139590129256 2023-01-22 20:53:59.932651: step: 988/527, loss: 0.00018173780699726194 2023-01-22 20:54:00.969264: step: 992/527, loss: 0.0005547582404688001 2023-01-22 20:54:02.020022: step: 996/527, loss: 0.007558836601674557 2023-01-22 20:54:03.080644: step: 1000/527, loss: 0.005875860340893269 2023-01-22 20:54:04.133627: step: 1004/527, loss: 0.003022552467882633 2023-01-22 20:54:05.185060: step: 1008/527, loss: 0.0037222157698124647 2023-01-22 20:54:06.241254: step: 1012/527, loss: 0.007213362492620945 2023-01-22 20:54:07.274670: step: 1016/527, loss: 0.005431391764432192 2023-01-22 20:54:08.306982: step: 1020/527, loss: 0.006150540895760059 2023-01-22 20:54:09.362287: step: 1024/527, loss: 0.026522129774093628 2023-01-22 20:54:10.408093: step: 1028/527, loss: 0.010076455771923065 2023-01-22 20:54:11.459300: step: 1032/527, loss: 0.007794274017214775 2023-01-22 20:54:12.496985: step: 1036/527, loss: 0.004307042341679335 2023-01-22 20:54:13.544272: step: 1040/527, loss: 0.004661035258322954 2023-01-22 20:54:14.609734: step: 1044/527, loss: 0.009102068841457367 2023-01-22 20:54:15.674812: step: 1048/527, loss: 0.009107415564358234 2023-01-22 20:54:16.735338: step: 1052/527, loss: 0.007682840805500746 2023-01-22 20:54:17.782068: step: 1056/527, loss: 0.007839532569050789 2023-01-22 20:54:18.840759: step: 1060/527, loss: 0.00691550737246871 2023-01-22 20:54:19.898716: step: 1064/527, loss: 0.01588447019457817 2023-01-22 20:54:20.934384: step: 1068/527, loss: 0.016561010852456093 2023-01-22 20:54:21.980845: step: 1072/527, loss: 0.0023054606281220913 2023-01-22 20:54:23.038570: step: 1076/527, loss: 0.005138486158102751 2023-01-22 20:54:24.086711: step: 1080/527, loss: 0.006244129966944456 2023-01-22 20:54:25.125003: step: 1084/527, loss: 0.004642792046070099 2023-01-22 20:54:26.190669: step: 1088/527, loss: 0.010316764935851097 2023-01-22 20:54:27.238868: step: 1092/527, loss: 0.006066053174436092 2023-01-22 20:54:28.288861: step: 1096/527, loss: 0.0024034250527620316 2023-01-22 20:54:29.332084: step: 1100/527, loss: 0.004848657175898552 2023-01-22 20:54:30.387000: step: 1104/527, loss: 0.000994514673948288 2023-01-22 20:54:31.424260: step: 1108/527, loss: 0.006069227121770382 2023-01-22 20:54:32.493211: step: 1112/527, loss: 0.0020929903257638216 2023-01-22 20:54:33.532752: step: 1116/527, loss: 0.000433559063822031 2023-01-22 20:54:34.593510: step: 1120/527, loss: 0.00041445757960900664 2023-01-22 20:54:35.644298: step: 1124/527, loss: 0.026016153395175934 2023-01-22 20:54:36.681806: step: 1128/527, loss: 0.0061644334346055984 2023-01-22 20:54:37.731688: step: 1132/527, loss: 0.004554815124720335 2023-01-22 20:54:38.808907: step: 1136/527, loss: 0.0036312779411673546 2023-01-22 20:54:39.870765: step: 1140/527, loss: 0.004106924869120121 2023-01-22 20:54:40.918947: step: 1144/527, loss: 0.0017886483110487461 2023-01-22 20:54:41.972044: step: 1148/527, loss: 0.0015938644064590335 2023-01-22 20:54:43.016457: step: 1152/527, loss: 0.0014809290878474712 2023-01-22 20:54:44.060976: step: 1156/527, loss: 0.016352303326129913 2023-01-22 20:54:45.102756: step: 1160/527, loss: 0.002711722394451499 2023-01-22 20:54:46.158483: step: 1164/527, loss: 0.01169298030436039 2023-01-22 20:54:47.222856: step: 1168/527, loss: 0.0002884409623220563 2023-01-22 20:54:48.265956: step: 1172/527, loss: 0.0029927738942205906 2023-01-22 20:54:49.324075: step: 1176/527, loss: 0.011302296072244644 2023-01-22 20:54:50.371717: step: 1180/527, loss: 0.00686067808419466 2023-01-22 20:54:51.416412: step: 1184/527, loss: 0.006412389222532511 2023-01-22 20:54:52.439845: step: 1188/527, loss: 0.0036091282963752747 2023-01-22 20:54:53.489003: step: 1192/527, loss: 0.009405727498233318 2023-01-22 20:54:54.525724: step: 1196/527, loss: 0.0026019210927188396 2023-01-22 20:54:55.580143: step: 1200/527, loss: 0.003222639672458172 2023-01-22 20:54:56.657439: step: 1204/527, loss: 0.006527747493237257 2023-01-22 20:54:57.706700: step: 1208/527, loss: 0.002040441380813718 2023-01-22 20:54:58.760653: step: 1212/527, loss: 0.005301251076161861 2023-01-22 20:54:59.801878: step: 1216/527, loss: 0.016271648928523064 2023-01-22 20:55:00.849511: step: 1220/527, loss: 0.006276706699281931 2023-01-22 20:55:01.892486: step: 1224/527, loss: 0.0022712545469403267 2023-01-22 20:55:02.932033: step: 1228/527, loss: 0.002332814736291766 2023-01-22 20:55:03.992632: step: 1232/527, loss: 0.011018120683729649 2023-01-22 20:55:05.051330: step: 1236/527, loss: 0.003257921664044261 2023-01-22 20:55:06.101880: step: 1240/527, loss: 0.010572624392807484 2023-01-22 20:55:07.150811: step: 1244/527, loss: 0.004362326581031084 2023-01-22 20:55:08.194565: step: 1248/527, loss: 0.0012781393015757203 2023-01-22 20:55:09.245143: step: 1252/527, loss: 0.0020749021787196398 2023-01-22 20:55:10.294375: step: 1256/527, loss: 0.009615735150873661 2023-01-22 20:55:11.383538: step: 1260/527, loss: 0.0025882520712912083 2023-01-22 20:55:12.425410: step: 1264/527, loss: 0.010402072221040726 2023-01-22 20:55:13.472635: step: 1268/527, loss: 0.003054817672818899 2023-01-22 20:55:14.506750: step: 1272/527, loss: 0.0019948140252381563 2023-01-22 20:55:15.554099: step: 1276/527, loss: 0.005946854595094919 2023-01-22 20:55:16.603490: step: 1280/527, loss: 0.006230685394257307 2023-01-22 20:55:17.660131: step: 1284/527, loss: 0.017413675785064697 2023-01-22 20:55:18.727873: step: 1288/527, loss: 0.05065235123038292 2023-01-22 20:55:19.789200: step: 1292/527, loss: 0.009509514085948467 2023-01-22 20:55:20.833176: step: 1296/527, loss: 0.0018403837457299232 2023-01-22 20:55:21.887907: step: 1300/527, loss: 0.002236853586509824 2023-01-22 20:55:22.922471: step: 1304/527, loss: 0.002076870994642377 2023-01-22 20:55:23.969103: step: 1308/527, loss: 0.017090918496251106 2023-01-22 20:55:25.033296: step: 1312/527, loss: 2.083394065266475e-05 2023-01-22 20:55:26.061846: step: 1316/527, loss: 0.003389152232557535 2023-01-22 20:55:27.129192: step: 1320/527, loss: 0.016382068395614624 2023-01-22 20:55:28.201269: step: 1324/527, loss: 0.007264185231178999 2023-01-22 20:55:29.248778: step: 1328/527, loss: 0.0006205785903148353 2023-01-22 20:55:30.272610: step: 1332/527, loss: 0.000725000980310142 2023-01-22 20:55:31.322747: step: 1336/527, loss: 0.0017693579429760575 2023-01-22 20:55:32.372540: step: 1340/527, loss: 0.0035823797807097435 2023-01-22 20:55:33.419116: step: 1344/527, loss: 0.012691323645412922 2023-01-22 20:55:34.462862: step: 1348/527, loss: 0.0054375301115214825 2023-01-22 20:55:35.503772: step: 1352/527, loss: 0.03330446034669876 2023-01-22 20:55:36.551405: step: 1356/527, loss: 0.012865869328379631 2023-01-22 20:55:37.609223: step: 1360/527, loss: 0.012280241586267948 2023-01-22 20:55:38.665009: step: 1364/527, loss: 0.023513102903962135 2023-01-22 20:55:39.708187: step: 1368/527, loss: 0.013600676320493221 2023-01-22 20:55:40.760233: step: 1372/527, loss: 0.005515237804502249 2023-01-22 20:55:41.798863: step: 1376/527, loss: 0.006135640665888786 2023-01-22 20:55:42.848829: step: 1380/527, loss: 0.0054725236259400845 2023-01-22 20:55:43.904176: step: 1384/527, loss: 0.010193255729973316 2023-01-22 20:55:44.953287: step: 1388/527, loss: 0.00300167896784842 2023-01-22 20:55:46.008736: step: 1392/527, loss: 0.0035158854443579912 2023-01-22 20:55:47.065297: step: 1396/527, loss: 0.011393156833946705 2023-01-22 20:55:48.110076: step: 1400/527, loss: 0.007708531338721514 2023-01-22 20:55:49.152978: step: 1404/527, loss: 0.029896896332502365 2023-01-22 20:55:50.220540: step: 1408/527, loss: 0.0033632079139351845 2023-01-22 20:55:51.264350: step: 1412/527, loss: 0.0020611356012523174 2023-01-22 20:55:52.311806: step: 1416/527, loss: 0.004310361109673977 2023-01-22 20:55:53.360535: step: 1420/527, loss: 0.010673295706510544 2023-01-22 20:55:54.399424: step: 1424/527, loss: 0.006751564797013998 2023-01-22 20:55:55.458001: step: 1428/527, loss: 0.0068667116574943066 2023-01-22 20:55:56.498171: step: 1432/527, loss: 0.007486337795853615 2023-01-22 20:55:57.551918: step: 1436/527, loss: 0.006137306336313486 2023-01-22 20:55:58.614335: step: 1440/527, loss: 0.0018077391432598233 2023-01-22 20:55:59.682996: step: 1444/527, loss: 0.009121482260525227 2023-01-22 20:56:00.739035: step: 1448/527, loss: 0.0013772554229944944 2023-01-22 20:56:01.799770: step: 1452/527, loss: 0.0028811676893383265 2023-01-22 20:56:02.856616: step: 1456/527, loss: 0.007387702353298664 2023-01-22 20:56:03.896798: step: 1460/527, loss: 0.0002977935073431581 2023-01-22 20:56:04.951012: step: 1464/527, loss: 0.00801931880414486 2023-01-22 20:56:06.013649: step: 1468/527, loss: 0.007619825657457113 2023-01-22 20:56:07.073197: step: 1472/527, loss: 0.009034757502377033 2023-01-22 20:56:08.132075: step: 1476/527, loss: 0.020904745906591415 2023-01-22 20:56:09.169313: step: 1480/527, loss: 0.008191004395484924 2023-01-22 20:56:10.216730: step: 1484/527, loss: 0.003263077698647976 2023-01-22 20:56:11.279567: step: 1488/527, loss: 0.003987972624599934 2023-01-22 20:56:12.336349: step: 1492/527, loss: 0.017899535596370697 2023-01-22 20:56:13.388880: step: 1496/527, loss: 0.002941242652013898 2023-01-22 20:56:14.438386: step: 1500/527, loss: 0.003999364096671343 2023-01-22 20:56:15.511163: step: 1504/527, loss: 0.0040275040082633495 2023-01-22 20:56:16.567435: step: 1508/527, loss: 0.009376759640872478 2023-01-22 20:56:17.616713: step: 1512/527, loss: 0.0031801059376448393 2023-01-22 20:56:18.672792: step: 1516/527, loss: 0.00037648380384780467 2023-01-22 20:56:19.768441: step: 1520/527, loss: 0.005084240809082985 2023-01-22 20:56:20.817001: step: 1524/527, loss: 0.002808620920404792 2023-01-22 20:56:21.844821: step: 1528/527, loss: 0.002762235002592206 2023-01-22 20:56:22.914193: step: 1532/527, loss: 0.00031852329266257584 2023-01-22 20:56:23.955311: step: 1536/527, loss: 0.0029655955731868744 2023-01-22 20:56:25.013312: step: 1540/527, loss: 0.009475357830524445 2023-01-22 20:56:26.068018: step: 1544/527, loss: 0.004274454433470964 2023-01-22 20:56:27.118141: step: 1548/527, loss: 0.00024244235828518867 2023-01-22 20:56:28.175331: step: 1552/527, loss: 0.00658113369718194 2023-01-22 20:56:29.224057: step: 1556/527, loss: 0.0017427315469831228 2023-01-22 20:56:30.262354: step: 1560/527, loss: 0.010992297902703285 2023-01-22 20:56:31.310429: step: 1564/527, loss: 0.004258877597749233 2023-01-22 20:56:32.362292: step: 1568/527, loss: 0.00022623898985330015 2023-01-22 20:56:33.416980: step: 1572/527, loss: 0.0011873444309458137 2023-01-22 20:56:34.462870: step: 1576/527, loss: 0.0016188404988497496 2023-01-22 20:56:35.504302: step: 1580/527, loss: 0.004137265495955944 2023-01-22 20:56:36.545381: step: 1584/527, loss: 0.005658352747559547 2023-01-22 20:56:37.582935: step: 1588/527, loss: 0.0010449644178152084 2023-01-22 20:56:38.626031: step: 1592/527, loss: 0.01422981545329094 2023-01-22 20:56:39.674578: step: 1596/527, loss: 0.007538523990660906 2023-01-22 20:56:40.724852: step: 1600/527, loss: 0.007514684461057186 2023-01-22 20:56:41.792055: step: 1604/527, loss: 4.9162977120431606e-06 2023-01-22 20:56:42.855447: step: 1608/527, loss: 0.001795978401787579 2023-01-22 20:56:43.917362: step: 1612/527, loss: 0.005608080420643091 2023-01-22 20:56:44.956070: step: 1616/527, loss: 0.0015663618687540293 2023-01-22 20:56:46.000080: step: 1620/527, loss: 0.0012878417037427425 2023-01-22 20:56:47.039334: step: 1624/527, loss: 0.007327280007302761 2023-01-22 20:56:48.090309: step: 1628/527, loss: 0.0031818028073757887 2023-01-22 20:56:49.142318: step: 1632/527, loss: 0.006334662437438965 2023-01-22 20:56:50.189655: step: 1636/527, loss: 0.0005831404705531895 2023-01-22 20:56:51.239363: step: 1640/527, loss: 0.004275789484381676 2023-01-22 20:56:52.290443: step: 1644/527, loss: 0.01633210852742195 2023-01-22 20:56:53.342818: step: 1648/527, loss: 0.005840673111379147 2023-01-22 20:56:54.405642: step: 1652/527, loss: 0.006150856614112854 2023-01-22 20:56:55.442522: step: 1656/527, loss: 0.004936764948070049 2023-01-22 20:56:56.485104: step: 1660/527, loss: 0.0006148935062810779 2023-01-22 20:56:57.528297: step: 1664/527, loss: 0.0020212149247527122 2023-01-22 20:56:58.581049: step: 1668/527, loss: 0.003940522205084562 2023-01-22 20:56:59.622353: step: 1672/527, loss: 0.006931124720722437 2023-01-22 20:57:00.671444: step: 1676/527, loss: 0.0016041100025177002 2023-01-22 20:57:01.714096: step: 1680/527, loss: 0.007625493686646223 2023-01-22 20:57:02.759779: step: 1684/527, loss: 0.003768304595723748 2023-01-22 20:57:03.803537: step: 1688/527, loss: 0.008148561231791973 2023-01-22 20:57:04.853420: step: 1692/527, loss: 0.003449991811066866 2023-01-22 20:57:05.900935: step: 1696/527, loss: 0.0007971778977662325 2023-01-22 20:57:06.949633: step: 1700/527, loss: 0.028176097199320793 2023-01-22 20:57:07.981559: step: 1704/527, loss: 0.014186665415763855 2023-01-22 20:57:09.022416: step: 1708/527, loss: 0.0055346558801829815 2023-01-22 20:57:10.070765: step: 1712/527, loss: 0.005703628994524479 2023-01-22 20:57:11.127437: step: 1716/527, loss: 0.004380775149911642 2023-01-22 20:57:12.162076: step: 1720/527, loss: 0.0010369113879278302 2023-01-22 20:57:13.207870: step: 1724/527, loss: 0.000326198321999982 2023-01-22 20:57:14.253768: step: 1728/527, loss: 0.01635780744254589 2023-01-22 20:57:15.291590: step: 1732/527, loss: 0.021077606827020645 2023-01-22 20:57:16.333399: step: 1736/527, loss: 0.022376619279384613 2023-01-22 20:57:17.378217: step: 1740/527, loss: 0.001262744190171361 2023-01-22 20:57:18.426496: step: 1744/527, loss: 0.011908268555998802 2023-01-22 20:57:19.476188: step: 1748/527, loss: 0.004396419040858746 2023-01-22 20:57:20.532293: step: 1752/527, loss: 0.026702603325247765 2023-01-22 20:57:21.575645: step: 1756/527, loss: 0.0011463734554126859 2023-01-22 20:57:22.610311: step: 1760/527, loss: 0.010404621250927448 2023-01-22 20:57:23.656963: step: 1764/527, loss: 0.010583776980638504 2023-01-22 20:57:24.701888: step: 1768/527, loss: 0.0028614166658371687 2023-01-22 20:57:25.739942: step: 1772/527, loss: 0.004903014283627272 2023-01-22 20:57:26.803204: step: 1776/527, loss: 0.003185364417731762 2023-01-22 20:57:27.853271: step: 1780/527, loss: 0.008488611318171024 2023-01-22 20:57:28.899337: step: 1784/527, loss: 0.006479854229837656 2023-01-22 20:57:29.946914: step: 1788/527, loss: 0.0031652546022087336 2023-01-22 20:57:30.994874: step: 1792/527, loss: 0.003490198403596878 2023-01-22 20:57:32.043559: step: 1796/527, loss: 0.020055042579770088 2023-01-22 20:57:33.095914: step: 1800/527, loss: 0.0017858686624094844 2023-01-22 20:57:34.162177: step: 1804/527, loss: 0.0051439846865832806 2023-01-22 20:57:35.214704: step: 1808/527, loss: 0.008592815138399601 2023-01-22 20:57:36.248909: step: 1812/527, loss: 0.007867495529353619 2023-01-22 20:57:37.309756: step: 1816/527, loss: 0.0036260022316128016 2023-01-22 20:57:38.361634: step: 1820/527, loss: 0.002685365965589881 2023-01-22 20:57:39.428033: step: 1824/527, loss: 0.004280132241547108 2023-01-22 20:57:40.478177: step: 1828/527, loss: 0.005354311782866716 2023-01-22 20:57:41.507983: step: 1832/527, loss: 0.0 2023-01-22 20:57:42.558403: step: 1836/527, loss: 0.00031337421387434006 2023-01-22 20:57:43.606327: step: 1840/527, loss: 0.0018453020602464676 2023-01-22 20:57:44.680873: step: 1844/527, loss: 0.0030670773703604937 2023-01-22 20:57:45.730068: step: 1848/527, loss: 0.011090303771197796 2023-01-22 20:57:46.784253: step: 1852/527, loss: 0.00362679036334157 2023-01-22 20:57:47.816548: step: 1856/527, loss: 0.014624183066189289 2023-01-22 20:57:48.860442: step: 1860/527, loss: 0.0017266328213736415 2023-01-22 20:57:49.931321: step: 1864/527, loss: 0.02116183564066887 2023-01-22 20:57:50.997515: step: 1868/527, loss: 0.009328898973762989 2023-01-22 20:57:52.057054: step: 1872/527, loss: 0.054570522159338 2023-01-22 20:57:53.107696: step: 1876/527, loss: 0.02399817295372486 2023-01-22 20:57:54.142749: step: 1880/527, loss: 0.0032696735579520464 2023-01-22 20:57:55.194910: step: 1884/527, loss: 0.007266837637871504 2023-01-22 20:57:56.238424: step: 1888/527, loss: 0.0030580516904592514 2023-01-22 20:57:57.304427: step: 1892/527, loss: 0.00226209731772542 2023-01-22 20:57:58.365043: step: 1896/527, loss: 0.009014708921313286 2023-01-22 20:57:59.414555: step: 1900/527, loss: 0.00284867687150836 2023-01-22 20:58:00.454047: step: 1904/527, loss: 0.011430088430643082 2023-01-22 20:58:01.503706: step: 1908/527, loss: 0.008479653857648373 2023-01-22 20:58:02.535512: step: 1912/527, loss: 0.002097503514960408 2023-01-22 20:58:03.592978: step: 1916/527, loss: 0.0004132471513003111 2023-01-22 20:58:04.634873: step: 1920/527, loss: 0.00039731647120788693 2023-01-22 20:58:05.692433: step: 1924/527, loss: 0.00582444341853261 2023-01-22 20:58:06.734436: step: 1928/527, loss: 0.004440548829734325 2023-01-22 20:58:07.779384: step: 1932/527, loss: 0.013903653249144554 2023-01-22 20:58:08.838473: step: 1936/527, loss: 0.02670356258749962 2023-01-22 20:58:09.896211: step: 1940/527, loss: 0.0012797055533155799 2023-01-22 20:58:10.949088: step: 1944/527, loss: 0.005963573232293129 2023-01-22 20:58:12.000344: step: 1948/527, loss: 0.0010697349207475781 2023-01-22 20:58:13.055128: step: 1952/527, loss: 0.009538499638438225 2023-01-22 20:58:14.099496: step: 1956/527, loss: 0.009736491367220879 2023-01-22 20:58:15.145443: step: 1960/527, loss: 0.023740118369460106 2023-01-22 20:58:16.182924: step: 1964/527, loss: 0.00720672681927681 2023-01-22 20:58:17.234902: step: 1968/527, loss: 0.018328851088881493 2023-01-22 20:58:18.294660: step: 1972/527, loss: 0.015286648645997047 2023-01-22 20:58:19.352346: step: 1976/527, loss: 0.00725268991664052 2023-01-22 20:58:20.394373: step: 1980/527, loss: 0.007340042851865292 2023-01-22 20:58:21.442940: step: 1984/527, loss: 0.005392418708652258 2023-01-22 20:58:22.487061: step: 1988/527, loss: 0.003717937506735325 2023-01-22 20:58:23.532207: step: 1992/527, loss: 0.008614039048552513 2023-01-22 20:58:24.577367: step: 1996/527, loss: 0.014852678403258324 2023-01-22 20:58:25.619756: step: 2000/527, loss: 0.0013763955794274807 2023-01-22 20:58:26.658425: step: 2004/527, loss: 0.0002763493394013494 2023-01-22 20:58:27.699607: step: 2008/527, loss: 0.0011422018287703395 2023-01-22 20:58:28.731527: step: 2012/527, loss: 0.0031692483462393284 2023-01-22 20:58:29.783652: step: 2016/527, loss: 0.007393590174615383 2023-01-22 20:58:30.840713: step: 2020/527, loss: 0.006015173625200987 2023-01-22 20:58:31.884513: step: 2024/527, loss: 0.003322953823953867 2023-01-22 20:58:32.940835: step: 2028/527, loss: 0.005143163725733757 2023-01-22 20:58:34.002703: step: 2032/527, loss: 0.0035218680277466774 2023-01-22 20:58:35.050282: step: 2036/527, loss: 0.00514903012663126 2023-01-22 20:58:36.100305: step: 2040/527, loss: 0.0094820661470294 2023-01-22 20:58:37.127325: step: 2044/527, loss: 0.009241255931556225 2023-01-22 20:58:38.169926: step: 2048/527, loss: 0.0016327640041708946 2023-01-22 20:58:39.213302: step: 2052/527, loss: 0.005420347210019827 2023-01-22 20:58:40.264123: step: 2056/527, loss: 0.002747166668996215 2023-01-22 20:58:41.303102: step: 2060/527, loss: 0.02144642546772957 2023-01-22 20:58:42.345702: step: 2064/527, loss: 0.004442389588803053 2023-01-22 20:58:43.384298: step: 2068/527, loss: 0.008240175433456898 2023-01-22 20:58:44.432203: step: 2072/527, loss: 0.0016872554551810026 2023-01-22 20:58:45.458804: step: 2076/527, loss: 0.004576928913593292 2023-01-22 20:58:46.502066: step: 2080/527, loss: 0.0008002108079381287 2023-01-22 20:58:47.544034: step: 2084/527, loss: 0.0009320533135905862 2023-01-22 20:58:48.598000: step: 2088/527, loss: 0.006594611331820488 2023-01-22 20:58:49.655287: step: 2092/527, loss: 0.0017086324514821172 2023-01-22 20:58:50.691427: step: 2096/527, loss: 0.000981308170594275 2023-01-22 20:58:51.742355: step: 2100/527, loss: 0.003248268272727728 2023-01-22 20:58:52.772913: step: 2104/527, loss: 0.0008998570265248418 2023-01-22 20:58:53.815734: step: 2108/527, loss: 0.002022647997364402 ================================================== Loss: 0.008 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32502820397111915, 'r': 0.3416805028462998, 'f1': 0.3331463922294172}, 'combined': 0.2454762890111495, 'stategy': 1, 'epoch': 7} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3386370503673957, 'r': 0.30785186397035974, 'f1': 0.3225114765403769}, 'combined': 0.20640734498584118, 'stategy': 1, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32767430257625235, 'r': 0.361871810435254, 'f1': 0.3439250569871576}, 'combined': 0.25341846304316873, 'stategy': 1, 'epoch': 7} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3576479137504792, 'r': 0.3127793572981464, 'f1': 0.3337122143821154}, 'combined': 0.21357581720455382, 'stategy': 1, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3289981150578922, 'r': 0.33149525445112105, 'f1': 0.3302419642641603}, 'combined': 0.24333618419464442, 'stategy': 1, 'epoch': 7} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3671608483843696, 'r': 0.2963345518807423, 'f1': 0.3279674446293412}, 'combined': 0.23514646973424463, 'stategy': 1, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 7} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33228787707526664, 'r': 0.33291840435624437, 'f1': 0.33260284188766026}, 'combined': 0.2450757782330128, 'stategy': 1, 'epoch': 6} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3655014852890142, 'r': 0.2969907428235575, 'f1': 0.3277036409267968}, 'combined': 0.2349573274569487, 'stategy': 1, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 6} ****************************** Epoch: 8 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 21:01:22.624010: step: 4/527, loss: 0.0002330032002646476 2023-01-22 21:01:23.694575: step: 8/527, loss: 0.02729787304997444 2023-01-22 21:01:24.743710: step: 12/527, loss: 0.007976001128554344 2023-01-22 21:01:25.770115: step: 16/527, loss: 0.0023132420610636473 2023-01-22 21:01:26.803197: step: 20/527, loss: 0.004748682025820017 2023-01-22 21:01:27.841863: step: 24/527, loss: 0.004335749428719282 2023-01-22 21:01:28.886382: step: 28/527, loss: 0.0025923559442162514 2023-01-22 21:01:29.911045: step: 32/527, loss: 0.0049667623825371265 2023-01-22 21:01:30.966407: step: 36/527, loss: 0.012910578399896622 2023-01-22 21:01:31.999325: step: 40/527, loss: 0.0002899013343267143 2023-01-22 21:01:33.049003: step: 44/527, loss: 0.01253580953925848 2023-01-22 21:01:34.120870: step: 48/527, loss: 0.024305060505867004 2023-01-22 21:01:35.152795: step: 52/527, loss: 0.002811727812513709 2023-01-22 21:01:36.192362: step: 56/527, loss: 0.00012486337800510228 2023-01-22 21:01:37.228684: step: 60/527, loss: 0.0034543753135949373 2023-01-22 21:01:38.274243: step: 64/527, loss: 0.011078727431595325 2023-01-22 21:01:39.312975: step: 68/527, loss: 0.013558818027377129 2023-01-22 21:01:40.350003: step: 72/527, loss: 0.10520170629024506 2023-01-22 21:01:41.403590: step: 76/527, loss: 0.021485585719347 2023-01-22 21:01:42.440038: step: 80/527, loss: 0.002736761001870036 2023-01-22 21:01:43.476611: step: 84/527, loss: 0.005472817458212376 2023-01-22 21:01:44.521425: step: 88/527, loss: 0.01194784976541996 2023-01-22 21:01:45.572767: step: 92/527, loss: 0.0010049795964732766 2023-01-22 21:01:46.611854: step: 96/527, loss: 0.0065386053174734116 2023-01-22 21:01:47.666196: step: 100/527, loss: 0.003975197207182646 2023-01-22 21:01:48.707497: step: 104/527, loss: 0.0062440973706543446 2023-01-22 21:01:49.747362: step: 108/527, loss: 0.0007234518998302519 2023-01-22 21:01:50.793239: step: 112/527, loss: 0.004400024190545082 2023-01-22 21:01:51.836583: step: 116/527, loss: 0.0057484060525894165 2023-01-22 21:01:52.884836: step: 120/527, loss: 0.004360474180430174 2023-01-22 21:01:53.926442: step: 124/527, loss: 0.012358917854726315 2023-01-22 21:01:54.977240: step: 128/527, loss: 0.00728977657854557 2023-01-22 21:01:56.010930: step: 132/527, loss: 0.03357089310884476 2023-01-22 21:01:57.063918: step: 136/527, loss: 0.0019295202801004052 2023-01-22 21:01:58.100595: step: 140/527, loss: 0.021667294204235077 2023-01-22 21:01:59.171356: step: 144/527, loss: 0.0050947656854987144 2023-01-22 21:02:00.226269: step: 148/527, loss: 0.006025749258697033 2023-01-22 21:02:01.272692: step: 152/527, loss: 0.020102210342884064 2023-01-22 21:02:02.326280: step: 156/527, loss: 0.014660214073956013 2023-01-22 21:02:03.367886: step: 160/527, loss: 0.011936459690332413 2023-01-22 21:02:04.418687: step: 164/527, loss: 0.005218774080276489 2023-01-22 21:02:05.459753: step: 168/527, loss: 0.001708473777398467 2023-01-22 21:02:06.504746: step: 172/527, loss: 0.0016623052069917321 2023-01-22 21:02:07.537677: step: 176/527, loss: 0.00282743270508945 2023-01-22 21:02:08.589838: step: 180/527, loss: 0.02042945846915245 2023-01-22 21:02:09.627483: step: 184/527, loss: 0.0031801247969269753 2023-01-22 21:02:10.656124: step: 188/527, loss: 0.0024131627287715673 2023-01-22 21:02:11.696275: step: 192/527, loss: 0.01198330894112587 2023-01-22 21:02:12.737287: step: 196/527, loss: 0.008296951651573181 2023-01-22 21:02:13.778447: step: 200/527, loss: 0.0021316383499652147 2023-01-22 21:02:14.812376: step: 204/527, loss: 0.001449936768040061 2023-01-22 21:02:15.863253: step: 208/527, loss: 0.008732173591852188 2023-01-22 21:02:16.911291: step: 212/527, loss: 0.0013532432494685054 2023-01-22 21:02:17.957976: step: 216/527, loss: 0.0106819411739707 2023-01-22 21:02:18.992297: step: 220/527, loss: 0.0022255731746554375 2023-01-22 21:02:20.022994: step: 224/527, loss: 0.005370576400309801 2023-01-22 21:02:21.065114: step: 228/527, loss: 0.004589673597365618 2023-01-22 21:02:22.107010: step: 232/527, loss: 0.0036165020428597927 2023-01-22 21:02:23.143595: step: 236/527, loss: 0.0008657953003421426 2023-01-22 21:02:24.179641: step: 240/527, loss: 0.0029934286139905453 2023-01-22 21:02:25.234808: step: 244/527, loss: 0.006988136097788811 2023-01-22 21:02:26.282452: step: 248/527, loss: 0.002905474277213216 2023-01-22 21:02:27.336897: step: 252/527, loss: 0.0031236233189702034 2023-01-22 21:02:28.389268: step: 256/527, loss: 0.002505676122382283 2023-01-22 21:02:29.447562: step: 260/527, loss: 0.004616268910467625 2023-01-22 21:02:30.486042: step: 264/527, loss: 0.0005057173548266292 2023-01-22 21:02:31.562404: step: 268/527, loss: 0.00617041764780879 2023-01-22 21:02:32.601929: step: 272/527, loss: 0.00039459115942008793 2023-01-22 21:02:33.640665: step: 276/527, loss: 0.0009348386665806174 2023-01-22 21:02:34.681239: step: 280/527, loss: 0.0071175373159348965 2023-01-22 21:02:35.752155: step: 284/527, loss: 0.00035052213934250176 2023-01-22 21:02:36.791282: step: 288/527, loss: 0.005999334622174501 2023-01-22 21:02:37.836131: step: 292/527, loss: 0.006068206857889891 2023-01-22 21:02:38.886209: step: 296/527, loss: 0.022090910002589226 2023-01-22 21:02:39.928901: step: 300/527, loss: 0.0031325744930654764 2023-01-22 21:02:40.985193: step: 304/527, loss: 0.0035281346645206213 2023-01-22 21:02:42.051021: step: 308/527, loss: 0.0017932542832568288 2023-01-22 21:02:43.097980: step: 312/527, loss: 0.0 2023-01-22 21:02:44.143535: step: 316/527, loss: 0.005270319525152445 2023-01-22 21:02:45.209704: step: 320/527, loss: 0.004883794113993645 2023-01-22 21:02:46.261726: step: 324/527, loss: 0.0069733248092234135 2023-01-22 21:02:47.307859: step: 328/527, loss: 0.01385743822902441 2023-01-22 21:02:48.365014: step: 332/527, loss: 0.0051896171644330025 2023-01-22 21:02:49.430086: step: 336/527, loss: 0.002243006369099021 2023-01-22 21:02:50.473776: step: 340/527, loss: 0.007538812234997749 2023-01-22 21:02:51.531070: step: 344/527, loss: 7.229208858916536e-05 2023-01-22 21:02:52.577859: step: 348/527, loss: 0.01012858934700489 2023-01-22 21:02:53.629261: step: 352/527, loss: 0.018005182966589928 2023-01-22 21:02:54.685306: step: 356/527, loss: 8.664518099976704e-05 2023-01-22 21:02:55.750426: step: 360/527, loss: 0.0052614095620810986 2023-01-22 21:02:56.791168: step: 364/527, loss: 0.012073944322764874 2023-01-22 21:02:57.839204: step: 368/527, loss: 4.259724300936796e-05 2023-01-22 21:02:58.890694: step: 372/527, loss: 0.006876377854496241 2023-01-22 21:02:59.947884: step: 376/527, loss: 0.006834926083683968 2023-01-22 21:03:00.995272: step: 380/527, loss: 0.029001053422689438 2023-01-22 21:03:02.036152: step: 384/527, loss: 0.00043874207767657936 2023-01-22 21:03:03.089233: step: 388/527, loss: 0.001796349766664207 2023-01-22 21:03:04.158021: step: 392/527, loss: 0.011168502271175385 2023-01-22 21:03:05.204247: step: 396/527, loss: 0.0008980839047580957 2023-01-22 21:03:06.251131: step: 400/527, loss: 0.002164802746847272 2023-01-22 21:03:07.294576: step: 404/527, loss: 0.0066799866035580635 2023-01-22 21:03:08.374972: step: 408/527, loss: 0.008634811267256737 2023-01-22 21:03:09.447054: step: 412/527, loss: 0.0018915216205641627 2023-01-22 21:03:10.502802: step: 416/527, loss: 0.0006427945918403566 2023-01-22 21:03:11.547732: step: 420/527, loss: 0.00903363898396492 2023-01-22 21:03:12.591455: step: 424/527, loss: 0.007423573173582554 2023-01-22 21:03:13.628132: step: 428/527, loss: 0.0012286821147426963 2023-01-22 21:03:14.686566: step: 432/527, loss: 0.0008254091953858733 2023-01-22 21:03:15.737970: step: 436/527, loss: 0.0021102188620716333 2023-01-22 21:03:16.774294: step: 440/527, loss: 0.0007236019591800869 2023-01-22 21:03:17.838084: step: 444/527, loss: 0.011712201870977879 2023-01-22 21:03:18.884616: step: 448/527, loss: 0.021262140944600105 2023-01-22 21:03:19.965901: step: 452/527, loss: 0.012126308865845203 2023-01-22 21:03:21.012728: step: 456/527, loss: 0.00015690227155573666 2023-01-22 21:03:22.057362: step: 460/527, loss: 0.008970070630311966 2023-01-22 21:03:23.112876: step: 464/527, loss: 0.0036351275630295277 2023-01-22 21:03:24.153881: step: 468/527, loss: 7.463943620678037e-05 2023-01-22 21:03:25.213739: step: 472/527, loss: 0.0003010346554219723 2023-01-22 21:03:26.249054: step: 476/527, loss: 0.002852163976058364 2023-01-22 21:03:27.296838: step: 480/527, loss: 0.006341664586216211 2023-01-22 21:03:28.360430: step: 484/527, loss: 0.016170799732208252 2023-01-22 21:03:29.442243: step: 488/527, loss: 0.002987401094287634 2023-01-22 21:03:30.492006: step: 492/527, loss: 0.0036651454865932465 2023-01-22 21:03:31.535688: step: 496/527, loss: 0.005586111918091774 2023-01-22 21:03:32.599262: step: 500/527, loss: 0.019877005368471146 2023-01-22 21:03:33.640736: step: 504/527, loss: 0.004212568514049053 2023-01-22 21:03:34.684965: step: 508/527, loss: 0.011822903528809547 2023-01-22 21:03:35.742066: step: 512/527, loss: 0.0007722167647443712 2023-01-22 21:03:36.798426: step: 516/527, loss: 0.012896863743662834 2023-01-22 21:03:37.864798: step: 520/527, loss: 0.01329719740897417 2023-01-22 21:03:38.919156: step: 524/527, loss: 0.00451135216280818 2023-01-22 21:03:39.961961: step: 528/527, loss: 0.00040527002420276403 2023-01-22 21:03:41.003964: step: 532/527, loss: 0.004665345884859562 2023-01-22 21:03:42.075853: step: 536/527, loss: 0.008312974125146866 2023-01-22 21:03:43.105654: step: 540/527, loss: 0.010481510311365128 2023-01-22 21:03:44.174675: step: 544/527, loss: 0.006175869144499302 2023-01-22 21:03:45.238039: step: 548/527, loss: 0.002698532771319151 2023-01-22 21:03:46.287154: step: 552/527, loss: 0.008815715089440346 2023-01-22 21:03:47.327982: step: 556/527, loss: 0.0076772840693593025 2023-01-22 21:03:48.386469: step: 560/527, loss: 0.00502409553155303 2023-01-22 21:03:49.467145: step: 564/527, loss: 0.007560611702501774 2023-01-22 21:03:50.511350: step: 568/527, loss: 0.02690347097814083 2023-01-22 21:03:51.547370: step: 572/527, loss: 0.007281936705112457 2023-01-22 21:03:52.605098: step: 576/527, loss: 0.007622469682246447 2023-01-22 21:03:53.660848: step: 580/527, loss: 0.003367589320987463 2023-01-22 21:03:54.725203: step: 584/527, loss: 0.0024943158496171236 2023-01-22 21:03:55.769448: step: 588/527, loss: 0.00034372886875644326 2023-01-22 21:03:56.822155: step: 592/527, loss: 0.0071462118066847324 2023-01-22 21:03:57.870055: step: 596/527, loss: 0.0013170083984732628 2023-01-22 21:03:58.918977: step: 600/527, loss: 0.009740952402353287 2023-01-22 21:03:59.971908: step: 604/527, loss: 0.0035525483544915915 2023-01-22 21:04:01.033046: step: 608/527, loss: 0.030997183173894882 2023-01-22 21:04:02.075093: step: 612/527, loss: 1.1262402040301822e-05 2023-01-22 21:04:03.148939: step: 616/527, loss: 0.015109583735466003 2023-01-22 21:04:04.204117: step: 620/527, loss: 0.030643979087471962 2023-01-22 21:04:05.266560: step: 624/527, loss: 0.005113973747938871 2023-01-22 21:04:06.311556: step: 628/527, loss: 3.4255677746841684e-05 2023-01-22 21:04:07.360019: step: 632/527, loss: 0.02006569318473339 2023-01-22 21:04:08.402900: step: 636/527, loss: 0.0061117433942854404 2023-01-22 21:04:09.443408: step: 640/527, loss: 0.02205202728509903 2023-01-22 21:04:10.481512: step: 644/527, loss: 0.010564395226538181 2023-01-22 21:04:11.525457: step: 648/527, loss: 0.0041475845500826836 2023-01-22 21:04:12.599359: step: 652/527, loss: 0.00854248832911253 2023-01-22 21:04:13.651729: step: 656/527, loss: 0.004570312332361937 2023-01-22 21:04:14.709119: step: 660/527, loss: 0.021101294085383415 2023-01-22 21:04:15.757861: step: 664/527, loss: 0.004836041480302811 2023-01-22 21:04:16.811035: step: 668/527, loss: 0.0071386490017175674 2023-01-22 21:04:17.864966: step: 672/527, loss: 0.008905834518373013 2023-01-22 21:04:18.907108: step: 676/527, loss: 0.0029244543984532356 2023-01-22 21:04:19.955131: step: 680/527, loss: 0.05654022842645645 2023-01-22 21:04:21.005017: step: 684/527, loss: 0.003175996942445636 2023-01-22 21:04:22.061939: step: 688/527, loss: 0.010086827911436558 2023-01-22 21:04:23.115304: step: 692/527, loss: 0.007584595121443272 2023-01-22 21:04:24.156161: step: 696/527, loss: 0.001266351668164134 2023-01-22 21:04:25.202415: step: 700/527, loss: 0.00592766422778368 2023-01-22 21:04:26.240893: step: 704/527, loss: 0.0015044219326227903 2023-01-22 21:04:27.294017: step: 708/527, loss: 0.0006652773590758443 2023-01-22 21:04:28.341659: step: 712/527, loss: 0.0027493576053529978 2023-01-22 21:04:29.387524: step: 716/527, loss: 0.003988585900515318 2023-01-22 21:04:30.453732: step: 720/527, loss: 0.0007629691390320659 2023-01-22 21:04:31.515882: step: 724/527, loss: 0.002542082918807864 2023-01-22 21:04:32.564168: step: 728/527, loss: 0.0038389023393392563 2023-01-22 21:04:33.592492: step: 732/527, loss: 0.0005127739277668297 2023-01-22 21:04:34.662158: step: 736/527, loss: 0.00039715145248919725 2023-01-22 21:04:35.730487: step: 740/527, loss: 0.004576391074806452 2023-01-22 21:04:36.773985: step: 744/527, loss: 3.2848427508724853e-05 2023-01-22 21:04:37.817848: step: 748/527, loss: 0.0052435859106481075 2023-01-22 21:04:38.866876: step: 752/527, loss: 0.005188755225390196 2023-01-22 21:04:39.918866: step: 756/527, loss: 0.02018502727150917 2023-01-22 21:04:40.972036: step: 760/527, loss: 0.008143670856952667 2023-01-22 21:04:42.025788: step: 764/527, loss: 0.010538049042224884 2023-01-22 21:04:43.078508: step: 768/527, loss: 0.03353271260857582 2023-01-22 21:04:44.131803: step: 772/527, loss: 0.0014808360720053315 2023-01-22 21:04:45.187611: step: 776/527, loss: 0.012612264603376389 2023-01-22 21:04:46.238873: step: 780/527, loss: 0.017173869535326958 2023-01-22 21:04:47.287582: step: 784/527, loss: 0.003489251248538494 2023-01-22 21:04:48.318988: step: 788/527, loss: 0.0020209914073348045 2023-01-22 21:04:49.374476: step: 792/527, loss: 0.0037294598296284676 2023-01-22 21:04:50.426264: step: 796/527, loss: 0.004305354785174131 2023-01-22 21:04:51.480296: step: 800/527, loss: 0.00795734953135252 2023-01-22 21:04:52.536972: step: 804/527, loss: 0.002077963203191757 2023-01-22 21:04:53.610132: step: 808/527, loss: 0.0212895218282938 2023-01-22 21:04:54.672633: step: 812/527, loss: 0.006212583743035793 2023-01-22 21:04:55.734961: step: 816/527, loss: 0.05978365242481232 2023-01-22 21:04:56.790601: step: 820/527, loss: 0.0009756953804753721 2023-01-22 21:04:57.855276: step: 824/527, loss: 0.010629642754793167 2023-01-22 21:04:58.914542: step: 828/527, loss: 0.0069140540435910225 2023-01-22 21:04:59.961345: step: 832/527, loss: 0.003601226955652237 2023-01-22 21:05:01.037520: step: 836/527, loss: 0.0057035028003156185 2023-01-22 21:05:02.092567: step: 840/527, loss: 0.003114989958703518 2023-01-22 21:05:03.144211: step: 844/527, loss: 0.00820672232657671 2023-01-22 21:05:04.210792: step: 848/527, loss: 0.0012793459463864565 2023-01-22 21:05:05.276988: step: 852/527, loss: 0.004029910080134869 2023-01-22 21:05:06.346701: step: 856/527, loss: 0.0008026692667044699 2023-01-22 21:05:07.406953: step: 860/527, loss: 0.004268052522093058 2023-01-22 21:05:08.458517: step: 864/527, loss: 0.0040172613225877285 2023-01-22 21:05:09.511383: step: 868/527, loss: 0.007618907373398542 2023-01-22 21:05:10.567423: step: 872/527, loss: 0.005499640479683876 2023-01-22 21:05:11.627854: step: 876/527, loss: 0.0024879355914890766 2023-01-22 21:05:12.667952: step: 880/527, loss: 0.006155849434435368 2023-01-22 21:05:13.725635: step: 884/527, loss: 0.001059104222804308 2023-01-22 21:05:14.779188: step: 888/527, loss: 0.0014829837018623948 2023-01-22 21:05:15.820434: step: 892/527, loss: 0.049139536917209625 2023-01-22 21:05:16.871844: step: 896/527, loss: 0.001243019476532936 2023-01-22 21:05:17.932763: step: 900/527, loss: 0.007693006657063961 2023-01-22 21:05:18.972105: step: 904/527, loss: 0.0015366185689345002 2023-01-22 21:05:20.044969: step: 908/527, loss: 0.006956764031201601 2023-01-22 21:05:21.094658: step: 912/527, loss: 0.006277000997215509 2023-01-22 21:05:22.127947: step: 916/527, loss: 0.00041099803638644516 2023-01-22 21:05:23.176071: step: 920/527, loss: 0.006701781414449215 2023-01-22 21:05:24.226619: step: 924/527, loss: 0.0040606423281133175 2023-01-22 21:05:25.268018: step: 928/527, loss: 0.006661850959062576 2023-01-22 21:05:26.303686: step: 932/527, loss: 0.0011129322228953242 2023-01-22 21:05:27.351355: step: 936/527, loss: 0.006584456190466881 2023-01-22 21:05:28.387679: step: 940/527, loss: 0.003968577831983566 2023-01-22 21:05:29.445890: step: 944/527, loss: 0.0001048057310981676 2023-01-22 21:05:30.481159: step: 948/527, loss: 0.0034072063863277435 2023-01-22 21:05:31.556696: step: 952/527, loss: 0.018250852823257446 2023-01-22 21:05:32.603596: step: 956/527, loss: 0.0015229685232043266 2023-01-22 21:05:33.648863: step: 960/527, loss: 0.002207589102908969 2023-01-22 21:05:34.699918: step: 964/527, loss: 0.0075803580693900585 2023-01-22 21:05:35.759729: step: 968/527, loss: 0.005760957952588797 2023-01-22 21:05:36.805754: step: 972/527, loss: 3.1093827601580415e-06 2023-01-22 21:05:37.864735: step: 976/527, loss: 0.0047851852141320705 2023-01-22 21:05:38.924463: step: 980/527, loss: 0.041994061321020126 2023-01-22 21:05:39.973894: step: 984/527, loss: 0.002797425724565983 2023-01-22 21:05:41.020233: step: 988/527, loss: 0.0030285988468676805 2023-01-22 21:05:42.060735: step: 992/527, loss: 0.0021650884300470352 2023-01-22 21:05:43.108617: step: 996/527, loss: 0.005847195629030466 2023-01-22 21:05:44.156786: step: 1000/527, loss: 0.0027541974559426308 2023-01-22 21:05:45.209806: step: 1004/527, loss: 0.005877191666513681 2023-01-22 21:05:46.253440: step: 1008/527, loss: 0.018481340259313583 2023-01-22 21:05:47.292604: step: 1012/527, loss: 0.004654380958527327 2023-01-22 21:05:48.339697: step: 1016/527, loss: 0.004897973965853453 2023-01-22 21:05:49.384440: step: 1020/527, loss: 0.01480427011847496 2023-01-22 21:05:50.427429: step: 1024/527, loss: 0.0007993084145709872 2023-01-22 21:05:51.475525: step: 1028/527, loss: 0.001427156268619001 2023-01-22 21:05:52.534585: step: 1032/527, loss: 0.0023678552825003862 2023-01-22 21:05:53.604918: step: 1036/527, loss: 0.004320325795561075 2023-01-22 21:05:54.656328: step: 1040/527, loss: 0.008059259504079819 2023-01-22 21:05:55.709660: step: 1044/527, loss: 0.014456305652856827 2023-01-22 21:05:56.762218: step: 1048/527, loss: 0.012549159117043018 2023-01-22 21:05:57.821686: step: 1052/527, loss: 0.005220066290348768 2023-01-22 21:05:58.860083: step: 1056/527, loss: 0.02834226004779339 2023-01-22 21:05:59.914660: step: 1060/527, loss: 0.0029862169176340103 2023-01-22 21:06:00.964696: step: 1064/527, loss: 0.0010300502181053162 2023-01-22 21:06:02.005190: step: 1068/527, loss: 0.010499897412955761 2023-01-22 21:06:03.047949: step: 1072/527, loss: 0.000737980124540627 2023-01-22 21:06:04.089004: step: 1076/527, loss: 0.004008077550679445 2023-01-22 21:06:05.135456: step: 1080/527, loss: 0.002080274512991309 2023-01-22 21:06:06.183702: step: 1084/527, loss: 0.01271986123174429 2023-01-22 21:06:07.232078: step: 1088/527, loss: 0.0023002137895673513 2023-01-22 21:06:08.285942: step: 1092/527, loss: 0.0036973075475543737 2023-01-22 21:06:09.330015: step: 1096/527, loss: 0.00861350167542696 2023-01-22 21:06:10.378496: step: 1100/527, loss: 0.007053459994494915 2023-01-22 21:06:11.413873: step: 1104/527, loss: 0.002832916099578142 2023-01-22 21:06:12.473127: step: 1108/527, loss: 0.013927474617958069 2023-01-22 21:06:13.517953: step: 1112/527, loss: 0.014119810424745083 2023-01-22 21:06:14.553608: step: 1116/527, loss: 0.0002748257538769394 2023-01-22 21:06:15.597916: step: 1120/527, loss: 0.00318628060631454 2023-01-22 21:06:16.642519: step: 1124/527, loss: 0.005412677302956581 2023-01-22 21:06:17.717399: step: 1128/527, loss: 0.005162232555449009 2023-01-22 21:06:18.765546: step: 1132/527, loss: 0.02338983491063118 2023-01-22 21:06:19.830926: step: 1136/527, loss: 0.03006056323647499 2023-01-22 21:06:20.889919: step: 1140/527, loss: 0.03350941464304924 2023-01-22 21:06:21.937383: step: 1144/527, loss: 0.0013721227878704667 2023-01-22 21:06:23.001661: step: 1148/527, loss: 0.01300261914730072 2023-01-22 21:06:24.047744: step: 1152/527, loss: 0.007490198593586683 2023-01-22 21:06:25.121979: step: 1156/527, loss: 0.004557053092867136 2023-01-22 21:06:26.177115: step: 1160/527, loss: 0.002955510513857007 2023-01-22 21:06:27.234047: step: 1164/527, loss: 0.0034884281922131777 2023-01-22 21:06:28.287797: step: 1168/527, loss: 0.012828486040234566 2023-01-22 21:06:29.326571: step: 1172/527, loss: 0.004694011993706226 2023-01-22 21:06:30.373866: step: 1176/527, loss: 0.005613983608782291 2023-01-22 21:06:31.420978: step: 1180/527, loss: 0.001298802555538714 2023-01-22 21:06:32.466348: step: 1184/527, loss: 0.005122896749526262 2023-01-22 21:06:33.524053: step: 1188/527, loss: 0.0038382215425372124 2023-01-22 21:06:34.558292: step: 1192/527, loss: 0.006829976104199886 2023-01-22 21:06:35.600884: step: 1196/527, loss: 0.007122765760868788 2023-01-22 21:06:36.645928: step: 1200/527, loss: 0.013456560671329498 2023-01-22 21:06:37.698375: step: 1204/527, loss: 0.0008182553574442863 2023-01-22 21:06:38.748018: step: 1208/527, loss: 0.004961424972862005 2023-01-22 21:06:39.793312: step: 1212/527, loss: 0.006390864495187998 2023-01-22 21:06:40.838396: step: 1216/527, loss: 0.0012007488403469324 2023-01-22 21:06:41.890117: step: 1220/527, loss: 0.004697028547525406 2023-01-22 21:06:42.930586: step: 1224/527, loss: 0.005180972628295422 2023-01-22 21:06:43.991765: step: 1228/527, loss: 0.00177439721301198 2023-01-22 21:06:45.042165: step: 1232/527, loss: 0.00016822277393657714 2023-01-22 21:06:46.101921: step: 1236/527, loss: 0.00426302757114172 2023-01-22 21:06:47.145318: step: 1240/527, loss: 0.0056867473758757114 2023-01-22 21:06:48.185030: step: 1244/527, loss: 0.007592365611344576 2023-01-22 21:06:49.252866: step: 1248/527, loss: 0.02699047513306141 2023-01-22 21:06:50.272820: step: 1252/527, loss: 0.0001514601317467168 2023-01-22 21:06:51.314593: step: 1256/527, loss: 0.005626342259347439 2023-01-22 21:06:52.359759: step: 1260/527, loss: 0.007088867481797934 2023-01-22 21:06:53.408160: step: 1264/527, loss: 0.10504268109798431 2023-01-22 21:06:54.451941: step: 1268/527, loss: 0.011039432138204575 2023-01-22 21:06:55.502230: step: 1272/527, loss: 0.0011690608225762844 2023-01-22 21:06:56.573019: step: 1276/527, loss: 0.005887324456125498 2023-01-22 21:06:57.608311: step: 1280/527, loss: 0.013546577654778957 2023-01-22 21:06:58.653600: step: 1284/527, loss: 0.0008591560181230307 2023-01-22 21:06:59.717544: step: 1288/527, loss: 0.008857784792780876 2023-01-22 21:07:00.771118: step: 1292/527, loss: 0.0036294113378971815 2023-01-22 21:07:01.822164: step: 1296/527, loss: 0.007951263338327408 2023-01-22 21:07:02.865155: step: 1300/527, loss: 0.004809096455574036 2023-01-22 21:07:03.903124: step: 1304/527, loss: 0.0024872305803000927 2023-01-22 21:07:04.952407: step: 1308/527, loss: 0.012335199862718582 2023-01-22 21:07:06.026405: step: 1312/527, loss: 0.0016196668148040771 2023-01-22 21:07:07.085687: step: 1316/527, loss: 0.016291413456201553 2023-01-22 21:07:08.134195: step: 1320/527, loss: 0.0013489486882463098 2023-01-22 21:07:09.185685: step: 1324/527, loss: 0.020952150225639343 2023-01-22 21:07:10.242876: step: 1328/527, loss: 0.004605205729603767 2023-01-22 21:07:11.293163: step: 1332/527, loss: 0.008913586847484112 2023-01-22 21:07:12.338592: step: 1336/527, loss: 0.007838152348995209 2023-01-22 21:07:13.392802: step: 1340/527, loss: 0.027775771915912628 2023-01-22 21:07:14.454901: step: 1344/527, loss: 0.003997542429715395 2023-01-22 21:07:15.493438: step: 1348/527, loss: 0.007928516715765 2023-01-22 21:07:16.543460: step: 1352/527, loss: 0.0024348124861717224 2023-01-22 21:07:17.600322: step: 1356/527, loss: 0.008911040611565113 2023-01-22 21:07:18.647534: step: 1360/527, loss: 0.004522170405834913 2023-01-22 21:07:19.715188: step: 1364/527, loss: 0.016758102923631668 2023-01-22 21:07:20.759790: step: 1368/527, loss: 0.02025187388062477 2023-01-22 21:07:21.797868: step: 1372/527, loss: 0.004952143877744675 2023-01-22 21:07:22.846886: step: 1376/527, loss: 0.0029432408045977354 2023-01-22 21:07:23.901690: step: 1380/527, loss: 0.017527470365166664 2023-01-22 21:07:24.949760: step: 1384/527, loss: 0.002615528181195259 2023-01-22 21:07:26.001333: step: 1388/527, loss: 0.004539438523352146 2023-01-22 21:07:27.049565: step: 1392/527, loss: 0.0361332893371582 2023-01-22 21:07:28.110554: step: 1396/527, loss: 0.004856666550040245 2023-01-22 21:07:29.172862: step: 1400/527, loss: 0.06768810003995895 2023-01-22 21:07:30.229770: step: 1404/527, loss: 0.0186097864061594 2023-01-22 21:07:31.297677: step: 1408/527, loss: 0.012118324637413025 2023-01-22 21:07:32.349594: step: 1412/527, loss: 0.00044537955545820296 2023-01-22 21:07:33.401597: step: 1416/527, loss: 0.00634266110137105 2023-01-22 21:07:34.455880: step: 1420/527, loss: 0.004623272456228733 2023-01-22 21:07:35.482935: step: 1424/527, loss: 0.0037482138723134995 2023-01-22 21:07:36.529848: step: 1428/527, loss: 0.012909426353871822 2023-01-22 21:07:37.585736: step: 1432/527, loss: 0.012598827481269836 2023-01-22 21:07:38.638753: step: 1436/527, loss: 0.0020017647184431553 2023-01-22 21:07:39.684445: step: 1440/527, loss: 0.006603737827390432 2023-01-22 21:07:40.736111: step: 1444/527, loss: 0.016444412991404533 2023-01-22 21:07:41.801978: step: 1448/527, loss: 0.009363116696476936 2023-01-22 21:07:42.844011: step: 1452/527, loss: 0.001328960177488625 2023-01-22 21:07:43.890655: step: 1456/527, loss: 0.02919343113899231 2023-01-22 21:07:44.950107: step: 1460/527, loss: 0.010756390169262886 2023-01-22 21:07:46.014167: step: 1464/527, loss: 0.004611361771821976 2023-01-22 21:07:47.066182: step: 1468/527, loss: 0.006537396926432848 2023-01-22 21:07:48.130223: step: 1472/527, loss: 0.023674989119172096 2023-01-22 21:07:49.165653: step: 1476/527, loss: 0.001452240627259016 2023-01-22 21:07:50.227907: step: 1480/527, loss: 0.0025078426115214825 2023-01-22 21:07:51.272550: step: 1484/527, loss: 0.0019193928455933928 2023-01-22 21:07:52.326341: step: 1488/527, loss: 0.007772236131131649 2023-01-22 21:07:53.377110: step: 1492/527, loss: 0.005323073826730251 2023-01-22 21:07:54.444569: step: 1496/527, loss: 0.010731426998972893 2023-01-22 21:07:55.497827: step: 1500/527, loss: 0.0007174506899900734 2023-01-22 21:07:56.542531: step: 1504/527, loss: 0.0017662025056779385 2023-01-22 21:07:57.601456: step: 1508/527, loss: 0.0021301519591361284 2023-01-22 21:07:58.651322: step: 1512/527, loss: 0.019619354978203773 2023-01-22 21:07:59.707239: step: 1516/527, loss: 0.014806658960878849 2023-01-22 21:08:00.752595: step: 1520/527, loss: 0.0057207257486879826 2023-01-22 21:08:01.806173: step: 1524/527, loss: 0.004767339210957289 2023-01-22 21:08:02.852700: step: 1528/527, loss: 0.007127249613404274 2023-01-22 21:08:03.878265: step: 1532/527, loss: 0.00903788860887289 2023-01-22 21:08:04.928505: step: 1536/527, loss: 0.0018995794234797359 2023-01-22 21:08:05.978629: step: 1540/527, loss: 0.009936696849763393 2023-01-22 21:08:07.034582: step: 1544/527, loss: 0.002004675567150116 2023-01-22 21:08:08.096006: step: 1548/527, loss: 0.01250398624688387 2023-01-22 21:08:09.130040: step: 1552/527, loss: 0.014228587038815022 2023-01-22 21:08:10.193866: step: 1556/527, loss: 0.0029483947437256575 2023-01-22 21:08:11.235306: step: 1560/527, loss: 0.00852295383810997 2023-01-22 21:08:12.275723: step: 1564/527, loss: 0.006583585869520903 2023-01-22 21:08:13.328775: step: 1568/527, loss: 0.007297124247997999 2023-01-22 21:08:14.368050: step: 1572/527, loss: 0.0007486153044737875 2023-01-22 21:08:15.426246: step: 1576/527, loss: 0.0025018302258104086 2023-01-22 21:08:16.498486: step: 1580/527, loss: 0.002946759108453989 2023-01-22 21:08:17.552865: step: 1584/527, loss: 0.006021034903824329 2023-01-22 21:08:18.606270: step: 1588/527, loss: 0.001694349106401205 2023-01-22 21:08:19.657347: step: 1592/527, loss: 0.008421924896538258 2023-01-22 21:08:20.707052: step: 1596/527, loss: 0.003631877712905407 2023-01-22 21:08:21.766581: step: 1600/527, loss: 0.005108369514346123 2023-01-22 21:08:22.806570: step: 1604/527, loss: 0.003427752060815692 2023-01-22 21:08:23.863755: step: 1608/527, loss: 0.00750616192817688 2023-01-22 21:08:24.902206: step: 1612/527, loss: 0.017545530572533607 2023-01-22 21:08:25.946264: step: 1616/527, loss: 0.004852135665714741 2023-01-22 21:08:26.993043: step: 1620/527, loss: 0.011480141431093216 2023-01-22 21:08:28.042700: step: 1624/527, loss: 0.00036228299723006785 2023-01-22 21:08:29.081606: step: 1628/527, loss: 0.0027326145209372044 2023-01-22 21:08:30.132888: step: 1632/527, loss: 0.006708966102451086 2023-01-22 21:08:31.185632: step: 1636/527, loss: 0.004288940690457821 2023-01-22 21:08:32.249661: step: 1640/527, loss: 0.004090540111064911 2023-01-22 21:08:33.304914: step: 1644/527, loss: 0.01779811829328537 2023-01-22 21:08:34.372312: step: 1648/527, loss: 0.0066120317205786705 2023-01-22 21:08:35.426607: step: 1652/527, loss: 0.05339784547686577 2023-01-22 21:08:36.469118: step: 1656/527, loss: 0.012658665888011456 2023-01-22 21:08:37.510056: step: 1660/527, loss: 0.003409152152016759 2023-01-22 21:08:38.562943: step: 1664/527, loss: 0.004463388584554195 2023-01-22 21:08:39.601327: step: 1668/527, loss: 0.007682070601731539 2023-01-22 21:08:40.647186: step: 1672/527, loss: 0.0011302304919809103 2023-01-22 21:08:41.681742: step: 1676/527, loss: 0.0012633507139980793 2023-01-22 21:08:42.740424: step: 1680/527, loss: 0.0004005008959211409 2023-01-22 21:08:43.792719: step: 1684/527, loss: 0.004677613731473684 2023-01-22 21:08:44.848613: step: 1688/527, loss: 0.008651215583086014 2023-01-22 21:08:45.899830: step: 1692/527, loss: 0.005553642753511667 2023-01-22 21:08:46.948537: step: 1696/527, loss: 0.007715283893048763 2023-01-22 21:08:47.997489: step: 1700/527, loss: 0.003041085321456194 2023-01-22 21:08:49.049469: step: 1704/527, loss: 0.009844905696809292 2023-01-22 21:08:50.109594: step: 1708/527, loss: 0.003409659257158637 2023-01-22 21:08:51.136073: step: 1712/527, loss: 0.0015394684160128236 2023-01-22 21:08:52.199692: step: 1716/527, loss: 0.0003158093895763159 2023-01-22 21:08:53.258049: step: 1720/527, loss: 0.009782586246728897 2023-01-22 21:08:54.297535: step: 1724/527, loss: 0.0077540758065879345 2023-01-22 21:08:55.343188: step: 1728/527, loss: 0.010061799548566341 2023-01-22 21:08:56.398103: step: 1732/527, loss: 0.0055779218673706055 2023-01-22 21:08:57.438702: step: 1736/527, loss: 0.003011499298736453 2023-01-22 21:08:58.495728: step: 1740/527, loss: 0.004066950641572475 2023-01-22 21:08:59.544987: step: 1744/527, loss: 0.005180165637284517 2023-01-22 21:09:00.596689: step: 1748/527, loss: 0.019502606242895126 2023-01-22 21:09:01.636369: step: 1752/527, loss: 0.006202231626957655 2023-01-22 21:09:02.689361: step: 1756/527, loss: 0.0028805765323340893 2023-01-22 21:09:03.738839: step: 1760/527, loss: 0.006118957884609699 2023-01-22 21:09:04.784798: step: 1764/527, loss: 0.008253831416368484 2023-01-22 21:09:05.825472: step: 1768/527, loss: 0.007253426592797041 2023-01-22 21:09:06.878457: step: 1772/527, loss: 0.010060278698801994 2023-01-22 21:09:07.943299: step: 1776/527, loss: 0.014416994526982307 2023-01-22 21:09:08.977734: step: 1780/527, loss: 0.003989460878074169 2023-01-22 21:09:10.025023: step: 1784/527, loss: 0.0040052211843431 2023-01-22 21:09:11.065716: step: 1788/527, loss: 0.006135378964245319 2023-01-22 21:09:12.128406: step: 1792/527, loss: 0.007094192318618298 2023-01-22 21:09:13.171300: step: 1796/527, loss: 0.009925086051225662 2023-01-22 21:09:14.214326: step: 1800/527, loss: 0.00942633580416441 2023-01-22 21:09:15.260976: step: 1804/527, loss: 0.002414081012830138 2023-01-22 21:09:16.316342: step: 1808/527, loss: 0.0015909569337964058 2023-01-22 21:09:17.370260: step: 1812/527, loss: 0.006630251184105873 2023-01-22 21:09:18.409987: step: 1816/527, loss: 0.012628388591110706 2023-01-22 21:09:19.473534: step: 1820/527, loss: 0.0032436393667012453 2023-01-22 21:09:20.512253: step: 1824/527, loss: 0.006815536879003048 2023-01-22 21:09:21.572251: step: 1828/527, loss: 0.005601773504167795 2023-01-22 21:09:22.618195: step: 1832/527, loss: 0.0029775493312627077 2023-01-22 21:09:23.691703: step: 1836/527, loss: 0.0008988206391222775 2023-01-22 21:09:24.728376: step: 1840/527, loss: 0.005202633794397116 2023-01-22 21:09:25.774660: step: 1844/527, loss: 0.004864899441599846 2023-01-22 21:09:26.824218: step: 1848/527, loss: 0.015622783452272415 2023-01-22 21:09:27.870484: step: 1852/527, loss: 0.007590119261294603 2023-01-22 21:09:28.920448: step: 1856/527, loss: 0.0016814471455290914 2023-01-22 21:09:29.972566: step: 1860/527, loss: 0.01379324309527874 2023-01-22 21:09:31.020700: step: 1864/527, loss: 0.0002621396561153233 2023-01-22 21:09:32.072018: step: 1868/527, loss: 0.03247030824422836 2023-01-22 21:09:33.135863: step: 1872/527, loss: 0.0045582083985209465 2023-01-22 21:09:34.189359: step: 1876/527, loss: 0.013034965842962265 2023-01-22 21:09:35.251351: step: 1880/527, loss: 0.004103172104805708 2023-01-22 21:09:36.291388: step: 1884/527, loss: 0.002278828527778387 2023-01-22 21:09:37.351059: step: 1888/527, loss: 0.0071377526037395 2023-01-22 21:09:38.395054: step: 1892/527, loss: 0.00052174850134179 2023-01-22 21:09:39.455111: step: 1896/527, loss: 0.017427073791623116 2023-01-22 21:09:40.500003: step: 1900/527, loss: 0.002051716670393944 2023-01-22 21:09:41.547725: step: 1904/527, loss: 0.03487671539187431 2023-01-22 21:09:42.594907: step: 1908/527, loss: 0.0019895327277481556 2023-01-22 21:09:43.653324: step: 1912/527, loss: 0.0014655434060841799 2023-01-22 21:09:44.707509: step: 1916/527, loss: 0.005020969547331333 2023-01-22 21:09:45.746945: step: 1920/527, loss: 0.007100861053913832 2023-01-22 21:09:46.803844: step: 1924/527, loss: 0.00624576210975647 2023-01-22 21:09:47.860632: step: 1928/527, loss: 0.0026606405153870583 2023-01-22 21:09:48.909884: step: 1932/527, loss: 0.0025374190881848335 2023-01-22 21:09:49.955221: step: 1936/527, loss: 0.013194175437092781 2023-01-22 21:09:50.999871: step: 1940/527, loss: 0.043300073593854904 2023-01-22 21:09:52.040561: step: 1944/527, loss: 0.003854151349514723 2023-01-22 21:09:53.093185: step: 1948/527, loss: 0.006117967888712883 2023-01-22 21:09:54.129081: step: 1952/527, loss: 0.0012772480258718133 2023-01-22 21:09:55.178040: step: 1956/527, loss: 0.006637756712734699 2023-01-22 21:09:56.224368: step: 1960/527, loss: 0.013307631947100163 2023-01-22 21:09:57.287465: step: 1964/527, loss: 0.007159658707678318 2023-01-22 21:09:58.340941: step: 1968/527, loss: 0.0021677978802472353 2023-01-22 21:09:59.403867: step: 1972/527, loss: 0.0038607243914157152 2023-01-22 21:10:00.442692: step: 1976/527, loss: 0.004574859514832497 2023-01-22 21:10:01.496275: step: 1980/527, loss: 0.008794196881353855 2023-01-22 21:10:02.533254: step: 1984/527, loss: 6.689530709991232e-05 2023-01-22 21:10:03.579903: step: 1988/527, loss: 0.0054607815109193325 2023-01-22 21:10:04.625439: step: 1992/527, loss: 0.004425434861332178 2023-01-22 21:10:05.660534: step: 1996/527, loss: 0.0022823847830295563 2023-01-22 21:10:06.702408: step: 2000/527, loss: 0.0010275078238919377 2023-01-22 21:10:07.737453: step: 2004/527, loss: 0.007139250636100769 2023-01-22 21:10:08.784729: step: 2008/527, loss: 0.0002385387779213488 2023-01-22 21:10:09.841665: step: 2012/527, loss: 0.003703586058691144 2023-01-22 21:10:10.897043: step: 2016/527, loss: 0.0005519840051420033 2023-01-22 21:10:11.958155: step: 2020/527, loss: 0.0845273956656456 2023-01-22 21:10:12.998316: step: 2024/527, loss: 0.0003239141951780766 2023-01-22 21:10:14.046059: step: 2028/527, loss: 0.004487867932766676 2023-01-22 21:10:15.095340: step: 2032/527, loss: 0.006838872097432613 2023-01-22 21:10:16.142054: step: 2036/527, loss: 0.006086735520511866 2023-01-22 21:10:17.190260: step: 2040/527, loss: 0.00017563503934070468 2023-01-22 21:10:18.230781: step: 2044/527, loss: 0.016819581389427185 2023-01-22 21:10:19.302648: step: 2048/527, loss: 0.0034136290196329355 2023-01-22 21:10:20.351805: step: 2052/527, loss: 0.001498789875768125 2023-01-22 21:10:21.393043: step: 2056/527, loss: 0.004052781034260988 2023-01-22 21:10:22.447505: step: 2060/527, loss: 0.02138926461338997 2023-01-22 21:10:23.501012: step: 2064/527, loss: 0.0009345430880784988 2023-01-22 21:10:24.546366: step: 2068/527, loss: 0.01055892277508974 2023-01-22 21:10:25.581236: step: 2072/527, loss: 0.007733316160738468 2023-01-22 21:10:26.627866: step: 2076/527, loss: 0.01906740292906761 2023-01-22 21:10:27.662446: step: 2080/527, loss: 0.009845261462032795 2023-01-22 21:10:28.714802: step: 2084/527, loss: 0.0012843109434470534 2023-01-22 21:10:29.751528: step: 2088/527, loss: 0.0008479171083308756 2023-01-22 21:10:30.800930: step: 2092/527, loss: 0.004049117676913738 2023-01-22 21:10:31.848426: step: 2096/527, loss: 0.03851398080587387 2023-01-22 21:10:32.917243: step: 2100/527, loss: 0.004781814757734537 2023-01-22 21:10:33.965905: step: 2104/527, loss: 0.009083278477191925 2023-01-22 21:10:35.040705: step: 2108/527, loss: 0.012636136263608932 ================================================== Loss: 0.008 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32396018683274025, 'r': 0.3454755692599621, 'f1': 0.3343721303948577}, 'combined': 0.24637946450147408, 'stategy': 1, 'epoch': 8} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3330581204403501, 'r': 0.307624591243087, 'f1': 0.3198365315381812}, 'combined': 0.20469538018443595, 'stategy': 1, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3220745706365417, 'r': 0.3642437269437929, 'f1': 0.34186365823575937}, 'combined': 0.25189953764740164, 'stategy': 1, 'epoch': 8} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.35371676120727746, 'r': 0.31641572093451, 'f1': 0.33402811231090307}, 'combined': 0.21377799187897792, 'stategy': 1, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3295121025991449, 'r': 0.33576470416649107, 'f1': 0.33260902085665567}, 'combined': 0.24508033115753575, 'stategy': 1, 'epoch': 8} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3605647599524678, 'r': 0.29724446998811266, 'f1': 0.3258570299420806}, 'combined': 0.23363334222262386, 'stategy': 1, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.261437908496732, 'r': 0.38095238095238093, 'f1': 0.31007751937984496}, 'combined': 0.20671834625322996, 'stategy': 1, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31666666666666665, 'r': 0.41304347826086957, 'f1': 0.3584905660377358}, 'combined': 0.1792452830188679, 'stategy': 1, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 8} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3295121025991449, 'r': 0.33576470416649107, 'f1': 0.33260902085665567}, 'combined': 0.24508033115753575, 'stategy': 1, 'epoch': 8} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3605647599524678, 'r': 0.29724446998811266, 'f1': 0.3258570299420806}, 'combined': 0.23363334222262386, 'stategy': 1, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 8} ****************************** Epoch: 9 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 21:13:13.513255: step: 4/527, loss: 0.0033048205077648163 2023-01-22 21:13:14.552022: step: 8/527, loss: 0.00628558499738574 2023-01-22 21:13:15.584987: step: 12/527, loss: 0.007464157417416573 2023-01-22 21:13:16.641477: step: 16/527, loss: 0.010390263982117176 2023-01-22 21:13:17.693308: step: 20/527, loss: 0.03318922221660614 2023-01-22 21:13:18.722192: step: 24/527, loss: 0.005193404853343964 2023-01-22 21:13:19.786754: step: 28/527, loss: 0.0054178559221327305 2023-01-22 21:13:20.825514: step: 32/527, loss: 0.0016802771715447307 2023-01-22 21:13:21.847290: step: 36/527, loss: 0.008208749815821648 2023-01-22 21:13:22.895933: step: 40/527, loss: 0.004966137930750847 2023-01-22 21:13:23.947792: step: 44/527, loss: 0.00980713777244091 2023-01-22 21:13:24.980688: step: 48/527, loss: 0.001211044960655272 2023-01-22 21:13:26.020836: step: 52/527, loss: 0.001335397595539689 2023-01-22 21:13:27.062462: step: 56/527, loss: 0.005293728783726692 2023-01-22 21:13:28.109294: step: 60/527, loss: 0.014024567790329456 2023-01-22 21:13:29.169971: step: 64/527, loss: 0.0037151824217289686 2023-01-22 21:13:30.217829: step: 68/527, loss: 0.004241329617798328 2023-01-22 21:13:31.273670: step: 72/527, loss: 0.010848058387637138 2023-01-22 21:13:32.301725: step: 76/527, loss: 0.0004890532582066953 2023-01-22 21:13:33.351498: step: 80/527, loss: 0.0344138965010643 2023-01-22 21:13:34.399729: step: 84/527, loss: 0.008191194385290146 2023-01-22 21:13:35.426609: step: 88/527, loss: 0.0031900103203952312 2023-01-22 21:13:36.473647: step: 92/527, loss: 0.003622998483479023 2023-01-22 21:13:37.512939: step: 96/527, loss: 0.002374161034822464 2023-01-22 21:13:38.564162: step: 100/527, loss: 7.176781946327537e-05 2023-01-22 21:13:39.593688: step: 104/527, loss: 6.124939682194963e-05 2023-01-22 21:13:40.634217: step: 108/527, loss: 0.0010666617890819907 2023-01-22 21:13:41.673520: step: 112/527, loss: 0.0009427742334082723 2023-01-22 21:13:42.720192: step: 116/527, loss: 0.015784695744514465 2023-01-22 21:13:43.760806: step: 120/527, loss: 0.0035453890450298786 2023-01-22 21:13:44.811704: step: 124/527, loss: 0.00014572578947991133 2023-01-22 21:13:45.855020: step: 128/527, loss: 0.0005226345383562148 2023-01-22 21:13:46.901506: step: 132/527, loss: 0.020005855709314346 2023-01-22 21:13:47.947703: step: 136/527, loss: 0.0069860536605119705 2023-01-22 21:13:48.989546: step: 140/527, loss: 0.0015370113542303443 2023-01-22 21:13:50.045910: step: 144/527, loss: 0.005408475641161203 2023-01-22 21:13:51.087334: step: 148/527, loss: 0.010170998051762581 2023-01-22 21:13:52.137349: step: 152/527, loss: 0.007378180045634508 2023-01-22 21:13:53.179624: step: 156/527, loss: 0.007173619233071804 2023-01-22 21:13:54.240750: step: 160/527, loss: 0.010527165606617928 2023-01-22 21:13:55.295437: step: 164/527, loss: 0.00205438956618309 2023-01-22 21:13:56.350228: step: 168/527, loss: 0.009719911031425 2023-01-22 21:13:57.390154: step: 172/527, loss: 0.0027593474369496107 2023-01-22 21:13:58.447787: step: 176/527, loss: 0.0027350790333002806 2023-01-22 21:13:59.486217: step: 180/527, loss: 0.0025922656059265137 2023-01-22 21:14:00.538177: step: 184/527, loss: 0.015137199312448502 2023-01-22 21:14:01.596742: step: 188/527, loss: 0.004287356976419687 2023-01-22 21:14:02.633695: step: 192/527, loss: 0.004597049672156572 2023-01-22 21:14:03.679950: step: 196/527, loss: 0.0009718707296997309 2023-01-22 21:14:04.744826: step: 200/527, loss: 0.0006976012955419719 2023-01-22 21:14:05.798539: step: 204/527, loss: 0.008843767456710339 2023-01-22 21:14:06.857146: step: 208/527, loss: 0.0010709972120821476 2023-01-22 21:14:07.903477: step: 212/527, loss: 0.0030473654624074697 2023-01-22 21:14:08.961256: step: 216/527, loss: 0.0059176222421228886 2023-01-22 21:14:10.005220: step: 220/527, loss: 0.001009593834169209 2023-01-22 21:14:11.054281: step: 224/527, loss: 0.0021999201271682978 2023-01-22 21:14:12.107944: step: 228/527, loss: 0.0015889824135228992 2023-01-22 21:14:13.157458: step: 232/527, loss: 0.011037415824830532 2023-01-22 21:14:14.207104: step: 236/527, loss: 0.008738135918974876 2023-01-22 21:14:15.254457: step: 240/527, loss: 0.00613620737567544 2023-01-22 21:14:16.344168: step: 244/527, loss: 0.07120721787214279 2023-01-22 21:14:17.385988: step: 248/527, loss: 0.004037360195070505 2023-01-22 21:14:18.423148: step: 252/527, loss: 0.0010949140414595604 2023-01-22 21:14:19.471105: step: 256/527, loss: 0.009065071120858192 2023-01-22 21:14:20.524065: step: 260/527, loss: 0.005262174177914858 2023-01-22 21:14:21.569175: step: 264/527, loss: 0.0025205330457538366 2023-01-22 21:14:22.626742: step: 268/527, loss: 0.004450581502169371 2023-01-22 21:14:23.667992: step: 272/527, loss: 0.013686439022421837 2023-01-22 21:14:24.711390: step: 276/527, loss: 0.0041384524665772915 2023-01-22 21:14:25.753879: step: 280/527, loss: 0.006845667026937008 2023-01-22 21:14:26.794818: step: 284/527, loss: 0.0023614733945578337 2023-01-22 21:14:27.836904: step: 288/527, loss: 0.007462221663445234 2023-01-22 21:14:28.900560: step: 292/527, loss: 0.0022659345995634794 2023-01-22 21:14:29.947049: step: 296/527, loss: 0.0024779837112873793 2023-01-22 21:14:31.008129: step: 300/527, loss: 0.0007155893836170435 2023-01-22 21:14:32.045311: step: 304/527, loss: 0.02356657385826111 2023-01-22 21:14:33.095081: step: 308/527, loss: 0.0005279480828903615 2023-01-22 21:14:34.142740: step: 312/527, loss: 0.006099306978285313 2023-01-22 21:14:35.206769: step: 316/527, loss: 0.007159698288887739 2023-01-22 21:14:36.251362: step: 320/527, loss: 0.0013627164298668504 2023-01-22 21:14:37.301688: step: 324/527, loss: 0.03577382490038872 2023-01-22 21:14:38.357993: step: 328/527, loss: 0.03740757703781128 2023-01-22 21:14:39.417600: step: 332/527, loss: 0.012488479726016521 2023-01-22 21:14:40.479012: step: 336/527, loss: 0.0020703908521682024 2023-01-22 21:14:41.543296: step: 340/527, loss: 0.00265825679525733 2023-01-22 21:14:42.594183: step: 344/527, loss: 0.0023790725972503424 2023-01-22 21:14:43.654247: step: 348/527, loss: 0.007340741343796253 2023-01-22 21:14:44.717707: step: 352/527, loss: 0.005207661539316177 2023-01-22 21:14:45.766136: step: 356/527, loss: 0.009817021898925304 2023-01-22 21:14:46.813474: step: 360/527, loss: 0.006574298720806837 2023-01-22 21:14:47.865249: step: 364/527, loss: 0.023271994665265083 2023-01-22 21:14:48.919961: step: 368/527, loss: 0.007112201768904924 2023-01-22 21:14:49.981560: step: 372/527, loss: 0.0051622046157717705 2023-01-22 21:14:51.044066: step: 376/527, loss: 0.0008659258019179106 2023-01-22 21:14:52.087698: step: 380/527, loss: 0.0018084509065374732 2023-01-22 21:14:53.125911: step: 384/527, loss: 0.0023707428481429815 2023-01-22 21:14:54.181187: step: 388/527, loss: 0.0011350124841555953 2023-01-22 21:14:55.236161: step: 392/527, loss: 0.011461051180958748 2023-01-22 21:14:56.274781: step: 396/527, loss: 0.0008804863318800926 2023-01-22 21:14:57.314373: step: 400/527, loss: 0.001423258800059557 2023-01-22 21:14:58.364465: step: 404/527, loss: 0.005903357174247503 2023-01-22 21:14:59.426896: step: 408/527, loss: 0.008300085552036762 2023-01-22 21:15:00.490710: step: 412/527, loss: 0.0011669457890093327 2023-01-22 21:15:01.563846: step: 416/527, loss: 0.006710781715810299 2023-01-22 21:15:02.623867: step: 420/527, loss: 0.004249363671988249 2023-01-22 21:15:03.682595: step: 424/527, loss: 0.002789535094052553 2023-01-22 21:15:04.730022: step: 428/527, loss: 0.005826150998473167 2023-01-22 21:15:05.783837: step: 432/527, loss: 0.001404089154675603 2023-01-22 21:15:06.836298: step: 436/527, loss: 0.00032881333027035 2023-01-22 21:15:07.886867: step: 440/527, loss: 0.0017484960844740272 2023-01-22 21:15:08.932792: step: 444/527, loss: 0.002080990467220545 2023-01-22 21:15:09.991000: step: 448/527, loss: 0.00020153906370978802 2023-01-22 21:15:11.048160: step: 452/527, loss: 0.0009347113082185388 2023-01-22 21:15:12.096825: step: 456/527, loss: 0.01221492700278759 2023-01-22 21:15:13.143721: step: 460/527, loss: 0.001383670256473124 2023-01-22 21:15:14.190931: step: 464/527, loss: 0.00013337848940864205 2023-01-22 21:15:15.248274: step: 468/527, loss: 0.0018460775027051568 2023-01-22 21:15:16.309412: step: 472/527, loss: 0.0016555074835196137 2023-01-22 21:15:17.371929: step: 476/527, loss: 8.989281195681542e-05 2023-01-22 21:15:18.414901: step: 480/527, loss: 0.00475011533126235 2023-01-22 21:15:19.507420: step: 484/527, loss: 0.005712158977985382 2023-01-22 21:15:20.573694: step: 488/527, loss: 0.003547893837094307 2023-01-22 21:15:21.637067: step: 492/527, loss: 0.0014360366621986032 2023-01-22 21:15:22.690951: step: 496/527, loss: 8.79975559655577e-05 2023-01-22 21:15:23.766173: step: 500/527, loss: 0.0028699792455881834 2023-01-22 21:15:24.815720: step: 504/527, loss: 0.007351300213485956 2023-01-22 21:15:25.866428: step: 508/527, loss: 0.01670556142926216 2023-01-22 21:15:26.947275: step: 512/527, loss: 0.003492743708193302 2023-01-22 21:15:27.999129: step: 516/527, loss: 0.0033423209097236395 2023-01-22 21:15:29.041644: step: 520/527, loss: 0.0024150048848241568 2023-01-22 21:15:30.118197: step: 524/527, loss: 0.004328024107962847 2023-01-22 21:15:31.185897: step: 528/527, loss: 0.006088695488870144 2023-01-22 21:15:32.233848: step: 532/527, loss: 0.009847326204180717 2023-01-22 21:15:33.288226: step: 536/527, loss: 0.0009617412579245865 2023-01-22 21:15:34.346454: step: 540/527, loss: 0.006474452558904886 2023-01-22 21:15:35.394603: step: 544/527, loss: 0.0002648832742124796 2023-01-22 21:15:36.449644: step: 548/527, loss: 0.0005481508560478687 2023-01-22 21:15:37.496559: step: 552/527, loss: 0.028697669506072998 2023-01-22 21:15:38.551137: step: 556/527, loss: 0.000871647906024009 2023-01-22 21:15:39.607944: step: 560/527, loss: 0.007545448839664459 2023-01-22 21:15:40.658730: step: 564/527, loss: 0.02357676438987255 2023-01-22 21:15:41.726968: step: 568/527, loss: 0.0023842155933380127 2023-01-22 21:15:42.790980: step: 572/527, loss: 0.0032975059002637863 2023-01-22 21:15:43.861426: step: 576/527, loss: 0.005622240714728832 2023-01-22 21:15:44.906123: step: 580/527, loss: 0.007460377179086208 2023-01-22 21:15:45.967134: step: 584/527, loss: 0.0036258345935493708 2023-01-22 21:15:47.023451: step: 588/527, loss: 0.004793180152773857 2023-01-22 21:15:48.070360: step: 592/527, loss: 0.005957207642495632 2023-01-22 21:15:49.109105: step: 596/527, loss: 0.016912993043661118 2023-01-22 21:15:50.203848: step: 600/527, loss: 0.005824789870530367 2023-01-22 21:15:51.265273: step: 604/527, loss: 0.022178053855895996 2023-01-22 21:15:52.325574: step: 608/527, loss: 0.000452561245765537 2023-01-22 21:15:53.384790: step: 612/527, loss: 0.0013217199593782425 2023-01-22 21:15:54.437820: step: 616/527, loss: 0.0008828747668303549 2023-01-22 21:15:55.490996: step: 620/527, loss: 0.0014094137586653233 2023-01-22 21:15:56.531599: step: 624/527, loss: 0.004334430210292339 2023-01-22 21:15:57.589215: step: 628/527, loss: 0.0017066608415916562 2023-01-22 21:15:58.659252: step: 632/527, loss: 0.002489479025825858 2023-01-22 21:15:59.707716: step: 636/527, loss: 0.0006815826054662466 2023-01-22 21:16:00.759246: step: 640/527, loss: 0.0053624981082975864 2023-01-22 21:16:01.829921: step: 644/527, loss: 0.009374787099659443 2023-01-22 21:16:02.890635: step: 648/527, loss: 0.005790805909782648 2023-01-22 21:16:03.957487: step: 652/527, loss: 0.006711133755743504 2023-01-22 21:16:05.024249: step: 656/527, loss: 0.009350858628749847 2023-01-22 21:16:06.077605: step: 660/527, loss: 0.0009404016309417784 2023-01-22 21:16:07.128067: step: 664/527, loss: 0.008918608538806438 2023-01-22 21:16:08.176686: step: 668/527, loss: 0.004295982886105776 2023-01-22 21:16:09.232023: step: 672/527, loss: 2.8885775464004837e-05 2023-01-22 21:16:10.264806: step: 676/527, loss: 0.0020936280488967896 2023-01-22 21:16:11.325167: step: 680/527, loss: 0.005032512824982405 2023-01-22 21:16:12.376946: step: 684/527, loss: 0.004130632616579533 2023-01-22 21:16:13.438489: step: 688/527, loss: 0.006223151460289955 2023-01-22 21:16:14.492514: step: 692/527, loss: 0.016337711364030838 2023-01-22 21:16:15.558966: step: 696/527, loss: 0.0030684652738273144 2023-01-22 21:16:16.610576: step: 700/527, loss: 0.006968655623495579 2023-01-22 21:16:17.647856: step: 704/527, loss: 0.0009030302753672004 2023-01-22 21:16:18.712992: step: 708/527, loss: 0.0012635773746296763 2023-01-22 21:16:19.790723: step: 712/527, loss: 0.0052895741537213326 2023-01-22 21:16:20.847938: step: 716/527, loss: 0.0007400992326438427 2023-01-22 21:16:21.911912: step: 720/527, loss: 0.0027009667828679085 2023-01-22 21:16:22.966941: step: 724/527, loss: 0.0011684899218380451 2023-01-22 21:16:24.018984: step: 728/527, loss: 0.044344719499349594 2023-01-22 21:16:25.065769: step: 732/527, loss: 0.0008587951306253672 2023-01-22 21:16:26.115752: step: 736/527, loss: 0.010753421112895012 2023-01-22 21:16:27.166950: step: 740/527, loss: 0.005611707456409931 2023-01-22 21:16:28.212725: step: 744/527, loss: 0.003588673658668995 2023-01-22 21:16:29.257191: step: 748/527, loss: 0.0005281084449961782 2023-01-22 21:16:30.323862: step: 752/527, loss: 0.00015799708489794284 2023-01-22 21:16:31.384192: step: 756/527, loss: 0.013051873072981834 2023-01-22 21:16:32.445715: step: 760/527, loss: 0.0017645241459831595 2023-01-22 21:16:33.499577: step: 764/527, loss: 0.0030068105552345514 2023-01-22 21:16:34.551467: step: 768/527, loss: 0.0032245442271232605 2023-01-22 21:16:35.618448: step: 772/527, loss: 0.001608636579476297 2023-01-22 21:16:36.675870: step: 776/527, loss: 0.0009087428916245699 2023-01-22 21:16:37.730914: step: 780/527, loss: 0.005056384485214949 2023-01-22 21:16:38.780495: step: 784/527, loss: 0.0043554785661399364 2023-01-22 21:16:39.827323: step: 788/527, loss: 0.0022038696333765984 2023-01-22 21:16:40.880356: step: 792/527, loss: 0.007971453480422497 2023-01-22 21:16:41.935940: step: 796/527, loss: 0.0022209854796528816 2023-01-22 21:16:42.982906: step: 800/527, loss: 0.0019030345138162374 2023-01-22 21:16:44.033050: step: 804/527, loss: 0.007975144311785698 2023-01-22 21:16:45.084960: step: 808/527, loss: 0.003980494569987059 2023-01-22 21:16:46.142621: step: 812/527, loss: 0.0034533338621258736 2023-01-22 21:16:47.185056: step: 816/527, loss: 0.00048152636736631393 2023-01-22 21:16:48.247038: step: 820/527, loss: 0.00901760533452034 2023-01-22 21:16:49.310943: step: 824/527, loss: 0.0010012831771746278 2023-01-22 21:16:50.365820: step: 828/527, loss: 0.014228510670363903 2023-01-22 21:16:51.435532: step: 832/527, loss: 0.0024030162021517754 2023-01-22 21:16:52.498926: step: 836/527, loss: 0.004572403151541948 2023-01-22 21:16:53.555352: step: 840/527, loss: 0.0006582144997082651 2023-01-22 21:16:54.599536: step: 844/527, loss: 0.01826358400285244 2023-01-22 21:16:55.652933: step: 848/527, loss: 0.004884585738182068 2023-01-22 21:16:56.703367: step: 852/527, loss: 0.005931575316935778 2023-01-22 21:16:57.757270: step: 856/527, loss: 0.0034075970761477947 2023-01-22 21:16:58.802099: step: 860/527, loss: 0.000262924178969115 2023-01-22 21:16:59.855945: step: 864/527, loss: 0.00541821401566267 2023-01-22 21:17:00.893067: step: 868/527, loss: 0.0029333201237022877 2023-01-22 21:17:01.951792: step: 872/527, loss: 0.01131183747202158 2023-01-22 21:17:03.014007: step: 876/527, loss: 0.002358927857130766 2023-01-22 21:17:04.066476: step: 880/527, loss: 0.011612359434366226 2023-01-22 21:17:05.128802: step: 884/527, loss: 0.00248313439078629 2023-01-22 21:17:06.181276: step: 888/527, loss: 0.001263869577087462 2023-01-22 21:17:07.242323: step: 892/527, loss: 0.0037806208711117506 2023-01-22 21:17:08.285996: step: 896/527, loss: 0.0002942613500636071 2023-01-22 21:17:09.353360: step: 900/527, loss: 0.03475181758403778 2023-01-22 21:17:10.420624: step: 904/527, loss: 0.0011011157184839249 2023-01-22 21:17:11.478613: step: 908/527, loss: 0.0038861355278640985 2023-01-22 21:17:12.540723: step: 912/527, loss: 0.0029995073564350605 2023-01-22 21:17:13.598639: step: 916/527, loss: 0.003938739653676748 2023-01-22 21:17:14.659630: step: 920/527, loss: 0.0034913131967186928 2023-01-22 21:17:15.707724: step: 924/527, loss: 0.022138547152280807 2023-01-22 21:17:16.765742: step: 928/527, loss: 0.01640618033707142 2023-01-22 21:17:17.814442: step: 932/527, loss: 0.0062112645246088505 2023-01-22 21:17:18.869562: step: 936/527, loss: 0.0028167078271508217 2023-01-22 21:17:19.955227: step: 940/527, loss: 0.016081981360912323 2023-01-22 21:17:21.001824: step: 944/527, loss: 0.00893818773329258 2023-01-22 21:17:22.057315: step: 948/527, loss: 0.0034072105772793293 2023-01-22 21:17:23.121972: step: 952/527, loss: 0.004192746710032225 2023-01-22 21:17:24.179008: step: 956/527, loss: 0.001134840422309935 2023-01-22 21:17:25.221657: step: 960/527, loss: 0.019354024901986122 2023-01-22 21:17:26.274153: step: 964/527, loss: 0.010273844003677368 2023-01-22 21:17:27.334652: step: 968/527, loss: 0.0009943839395418763 2023-01-22 21:17:28.384354: step: 972/527, loss: 0.004493022337555885 2023-01-22 21:17:29.428787: step: 976/527, loss: 0.0005187156493775547 2023-01-22 21:17:30.485824: step: 980/527, loss: 0.0060578277334570885 2023-01-22 21:17:31.532869: step: 984/527, loss: 0.006456491071730852 2023-01-22 21:17:32.579709: step: 988/527, loss: 0.0006316354847513139 2023-01-22 21:17:33.642438: step: 992/527, loss: 0.007543027400970459 2023-01-22 21:17:34.697262: step: 996/527, loss: 0.0046927956864237785 2023-01-22 21:17:35.747328: step: 1000/527, loss: 0.001250214409083128 2023-01-22 21:17:36.829238: step: 1004/527, loss: 0.0040900446474552155 2023-01-22 21:17:37.884803: step: 1008/527, loss: 0.011389343068003654 2023-01-22 21:17:38.937882: step: 1012/527, loss: 0.004603247623890638 2023-01-22 21:17:39.988808: step: 1016/527, loss: 0.002595828380435705 2023-01-22 21:17:41.034769: step: 1020/527, loss: 0.001541301142424345 2023-01-22 21:17:42.076081: step: 1024/527, loss: 6.932113319635391e-05 2023-01-22 21:17:43.149365: step: 1028/527, loss: 0.0052374848164618015 2023-01-22 21:17:44.191412: step: 1032/527, loss: 0.0 2023-01-22 21:17:45.235262: step: 1036/527, loss: 0.00318225403316319 2023-01-22 21:17:46.299073: step: 1040/527, loss: 0.007270502857863903 2023-01-22 21:17:47.353649: step: 1044/527, loss: 0.007681084331125021 2023-01-22 21:17:48.406219: step: 1048/527, loss: 0.007212923374027014 2023-01-22 21:17:49.468822: step: 1052/527, loss: 4.61352028651163e-05 2023-01-22 21:17:50.524572: step: 1056/527, loss: 0.003815811825916171 2023-01-22 21:17:51.563101: step: 1060/527, loss: 0.0013724776217713952 2023-01-22 21:17:52.627254: step: 1064/527, loss: 0.009887893684208393 2023-01-22 21:17:53.665959: step: 1068/527, loss: 0.00273287040181458 2023-01-22 21:17:54.726211: step: 1072/527, loss: 0.0027575402054935694 2023-01-22 21:17:55.779474: step: 1076/527, loss: 0.005254405550658703 2023-01-22 21:17:56.822754: step: 1080/527, loss: 0.012286016717553139 2023-01-22 21:17:57.881445: step: 1084/527, loss: 0.003363200929015875 2023-01-22 21:17:58.934071: step: 1088/527, loss: 0.0010884279618039727 2023-01-22 21:18:00.001057: step: 1092/527, loss: 0.001632597646676004 2023-01-22 21:18:01.038675: step: 1096/527, loss: 0.00014103917055763304 2023-01-22 21:18:02.088733: step: 1100/527, loss: 0.00603564502671361 2023-01-22 21:18:03.134833: step: 1104/527, loss: 0.0013685551239177585 2023-01-22 21:18:04.185987: step: 1108/527, loss: 0.020622577518224716 2023-01-22 21:18:05.229767: step: 1112/527, loss: 0.006159561220556498 2023-01-22 21:18:06.269496: step: 1116/527, loss: 0.004934003110975027 2023-01-22 21:18:07.313035: step: 1120/527, loss: 0.014519993215799332 2023-01-22 21:18:08.356593: step: 1124/527, loss: 0.00016985707043204457 2023-01-22 21:18:09.425404: step: 1128/527, loss: 0.00581841915845871 2023-01-22 21:18:10.491886: step: 1132/527, loss: 0.0029912195168435574 2023-01-22 21:18:11.539288: step: 1136/527, loss: 0.0021725844126194715 2023-01-22 21:18:12.590186: step: 1140/527, loss: 0.005850040819495916 2023-01-22 21:18:13.641876: step: 1144/527, loss: 0.007465075701475143 2023-01-22 21:18:14.692251: step: 1148/527, loss: 0.010512011125683784 2023-01-22 21:18:15.729994: step: 1152/527, loss: 0.003384977113455534 2023-01-22 21:18:16.777139: step: 1156/527, loss: 3.9365568227367476e-05 2023-01-22 21:18:17.825149: step: 1160/527, loss: 0.0033887599129229784 2023-01-22 21:18:18.862308: step: 1164/527, loss: 0.011170794256031513 2023-01-22 21:18:19.935359: step: 1168/527, loss: 0.0012819372350350022 2023-01-22 21:18:20.988067: step: 1172/527, loss: 0.002545284805819392 2023-01-22 21:18:22.048726: step: 1176/527, loss: 0.006260955706238747 2023-01-22 21:18:23.092515: step: 1180/527, loss: 0.003953002858906984 2023-01-22 21:18:24.154606: step: 1184/527, loss: 0.0008687982917763293 2023-01-22 21:18:25.215741: step: 1188/527, loss: 0.001278851181268692 2023-01-22 21:18:26.276854: step: 1192/527, loss: 0.003125661052763462 2023-01-22 21:18:27.318809: step: 1196/527, loss: 0.010133283212780952 2023-01-22 21:18:28.369159: step: 1200/527, loss: 0.011354834772646427 2023-01-22 21:18:29.406988: step: 1204/527, loss: 0.008050438947975636 2023-01-22 21:18:30.476333: step: 1208/527, loss: 0.015564941801130772 2023-01-22 21:18:31.536859: step: 1212/527, loss: 0.006989758461713791 2023-01-22 21:18:32.593105: step: 1216/527, loss: 0.006502440664917231 2023-01-22 21:18:33.644435: step: 1220/527, loss: 0.004202271346002817 2023-01-22 21:18:34.683976: step: 1224/527, loss: 0.00974033959209919 2023-01-22 21:18:35.736987: step: 1228/527, loss: 0.001350743230432272 2023-01-22 21:18:36.797037: step: 1232/527, loss: 0.006581700872629881 2023-01-22 21:18:37.848176: step: 1236/527, loss: 0.015475451946258545 2023-01-22 21:18:38.902010: step: 1240/527, loss: 0.022281644865870476 2023-01-22 21:18:39.947347: step: 1244/527, loss: 0.006544162053614855 2023-01-22 21:18:41.004095: step: 1248/527, loss: 0.001436798251233995 2023-01-22 21:18:42.049762: step: 1252/527, loss: 0.021127983927726746 2023-01-22 21:18:43.096671: step: 1256/527, loss: 0.011949064210057259 2023-01-22 21:18:44.143109: step: 1260/527, loss: 0.0021717033814638853 2023-01-22 21:18:45.218438: step: 1264/527, loss: 0.01362568698823452 2023-01-22 21:18:46.282442: step: 1268/527, loss: 0.009508670307695866 2023-01-22 21:18:47.327550: step: 1272/527, loss: 0.0018951075617223978 2023-01-22 21:18:48.372555: step: 1276/527, loss: 0.0011392015730962157 2023-01-22 21:18:49.413116: step: 1280/527, loss: 0.0015228339470922947 2023-01-22 21:18:50.448848: step: 1284/527, loss: 0.008156024850904942 2023-01-22 21:18:51.504224: step: 1288/527, loss: 0.004517382942140102 2023-01-22 21:18:52.566105: step: 1292/527, loss: 0.004823583178222179 2023-01-22 21:18:53.613603: step: 1296/527, loss: 0.0001608405145816505 2023-01-22 21:18:54.667314: step: 1300/527, loss: 0.008443241007626057 2023-01-22 21:18:55.733010: step: 1304/527, loss: 0.006816533859819174 2023-01-22 21:18:56.784803: step: 1308/527, loss: 0.0001046990291797556 2023-01-22 21:18:57.868006: step: 1312/527, loss: 0.008965989574790001 2023-01-22 21:18:58.921273: step: 1316/527, loss: 0.005940592382103205 2023-01-22 21:18:59.978516: step: 1320/527, loss: 0.0053385584615170956 2023-01-22 21:19:01.031413: step: 1324/527, loss: 0.0005323129007592797 2023-01-22 21:19:02.067602: step: 1328/527, loss: 0.0018906351178884506 2023-01-22 21:19:03.130440: step: 1332/527, loss: 0.006390294060111046 2023-01-22 21:19:04.177472: step: 1336/527, loss: 0.016296381130814552 2023-01-22 21:19:05.209866: step: 1340/527, loss: 0.0012550114188343287 2023-01-22 21:19:06.257239: step: 1344/527, loss: 0.002112034475430846 2023-01-22 21:19:07.300665: step: 1348/527, loss: 0.0056297811679542065 2023-01-22 21:19:08.347911: step: 1352/527, loss: 0.007862898521125317 2023-01-22 21:19:09.386604: step: 1356/527, loss: 0.015588123351335526 2023-01-22 21:19:10.431485: step: 1360/527, loss: 0.0 2023-01-22 21:19:11.466119: step: 1364/527, loss: 0.003212908050045371 2023-01-22 21:19:12.506666: step: 1368/527, loss: 0.009797049686312675 2023-01-22 21:19:13.566195: step: 1372/527, loss: 0.016829384490847588 2023-01-22 21:19:14.617520: step: 1376/527, loss: 0.005274294409900904 2023-01-22 21:19:15.680263: step: 1380/527, loss: 0.00189267098903656 2023-01-22 21:19:16.722439: step: 1384/527, loss: 0.015568006783723831 2023-01-22 21:19:17.785153: step: 1388/527, loss: 0.0030747901182621717 2023-01-22 21:19:18.843251: step: 1392/527, loss: 0.003674580017104745 2023-01-22 21:19:19.882439: step: 1396/527, loss: 0.002436636947095394 2023-01-22 21:19:20.941912: step: 1400/527, loss: 0.0010753895621746778 2023-01-22 21:19:21.998901: step: 1404/527, loss: 0.005425342358648777 2023-01-22 21:19:23.042436: step: 1408/527, loss: 0.0031529467087239027 2023-01-22 21:19:24.108041: step: 1412/527, loss: 0.0017186481272801757 2023-01-22 21:19:25.169063: step: 1416/527, loss: 0.008601337671279907 2023-01-22 21:19:26.211716: step: 1420/527, loss: 0.0001440748164895922 2023-01-22 21:19:27.268370: step: 1424/527, loss: 0.004143062513321638 2023-01-22 21:19:28.313940: step: 1428/527, loss: 0.004578811582177877 2023-01-22 21:19:29.347465: step: 1432/527, loss: 0.010508562438189983 2023-01-22 21:19:30.402465: step: 1436/527, loss: 0.005084532778710127 2023-01-22 21:19:31.432648: step: 1440/527, loss: 0.003376667620614171 2023-01-22 21:19:32.482961: step: 1444/527, loss: 0.007349275052547455 2023-01-22 21:19:33.541964: step: 1448/527, loss: 0.007827515713870525 2023-01-22 21:19:34.592997: step: 1452/527, loss: 0.001971168676391244 2023-01-22 21:19:35.635068: step: 1456/527, loss: 0.0033010949846357107 2023-01-22 21:19:36.697165: step: 1460/527, loss: 0.004628949332982302 2023-01-22 21:19:37.737223: step: 1464/527, loss: 0.005636187270283699 2023-01-22 21:19:38.810651: step: 1468/527, loss: 0.016175074502825737 2023-01-22 21:19:39.871797: step: 1472/527, loss: 0.006416806019842625 2023-01-22 21:19:40.923446: step: 1476/527, loss: 0.00404331972822547 2023-01-22 21:19:41.970667: step: 1480/527, loss: 0.01362985372543335 2023-01-22 21:19:43.018236: step: 1484/527, loss: 0.005481289699673653 2023-01-22 21:19:44.077834: step: 1488/527, loss: 0.007824812084436417 2023-01-22 21:19:45.138407: step: 1492/527, loss: 0.006577885709702969 2023-01-22 21:19:46.174278: step: 1496/527, loss: 0.00042355613550171256 2023-01-22 21:19:47.221131: step: 1500/527, loss: 0.03957362845540047 2023-01-22 21:19:48.275519: step: 1504/527, loss: 0.006651520729064941 2023-01-22 21:19:49.327095: step: 1508/527, loss: 0.003347818274050951 2023-01-22 21:19:50.400875: step: 1512/527, loss: 0.00016294789384119213 2023-01-22 21:19:51.453949: step: 1516/527, loss: 0.003195906290784478 2023-01-22 21:19:52.498898: step: 1520/527, loss: 0.00042527110781520605 2023-01-22 21:19:53.554727: step: 1524/527, loss: 0.004982766229659319 2023-01-22 21:19:54.607516: step: 1528/527, loss: 0.012321083806455135 2023-01-22 21:19:55.666467: step: 1532/527, loss: 0.009371409192681313 2023-01-22 21:19:56.710872: step: 1536/527, loss: 0.009588467888534069 2023-01-22 21:19:57.751991: step: 1540/527, loss: 0.0042380825616419315 2023-01-22 21:19:58.794191: step: 1544/527, loss: 1.6275606640192564e-06 2023-01-22 21:19:59.857105: step: 1548/527, loss: 0.01996329426765442 2023-01-22 21:20:00.906894: step: 1552/527, loss: 0.0034044182393699884 2023-01-22 21:20:01.962241: step: 1556/527, loss: 0.0027883679140359163 2023-01-22 21:20:03.010869: step: 1560/527, loss: 0.007521283347159624 2023-01-22 21:20:04.071604: step: 1564/527, loss: 0.010301162488758564 2023-01-22 21:20:05.137921: step: 1568/527, loss: 0.007357570342719555 2023-01-22 21:20:06.179034: step: 1572/527, loss: 0.0035316043067723513 2023-01-22 21:20:07.228457: step: 1576/527, loss: 0.00499480776488781 2023-01-22 21:20:08.284018: step: 1580/527, loss: 0.0066254339180886745 2023-01-22 21:20:09.333951: step: 1584/527, loss: 0.005504979752004147 2023-01-22 21:20:10.367809: step: 1588/527, loss: 0.002800211776047945 2023-01-22 21:20:11.402354: step: 1592/527, loss: 0.009911361150443554 2023-01-22 21:20:12.463708: step: 1596/527, loss: 0.0003656733315438032 2023-01-22 21:20:13.505339: step: 1600/527, loss: 0.009688540361821651 2023-01-22 21:20:14.540201: step: 1604/527, loss: 0.014820481650531292 2023-01-22 21:20:15.579519: step: 1608/527, loss: 0.0014237827854231 2023-01-22 21:20:16.640143: step: 1612/527, loss: 0.007423287723213434 2023-01-22 21:20:17.692404: step: 1616/527, loss: 0.002564249327406287 2023-01-22 21:20:18.748681: step: 1620/527, loss: 0.005924270488321781 2023-01-22 21:20:19.794356: step: 1624/527, loss: 0.0025675161741673946 2023-01-22 21:20:20.861093: step: 1628/527, loss: 0.008801406249403954 2023-01-22 21:20:21.899777: step: 1632/527, loss: 0.0040179709903895855 2023-01-22 21:20:22.957987: step: 1636/527, loss: 0.003670003730803728 2023-01-22 21:20:24.010959: step: 1640/527, loss: 0.002694187220185995 2023-01-22 21:20:25.065351: step: 1644/527, loss: 0.0025451581459492445 2023-01-22 21:20:26.131388: step: 1648/527, loss: 0.007700404152274132 2023-01-22 21:20:27.184659: step: 1652/527, loss: 0.0026077507063746452 2023-01-22 21:20:28.229845: step: 1656/527, loss: 0.001982457237318158 2023-01-22 21:20:29.269814: step: 1660/527, loss: 0.005754484795033932 2023-01-22 21:20:30.327521: step: 1664/527, loss: 0.011494054459035397 2023-01-22 21:20:31.370047: step: 1668/527, loss: 0.0015631213318556547 2023-01-22 21:20:32.427734: step: 1672/527, loss: 0.0017745968652889132 2023-01-22 21:20:33.471708: step: 1676/527, loss: 0.008026758208870888 2023-01-22 21:20:34.529683: step: 1680/527, loss: 0.000972359033767134 2023-01-22 21:20:35.585708: step: 1684/527, loss: 0.005759639199823141 2023-01-22 21:20:36.638334: step: 1688/527, loss: 0.0030640270560979843 2023-01-22 21:20:37.683974: step: 1692/527, loss: 0.004121731501072645 2023-01-22 21:20:38.750748: step: 1696/527, loss: 0.00786714255809784 2023-01-22 21:20:39.799351: step: 1700/527, loss: 0.0027322557289153337 2023-01-22 21:20:40.861896: step: 1704/527, loss: 0.00435422221198678 2023-01-22 21:20:41.902846: step: 1708/527, loss: 0.004531675949692726 2023-01-22 21:20:42.954651: step: 1712/527, loss: 0.003216907847672701 2023-01-22 21:20:43.991873: step: 1716/527, loss: 0.0037044119089841843 2023-01-22 21:20:45.065443: step: 1720/527, loss: 0.00047670950880274177 2023-01-22 21:20:46.114561: step: 1724/527, loss: 0.0023558507673442364 2023-01-22 21:20:47.151308: step: 1728/527, loss: 0.0002946642925962806 2023-01-22 21:20:48.181335: step: 1732/527, loss: 0.0003873909590765834 2023-01-22 21:20:49.257474: step: 1736/527, loss: 0.0026340989861637354 2023-01-22 21:20:50.305169: step: 1740/527, loss: 0.00020999550179112703 2023-01-22 21:20:51.336761: step: 1744/527, loss: 0.0033446657471358776 2023-01-22 21:20:52.408692: step: 1748/527, loss: 0.003600065829232335 2023-01-22 21:20:53.470897: step: 1752/527, loss: 0.019987093284726143 2023-01-22 21:20:54.518221: step: 1756/527, loss: 0.0019899429753422737 2023-01-22 21:20:55.575481: step: 1760/527, loss: 0.01258633378893137 2023-01-22 21:20:56.636791: step: 1764/527, loss: 0.005223503801971674 2023-01-22 21:20:57.695351: step: 1768/527, loss: 0.005940215662121773 2023-01-22 21:20:58.738282: step: 1772/527, loss: 0.0013130708830431104 2023-01-22 21:20:59.784402: step: 1776/527, loss: 0.0008674848359078169 2023-01-22 21:21:00.851293: step: 1780/527, loss: 0.0004331193631514907 2023-01-22 21:21:01.906652: step: 1784/527, loss: 0.010653001256287098 2023-01-22 21:21:02.961805: step: 1788/527, loss: 0.005102533381432295 2023-01-22 21:21:04.011081: step: 1792/527, loss: 0.005774473771452904 2023-01-22 21:21:05.054788: step: 1796/527, loss: 0.000965410319622606 2023-01-22 21:21:06.100344: step: 1800/527, loss: 0.010976944118738174 2023-01-22 21:21:07.156569: step: 1804/527, loss: 0.011997640132904053 2023-01-22 21:21:08.201005: step: 1808/527, loss: 0.004539198242127895 2023-01-22 21:21:09.255055: step: 1812/527, loss: 0.002840488450601697 2023-01-22 21:21:10.308544: step: 1816/527, loss: 0.0035524554550647736 2023-01-22 21:21:11.361279: step: 1820/527, loss: 0.0060480451211333275 2023-01-22 21:21:12.417917: step: 1824/527, loss: 0.0031858596485108137 2023-01-22 21:21:13.465135: step: 1828/527, loss: 0.005063018295913935 2023-01-22 21:21:14.501505: step: 1832/527, loss: 0.0026580675039440393 2023-01-22 21:21:15.540025: step: 1836/527, loss: 0.0028464419301599264 2023-01-22 21:21:16.580442: step: 1840/527, loss: 0.021969856694340706 2023-01-22 21:21:17.623421: step: 1844/527, loss: 0.016569070518016815 2023-01-22 21:21:18.656840: step: 1848/527, loss: 0.00024191215925384313 2023-01-22 21:21:19.725796: step: 1852/527, loss: 0.0017284497153013945 2023-01-22 21:21:20.774440: step: 1856/527, loss: 0.00017486634897068143 2023-01-22 21:21:21.826494: step: 1860/527, loss: 0.0003898689174093306 2023-01-22 21:21:22.885221: step: 1864/527, loss: 0.003978152759373188 2023-01-22 21:21:23.931635: step: 1868/527, loss: 0.006101039703935385 2023-01-22 21:21:24.982677: step: 1872/527, loss: 0.00025843296316452324 2023-01-22 21:21:26.019962: step: 1876/527, loss: 0.001974471379071474 2023-01-22 21:21:27.060234: step: 1880/527, loss: 0.003903137519955635 2023-01-22 21:21:28.106766: step: 1884/527, loss: 0.003618246875703335 2023-01-22 21:21:29.156342: step: 1888/527, loss: 0.001595757552422583 2023-01-22 21:21:30.213997: step: 1892/527, loss: 0.005377558991312981 2023-01-22 21:21:31.265069: step: 1896/527, loss: 0.007170847151428461 2023-01-22 21:21:32.322177: step: 1900/527, loss: 0.008971121162176132 2023-01-22 21:21:33.364241: step: 1904/527, loss: 0.003187012393027544 2023-01-22 21:21:34.399850: step: 1908/527, loss: 0.018539823591709137 2023-01-22 21:21:35.436564: step: 1912/527, loss: 0.004290503915399313 2023-01-22 21:21:36.489267: step: 1916/527, loss: 0.0031834153924137354 2023-01-22 21:21:37.541360: step: 1920/527, loss: 0.004645330831408501 2023-01-22 21:21:38.606527: step: 1924/527, loss: 0.005643555428832769 2023-01-22 21:21:39.661283: step: 1928/527, loss: 0.003038665046915412 2023-01-22 21:21:40.721259: step: 1932/527, loss: 0.0023239408619701862 2023-01-22 21:21:41.765499: step: 1936/527, loss: 0.00981674063950777 2023-01-22 21:21:42.819899: step: 1940/527, loss: 0.004298528656363487 2023-01-22 21:21:43.867172: step: 1944/527, loss: 0.002680855803191662 2023-01-22 21:21:44.907761: step: 1948/527, loss: 0.005807960871607065 2023-01-22 21:21:45.961161: step: 1952/527, loss: 0.0013155624037608504 2023-01-22 21:21:46.997260: step: 1956/527, loss: 0.005734990816563368 2023-01-22 21:21:48.035391: step: 1960/527, loss: 0.00031206291168928146 2023-01-22 21:21:49.114994: step: 1964/527, loss: 0.0018663842929527164 2023-01-22 21:21:50.201920: step: 1968/527, loss: 0.0022151663433760405 2023-01-22 21:21:51.235157: step: 1972/527, loss: 0.00681777810677886 2023-01-22 21:21:52.287355: step: 1976/527, loss: 0.0012103980407118797 2023-01-22 21:21:53.336474: step: 1980/527, loss: 0.009966288693249226 2023-01-22 21:21:54.378099: step: 1984/527, loss: 0.0021340358071029186 2023-01-22 21:21:55.445255: step: 1988/527, loss: 0.0022295245435088873 2023-01-22 21:21:56.484540: step: 1992/527, loss: 0.002826056443154812 2023-01-22 21:21:57.524772: step: 1996/527, loss: 0.0019474141299724579 2023-01-22 21:21:58.571797: step: 2000/527, loss: 0.008456511422991753 2023-01-22 21:21:59.652983: step: 2004/527, loss: 0.0041686114855110645 2023-01-22 21:22:00.685385: step: 2008/527, loss: 0.0017365453531965613 2023-01-22 21:22:01.730033: step: 2012/527, loss: 9.989827231038362e-05 2023-01-22 21:22:02.783366: step: 2016/527, loss: 0.003863633843138814 2023-01-22 21:22:03.841861: step: 2020/527, loss: 0.001388088334351778 2023-01-22 21:22:04.882059: step: 2024/527, loss: 9.909391519613564e-05 2023-01-22 21:22:05.943122: step: 2028/527, loss: 0.015627026557922363 2023-01-22 21:22:07.015518: step: 2032/527, loss: 0.0035361130721867085 2023-01-22 21:22:08.044558: step: 2036/527, loss: 0.001129011856392026 2023-01-22 21:22:09.097245: step: 2040/527, loss: 0.0007500606006942689 2023-01-22 21:22:10.138212: step: 2044/527, loss: 0.007517545484006405 2023-01-22 21:22:11.175269: step: 2048/527, loss: 0.0008999849087558687 2023-01-22 21:22:12.213924: step: 2052/527, loss: 0.0019135841866955161 2023-01-22 21:22:13.258803: step: 2056/527, loss: 0.00427060155197978 2023-01-22 21:22:14.309449: step: 2060/527, loss: 0.00032574342912994325 2023-01-22 21:22:15.372898: step: 2064/527, loss: 0.008994975127279758 2023-01-22 21:22:16.430190: step: 2068/527, loss: 0.0009440046269446611 2023-01-22 21:22:17.472824: step: 2072/527, loss: 0.0038226398173719645 2023-01-22 21:22:18.529457: step: 2076/527, loss: 0.01862499676644802 2023-01-22 21:22:19.582728: step: 2080/527, loss: 0.015223889611661434 2023-01-22 21:22:20.629898: step: 2084/527, loss: 0.00031723108259029686 2023-01-22 21:22:21.675239: step: 2088/527, loss: 0.00397729454562068 2023-01-22 21:22:22.735817: step: 2092/527, loss: 0.003177271457388997 2023-01-22 21:22:23.785655: step: 2096/527, loss: 0.00018040844588540494 2023-01-22 21:22:24.830471: step: 2100/527, loss: 0.004522176459431648 2023-01-22 21:22:25.875654: step: 2104/527, loss: 0.0021267361007630825 2023-01-22 21:22:26.942247: step: 2108/527, loss: 0.004273226950317621 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32193110412926396, 'r': 0.3402573529411765, 'f1': 0.33084063653136536}, 'combined': 0.24377731112837447, 'stategy': 1, 'epoch': 9} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3402749053493572, 'r': 0.3130529129214087, 'f1': 0.32609678429313405}, 'combined': 0.20870194194760575, 'stategy': 1, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31137448501578935, 'r': 0.35096099449597506, 'f1': 0.3299847352352879}, 'combined': 0.24314664701547525, 'stategy': 1, 'epoch': 9} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3554174001845758, 'r': 0.3182601265289156, 'f1': 0.3358140423806304}, 'combined': 0.2149209871236034, 'stategy': 1, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3291707300107707, 'r': 0.33291840435624437, 'f1': 0.3310339605580015}, 'combined': 0.24391976041115898, 'stategy': 1, 'epoch': 9} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3643725955343912, 'r': 0.2997204971456685, 'f1': 0.32889947714736867}, 'combined': 0.23581471946415114, 'stategy': 1, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.261437908496732, 'r': 0.38095238095238093, 'f1': 0.31007751937984496}, 'combined': 0.20671834625322996, 'stategy': 1, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31666666666666665, 'r': 0.41304347826086957, 'f1': 0.3584905660377358}, 'combined': 0.1792452830188679, 'stategy': 1, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'stategy': 1, 'epoch': 9} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3295121025991449, 'r': 0.33576470416649107, 'f1': 0.33260902085665567}, 'combined': 0.24508033115753575, 'stategy': 1, 'epoch': 8} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3605647599524678, 'r': 0.29724446998811266, 'f1': 0.3258570299420806}, 'combined': 0.23363334222262386, 'stategy': 1, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 8} ****************************** Epoch: 10 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 21:24:57.465759: step: 4/527, loss: 0.004833657760173082 2023-01-22 21:24:58.501015: step: 8/527, loss: 0.0011364136589691043 2023-01-22 21:24:59.551686: step: 12/527, loss: 0.013693591579794884 2023-01-22 21:25:00.599693: step: 16/527, loss: 0.0008982678409665823 2023-01-22 21:25:01.642370: step: 20/527, loss: 0.00028102987562306225 2023-01-22 21:25:02.675168: step: 24/527, loss: 0.0021157904993742704 2023-01-22 21:25:03.718626: step: 28/527, loss: 0.0020598529372364283 2023-01-22 21:25:04.747778: step: 32/527, loss: 0.005269910208880901 2023-01-22 21:25:05.796182: step: 36/527, loss: 0.03047555685043335 2023-01-22 21:25:06.821684: step: 40/527, loss: 0.008462157100439072 2023-01-22 21:25:07.846671: step: 44/527, loss: 0.0009237699559889734 2023-01-22 21:25:08.882347: step: 48/527, loss: 0.014638824388384819 2023-01-22 21:25:09.924351: step: 52/527, loss: 0.0074201286770403385 2023-01-22 21:25:10.964525: step: 56/527, loss: 0.007807761896401644 2023-01-22 21:25:12.022021: step: 60/527, loss: 0.009378915652632713 2023-01-22 21:25:13.072332: step: 64/527, loss: 0.0028345270548015833 2023-01-22 21:25:14.119125: step: 68/527, loss: 0.004703216720372438 2023-01-22 21:25:15.154301: step: 72/527, loss: 0.010960305109620094 2023-01-22 21:25:16.202081: step: 76/527, loss: 0.007687015924602747 2023-01-22 21:25:17.249137: step: 80/527, loss: 0.001322070718742907 2023-01-22 21:25:18.287162: step: 84/527, loss: 0.0022883673664182425 2023-01-22 21:25:19.333292: step: 88/527, loss: 0.004564776550978422 2023-01-22 21:25:20.373490: step: 92/527, loss: 0.0018911577062681317 2023-01-22 21:25:21.403461: step: 96/527, loss: 0.019945869222283363 2023-01-22 21:25:22.443511: step: 100/527, loss: 0.0005262373015284538 2023-01-22 21:25:23.478372: step: 104/527, loss: 0.002583642490208149 2023-01-22 21:25:24.526644: step: 108/527, loss: 0.0010474671144038439 2023-01-22 21:25:25.576505: step: 112/527, loss: 1.6051326383603737e-05 2023-01-22 21:25:26.620990: step: 116/527, loss: 0.0030734159518033266 2023-01-22 21:25:27.651212: step: 120/527, loss: 0.00039913030923344195 2023-01-22 21:25:28.689774: step: 124/527, loss: 0.00012458326818887144 2023-01-22 21:25:29.727561: step: 128/527, loss: 0.007980993948876858 2023-01-22 21:25:30.776351: step: 132/527, loss: 0.0007757146959193051 2023-01-22 21:25:31.819595: step: 136/527, loss: 0.0032367203384637833 2023-01-22 21:25:32.870802: step: 140/527, loss: 0.008006197400391102 2023-01-22 21:25:33.926239: step: 144/527, loss: 0.0037750983610749245 2023-01-22 21:25:34.970013: step: 148/527, loss: 0.004418736323714256 2023-01-22 21:25:36.009839: step: 152/527, loss: 0.0016289552440866828 2023-01-22 21:25:37.046358: step: 156/527, loss: 0.019678598269820213 2023-01-22 21:25:38.083396: step: 160/527, loss: 0.0011051876936107874 2023-01-22 21:25:39.121608: step: 164/527, loss: 0.0004329594084993005 2023-01-22 21:25:40.169833: step: 168/527, loss: 0.00012706460256595165 2023-01-22 21:25:41.217197: step: 172/527, loss: 0.01562930829823017 2023-01-22 21:25:42.258167: step: 176/527, loss: 0.004352135583758354 2023-01-22 21:25:43.302790: step: 180/527, loss: 0.003814413445070386 2023-01-22 21:25:44.347741: step: 184/527, loss: 0.00027251176652498543 2023-01-22 21:25:45.405146: step: 188/527, loss: 0.0033607804216444492 2023-01-22 21:25:46.452656: step: 192/527, loss: 0.007782275322824717 2023-01-22 21:25:47.504096: step: 196/527, loss: 0.003576585790142417 2023-01-22 21:25:48.560159: step: 200/527, loss: 2.0078901741271693e-07 2023-01-22 21:25:49.611172: step: 204/527, loss: 0.002374565228819847 2023-01-22 21:25:50.667246: step: 208/527, loss: 0.0038562549743801355 2023-01-22 21:25:51.704618: step: 212/527, loss: 0.0022288477048277855 2023-01-22 21:25:52.751465: step: 216/527, loss: 0.009281554259359837 2023-01-22 21:25:53.799418: step: 220/527, loss: 0.00013035137089900672 2023-01-22 21:25:54.859565: step: 224/527, loss: 3.689919685712084e-05 2023-01-22 21:25:55.899236: step: 228/527, loss: 0.0007885729428380728 2023-01-22 21:25:56.947573: step: 232/527, loss: 0.0016048562247306108 2023-01-22 21:25:57.984362: step: 236/527, loss: 0.02389618195593357 2023-01-22 21:25:59.048648: step: 240/527, loss: 0.007483481429517269 2023-01-22 21:26:00.084999: step: 244/527, loss: 0.010217053815722466 2023-01-22 21:26:01.129829: step: 248/527, loss: 0.00013255391968414187 2023-01-22 21:26:02.180209: step: 252/527, loss: 0.005191694479435682 2023-01-22 21:26:03.223981: step: 256/527, loss: 1.3778627362626139e-05 2023-01-22 21:26:04.269987: step: 260/527, loss: 0.00015251721197273582 2023-01-22 21:26:05.325481: step: 264/527, loss: 0.027187911793589592 2023-01-22 21:26:06.359823: step: 268/527, loss: 0.007595769129693508 2023-01-22 21:26:07.418423: step: 272/527, loss: 0.002322762506082654 2023-01-22 21:26:08.482573: step: 276/527, loss: 0.002794540487229824 2023-01-22 21:26:09.534455: step: 280/527, loss: 0.007562355138361454 2023-01-22 21:26:10.573239: step: 284/527, loss: 0.00033715058816596866 2023-01-22 21:26:11.614632: step: 288/527, loss: 0.0005360583891160786 2023-01-22 21:26:12.648832: step: 292/527, loss: 0.008844222873449326 2023-01-22 21:26:13.700828: step: 296/527, loss: 0.0033350863959640265 2023-01-22 21:26:14.749134: step: 300/527, loss: 0.003205646062269807 2023-01-22 21:26:15.801956: step: 304/527, loss: 0.022825386375188828 2023-01-22 21:26:16.849724: step: 308/527, loss: 0.0027564482297748327 2023-01-22 21:26:17.909977: step: 312/527, loss: 0.011012528091669083 2023-01-22 21:26:18.955122: step: 316/527, loss: 0.000799546658527106 2023-01-22 21:26:20.023160: step: 320/527, loss: 0.0071962978690862656 2023-01-22 21:26:21.075158: step: 324/527, loss: 0.0036637221928685904 2023-01-22 21:26:22.112220: step: 328/527, loss: 0.003913890570402145 2023-01-22 21:26:23.148746: step: 332/527, loss: 0.004625939764082432 2023-01-22 21:26:24.196483: step: 336/527, loss: 0.001403993577696383 2023-01-22 21:26:25.250473: step: 340/527, loss: 0.0035943130496889353 2023-01-22 21:26:26.318819: step: 344/527, loss: 0.002903914311900735 2023-01-22 21:26:27.369177: step: 348/527, loss: 0.018021633848547935 2023-01-22 21:26:28.430460: step: 352/527, loss: 0.002456091344356537 2023-01-22 21:26:29.492989: step: 356/527, loss: 0.017133845016360283 2023-01-22 21:26:30.544349: step: 360/527, loss: 0.0025656798388808966 2023-01-22 21:26:31.606723: step: 364/527, loss: 0.005487536080181599 2023-01-22 21:26:32.656123: step: 368/527, loss: 0.0010569763835519552 2023-01-22 21:26:33.694468: step: 372/527, loss: 0.01532845851033926 2023-01-22 21:26:34.733658: step: 376/527, loss: 0.004308891948312521 2023-01-22 21:26:35.769682: step: 380/527, loss: 0.002837874460965395 2023-01-22 21:26:36.832756: step: 384/527, loss: 0.00013713547377847135 2023-01-22 21:26:37.878636: step: 388/527, loss: 0.012571699917316437 2023-01-22 21:26:38.958511: step: 392/527, loss: 0.002484510187059641 2023-01-22 21:26:40.006956: step: 396/527, loss: 0.0013878004392609 2023-01-22 21:26:41.058800: step: 400/527, loss: 0.0014320421032607555 2023-01-22 21:26:42.101436: step: 404/527, loss: 0.002912610536441207 2023-01-22 21:26:43.158224: step: 408/527, loss: 0.0035548831801861525 2023-01-22 21:26:44.203711: step: 412/527, loss: 0.0009113316191360354 2023-01-22 21:26:45.257463: step: 416/527, loss: 0.004987087566405535 2023-01-22 21:26:46.303545: step: 420/527, loss: 0.004824623465538025 2023-01-22 21:26:47.365283: step: 424/527, loss: 0.0003809690533671528 2023-01-22 21:26:48.408359: step: 428/527, loss: 5.815789336338639e-05 2023-01-22 21:26:49.471876: step: 432/527, loss: 0.002271481789648533 2023-01-22 21:26:50.536023: step: 436/527, loss: 0.003358307993039489 2023-01-22 21:26:51.589452: step: 440/527, loss: 0.015740172937512398 2023-01-22 21:26:52.648077: step: 444/527, loss: 0.0017715157009661198 2023-01-22 21:26:53.714460: step: 448/527, loss: 0.0017089046305045485 2023-01-22 21:26:54.773399: step: 452/527, loss: 0.0003556696465238929 2023-01-22 21:26:55.840052: step: 456/527, loss: 0.003091268241405487 2023-01-22 21:26:56.880385: step: 460/527, loss: 0.0024387831799685955 2023-01-22 21:26:57.923841: step: 464/527, loss: 0.004320340696722269 2023-01-22 21:26:58.981691: step: 468/527, loss: 0.004508669953793287 2023-01-22 21:27:00.033783: step: 472/527, loss: 0.0001858569448813796 2023-01-22 21:27:01.085918: step: 476/527, loss: 0.00457730982452631 2023-01-22 21:27:02.130592: step: 480/527, loss: 0.001630950951948762 2023-01-22 21:27:03.181003: step: 484/527, loss: 0.0008800184587016702 2023-01-22 21:27:04.229625: step: 488/527, loss: 0.0023561110720038414 2023-01-22 21:27:05.275420: step: 492/527, loss: 0.00024463661247864366 2023-01-22 21:27:06.327263: step: 496/527, loss: 0.0067946636117994785 2023-01-22 21:27:07.364332: step: 500/527, loss: 0.004996356088668108 2023-01-22 21:27:08.411909: step: 504/527, loss: 0.0007243560976348817 2023-01-22 21:27:09.458799: step: 508/527, loss: 0.00022676971275359392 2023-01-22 21:27:10.508848: step: 512/527, loss: 0.00014773133443668485 2023-01-22 21:27:11.569383: step: 516/527, loss: 0.005032065324485302 2023-01-22 21:27:12.611816: step: 520/527, loss: 0.00022982760856393725 2023-01-22 21:27:13.663518: step: 524/527, loss: 0.0025566464755684137 2023-01-22 21:27:14.724773: step: 528/527, loss: 0.00025172450114041567 2023-01-22 21:27:15.789425: step: 532/527, loss: 0.0019041320774704218 2023-01-22 21:27:16.840848: step: 536/527, loss: 0.017800897359848022 2023-01-22 21:27:17.917165: step: 540/527, loss: 0.004606978967785835 2023-01-22 21:27:18.980519: step: 544/527, loss: 0.0055726878345012665 2023-01-22 21:27:20.029725: step: 548/527, loss: 0.0058105806820094585 2023-01-22 21:27:21.085379: step: 552/527, loss: 0.003775089280679822 2023-01-22 21:27:22.132183: step: 556/527, loss: 0.0029681515879929066 2023-01-22 21:27:23.187775: step: 560/527, loss: 0.00196833279915154 2023-01-22 21:27:24.246183: step: 564/527, loss: 0.00226177042350173 2023-01-22 21:27:25.317942: step: 568/527, loss: 0.0015703982207924128 2023-01-22 21:27:26.367612: step: 572/527, loss: 0.006242364179342985 2023-01-22 21:27:27.421635: step: 576/527, loss: 0.002695418195798993 2023-01-22 21:27:28.471625: step: 580/527, loss: 0.0025354200042784214 2023-01-22 21:27:29.520712: step: 584/527, loss: 0.0 2023-01-22 21:27:30.576506: step: 588/527, loss: 0.005481021478772163 2023-01-22 21:27:31.624654: step: 592/527, loss: 0.0020466595888137817 2023-01-22 21:27:32.687676: step: 596/527, loss: 9.231808689946774e-06 2023-01-22 21:27:33.750973: step: 600/527, loss: 0.005630808882415295 2023-01-22 21:27:34.806409: step: 604/527, loss: 0.012374069541692734 2023-01-22 21:27:35.858545: step: 608/527, loss: 0.002124031074345112 2023-01-22 21:27:36.919661: step: 612/527, loss: 0.003565231105312705 2023-01-22 21:27:37.992637: step: 616/527, loss: 0.0006387982866726816 2023-01-22 21:27:39.052382: step: 620/527, loss: 0.001555423135869205 2023-01-22 21:27:40.106839: step: 624/527, loss: 0.01116778701543808 2023-01-22 21:27:41.174063: step: 628/527, loss: 0.0018999361200258136 2023-01-22 21:27:42.212726: step: 632/527, loss: 0.0003884321195073426 2023-01-22 21:27:43.265986: step: 636/527, loss: 0.005087822675704956 2023-01-22 21:27:44.315940: step: 640/527, loss: 0.003662888426333666 2023-01-22 21:27:45.357959: step: 644/527, loss: 0.00792395044118166 2023-01-22 21:27:46.401445: step: 648/527, loss: 0.00112702208571136 2023-01-22 21:27:47.469761: step: 652/527, loss: 0.002395574701949954 2023-01-22 21:27:48.521661: step: 656/527, loss: 0.01734435185790062 2023-01-22 21:27:49.578188: step: 660/527, loss: 0.006966793909668922 2023-01-22 21:27:50.627320: step: 664/527, loss: 0.0061254785396158695 2023-01-22 21:27:51.686884: step: 668/527, loss: 0.0013585977721959352 2023-01-22 21:27:52.746606: step: 672/527, loss: 0.012014559470117092 2023-01-22 21:27:53.786926: step: 676/527, loss: 0.0003197441983502358 2023-01-22 21:27:54.844386: step: 680/527, loss: 0.004277748055756092 2023-01-22 21:27:55.898187: step: 684/527, loss: 0.0005192296812310815 2023-01-22 21:27:56.952913: step: 688/527, loss: 0.0035792121198028326 2023-01-22 21:27:58.008804: step: 692/527, loss: 0.0015736257191747427 2023-01-22 21:27:59.063472: step: 696/527, loss: 0.0007043908117339015 2023-01-22 21:28:00.117505: step: 700/527, loss: 0.004315359517931938 2023-01-22 21:28:01.193906: step: 704/527, loss: 0.0061894189566373825 2023-01-22 21:28:02.251905: step: 708/527, loss: 0.007439267821609974 2023-01-22 21:28:03.297004: step: 712/527, loss: 0.004487603437155485 2023-01-22 21:28:04.346249: step: 716/527, loss: 0.005449655000120401 2023-01-22 21:28:05.395615: step: 720/527, loss: 0.0030286421533674 2023-01-22 21:28:06.465928: step: 724/527, loss: 0.00014306108641903847 2023-01-22 21:28:07.503820: step: 728/527, loss: 0.0013849989045411348 2023-01-22 21:28:08.563717: step: 732/527, loss: 7.165545684983954e-05 2023-01-22 21:28:09.608605: step: 736/527, loss: 0.0041424003429710865 2023-01-22 21:28:10.656931: step: 740/527, loss: 0.003762218402698636 2023-01-22 21:28:11.710752: step: 744/527, loss: 0.0012876322725787759 2023-01-22 21:28:12.757652: step: 748/527, loss: 0.002330939983949065 2023-01-22 21:28:13.809594: step: 752/527, loss: 0.01054556854069233 2023-01-22 21:28:14.856892: step: 756/527, loss: 0.0018435473321005702 2023-01-22 21:28:15.903401: step: 760/527, loss: 0.002357003279030323 2023-01-22 21:28:16.961625: step: 764/527, loss: 0.0005506074521690607 2023-01-22 21:28:18.015286: step: 768/527, loss: 0.004188058897852898 2023-01-22 21:28:19.066073: step: 772/527, loss: 0.0008292871643789113 2023-01-22 21:28:20.118456: step: 776/527, loss: 0.0024797916412353516 2023-01-22 21:28:21.168705: step: 780/527, loss: 0.004266451112926006 2023-01-22 21:28:22.212814: step: 784/527, loss: 0.06176600977778435 2023-01-22 21:28:23.268474: step: 788/527, loss: 0.0036624919157475233 2023-01-22 21:28:24.315180: step: 792/527, loss: 0.004840458743274212 2023-01-22 21:28:25.375214: step: 796/527, loss: 0.0023256016429513693 2023-01-22 21:28:26.440633: step: 800/527, loss: 0.00648619094863534 2023-01-22 21:28:27.509754: step: 804/527, loss: 0.0068627409636974335 2023-01-22 21:28:28.569770: step: 808/527, loss: 0.001807958702556789 2023-01-22 21:28:29.611446: step: 812/527, loss: 0.0010146456770598888 2023-01-22 21:28:30.690207: step: 816/527, loss: 0.011444950476288795 2023-01-22 21:28:31.750378: step: 820/527, loss: 0.004351429175585508 2023-01-22 21:28:32.817110: step: 824/527, loss: 0.003175140591338277 2023-01-22 21:28:33.878803: step: 828/527, loss: 0.002956350101158023 2023-01-22 21:28:34.936474: step: 832/527, loss: 0.001291818916797638 2023-01-22 21:28:35.996348: step: 836/527, loss: 0.005952575244009495 2023-01-22 21:28:37.068321: step: 840/527, loss: 0.002884318819269538 2023-01-22 21:28:38.152525: step: 844/527, loss: 0.02117699384689331 2023-01-22 21:28:39.199523: step: 848/527, loss: 0.009070572443306446 2023-01-22 21:28:40.251982: step: 852/527, loss: 0.002150001935660839 2023-01-22 21:28:41.309509: step: 856/527, loss: 0.0005330175627022982 2023-01-22 21:28:42.349205: step: 860/527, loss: 0.0032782384660094976 2023-01-22 21:28:43.417503: step: 864/527, loss: 0.00023411009169649333 2023-01-22 21:28:44.459013: step: 868/527, loss: 0.0014934978680685163 2023-01-22 21:28:45.519217: step: 872/527, loss: 0.0005229530506767333 2023-01-22 21:28:46.577874: step: 876/527, loss: 1.0392669537395705e-05 2023-01-22 21:28:47.615850: step: 880/527, loss: 0.0076842415146529675 2023-01-22 21:28:48.667615: step: 884/527, loss: 0.001741165528073907 2023-01-22 21:28:49.724380: step: 888/527, loss: 0.0063182697631418705 2023-01-22 21:28:50.780883: step: 892/527, loss: 0.013243038207292557 2023-01-22 21:28:51.843707: step: 896/527, loss: 0.029148589819669724 2023-01-22 21:28:52.883678: step: 900/527, loss: 0.005447585601359606 2023-01-22 21:28:53.930358: step: 904/527, loss: 0.01837785914540291 2023-01-22 21:28:54.978201: step: 908/527, loss: 0.0020566792227327824 2023-01-22 21:28:56.017850: step: 912/527, loss: 0.003916104324162006 2023-01-22 21:28:57.089702: step: 916/527, loss: 0.002611628035083413 2023-01-22 21:28:58.131531: step: 920/527, loss: 0.0010981709929183125 2023-01-22 21:28:59.196827: step: 924/527, loss: 0.00297788018360734 2023-01-22 21:29:00.254918: step: 928/527, loss: 0.0043276394717395306 2023-01-22 21:29:01.308554: step: 932/527, loss: 0.0031342708971351385 2023-01-22 21:29:02.361488: step: 936/527, loss: 0.0028929393738508224 2023-01-22 21:29:03.408280: step: 940/527, loss: 0.004579306114464998 2023-01-22 21:29:04.474373: step: 944/527, loss: 0.0030227333772927523 2023-01-22 21:29:05.534254: step: 948/527, loss: 0.004377361387014389 2023-01-22 21:29:06.578598: step: 952/527, loss: 0.010050162672996521 2023-01-22 21:29:07.633184: step: 956/527, loss: 0.00290341186337173 2023-01-22 21:29:08.690309: step: 960/527, loss: 0.00043574688606895506 2023-01-22 21:29:09.744000: step: 964/527, loss: 0.018408535048365593 2023-01-22 21:29:10.807775: step: 968/527, loss: 0.022258492186665535 2023-01-22 21:29:11.847314: step: 972/527, loss: 0.007590233348309994 2023-01-22 21:29:12.901541: step: 976/527, loss: 0.007303510792553425 2023-01-22 21:29:13.973753: step: 980/527, loss: 0.003910573199391365 2023-01-22 21:29:15.029329: step: 984/527, loss: 0.00604192353785038 2023-01-22 21:29:16.064645: step: 988/527, loss: 0.0021004544105380774 2023-01-22 21:29:17.123813: step: 992/527, loss: 0.004176327958703041 2023-01-22 21:29:18.191927: step: 996/527, loss: 0.005303438752889633 2023-01-22 21:29:19.257719: step: 1000/527, loss: 0.03525322303175926 2023-01-22 21:29:20.318187: step: 1004/527, loss: 0.005037392023950815 2023-01-22 21:29:21.372304: step: 1008/527, loss: 0.017468789592385292 2023-01-22 21:29:22.423814: step: 1012/527, loss: 0.004671341739594936 2023-01-22 21:29:23.480829: step: 1016/527, loss: 0.004056756384670734 2023-01-22 21:29:24.516954: step: 1020/527, loss: 0.00670308293774724 2023-01-22 21:29:25.559971: step: 1024/527, loss: 0.009855917654931545 2023-01-22 21:29:26.624306: step: 1028/527, loss: 0.0028353857342153788 2023-01-22 21:29:27.669518: step: 1032/527, loss: 0.0053049251437187195 2023-01-22 21:29:28.710188: step: 1036/527, loss: 0.0013734814710915089 2023-01-22 21:29:29.741473: step: 1040/527, loss: 0.008751742541790009 2023-01-22 21:29:30.790009: step: 1044/527, loss: 0.012873737141489983 2023-01-22 21:29:31.858005: step: 1048/527, loss: 0.0014865464763715863 2023-01-22 21:29:32.914765: step: 1052/527, loss: 0.0015398594550788403 2023-01-22 21:29:33.970950: step: 1056/527, loss: 0.0001848602551035583 2023-01-22 21:29:35.015751: step: 1060/527, loss: 0.0016110041178762913 2023-01-22 21:29:36.066863: step: 1064/527, loss: 0.005330005194991827 2023-01-22 21:29:37.116785: step: 1068/527, loss: 0.002192337065935135 2023-01-22 21:29:38.151904: step: 1072/527, loss: 0.0031953558791428804 2023-01-22 21:29:39.196677: step: 1076/527, loss: 0.002988615073263645 2023-01-22 21:29:40.247413: step: 1080/527, loss: 0.001418066443875432 2023-01-22 21:29:41.272866: step: 1084/527, loss: 0.0003354892542120069 2023-01-22 21:29:42.322808: step: 1088/527, loss: 0.0049008517526090145 2023-01-22 21:29:43.395762: step: 1092/527, loss: 0.004286483861505985 2023-01-22 21:29:44.461683: step: 1096/527, loss: 0.0003213726740796119 2023-01-22 21:29:45.503658: step: 1100/527, loss: 0.00027230611885897815 2023-01-22 21:29:46.562939: step: 1104/527, loss: 0.0024362134281545877 2023-01-22 21:29:47.627595: step: 1108/527, loss: 0.0031442472245544195 2023-01-22 21:29:48.673133: step: 1112/527, loss: 0.00042018588283099234 2023-01-22 21:29:49.723721: step: 1116/527, loss: 0.00230475515127182 2023-01-22 21:29:50.775892: step: 1120/527, loss: 0.0073813265189528465 2023-01-22 21:29:51.819394: step: 1124/527, loss: 0.0018790271133184433 2023-01-22 21:29:52.892970: step: 1128/527, loss: 0.009314043447375298 2023-01-22 21:29:53.939022: step: 1132/527, loss: 0.003457744373008609 2023-01-22 21:29:54.985981: step: 1136/527, loss: 0.00036238873144611716 2023-01-22 21:29:56.028955: step: 1140/527, loss: 0.004901781212538481 2023-01-22 21:29:57.113289: step: 1144/527, loss: 0.00927270483225584 2023-01-22 21:29:58.152038: step: 1148/527, loss: 0.009323596023023129 2023-01-22 21:29:59.199788: step: 1152/527, loss: 0.00012607510143425316 2023-01-22 21:30:00.262550: step: 1156/527, loss: 0.003918622620403767 2023-01-22 21:30:01.303331: step: 1160/527, loss: 0.0005425811978057027 2023-01-22 21:30:02.360398: step: 1164/527, loss: 0.0023384541273117065 2023-01-22 21:30:03.437428: step: 1168/527, loss: 0.004741017706692219 2023-01-22 21:30:04.480693: step: 1172/527, loss: 0.0015756604261696339 2023-01-22 21:30:05.531433: step: 1176/527, loss: 0.0015137273585423827 2023-01-22 21:30:06.580947: step: 1180/527, loss: 0.00488596735522151 2023-01-22 21:30:07.635461: step: 1184/527, loss: 0.005109952297061682 2023-01-22 21:30:08.697488: step: 1188/527, loss: 0.01632414013147354 2023-01-22 21:30:09.751970: step: 1192/527, loss: 0.007905122824013233 2023-01-22 21:30:10.805582: step: 1196/527, loss: 0.0050056991167366505 2023-01-22 21:30:11.860360: step: 1200/527, loss: 0.003703346010297537 2023-01-22 21:30:12.917083: step: 1204/527, loss: 0.02175176329910755 2023-01-22 21:30:13.966585: step: 1208/527, loss: 0.007691751234233379 2023-01-22 21:30:15.025901: step: 1212/527, loss: 0.00457302201539278 2023-01-22 21:30:16.068123: step: 1216/527, loss: 0.004537994973361492 2023-01-22 21:30:17.119135: step: 1220/527, loss: 3.433651363593526e-05 2023-01-22 21:30:18.164848: step: 1224/527, loss: 0.00292415963485837 2023-01-22 21:30:19.239684: step: 1228/527, loss: 0.01906370185315609 2023-01-22 21:30:20.288462: step: 1232/527, loss: 0.011376405134797096 2023-01-22 21:30:21.332147: step: 1236/527, loss: 0.0011853516334667802 2023-01-22 21:30:22.383839: step: 1240/527, loss: 0.028676720336079597 2023-01-22 21:30:23.430800: step: 1244/527, loss: 0.0014396763872355223 2023-01-22 21:30:24.483111: step: 1248/527, loss: 0.012249977327883244 2023-01-22 21:30:25.530818: step: 1252/527, loss: 0.00428337138146162 2023-01-22 21:30:26.590959: step: 1256/527, loss: 0.015964098274707794 2023-01-22 21:30:27.642378: step: 1260/527, loss: 0.01603492721915245 2023-01-22 21:30:28.692684: step: 1264/527, loss: 0.003757587866857648 2023-01-22 21:30:29.749761: step: 1268/527, loss: 0.004188335966318846 2023-01-22 21:30:30.801418: step: 1272/527, loss: 0.0010117783676832914 2023-01-22 21:30:31.873269: step: 1276/527, loss: 0.002257630694657564 2023-01-22 21:30:32.934618: step: 1280/527, loss: 0.0014187361812219024 2023-01-22 21:30:34.018918: step: 1284/527, loss: 0.005581737495958805 2023-01-22 21:30:35.079666: step: 1288/527, loss: 0.0009192198631353676 2023-01-22 21:30:36.132193: step: 1292/527, loss: 0.0068116397596895695 2023-01-22 21:30:37.196932: step: 1296/527, loss: 0.0033853058703243732 2023-01-22 21:30:38.259064: step: 1300/527, loss: 0.007141647394746542 2023-01-22 21:30:39.314910: step: 1304/527, loss: 0.0015104282647371292 2023-01-22 21:30:40.359692: step: 1308/527, loss: 8.462648838758469e-05 2023-01-22 21:30:41.428185: step: 1312/527, loss: 0.017198268324136734 2023-01-22 21:30:42.477072: step: 1316/527, loss: 4.7995759814511985e-05 2023-01-22 21:30:43.525791: step: 1320/527, loss: 0.00940526183694601 2023-01-22 21:30:44.593010: step: 1324/527, loss: 0.0032857030164450407 2023-01-22 21:30:45.654352: step: 1328/527, loss: 0.004373821895569563 2023-01-22 21:30:46.735081: step: 1332/527, loss: 0.0030779403168708086 2023-01-22 21:30:47.784508: step: 1336/527, loss: 0.003492555348202586 2023-01-22 21:30:48.827087: step: 1340/527, loss: 0.0011717055458575487 2023-01-22 21:30:49.880055: step: 1344/527, loss: 0.008130822330713272 2023-01-22 21:30:50.927083: step: 1348/527, loss: 0.009108836762607098 2023-01-22 21:30:51.974446: step: 1352/527, loss: 0.0008072527125477791 2023-01-22 21:30:53.021538: step: 1356/527, loss: 0.014012214727699757 2023-01-22 21:30:54.081002: step: 1360/527, loss: 0.011362689547240734 2023-01-22 21:30:55.134792: step: 1364/527, loss: 0.003630140097811818 2023-01-22 21:30:56.168340: step: 1368/527, loss: 0.0042950985953211784 2023-01-22 21:30:57.227171: step: 1372/527, loss: 0.0006355448858812451 2023-01-22 21:30:58.262018: step: 1376/527, loss: 0.0014422856038436294 2023-01-22 21:30:59.303254: step: 1380/527, loss: 0.0030392364133149385 2023-01-22 21:31:00.367004: step: 1384/527, loss: 0.004173503257334232 2023-01-22 21:31:01.428143: step: 1388/527, loss: 0.004709865897893906 2023-01-22 21:31:02.476127: step: 1392/527, loss: 0.0019907085224986076 2023-01-22 21:31:03.532062: step: 1396/527, loss: 0.005064527038484812 2023-01-22 21:31:04.587861: step: 1400/527, loss: 0.0015091727254912257 2023-01-22 21:31:05.635385: step: 1404/527, loss: 0.0030564062763005495 2023-01-22 21:31:06.681125: step: 1408/527, loss: 0.0012647153344005346 2023-01-22 21:31:07.724177: step: 1412/527, loss: 0.00022612253087572753 2023-01-22 21:31:08.770451: step: 1416/527, loss: 0.004740377888083458 2023-01-22 21:31:09.813067: step: 1420/527, loss: 0.0037263089325278997 2023-01-22 21:31:10.870413: step: 1424/527, loss: 0.0009227913105860353 2023-01-22 21:31:11.925830: step: 1428/527, loss: 0.006331590469926596 2023-01-22 21:31:12.971060: step: 1432/527, loss: 0.02578769437968731 2023-01-22 21:31:14.020988: step: 1436/527, loss: 0.00416067149490118 2023-01-22 21:31:15.079837: step: 1440/527, loss: 0.0 2023-01-22 21:31:16.110603: step: 1444/527, loss: 0.003131650621071458 2023-01-22 21:31:17.165761: step: 1448/527, loss: 0.042015351355075836 2023-01-22 21:31:18.210593: step: 1452/527, loss: 0.04288274049758911 2023-01-22 21:31:19.279748: step: 1456/527, loss: 0.0007513285963796079 2023-01-22 21:31:20.334033: step: 1460/527, loss: 0.004160204436630011 2023-01-22 21:31:21.388140: step: 1464/527, loss: 0.01646789163351059 2023-01-22 21:31:22.443751: step: 1468/527, loss: 0.006657102610915899 2023-01-22 21:31:23.497451: step: 1472/527, loss: 0.0161454938352108 2023-01-22 21:31:24.564849: step: 1476/527, loss: 0.0019072722643613815 2023-01-22 21:31:25.611542: step: 1480/527, loss: 0.0007143720868043602 2023-01-22 21:31:26.680076: step: 1484/527, loss: 0.027458615601062775 2023-01-22 21:31:27.755606: step: 1488/527, loss: 0.0007634982466697693 2023-01-22 21:31:28.795930: step: 1492/527, loss: 0.007612764835357666 2023-01-22 21:31:29.830164: step: 1496/527, loss: 0.010697748512029648 2023-01-22 21:31:30.886628: step: 1500/527, loss: 0.004767022095620632 2023-01-22 21:31:31.922737: step: 1504/527, loss: 0.003955787979066372 2023-01-22 21:31:32.962906: step: 1508/527, loss: 0.0014415780315175653 2023-01-22 21:31:34.014593: step: 1512/527, loss: 0.0030549075454473495 2023-01-22 21:31:35.060585: step: 1516/527, loss: 0.002484110416844487 2023-01-22 21:31:36.112756: step: 1520/527, loss: 0.0045503368601202965 2023-01-22 21:31:37.167058: step: 1524/527, loss: 0.008399713784456253 2023-01-22 21:31:38.206680: step: 1528/527, loss: 0.0025790345389395952 2023-01-22 21:31:39.244789: step: 1532/527, loss: 0.0015924338949844241 2023-01-22 21:31:40.293823: step: 1536/527, loss: 0.000288614712189883 2023-01-22 21:31:41.347149: step: 1540/527, loss: 0.0005859467783011496 2023-01-22 21:31:42.400081: step: 1544/527, loss: 0.013887022621929646 2023-01-22 21:31:43.454465: step: 1548/527, loss: 0.0018600106704980135 2023-01-22 21:31:44.505889: step: 1552/527, loss: 0.008629896678030491 2023-01-22 21:31:45.561334: step: 1556/527, loss: 0.0015319508966058493 2023-01-22 21:31:46.605352: step: 1560/527, loss: 0.004325922578573227 2023-01-22 21:31:47.650840: step: 1564/527, loss: 0.0072137825191020966 2023-01-22 21:31:48.697543: step: 1568/527, loss: 0.005579197313636541 2023-01-22 21:31:49.761452: step: 1572/527, loss: 0.000697479525115341 2023-01-22 21:31:50.808818: step: 1576/527, loss: 0.001068268553353846 2023-01-22 21:31:51.856915: step: 1580/527, loss: 0.005155193153768778 2023-01-22 21:31:52.890915: step: 1584/527, loss: 0.001507272943854332 2023-01-22 21:31:53.944511: step: 1588/527, loss: 0.004740849602967501 2023-01-22 21:31:54.991864: step: 1592/527, loss: 0.008445720188319683 2023-01-22 21:31:56.040344: step: 1596/527, loss: 0.001141395652666688 2023-01-22 21:31:57.076422: step: 1600/527, loss: 0.0009254674077965319 2023-01-22 21:31:58.131049: step: 1604/527, loss: 0.004802384413778782 2023-01-22 21:31:59.169738: step: 1608/527, loss: 0.0007032358553260565 2023-01-22 21:32:00.201721: step: 1612/527, loss: 0.00888037495315075 2023-01-22 21:32:01.250396: step: 1616/527, loss: 0.003867973340675235 2023-01-22 21:32:02.294175: step: 1620/527, loss: 0.001151920179836452 2023-01-22 21:32:03.342784: step: 1624/527, loss: 0.0069147152826189995 2023-01-22 21:32:04.383549: step: 1628/527, loss: 0.00042818597285076976 2023-01-22 21:32:05.433804: step: 1632/527, loss: 0.000888426264282316 2023-01-22 21:32:06.489165: step: 1636/527, loss: 0.00192177458666265 2023-01-22 21:32:07.544062: step: 1640/527, loss: 0.005601990036666393 2023-01-22 21:32:08.583102: step: 1644/527, loss: 0.01739492267370224 2023-01-22 21:32:09.639591: step: 1648/527, loss: 0.0011741763446480036 2023-01-22 21:32:10.692472: step: 1652/527, loss: 0.03272141516208649 2023-01-22 21:32:11.738340: step: 1656/527, loss: 0.01118404883891344 2023-01-22 21:32:12.781592: step: 1660/527, loss: 0.008036890998482704 2023-01-22 21:32:13.837788: step: 1664/527, loss: 0.002747524296864867 2023-01-22 21:32:14.890165: step: 1668/527, loss: 0.0027344771660864353 2023-01-22 21:32:15.940311: step: 1672/527, loss: 0.0010497388429939747 2023-01-22 21:32:16.993636: step: 1676/527, loss: 0.004042036831378937 2023-01-22 21:32:18.047453: step: 1680/527, loss: 0.01404307596385479 2023-01-22 21:32:19.088114: step: 1684/527, loss: 0.0013216814259067178 2023-01-22 21:32:20.134509: step: 1688/527, loss: 0.0011174700921401381 2023-01-22 21:32:21.194399: step: 1692/527, loss: 0.0011807511327788234 2023-01-22 21:32:22.235980: step: 1696/527, loss: 0.0033853594213724136 2023-01-22 21:32:23.286215: step: 1700/527, loss: 0.004801849834620953 2023-01-22 21:32:24.318596: step: 1704/527, loss: 0.00011519994586706161 2023-01-22 21:32:25.368562: step: 1708/527, loss: 0.025350751355290413 2023-01-22 21:32:26.434879: step: 1712/527, loss: 0.03762112930417061 2023-01-22 21:32:27.491244: step: 1716/527, loss: 0.0035009244456887245 2023-01-22 21:32:28.546222: step: 1720/527, loss: 0.0019672783091664314 2023-01-22 21:32:29.611909: step: 1724/527, loss: 0.0012699570506811142 2023-01-22 21:32:30.659667: step: 1728/527, loss: 0.001014609937556088 2023-01-22 21:32:31.709374: step: 1732/527, loss: 0.004141919314861298 2023-01-22 21:32:32.754079: step: 1736/527, loss: 0.006008971948176622 2023-01-22 21:32:33.824408: step: 1740/527, loss: 0.017608124762773514 2023-01-22 21:32:34.890555: step: 1744/527, loss: 0.0002709919062908739 2023-01-22 21:32:35.942001: step: 1748/527, loss: 0.015857547521591187 2023-01-22 21:32:36.997814: step: 1752/527, loss: 0.010998151265084743 2023-01-22 21:32:38.039347: step: 1756/527, loss: 0.001594930305145681 2023-01-22 21:32:39.116227: step: 1760/527, loss: 0.0054114446975290775 2023-01-22 21:32:40.165220: step: 1764/527, loss: 0.00354445306584239 2023-01-22 21:32:41.205870: step: 1768/527, loss: 0.004710740875452757 2023-01-22 21:32:42.268829: step: 1772/527, loss: 0.005172974895685911 2023-01-22 21:32:43.320286: step: 1776/527, loss: 0.006135033909231424 2023-01-22 21:32:44.378559: step: 1780/527, loss: 3.558532989700325e-05 2023-01-22 21:32:45.440705: step: 1784/527, loss: 0.0026814830489456654 2023-01-22 21:32:46.481770: step: 1788/527, loss: 0.0025886159855872393 2023-01-22 21:32:47.534692: step: 1792/527, loss: 0.0030488884076476097 2023-01-22 21:32:48.598095: step: 1796/527, loss: 0.0024280319921672344 2023-01-22 21:32:49.672234: step: 1800/527, loss: 0.0014610904036089778 2023-01-22 21:32:50.733310: step: 1804/527, loss: 0.05553878843784332 2023-01-22 21:32:51.777537: step: 1808/527, loss: 0.000567455543205142 2023-01-22 21:32:52.823320: step: 1812/527, loss: 0.004510463681071997 2023-01-22 21:32:53.879225: step: 1816/527, loss: 0.0047310334630310535 2023-01-22 21:32:54.933112: step: 1820/527, loss: 0.003958914428949356 2023-01-22 21:32:55.979367: step: 1824/527, loss: 0.017910659313201904 2023-01-22 21:32:57.019822: step: 1828/527, loss: 0.010180640034377575 2023-01-22 21:32:58.066602: step: 1832/527, loss: 0.0016555368201807141 2023-01-22 21:32:59.119467: step: 1836/527, loss: 0.005525160115212202 2023-01-22 21:33:00.177487: step: 1840/527, loss: 0.0062970309518277645 2023-01-22 21:33:01.224319: step: 1844/527, loss: 0.002770808292552829 2023-01-22 21:33:02.284298: step: 1848/527, loss: 0.001972703728824854 2023-01-22 21:33:03.331239: step: 1852/527, loss: 0.00023240938025992364 2023-01-22 21:33:04.376998: step: 1856/527, loss: 0.00856590922921896 2023-01-22 21:33:05.436154: step: 1860/527, loss: 0.004154106602072716 2023-01-22 21:33:06.480922: step: 1864/527, loss: 0.005443803034722805 2023-01-22 21:33:07.516637: step: 1868/527, loss: 0.0017447288846597075 2023-01-22 21:33:08.566671: step: 1872/527, loss: 0.00017517953529022634 2023-01-22 21:33:09.612418: step: 1876/527, loss: 0.0005201484309509397 2023-01-22 21:33:10.666644: step: 1880/527, loss: 0.0008196663111448288 2023-01-22 21:33:11.705360: step: 1884/527, loss: 0.003418135456740856 2023-01-22 21:33:12.771288: step: 1888/527, loss: 0.015642037615180016 2023-01-22 21:33:13.825188: step: 1892/527, loss: 0.006341900676488876 2023-01-22 21:33:14.875035: step: 1896/527, loss: 0.008579635992646217 2023-01-22 21:33:15.952826: step: 1900/527, loss: 0.01056369673460722 2023-01-22 21:33:17.003951: step: 1904/527, loss: 0.0050481874495744705 2023-01-22 21:33:18.040506: step: 1908/527, loss: 0.007153413724154234 2023-01-22 21:33:19.087878: step: 1912/527, loss: 0.006582628004252911 2023-01-22 21:33:20.148904: step: 1916/527, loss: 0.006622140295803547 2023-01-22 21:33:21.209170: step: 1920/527, loss: 0.005359706934541464 2023-01-22 21:33:22.266072: step: 1924/527, loss: 0.007415145635604858 2023-01-22 21:33:23.327833: step: 1928/527, loss: 0.04922202602028847 2023-01-22 21:33:24.370896: step: 1932/527, loss: 0.003483665408566594 2023-01-22 21:33:25.432951: step: 1936/527, loss: 0.006572945509105921 2023-01-22 21:33:26.487488: step: 1940/527, loss: 0.004526928532868624 2023-01-22 21:33:27.547574: step: 1944/527, loss: 0.005747731775045395 2023-01-22 21:33:28.603066: step: 1948/527, loss: 0.0029546748846769333 2023-01-22 21:33:29.646770: step: 1952/527, loss: 0.005977029446512461 2023-01-22 21:33:30.711575: step: 1956/527, loss: 0.008019886910915375 2023-01-22 21:33:31.763146: step: 1960/527, loss: 0.0030083435121923685 2023-01-22 21:33:32.814722: step: 1964/527, loss: 0.0013906039530411363 2023-01-22 21:33:33.860795: step: 1968/527, loss: 0.009009996429085732 2023-01-22 21:33:34.898346: step: 1972/527, loss: 0.023654289543628693 2023-01-22 21:33:35.973340: step: 1976/527, loss: 0.0029636144172400236 2023-01-22 21:33:37.021242: step: 1980/527, loss: 0.00697800749912858 2023-01-22 21:33:38.071221: step: 1984/527, loss: 0.0019071944989264011 2023-01-22 21:33:39.117108: step: 1988/527, loss: 0.002265317365527153 2023-01-22 21:33:40.159773: step: 1992/527, loss: 0.0013044317020103335 2023-01-22 21:33:41.212983: step: 1996/527, loss: 0.001633031410165131 2023-01-22 21:33:42.253377: step: 2000/527, loss: 0.0031928825192153454 2023-01-22 21:33:43.307849: step: 2004/527, loss: 0.0016749455826357007 2023-01-22 21:33:44.369200: step: 2008/527, loss: 0.004191864747554064 2023-01-22 21:33:45.405360: step: 2012/527, loss: 1.1844836990348995e-05 2023-01-22 21:33:46.445394: step: 2016/527, loss: 0.0002957037650048733 2023-01-22 21:33:47.493732: step: 2020/527, loss: 0.0029167046304792166 2023-01-22 21:33:48.523690: step: 2024/527, loss: 0.0004001189663540572 2023-01-22 21:33:49.570242: step: 2028/527, loss: 0.00020117717212997377 2023-01-22 21:33:50.611771: step: 2032/527, loss: 0.010397354140877724 2023-01-22 21:33:51.647243: step: 2036/527, loss: 0.006372228730469942 2023-01-22 21:33:52.703503: step: 2040/527, loss: 0.006984546780586243 2023-01-22 21:33:53.755725: step: 2044/527, loss: 0.013308129273355007 2023-01-22 21:33:54.808472: step: 2048/527, loss: 0.007427938748151064 2023-01-22 21:33:55.859064: step: 2052/527, loss: 0.00517807062715292 2023-01-22 21:33:56.914806: step: 2056/527, loss: 0.003048629965633154 2023-01-22 21:33:57.948215: step: 2060/527, loss: 0.001386835239827633 2023-01-22 21:33:58.983186: step: 2064/527, loss: 7.517064659623429e-07 2023-01-22 21:34:00.018035: step: 2068/527, loss: 0.000144703546538949 2023-01-22 21:34:01.069952: step: 2072/527, loss: 0.005975764710456133 2023-01-22 21:34:02.136371: step: 2076/527, loss: 0.006867633201181889 2023-01-22 21:34:03.178608: step: 2080/527, loss: 0.004074363503605127 2023-01-22 21:34:04.237084: step: 2084/527, loss: 0.0009200856438837945 2023-01-22 21:34:05.302463: step: 2088/527, loss: 0.07849445939064026 2023-01-22 21:34:06.345141: step: 2092/527, loss: 0.0019149655709043145 2023-01-22 21:34:07.408864: step: 2096/527, loss: 0.006038849242031574 2023-01-22 21:34:08.480256: step: 2100/527, loss: 0.009350918233394623 2023-01-22 21:34:09.523713: step: 2104/527, loss: 0.05594834312796593 2023-01-22 21:34:10.567242: step: 2108/527, loss: 0.01078872475773096 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3232546542553192, 'r': 0.3459499525616698, 'f1': 0.33421746104491296}, 'combined': 0.2462654976120411, 'stategy': 1, 'epoch': 10} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3386891773755409, 'r': 0.31282564019413595, 'f1': 0.32524404935118106}, 'combined': 0.20815619158475585, 'stategy': 1, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.314403401163692, 'r': 0.3585511273232996, 'f1': 0.33502915620457246}, 'combined': 0.24686358878231654, 'stategy': 1, 'epoch': 10} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3536407504383829, 'r': 0.31666921743800647, 'f1': 0.33413538530628983}, 'combined': 0.21384664659602545, 'stategy': 1, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3290390317764698, 'r': 0.3359070191570034, 'f1': 0.3324375569873066}, 'combined': 0.244953989359068, 'stategy': 1, 'epoch': 10} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3623960772372005, 'r': 0.30040293572619625, 'f1': 0.3285003247393927}, 'combined': 0.23552853471880988, 'stategy': 1, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'stategy': 1, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3295121025991449, 'r': 0.33576470416649107, 'f1': 0.33260902085665567}, 'combined': 0.24508033115753575, 'stategy': 1, 'epoch': 8} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3605647599524678, 'r': 0.29724446998811266, 'f1': 0.3258570299420806}, 'combined': 0.23363334222262386, 'stategy': 1, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 8} ****************************** Epoch: 11 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 21:36:44.188308: step: 4/527, loss: 0.0043526035733520985 2023-01-22 21:36:45.239703: step: 8/527, loss: 0.0023075598292052746 2023-01-22 21:36:46.297389: step: 12/527, loss: 0.0019912654533982277 2023-01-22 21:36:47.342957: step: 16/527, loss: 0.012579189613461494 2023-01-22 21:36:48.398587: step: 20/527, loss: 0.002191648120060563 2023-01-22 21:36:49.458517: step: 24/527, loss: 0.005206662230193615 2023-01-22 21:36:50.490553: step: 28/527, loss: 9.666178812040016e-06 2023-01-22 21:36:51.527742: step: 32/527, loss: 0.002191325882449746 2023-01-22 21:36:52.580570: step: 36/527, loss: 0.00243182759732008 2023-01-22 21:36:53.620028: step: 40/527, loss: 0.002461265306919813 2023-01-22 21:36:54.662254: step: 44/527, loss: 0.0010654813377186656 2023-01-22 21:36:55.714834: step: 48/527, loss: 0.0029627038165926933 2023-01-22 21:36:56.758185: step: 52/527, loss: 0.0016816186252981424 2023-01-22 21:36:57.814956: step: 56/527, loss: 0.011506606824696064 2023-01-22 21:36:58.856763: step: 60/527, loss: 0.0017763753421604633 2023-01-22 21:36:59.896405: step: 64/527, loss: 0.0005496774101629853 2023-01-22 21:37:00.930426: step: 68/527, loss: 0.006504182703793049 2023-01-22 21:37:01.988175: step: 72/527, loss: 0.0038948948495090008 2023-01-22 21:37:03.021366: step: 76/527, loss: 0.007334815338253975 2023-01-22 21:37:04.059555: step: 80/527, loss: 0.0004767587233800441 2023-01-22 21:37:05.117304: step: 84/527, loss: 0.0004721745499409735 2023-01-22 21:37:06.162779: step: 88/527, loss: 0.008727424778044224 2023-01-22 21:37:07.200599: step: 92/527, loss: 0.004020388703793287 2023-01-22 21:37:08.254234: step: 96/527, loss: 0.0025085019879043102 2023-01-22 21:37:09.305280: step: 100/527, loss: 0.0015356676885858178 2023-01-22 21:37:10.355988: step: 104/527, loss: 0.008122754283249378 2023-01-22 21:37:11.393287: step: 108/527, loss: 0.0042010401375591755 2023-01-22 21:37:12.447322: step: 112/527, loss: 0.010724453255534172 2023-01-22 21:37:13.512697: step: 116/527, loss: 0.005484436172991991 2023-01-22 21:37:14.556933: step: 120/527, loss: 0.003741045016795397 2023-01-22 21:37:15.590213: step: 124/527, loss: 0.00021263000962790102 2023-01-22 21:37:16.651585: step: 128/527, loss: 0.00572225684300065 2023-01-22 21:37:17.701162: step: 132/527, loss: 0.002912192139774561 2023-01-22 21:37:18.761928: step: 136/527, loss: 0.0014544990845024586 2023-01-22 21:37:19.802005: step: 140/527, loss: 0.0021721383091062307 2023-01-22 21:37:20.841690: step: 144/527, loss: 0.0064918044954538345 2023-01-22 21:37:21.884777: step: 148/527, loss: 0.02143397368490696 2023-01-22 21:37:22.908714: step: 152/527, loss: 0.0012229107087478042 2023-01-22 21:37:23.970024: step: 156/527, loss: 0.009608855471014977 2023-01-22 21:37:25.037773: step: 160/527, loss: 0.010895432904362679 2023-01-22 21:37:26.072383: step: 164/527, loss: 4.52004860562738e-05 2023-01-22 21:37:27.111534: step: 168/527, loss: 0.00017657164426054806 2023-01-22 21:37:28.140342: step: 172/527, loss: 0.004457660019397736 2023-01-22 21:37:29.181675: step: 176/527, loss: 0.010944776237010956 2023-01-22 21:37:30.219342: step: 180/527, loss: 0.005464617628604174 2023-01-22 21:37:31.260753: step: 184/527, loss: 0.0021822936832904816 2023-01-22 21:37:32.306588: step: 188/527, loss: 0.0020280620083212852 2023-01-22 21:37:33.366738: step: 192/527, loss: 0.005777272861450911 2023-01-22 21:37:34.409896: step: 196/527, loss: 0.0025322050787508488 2023-01-22 21:37:35.462879: step: 200/527, loss: 0.0053939190693199635 2023-01-22 21:37:36.533573: step: 204/527, loss: 0.005446792580187321 2023-01-22 21:37:37.578890: step: 208/527, loss: 0.005624681245535612 2023-01-22 21:37:38.634534: step: 212/527, loss: 0.0015547068323940039 2023-01-22 21:37:39.677165: step: 216/527, loss: 0.0020389468409121037 2023-01-22 21:37:40.707694: step: 220/527, loss: 0.0009207907714881003 2023-01-22 21:37:41.771154: step: 224/527, loss: 0.004429956432431936 2023-01-22 21:37:42.838809: step: 228/527, loss: 0.009660227224230766 2023-01-22 21:37:43.877850: step: 232/527, loss: 0.0010038953041657805 2023-01-22 21:37:44.931985: step: 236/527, loss: 0.0038022464141249657 2023-01-22 21:37:45.974526: step: 240/527, loss: 0.0037014468107372522 2023-01-22 21:37:47.009523: step: 244/527, loss: 0.0024868869222700596 2023-01-22 21:37:48.056421: step: 248/527, loss: 0.00047362822806462646 2023-01-22 21:37:49.117032: step: 252/527, loss: 0.02443956770002842 2023-01-22 21:37:50.190836: step: 256/527, loss: 0.005628102924674749 2023-01-22 21:37:51.242488: step: 260/527, loss: 0.0009129595127888024 2023-01-22 21:37:52.280878: step: 264/527, loss: 0.0025695806834846735 2023-01-22 21:37:53.332853: step: 268/527, loss: 0.004168948158621788 2023-01-22 21:37:54.384816: step: 272/527, loss: 0.001953809754922986 2023-01-22 21:37:55.427523: step: 276/527, loss: 0.0073697008192539215 2023-01-22 21:37:56.469440: step: 280/527, loss: 0.00013159932859707624 2023-01-22 21:37:57.517736: step: 284/527, loss: 0.0006879745051264763 2023-01-22 21:37:58.564416: step: 288/527, loss: 0.001803342835046351 2023-01-22 21:37:59.613475: step: 292/527, loss: 0.0006572796846739948 2023-01-22 21:38:00.665021: step: 296/527, loss: 0.0022013038396835327 2023-01-22 21:38:01.719537: step: 300/527, loss: 0.0012773108901455998 2023-01-22 21:38:02.790716: step: 304/527, loss: 2.185352786909789e-05 2023-01-22 21:38:03.842905: step: 308/527, loss: 0.004841454327106476 2023-01-22 21:38:04.903255: step: 312/527, loss: 0.02062935009598732 2023-01-22 21:38:05.988849: step: 316/527, loss: 0.008016521111130714 2023-01-22 21:38:07.039384: step: 320/527, loss: 0.002767492551356554 2023-01-22 21:38:08.100966: step: 324/527, loss: 0.0008753962465561926 2023-01-22 21:38:09.162999: step: 328/527, loss: 0.0010826876387000084 2023-01-22 21:38:10.209636: step: 332/527, loss: 0.006960335653275251 2023-01-22 21:38:11.269485: step: 336/527, loss: 0.00998025480657816 2023-01-22 21:38:12.325472: step: 340/527, loss: 0.012459678575396538 2023-01-22 21:38:13.384219: step: 344/527, loss: 0.006665992550551891 2023-01-22 21:38:14.441813: step: 348/527, loss: 0.0006250516162253916 2023-01-22 21:38:15.481217: step: 352/527, loss: 0.000537982617970556 2023-01-22 21:38:16.542102: step: 356/527, loss: 0.005042615812271833 2023-01-22 21:38:17.587263: step: 360/527, loss: 0.011064927093684673 2023-01-22 21:38:18.624586: step: 364/527, loss: 0.0042867474257946014 2023-01-22 21:38:19.700569: step: 368/527, loss: 1.4619061403209344e-05 2023-01-22 21:38:20.761667: step: 372/527, loss: 0.007952438667416573 2023-01-22 21:38:21.809937: step: 376/527, loss: 0.0026353937573730946 2023-01-22 21:38:22.865868: step: 380/527, loss: 0.0005096009699627757 2023-01-22 21:38:23.931000: step: 384/527, loss: 0.003215527394786477 2023-01-22 21:38:24.978217: step: 388/527, loss: 0.0015521569876000285 2023-01-22 21:38:26.031744: step: 392/527, loss: 0.017297156155109406 2023-01-22 21:38:27.082870: step: 396/527, loss: 0.0006201857468113303 2023-01-22 21:38:28.130840: step: 400/527, loss: 0.004360498394817114 2023-01-22 21:38:29.186794: step: 404/527, loss: 0.006486785132437944 2023-01-22 21:38:30.231283: step: 408/527, loss: 0.008071518503129482 2023-01-22 21:38:31.291751: step: 412/527, loss: 0.0024542007595300674 2023-01-22 21:38:32.357189: step: 416/527, loss: 0.0007408526726067066 2023-01-22 21:38:33.392734: step: 420/527, loss: 0.0022004658821970224 2023-01-22 21:38:34.443792: step: 424/527, loss: 0.0023798802867531776 2023-01-22 21:38:35.490760: step: 428/527, loss: 0.0014789014821872115 2023-01-22 21:38:36.557673: step: 432/527, loss: 0.0024607426021248102 2023-01-22 21:38:37.612156: step: 436/527, loss: 0.0018823177088052034 2023-01-22 21:38:38.676517: step: 440/527, loss: 0.009442086331546307 2023-01-22 21:38:39.724594: step: 444/527, loss: 0.0041308957152068615 2023-01-22 21:38:40.790698: step: 448/527, loss: 0.003608011407777667 2023-01-22 21:38:41.878844: step: 452/527, loss: 0.005427576135843992 2023-01-22 21:38:42.933585: step: 456/527, loss: 0.007602985482662916 2023-01-22 21:38:43.981205: step: 460/527, loss: 0.011451112106442451 2023-01-22 21:38:45.039140: step: 464/527, loss: 0.0002630161470733583 2023-01-22 21:38:46.071486: step: 468/527, loss: 0.0030766415875405073 2023-01-22 21:38:47.119213: step: 472/527, loss: 0.005387390032410622 2023-01-22 21:38:48.171198: step: 476/527, loss: 0.0015508156502619386 2023-01-22 21:38:49.248028: step: 480/527, loss: 0.007273146882653236 2023-01-22 21:38:50.292180: step: 484/527, loss: 0.002591175027191639 2023-01-22 21:38:51.341429: step: 488/527, loss: 0.0002340820647077635 2023-01-22 21:38:52.396732: step: 492/527, loss: 0.01507216040045023 2023-01-22 21:38:53.471856: step: 496/527, loss: 0.011896253563463688 2023-01-22 21:38:54.523888: step: 500/527, loss: 0.0026756778825074434 2023-01-22 21:38:55.577961: step: 504/527, loss: 0.001898287096992135 2023-01-22 21:38:56.637204: step: 508/527, loss: 0.002242229413241148 2023-01-22 21:38:57.695607: step: 512/527, loss: 0.002066503744572401 2023-01-22 21:38:58.742203: step: 516/527, loss: 0.0019434496061876416 2023-01-22 21:38:59.790975: step: 520/527, loss: 0.002433086046949029 2023-01-22 21:39:00.851433: step: 524/527, loss: 0.0019725847523659468 2023-01-22 21:39:01.886621: step: 528/527, loss: 0.0021631342824548483 2023-01-22 21:39:02.932961: step: 532/527, loss: 0.0002122735750162974 2023-01-22 21:39:03.988363: step: 536/527, loss: 0.003666941076517105 2023-01-22 21:39:05.043833: step: 540/527, loss: 0.0038509280420839787 2023-01-22 21:39:06.081670: step: 544/527, loss: 0.0008640268933959305 2023-01-22 21:39:07.132762: step: 548/527, loss: 0.0010096246842294931 2023-01-22 21:39:08.190923: step: 552/527, loss: 0.0038988187443464994 2023-01-22 21:39:09.247503: step: 556/527, loss: 0.0017615047981962562 2023-01-22 21:39:10.294260: step: 560/527, loss: 0.0041091907769441605 2023-01-22 21:39:11.345861: step: 564/527, loss: 0.06354616582393646 2023-01-22 21:39:12.404985: step: 568/527, loss: 0.0022531517315655947 2023-01-22 21:39:13.446276: step: 572/527, loss: 0.009618941694498062 2023-01-22 21:39:14.492581: step: 576/527, loss: 0.002705459948629141 2023-01-22 21:39:15.557658: step: 580/527, loss: 0.010746491141617298 2023-01-22 21:39:16.621362: step: 584/527, loss: 0.0004461625940166414 2023-01-22 21:39:17.675476: step: 588/527, loss: 0.001540738856419921 2023-01-22 21:39:18.746949: step: 592/527, loss: 0.004156941082328558 2023-01-22 21:39:19.795300: step: 596/527, loss: 0.0029327841475605965 2023-01-22 21:39:20.860215: step: 600/527, loss: 0.002596890786662698 2023-01-22 21:39:21.915695: step: 604/527, loss: 0.0054204463958740234 2023-01-22 21:39:22.964950: step: 608/527, loss: 0.0019639183301478624 2023-01-22 21:39:24.014173: step: 612/527, loss: 0.0015900291036814451 2023-01-22 21:39:25.077139: step: 616/527, loss: 0.004255213309079409 2023-01-22 21:39:26.116879: step: 620/527, loss: 0.014682373963296413 2023-01-22 21:39:27.169569: step: 624/527, loss: 0.0022347094491124153 2023-01-22 21:39:28.210650: step: 628/527, loss: 0.0032233865931630135 2023-01-22 21:39:29.263198: step: 632/527, loss: 0.00254452764056623 2023-01-22 21:39:30.325036: step: 636/527, loss: 0.002465637866407633 2023-01-22 21:39:31.395227: step: 640/527, loss: 0.00031317968387156725 2023-01-22 21:39:32.460289: step: 644/527, loss: 0.009111504070460796 2023-01-22 21:39:33.510890: step: 648/527, loss: 2.2168167561176233e-05 2023-01-22 21:39:34.582678: step: 652/527, loss: 0.005279306787997484 2023-01-22 21:39:35.639584: step: 656/527, loss: 0.0 2023-01-22 21:39:36.709003: step: 660/527, loss: 0.013738441281020641 2023-01-22 21:39:37.774046: step: 664/527, loss: 0.0006977031007409096 2023-01-22 21:39:38.826584: step: 668/527, loss: 0.0022768783383071423 2023-01-22 21:39:39.879765: step: 672/527, loss: 0.0042579504661262035 2023-01-22 21:39:40.941228: step: 676/527, loss: 0.0034059248864650726 2023-01-22 21:39:42.017488: step: 680/527, loss: 0.0010519471252337098 2023-01-22 21:39:43.078361: step: 684/527, loss: 0.005364408250898123 2023-01-22 21:39:44.133851: step: 688/527, loss: 0.0002478122478350997 2023-01-22 21:39:45.182626: step: 692/527, loss: 0.0005271011614240706 2023-01-22 21:39:46.243982: step: 696/527, loss: 0.0017791267018765211 2023-01-22 21:39:47.309501: step: 700/527, loss: 0.0022672144696116447 2023-01-22 21:39:48.354721: step: 704/527, loss: 0.003363115945830941 2023-01-22 21:39:49.407145: step: 708/527, loss: 0.005192456301301718 2023-01-22 21:39:50.470636: step: 712/527, loss: 0.0006168190157040954 2023-01-22 21:39:51.526638: step: 716/527, loss: 0.0028047868981957436 2023-01-22 21:39:52.571686: step: 720/527, loss: 0.001376513042487204 2023-01-22 21:39:53.629309: step: 724/527, loss: 0.00010021023626904935 2023-01-22 21:39:54.697164: step: 728/527, loss: 0.0011807429837062955 2023-01-22 21:39:55.749707: step: 732/527, loss: 0.00048385339323431253 2023-01-22 21:39:56.803683: step: 736/527, loss: 0.00017284642672166228 2023-01-22 21:39:57.840240: step: 740/527, loss: 0.00017498839588370174 2023-01-22 21:39:58.890644: step: 744/527, loss: 0.005302312783896923 2023-01-22 21:39:59.934038: step: 748/527, loss: 0.0014845379628241062 2023-01-22 21:40:00.994974: step: 752/527, loss: 0.01043526828289032 2023-01-22 21:40:02.046614: step: 756/527, loss: 0.0024213765282183886 2023-01-22 21:40:03.086863: step: 760/527, loss: 0.0011087879538536072 2023-01-22 21:40:04.147920: step: 764/527, loss: 0.00528079504147172 2023-01-22 21:40:05.190014: step: 768/527, loss: 6.313556514214724e-05 2023-01-22 21:40:06.242135: step: 772/527, loss: 0.0012994182761758566 2023-01-22 21:40:07.294414: step: 776/527, loss: 0.003603674005717039 2023-01-22 21:40:08.342645: step: 780/527, loss: 3.5160126572009176e-05 2023-01-22 21:40:09.398621: step: 784/527, loss: 0.006024331320077181 2023-01-22 21:40:10.442269: step: 788/527, loss: 0.00414207624271512 2023-01-22 21:40:11.527328: step: 792/527, loss: 0.000872282253112644 2023-01-22 21:40:12.596578: step: 796/527, loss: 0.0005333773442544043 2023-01-22 21:40:13.650385: step: 800/527, loss: 0.002157765906304121 2023-01-22 21:40:14.704242: step: 804/527, loss: 0.02121039852499962 2023-01-22 21:40:15.761340: step: 808/527, loss: 0.0024337389040738344 2023-01-22 21:40:16.820154: step: 812/527, loss: 0.0003536268195603043 2023-01-22 21:40:17.864984: step: 816/527, loss: 0.00033942601294256747 2023-01-22 21:40:18.910793: step: 820/527, loss: 0.006443643476814032 2023-01-22 21:40:19.961943: step: 824/527, loss: 0.0020841641817241907 2023-01-22 21:40:21.006405: step: 828/527, loss: 0.0006504695629701018 2023-01-22 21:40:22.057202: step: 832/527, loss: 0.013514254242181778 2023-01-22 21:40:23.109520: step: 836/527, loss: 0.006408346351236105 2023-01-22 21:40:24.158942: step: 840/527, loss: 0.009259113110601902 2023-01-22 21:40:25.209006: step: 844/527, loss: 0.03093450888991356 2023-01-22 21:40:26.272071: step: 848/527, loss: 0.006811351515352726 2023-01-22 21:40:27.337886: step: 852/527, loss: 0.0011162091977894306 2023-01-22 21:40:28.379334: step: 856/527, loss: 0.0030295744072645903 2023-01-22 21:40:29.426466: step: 860/527, loss: 0.002891014562919736 2023-01-22 21:40:30.485026: step: 864/527, loss: 0.001489816466346383 2023-01-22 21:40:31.554758: step: 868/527, loss: 0.0006352232885546982 2023-01-22 21:40:32.601552: step: 872/527, loss: 0.005036013666540384 2023-01-22 21:40:33.668677: step: 876/527, loss: 0.004445035010576248 2023-01-22 21:40:34.729954: step: 880/527, loss: 0.03216918557882309 2023-01-22 21:40:35.771365: step: 884/527, loss: 0.006238726433366537 2023-01-22 21:40:36.816949: step: 888/527, loss: 0.0025143115781247616 2023-01-22 21:40:37.889128: step: 892/527, loss: 0.010279372334480286 2023-01-22 21:40:38.925987: step: 896/527, loss: 0.0053979442454874516 2023-01-22 21:40:39.993654: step: 900/527, loss: 0.00488645862787962 2023-01-22 21:40:41.046806: step: 904/527, loss: 0.013648072257637978 2023-01-22 21:40:42.089451: step: 908/527, loss: 0.003178500337526202 2023-01-22 21:40:43.143478: step: 912/527, loss: 1.1075104339397512e-05 2023-01-22 21:40:44.203309: step: 916/527, loss: 0.0012122975895181298 2023-01-22 21:40:45.268962: step: 920/527, loss: 0.01264969538897276 2023-01-22 21:40:46.319111: step: 924/527, loss: 0.0016744674649089575 2023-01-22 21:40:47.371962: step: 928/527, loss: 0.0019454541616141796 2023-01-22 21:40:48.418934: step: 932/527, loss: 0.0008535322267562151 2023-01-22 21:40:49.470252: step: 936/527, loss: 0.011696591973304749 2023-01-22 21:40:50.522589: step: 940/527, loss: 0.012296185828745365 2023-01-22 21:40:51.596280: step: 944/527, loss: 0.0055004507303237915 2023-01-22 21:40:52.654943: step: 948/527, loss: 0.047577809542417526 2023-01-22 21:40:53.703667: step: 952/527, loss: 0.009579057805240154 2023-01-22 21:40:54.770273: step: 956/527, loss: 0.0096200630068779 2023-01-22 21:40:55.822226: step: 960/527, loss: 0.0025058332830667496 2023-01-22 21:40:56.887006: step: 964/527, loss: 0.005407311022281647 2023-01-22 21:40:57.935321: step: 968/527, loss: 0.00035533253685571253 2023-01-22 21:40:58.988183: step: 972/527, loss: 0.003501267870888114 2023-01-22 21:41:00.039783: step: 976/527, loss: 0.00415364233776927 2023-01-22 21:41:01.091059: step: 980/527, loss: 0.0027824013959616423 2023-01-22 21:41:02.151100: step: 984/527, loss: 0.001865458209067583 2023-01-22 21:41:03.211848: step: 988/527, loss: 0.004339116159826517 2023-01-22 21:41:04.251758: step: 992/527, loss: 0.006068643648177385 2023-01-22 21:41:05.308599: step: 996/527, loss: 0.003729678923264146 2023-01-22 21:41:06.352486: step: 1000/527, loss: 0.004127271473407745 2023-01-22 21:41:07.401129: step: 1004/527, loss: 0.001441058237105608 2023-01-22 21:41:08.462970: step: 1008/527, loss: 0.02191963605582714 2023-01-22 21:41:09.520129: step: 1012/527, loss: 0.0009864723542705178 2023-01-22 21:41:10.587215: step: 1016/527, loss: 0.006173207890242338 2023-01-22 21:41:11.659985: step: 1020/527, loss: 0.004782783333212137 2023-01-22 21:41:12.699384: step: 1024/527, loss: 0.00021754145564045757 2023-01-22 21:41:13.757982: step: 1028/527, loss: 0.017618104815483093 2023-01-22 21:41:14.807084: step: 1032/527, loss: 0.000496803259011358 2023-01-22 21:41:15.864232: step: 1036/527, loss: 0.0011742659844458103 2023-01-22 21:41:16.917947: step: 1040/527, loss: 0.0009261745144613087 2023-01-22 21:41:17.971196: step: 1044/527, loss: 0.05287783965468407 2023-01-22 21:41:19.013523: step: 1048/527, loss: 0.0008137134136632085 2023-01-22 21:41:20.057305: step: 1052/527, loss: 0.02463652566075325 2023-01-22 21:41:21.111554: step: 1056/527, loss: 0.00018551468383520842 2023-01-22 21:41:22.155825: step: 1060/527, loss: 0.011669578962028027 2023-01-22 21:41:23.212162: step: 1064/527, loss: 0.004068742040544748 2023-01-22 21:41:24.258841: step: 1068/527, loss: 0.0034586458932608366 2023-01-22 21:41:25.313983: step: 1072/527, loss: 0.004754193127155304 2023-01-22 21:41:26.378039: step: 1076/527, loss: 0.006056899204850197 2023-01-22 21:41:27.420687: step: 1080/527, loss: 0.0077989427372813225 2023-01-22 21:41:28.473166: step: 1084/527, loss: 0.0011496876832097769 2023-01-22 21:41:29.525505: step: 1088/527, loss: 0.007621351163834333 2023-01-22 21:41:30.566357: step: 1092/527, loss: 0.0016001664334908128 2023-01-22 21:41:31.614480: step: 1096/527, loss: 0.0009155957377515733 2023-01-22 21:41:32.670918: step: 1100/527, loss: 0.0011686549987643957 2023-01-22 21:41:33.726982: step: 1104/527, loss: 0.004579978995025158 2023-01-22 21:41:34.778949: step: 1108/527, loss: 0.0007663153228349984 2023-01-22 21:41:35.844005: step: 1112/527, loss: 0.0001973050821106881 2023-01-22 21:41:36.896292: step: 1116/527, loss: 0.005455676931887865 2023-01-22 21:41:37.940196: step: 1120/527, loss: 0.0017178180860355496 2023-01-22 21:41:38.995786: step: 1124/527, loss: 0.0034558384213596582 2023-01-22 21:41:40.066082: step: 1128/527, loss: 0.0024458167608827353 2023-01-22 21:41:41.119255: step: 1132/527, loss: 0.0037678834050893784 2023-01-22 21:41:42.180499: step: 1136/527, loss: 0.007445403374731541 2023-01-22 21:41:43.227384: step: 1140/527, loss: 0.00038117272197268903 2023-01-22 21:41:44.275903: step: 1144/527, loss: 0.005706341937184334 2023-01-22 21:41:45.329070: step: 1148/527, loss: 0.0030229093972593546 2023-01-22 21:41:46.395559: step: 1152/527, loss: 0.006949125323444605 2023-01-22 21:41:47.451812: step: 1156/527, loss: 0.008406054228544235 2023-01-22 21:41:48.499723: step: 1160/527, loss: 0.00869454350322485 2023-01-22 21:41:49.559633: step: 1164/527, loss: 0.010242755524814129 2023-01-22 21:41:50.606881: step: 1168/527, loss: 0.002660030033439398 2023-01-22 21:41:51.671297: step: 1172/527, loss: 0.008592398837208748 2023-01-22 21:41:52.707744: step: 1176/527, loss: 0.008411075919866562 2023-01-22 21:41:53.779413: step: 1180/527, loss: 0.01542226318269968 2023-01-22 21:41:54.848759: step: 1184/527, loss: 0.008960546925663948 2023-01-22 21:41:55.897584: step: 1188/527, loss: 0.006315885577350855 2023-01-22 21:41:56.952093: step: 1192/527, loss: 0.007986439391970634 2023-01-22 21:41:58.014340: step: 1196/527, loss: 0.005045855883508921 2023-01-22 21:41:59.070048: step: 1200/527, loss: 0.0020497306250035763 2023-01-22 21:42:00.114733: step: 1204/527, loss: 0.007824450731277466 2023-01-22 21:42:01.178388: step: 1208/527, loss: 0.005716984160244465 2023-01-22 21:42:02.228669: step: 1212/527, loss: 0.0019280440174043179 2023-01-22 21:42:03.269938: step: 1216/527, loss: 0.01253961119800806 2023-01-22 21:42:04.328801: step: 1220/527, loss: 0.02188253402709961 2023-01-22 21:42:05.379430: step: 1224/527, loss: 0.00019145748228766024 2023-01-22 21:42:06.435390: step: 1228/527, loss: 0.003739667357876897 2023-01-22 21:42:07.488861: step: 1232/527, loss: 0.020722458139061928 2023-01-22 21:42:08.547889: step: 1236/527, loss: 0.021839477121829987 2023-01-22 21:42:09.596577: step: 1240/527, loss: 0.005633394233882427 2023-01-22 21:42:10.640906: step: 1244/527, loss: 0.0013857269659638405 2023-01-22 21:42:11.693930: step: 1248/527, loss: 0.008683423511683941 2023-01-22 21:42:12.761919: step: 1252/527, loss: 0.006105201318860054 2023-01-22 21:42:13.811035: step: 1256/527, loss: 3.8500882510561496e-05 2023-01-22 21:42:14.858639: step: 1260/527, loss: 0.0009859050624072552 2023-01-22 21:42:15.889117: step: 1264/527, loss: 0.0005384812830016017 2023-01-22 21:42:16.939471: step: 1268/527, loss: 0.01527114026248455 2023-01-22 21:42:17.991208: step: 1272/527, loss: 0.013376198709011078 2023-01-22 21:42:19.028914: step: 1276/527, loss: 0.004428771790117025 2023-01-22 21:42:20.092789: step: 1280/527, loss: 0.0067210509441792965 2023-01-22 21:42:21.153086: step: 1284/527, loss: 0.00804841797798872 2023-01-22 21:42:22.202143: step: 1288/527, loss: 0.0025202713441103697 2023-01-22 21:42:23.258285: step: 1292/527, loss: 0.0035377750173211098 2023-01-22 21:42:24.302340: step: 1296/527, loss: 0.0032801826018840075 2023-01-22 21:42:25.345429: step: 1300/527, loss: 0.0031494859140366316 2023-01-22 21:42:26.409376: step: 1304/527, loss: 0.0032288830261677504 2023-01-22 21:42:27.450297: step: 1308/527, loss: 0.004604824353009462 2023-01-22 21:42:28.505683: step: 1312/527, loss: 0.002553367055952549 2023-01-22 21:42:29.561861: step: 1316/527, loss: 0.017564352601766586 2023-01-22 21:42:30.620531: step: 1320/527, loss: 0.010695389471948147 2023-01-22 21:42:31.671063: step: 1324/527, loss: 0.0026875792536884546 2023-01-22 21:42:32.733860: step: 1328/527, loss: 0.012935048900544643 2023-01-22 21:42:33.786788: step: 1332/527, loss: 0.030531177297234535 2023-01-22 21:42:34.825927: step: 1336/527, loss: 0.0018842265708371997 2023-01-22 21:42:35.879608: step: 1340/527, loss: 0.0009928299114108086 2023-01-22 21:42:36.935714: step: 1344/527, loss: 0.08753965049982071 2023-01-22 21:42:37.991268: step: 1348/527, loss: 0.00033283248194493353 2023-01-22 21:42:39.048455: step: 1352/527, loss: 0.0019588300492614508 2023-01-22 21:42:40.103668: step: 1356/527, loss: 0.0076537844724953175 2023-01-22 21:42:41.157802: step: 1360/527, loss: 0.0036690644919872284 2023-01-22 21:42:42.199827: step: 1364/527, loss: 0.0029536555521190166 2023-01-22 21:42:43.258745: step: 1368/527, loss: 0.004815292079001665 2023-01-22 21:42:44.295545: step: 1372/527, loss: 0.005803946405649185 2023-01-22 21:42:45.346187: step: 1376/527, loss: 0.0026500229723751545 2023-01-22 21:42:46.382519: step: 1380/527, loss: 0.00893679540604353 2023-01-22 21:42:47.431178: step: 1384/527, loss: 0.0014254737179726362 2023-01-22 21:42:48.482088: step: 1388/527, loss: 0.002962946891784668 2023-01-22 21:42:49.535470: step: 1392/527, loss: 0.0010838632006198168 2023-01-22 21:42:50.584471: step: 1396/527, loss: 0.0016842987388372421 2023-01-22 21:42:51.628790: step: 1400/527, loss: 0.0007154577760957181 2023-01-22 21:42:52.681644: step: 1404/527, loss: 0.009588307701051235 2023-01-22 21:42:53.733746: step: 1408/527, loss: 0.006933812517672777 2023-01-22 21:42:54.785850: step: 1412/527, loss: 0.001652062637731433 2023-01-22 21:42:55.844548: step: 1416/527, loss: 0.0175021942704916 2023-01-22 21:42:56.884971: step: 1420/527, loss: 0.007716262713074684 2023-01-22 21:42:57.934590: step: 1424/527, loss: 0.0044432831928133965 2023-01-22 21:42:58.985157: step: 1428/527, loss: 0.00327146053314209 2023-01-22 21:43:00.022020: step: 1432/527, loss: 7.69982289057225e-05 2023-01-22 21:43:01.083219: step: 1436/527, loss: 0.015095869079232216 2023-01-22 21:43:02.152221: step: 1440/527, loss: 0.005716430023312569 2023-01-22 21:43:03.202382: step: 1444/527, loss: 0.014631562866270542 2023-01-22 21:43:04.252437: step: 1448/527, loss: 0.00968735758215189 2023-01-22 21:43:05.295885: step: 1452/527, loss: 0.0002707011008169502 2023-01-22 21:43:06.346867: step: 1456/527, loss: 0.013000497594475746 2023-01-22 21:43:07.405560: step: 1460/527, loss: 0.005486202891916037 2023-01-22 21:43:08.477744: step: 1464/527, loss: 3.6374767660163343e-06 2023-01-22 21:43:09.539348: step: 1468/527, loss: 0.0021635941229760647 2023-01-22 21:43:10.576917: step: 1472/527, loss: 0.0036381848622113466 2023-01-22 21:43:11.622381: step: 1476/527, loss: 0.0100374361500144 2023-01-22 21:43:12.676558: step: 1480/527, loss: 0.00924387015402317 2023-01-22 21:43:13.720318: step: 1484/527, loss: 0.0057376474142074585 2023-01-22 21:43:14.768736: step: 1488/527, loss: 0.002657095668837428 2023-01-22 21:43:15.818998: step: 1492/527, loss: 0.010364815592765808 2023-01-22 21:43:16.869931: step: 1496/527, loss: 0.003379128174856305 2023-01-22 21:43:17.924467: step: 1500/527, loss: 0.00571554247289896 2023-01-22 21:43:18.973084: step: 1504/527, loss: 0.0032677853014320135 2023-01-22 21:43:20.020606: step: 1508/527, loss: 5.5866941693238914e-05 2023-01-22 21:43:21.070268: step: 1512/527, loss: 0.000800703011918813 2023-01-22 21:43:22.108316: step: 1516/527, loss: 0.0036062246654182673 2023-01-22 21:43:23.161788: step: 1520/527, loss: 0.002597344573587179 2023-01-22 21:43:24.217550: step: 1524/527, loss: 0.00915090087801218 2023-01-22 21:43:25.263665: step: 1528/527, loss: 0.003382542170584202 2023-01-22 21:43:26.310324: step: 1532/527, loss: 0.00884362030774355 2023-01-22 21:43:27.360574: step: 1536/527, loss: 0.008724585175514221 2023-01-22 21:43:28.412106: step: 1540/527, loss: 0.011519985273480415 2023-01-22 21:43:29.445054: step: 1544/527, loss: 0.0047426363453269005 2023-01-22 21:43:30.487452: step: 1548/527, loss: 0.010204588994383812 2023-01-22 21:43:31.569986: step: 1552/527, loss: 0.013213660567998886 2023-01-22 21:43:32.609001: step: 1556/527, loss: 0.0013982506934553385 2023-01-22 21:43:33.655762: step: 1560/527, loss: 0.0009502901812084019 2023-01-22 21:43:34.715315: step: 1564/527, loss: 0.0009792763739824295 2023-01-22 21:43:35.753861: step: 1568/527, loss: 0.0003333989589009434 2023-01-22 21:43:36.819590: step: 1572/527, loss: 3.688324795803055e-05 2023-01-22 21:43:37.868995: step: 1576/527, loss: 0.006057139951735735 2023-01-22 21:43:38.915361: step: 1580/527, loss: 0.004469339735805988 2023-01-22 21:43:39.966367: step: 1584/527, loss: 0.0034449677914381027 2023-01-22 21:43:41.016477: step: 1588/527, loss: 0.007865716703236103 2023-01-22 21:43:42.070245: step: 1592/527, loss: 0.0003162748762406409 2023-01-22 21:43:43.122484: step: 1596/527, loss: 0.010795745067298412 2023-01-22 21:43:44.170029: step: 1600/527, loss: 0.0028588275890797377 2023-01-22 21:43:45.214243: step: 1604/527, loss: 0.0023156236857175827 2023-01-22 21:43:46.255743: step: 1608/527, loss: 0.009255464188754559 2023-01-22 21:43:47.316891: step: 1612/527, loss: 0.021274641156196594 2023-01-22 21:43:48.355756: step: 1616/527, loss: 0.014407042413949966 2023-01-22 21:43:49.431209: step: 1620/527, loss: 0.014995587058365345 2023-01-22 21:43:50.492819: step: 1624/527, loss: 2.666874206624925e-05 2023-01-22 21:43:51.543916: step: 1628/527, loss: 0.038134440779685974 2023-01-22 21:43:52.606371: step: 1632/527, loss: 0.020040687173604965 2023-01-22 21:43:53.685497: step: 1636/527, loss: 0.005989911966025829 2023-01-22 21:43:54.748935: step: 1640/527, loss: 0.002983721671625972 2023-01-22 21:43:55.804569: step: 1644/527, loss: 0.015931863337755203 2023-01-22 21:43:56.847475: step: 1648/527, loss: 0.0021516289561986923 2023-01-22 21:43:57.901362: step: 1652/527, loss: 0.02262815274298191 2023-01-22 21:43:58.950240: step: 1656/527, loss: 0.0024789704475551844 2023-01-22 21:44:00.011368: step: 1660/527, loss: 0.017011869698762894 2023-01-22 21:44:01.066463: step: 1664/527, loss: 0.034133926033973694 2023-01-22 21:44:02.109711: step: 1668/527, loss: 0.023153837770223618 2023-01-22 21:44:03.159015: step: 1672/527, loss: 0.011968724429607391 2023-01-22 21:44:04.201991: step: 1676/527, loss: 0.005910203792154789 2023-01-22 21:44:05.268039: step: 1680/527, loss: 0.0025056428276002407 2023-01-22 21:44:06.307053: step: 1684/527, loss: 0.0 2023-01-22 21:44:07.373225: step: 1688/527, loss: 0.003497748402878642 2023-01-22 21:44:08.421349: step: 1692/527, loss: 0.0023310959804803133 2023-01-22 21:44:09.467326: step: 1696/527, loss: 0.003858437528833747 2023-01-22 21:44:10.514251: step: 1700/527, loss: 0.005956585053354502 2023-01-22 21:44:11.567743: step: 1704/527, loss: 0.04720918834209442 2023-01-22 21:44:12.616168: step: 1708/527, loss: 0.005280202720314264 2023-01-22 21:44:13.677138: step: 1712/527, loss: 0.004693740513175726 2023-01-22 21:44:14.720566: step: 1716/527, loss: 0.011267592199146748 2023-01-22 21:44:15.769580: step: 1720/527, loss: 0.0010155562777072191 2023-01-22 21:44:16.819851: step: 1724/527, loss: 0.004556070081889629 2023-01-22 21:44:17.877371: step: 1728/527, loss: 0.008643990382552147 2023-01-22 21:44:18.941747: step: 1732/527, loss: 0.012633326463401318 2023-01-22 21:44:19.997249: step: 1736/527, loss: 0.008252976462244987 2023-01-22 21:44:21.049406: step: 1740/527, loss: 0.0008520284900441766 2023-01-22 21:44:22.096969: step: 1744/527, loss: 0.0012677937047556043 2023-01-22 21:44:23.160485: step: 1748/527, loss: 0.0010938954073935747 2023-01-22 21:44:24.202541: step: 1752/527, loss: 0.011090392246842384 2023-01-22 21:44:25.249612: step: 1756/527, loss: 0.0 2023-01-22 21:44:26.280105: step: 1760/527, loss: 0.009472419507801533 2023-01-22 21:44:27.316796: step: 1764/527, loss: 0.0035333549603819847 2023-01-22 21:44:28.373824: step: 1768/527, loss: 0.009755841456353664 2023-01-22 21:44:29.422362: step: 1772/527, loss: 0.005978620611131191 2023-01-22 21:44:30.483117: step: 1776/527, loss: 0.004433545283973217 2023-01-22 21:44:31.521969: step: 1780/527, loss: 0.003584163961932063 2023-01-22 21:44:32.589125: step: 1784/527, loss: 0.0029540956020355225 2023-01-22 21:44:33.633214: step: 1788/527, loss: 0.005422653630375862 2023-01-22 21:44:34.683715: step: 1792/527, loss: 0.0014488842571154237 2023-01-22 21:44:35.719934: step: 1796/527, loss: 8.900416105461773e-06 2023-01-22 21:44:36.772743: step: 1800/527, loss: 0.013409025967121124 2023-01-22 21:44:37.810698: step: 1804/527, loss: 0.0002003545523621142 2023-01-22 21:44:38.871365: step: 1808/527, loss: 0.00795035157352686 2023-01-22 21:44:39.928319: step: 1812/527, loss: 0.02246682345867157 2023-01-22 21:44:40.986162: step: 1816/527, loss: 0.010253187268972397 2023-01-22 21:44:42.040071: step: 1820/527, loss: 0.02941504307091236 2023-01-22 21:44:43.081551: step: 1824/527, loss: 1.8626450382086546e-09 2023-01-22 21:44:44.125086: step: 1828/527, loss: 9.63321053859545e-06 2023-01-22 21:44:45.167568: step: 1832/527, loss: 0.0004840172769036144 2023-01-22 21:44:46.216439: step: 1836/527, loss: 0.0017071020556613803 2023-01-22 21:44:47.246594: step: 1840/527, loss: 0.005629593972116709 2023-01-22 21:44:48.294154: step: 1844/527, loss: 0.0036395310889929533 2023-01-22 21:44:49.342263: step: 1848/527, loss: 0.016531143337488174 2023-01-22 21:44:50.379213: step: 1852/527, loss: 0.0003981611516792327 2023-01-22 21:44:51.427014: step: 1856/527, loss: 0.002332820789888501 2023-01-22 21:44:52.495524: step: 1860/527, loss: 0.004095232114195824 2023-01-22 21:44:53.560659: step: 1864/527, loss: 0.00022128420823719352 2023-01-22 21:44:54.621148: step: 1868/527, loss: 0.0004512519226409495 2023-01-22 21:44:55.687260: step: 1872/527, loss: 0.07282253354787827 2023-01-22 21:44:56.723696: step: 1876/527, loss: 0.006818498019129038 2023-01-22 21:44:57.775232: step: 1880/527, loss: 0.000982586294412613 2023-01-22 21:44:58.826973: step: 1884/527, loss: 0.0008853274630382657 2023-01-22 21:44:59.903236: step: 1888/527, loss: 0.003433793317526579 2023-01-22 21:45:00.959342: step: 1892/527, loss: 0.006716647185385227 2023-01-22 21:45:02.009362: step: 1896/527, loss: 0.0017021086532622576 2023-01-22 21:45:03.051782: step: 1900/527, loss: 0.00382750341668725 2023-01-22 21:45:04.097110: step: 1904/527, loss: 0.0021611934062093496 2023-01-22 21:45:05.146961: step: 1908/527, loss: 0.011224180459976196 2023-01-22 21:45:06.211765: step: 1912/527, loss: 0.008924460969865322 2023-01-22 21:45:07.262420: step: 1916/527, loss: 0.007336306385695934 2023-01-22 21:45:08.318327: step: 1920/527, loss: 0.001474024960771203 2023-01-22 21:45:09.356272: step: 1924/527, loss: 0.007357365917414427 2023-01-22 21:45:10.398604: step: 1928/527, loss: 0.00017077891970984638 2023-01-22 21:45:11.444837: step: 1932/527, loss: 0.006330475211143494 2023-01-22 21:45:12.489239: step: 1936/527, loss: 0.007395185064524412 2023-01-22 21:45:13.556412: step: 1940/527, loss: 0.00021674064919352531 2023-01-22 21:45:14.606777: step: 1944/527, loss: 0.01223550271242857 2023-01-22 21:45:15.659192: step: 1948/527, loss: 0.12603196501731873 2023-01-22 21:45:16.702261: step: 1952/527, loss: 0.0003739767416846007 2023-01-22 21:45:17.768421: step: 1956/527, loss: 0.0018462217412889004 2023-01-22 21:45:18.827279: step: 1960/527, loss: 0.006662336643785238 2023-01-22 21:45:19.874766: step: 1964/527, loss: 0.003399134613573551 2023-01-22 21:45:20.933708: step: 1968/527, loss: 0.02375180274248123 2023-01-22 21:45:21.989491: step: 1972/527, loss: 0.002270856872200966 2023-01-22 21:45:23.039145: step: 1976/527, loss: 0.006422302220016718 2023-01-22 21:45:24.096459: step: 1980/527, loss: 0.04472963139414787 2023-01-22 21:45:25.151436: step: 1984/527, loss: 0.006703260354697704 2023-01-22 21:45:26.207590: step: 1988/527, loss: 0.00034308084286749363 2023-01-22 21:45:27.258806: step: 1992/527, loss: 0.00012687459820881486 2023-01-22 21:45:28.301034: step: 1996/527, loss: 0.0035361372865736485 2023-01-22 21:45:29.354654: step: 2000/527, loss: 0.006403637584298849 2023-01-22 21:45:30.408968: step: 2004/527, loss: 0.0037350535858422518 2023-01-22 21:45:31.471813: step: 2008/527, loss: 0.0022046868689358234 2023-01-22 21:45:32.531585: step: 2012/527, loss: 0.001493731397204101 2023-01-22 21:45:33.570948: step: 2016/527, loss: 0.0017241832101717591 2023-01-22 21:45:34.610918: step: 2020/527, loss: 0.0029128578025847673 2023-01-22 21:45:35.678807: step: 2024/527, loss: 0.0016437429003417492 2023-01-22 21:45:36.729057: step: 2028/527, loss: 0.005673205945640802 2023-01-22 21:45:37.769060: step: 2032/527, loss: 0.002791368868201971 2023-01-22 21:45:38.818028: step: 2036/527, loss: 0.0009393185609951615 2023-01-22 21:45:39.879745: step: 2040/527, loss: 0.0020621151197701693 2023-01-22 21:45:40.951473: step: 2044/527, loss: 0.007466120179742575 2023-01-22 21:45:42.000046: step: 2048/527, loss: 0.009691721759736538 2023-01-22 21:45:43.078236: step: 2052/527, loss: 0.005073025822639465 2023-01-22 21:45:44.116534: step: 2056/527, loss: 0.0054930225014686584 2023-01-22 21:45:45.160284: step: 2060/527, loss: 0.007369012571871281 2023-01-22 21:45:46.205044: step: 2064/527, loss: 0.0012249633437022567 2023-01-22 21:45:47.268221: step: 2068/527, loss: 0.0011255674762651324 2023-01-22 21:45:48.310154: step: 2072/527, loss: 0.002401367761194706 2023-01-22 21:45:49.359055: step: 2076/527, loss: 0.0006509974482469261 2023-01-22 21:45:50.399724: step: 2080/527, loss: 0.0022482527419924736 2023-01-22 21:45:51.450202: step: 2084/527, loss: 0.0009043293539434671 2023-01-22 21:45:52.501579: step: 2088/527, loss: 0.00888520572334528 2023-01-22 21:45:53.556537: step: 2092/527, loss: 0.002215207554399967 2023-01-22 21:45:54.601746: step: 2096/527, loss: 0.00013173124170862138 2023-01-22 21:45:55.660768: step: 2100/527, loss: 0.0006686710985377431 2023-01-22 21:45:56.712993: step: 2104/527, loss: 0.0010806603822857141 2023-01-22 21:45:57.762821: step: 2108/527, loss: 0.025548676028847694 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3237264362657092, 'r': 0.3421548861480076, 'f1': 0.33268565498154984}, 'combined': 0.24513679840745775, 'stategy': 1, 'epoch': 11} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3431424917547756, 'r': 0.3135074583759541, 'f1': 0.32765625103425133}, 'combined': 0.20970000066192082, 'stategy': 1, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3233555926984361, 'r': 0.3613974271335462, 'f1': 0.34131979229279363}, 'combined': 0.2514987943210058, 'stategy': 1, 'epoch': 11} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.36036791237789995, 'r': 0.31712376289255195, 'f1': 0.33736570520484255}, 'combined': 0.2159140513310992, 'stategy': 1, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3319773567844166, 'r': 0.33386717095965995, 'f1': 0.3329195820165389}, 'combined': 0.24530916569639705, 'stategy': 1, 'epoch': 11} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36875593136201523, 'r': 0.3013128538335666, 'f1': 0.3316402867932796}, 'combined': 0.23777982826687974, 'stategy': 1, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 11} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3319773567844166, 'r': 0.33386717095965995, 'f1': 0.3329195820165389}, 'combined': 0.24530916569639705, 'stategy': 1, 'epoch': 11} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36875593136201523, 'r': 0.3013128538335666, 'f1': 0.3316402867932796}, 'combined': 0.23777982826687974, 'stategy': 1, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 11} ****************************** Epoch: 12 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 21:48:34.167392: step: 4/527, loss: 0.007619322277605534 2023-01-22 21:48:35.213354: step: 8/527, loss: 0.005011274013668299 2023-01-22 21:48:36.253853: step: 12/527, loss: 0.010464202612638474 2023-01-22 21:48:37.287467: step: 16/527, loss: 0.005930216051638126 2023-01-22 21:48:38.323995: step: 20/527, loss: 0.0049021546728909016 2023-01-22 21:48:39.355217: step: 24/527, loss: 0.000686881598085165 2023-01-22 21:48:40.391665: step: 28/527, loss: 2.834871156665031e-05 2023-01-22 21:48:41.440865: step: 32/527, loss: 0.0025016057770699263 2023-01-22 21:48:42.477299: step: 36/527, loss: 0.0003890406514983624 2023-01-22 21:48:43.528174: step: 40/527, loss: 0.0021663017105311155 2023-01-22 21:48:44.572027: step: 44/527, loss: 0.0022376582492142916 2023-01-22 21:48:45.623261: step: 48/527, loss: 0.003644646843895316 2023-01-22 21:48:46.669330: step: 52/527, loss: 0.0018524817423895001 2023-01-22 21:48:47.725845: step: 56/527, loss: 0.003466264344751835 2023-01-22 21:48:48.764728: step: 60/527, loss: 0.00041731770033948123 2023-01-22 21:48:49.806882: step: 64/527, loss: 0.0034938897006213665 2023-01-22 21:48:50.831786: step: 68/527, loss: 0.003241848200559616 2023-01-22 21:48:51.884478: step: 72/527, loss: 0.005263487342745066 2023-01-22 21:48:52.925433: step: 76/527, loss: 0.00880734995007515 2023-01-22 21:48:53.954376: step: 80/527, loss: 0.0012290183221921325 2023-01-22 21:48:54.985417: step: 84/527, loss: 0.005908619612455368 2023-01-22 21:48:56.029811: step: 88/527, loss: 0.0076753199100494385 2023-01-22 21:48:57.087852: step: 92/527, loss: 0.014141412451863289 2023-01-22 21:48:58.144486: step: 96/527, loss: 0.0033978584688156843 2023-01-22 21:48:59.204133: step: 100/527, loss: 0.00019484110816847533 2023-01-22 21:49:00.250254: step: 104/527, loss: 0.003944715950638056 2023-01-22 21:49:01.298671: step: 108/527, loss: 0.0020148225594311953 2023-01-22 21:49:02.354512: step: 112/527, loss: 0.005602751858532429 2023-01-22 21:49:03.401457: step: 116/527, loss: 0.0020726402290165424 2023-01-22 21:49:04.453467: step: 120/527, loss: 0.00225808285176754 2023-01-22 21:49:05.505810: step: 124/527, loss: 0.001250948291271925 2023-01-22 21:49:06.550093: step: 128/527, loss: 0.03231208398938179 2023-01-22 21:49:07.593201: step: 132/527, loss: 0.001055886852554977 2023-01-22 21:49:08.634215: step: 136/527, loss: 0.018326491117477417 2023-01-22 21:49:09.690376: step: 140/527, loss: 0.0023968794848769903 2023-01-22 21:49:10.734635: step: 144/527, loss: 0.0008112549548968673 2023-01-22 21:49:11.784278: step: 148/527, loss: 0.002209191909059882 2023-01-22 21:49:12.819175: step: 152/527, loss: 0.0023432201705873013 2023-01-22 21:49:13.846507: step: 156/527, loss: 0.009897831827402115 2023-01-22 21:49:14.902725: step: 160/527, loss: 0.004624166525900364 2023-01-22 21:49:15.931212: step: 164/527, loss: 0.007766325492411852 2023-01-22 21:49:16.993027: step: 168/527, loss: 0.005127053242176771 2023-01-22 21:49:18.026843: step: 172/527, loss: 0.008108958601951599 2023-01-22 21:49:19.068335: step: 176/527, loss: 0.0010407913941890001 2023-01-22 21:49:20.111974: step: 180/527, loss: 0.0027981160674244165 2023-01-22 21:49:21.164052: step: 184/527, loss: 0.0007031296263448894 2023-01-22 21:49:22.208196: step: 188/527, loss: 0.0025272162165492773 2023-01-22 21:49:23.259152: step: 192/527, loss: 0.00016711748321540654 2023-01-22 21:49:24.290114: step: 196/527, loss: 1.261873148905579e-05 2023-01-22 21:49:25.344075: step: 200/527, loss: 0.0031699826940894127 2023-01-22 21:49:26.375344: step: 204/527, loss: 0.0019168907310813665 2023-01-22 21:49:27.413353: step: 208/527, loss: 0.001322442665696144 2023-01-22 21:49:28.472768: step: 212/527, loss: 0.014579384587705135 2023-01-22 21:49:29.508755: step: 216/527, loss: 0.0036112028174102306 2023-01-22 21:49:30.559449: step: 220/527, loss: 0.02042965032160282 2023-01-22 21:49:31.623893: step: 224/527, loss: 0.0037181549705564976 2023-01-22 21:49:32.691726: step: 228/527, loss: 0.003545230021700263 2023-01-22 21:49:33.733814: step: 232/527, loss: 0.00016019395843613893 2023-01-22 21:49:34.786312: step: 236/527, loss: 0.005952728446573019 2023-01-22 21:49:35.839821: step: 240/527, loss: 0.0032371790148317814 2023-01-22 21:49:36.890978: step: 244/527, loss: 0.0030368592124432325 2023-01-22 21:49:37.946003: step: 248/527, loss: 0.001996848499402404 2023-01-22 21:49:38.986663: step: 252/527, loss: 0.0027342450339347124 2023-01-22 21:49:40.042978: step: 256/527, loss: 0.0003024438628926873 2023-01-22 21:49:41.103254: step: 260/527, loss: 0.003370189107954502 2023-01-22 21:49:42.156188: step: 264/527, loss: 0.0047250972129404545 2023-01-22 21:49:43.206333: step: 268/527, loss: 0.004215765278786421 2023-01-22 21:49:44.253742: step: 272/527, loss: 0.000495637534186244 2023-01-22 21:49:45.313984: step: 276/527, loss: 0.010135481134057045 2023-01-22 21:49:46.356666: step: 280/527, loss: 0.0017686393111944199 2023-01-22 21:49:47.408685: step: 284/527, loss: 0.009661720134317875 2023-01-22 21:49:48.464082: step: 288/527, loss: 0.00840539950877428 2023-01-22 21:49:49.546429: step: 292/527, loss: 0.0058349426835775375 2023-01-22 21:49:50.588328: step: 296/527, loss: 0.002013220451772213 2023-01-22 21:49:51.671882: step: 300/527, loss: 0.0010815411806106567 2023-01-22 21:49:52.714046: step: 304/527, loss: 0.0010543304961174726 2023-01-22 21:49:53.755637: step: 308/527, loss: 0.0016437410376966 2023-01-22 21:49:54.802101: step: 312/527, loss: 0.0036136480048298836 2023-01-22 21:49:55.855028: step: 316/527, loss: 0.008142869919538498 2023-01-22 21:49:56.899909: step: 320/527, loss: 0.0016600724775344133 2023-01-22 21:49:57.947826: step: 324/527, loss: 0.0037983183283358812 2023-01-22 21:49:59.001858: step: 328/527, loss: 0.0028913291171193123 2023-01-22 21:50:00.051911: step: 332/527, loss: 0.010967910289764404 2023-01-22 21:50:01.105805: step: 336/527, loss: 0.018941832706332207 2023-01-22 21:50:02.164109: step: 340/527, loss: 0.0037054328713566065 2023-01-22 21:50:03.222305: step: 344/527, loss: 0.00010407343506813049 2023-01-22 21:50:04.273346: step: 348/527, loss: 0.0033731076400727034 2023-01-22 21:50:05.317872: step: 352/527, loss: 0.003104487666860223 2023-01-22 21:50:06.367546: step: 356/527, loss: 0.002900853054597974 2023-01-22 21:50:07.421352: step: 360/527, loss: 0.0010574869811534882 2023-01-22 21:50:08.482045: step: 364/527, loss: 0.00032582986750639975 2023-01-22 21:50:09.533165: step: 368/527, loss: 0.0008039302774704993 2023-01-22 21:50:10.590878: step: 372/527, loss: 0.0005461260443553329 2023-01-22 21:50:11.648213: step: 376/527, loss: 0.002027584007009864 2023-01-22 21:50:12.719498: step: 380/527, loss: 0.0006290959427133203 2023-01-22 21:50:13.771551: step: 384/527, loss: 0.005801249761134386 2023-01-22 21:50:14.839208: step: 388/527, loss: 0.005603089462965727 2023-01-22 21:50:15.891905: step: 392/527, loss: 0.008031330071389675 2023-01-22 21:50:16.946612: step: 396/527, loss: 0.003185323206707835 2023-01-22 21:50:17.997369: step: 400/527, loss: 0.0005803056410513818 2023-01-22 21:50:19.049943: step: 404/527, loss: 0.009660094976425171 2023-01-22 21:50:20.108355: step: 408/527, loss: 6.6267039073864e-06 2023-01-22 21:50:21.160797: step: 412/527, loss: 0.0030033146031200886 2023-01-22 21:50:22.214344: step: 416/527, loss: 9.05255728866905e-05 2023-01-22 21:50:23.278305: step: 420/527, loss: 0.01953630894422531 2023-01-22 21:50:24.334065: step: 424/527, loss: 0.0003336553636472672 2023-01-22 21:50:25.396152: step: 428/527, loss: 0.0024180535692721605 2023-01-22 21:50:26.469739: step: 432/527, loss: 0.004492941312491894 2023-01-22 21:50:27.525932: step: 436/527, loss: 0.0039434912614524364 2023-01-22 21:50:28.569027: step: 440/527, loss: 0.00026417331537231803 2023-01-22 21:50:29.623149: step: 444/527, loss: 0.001785318716429174 2023-01-22 21:50:30.683850: step: 448/527, loss: 0.0029834997840225697 2023-01-22 21:50:31.729250: step: 452/527, loss: 0.005437122192233801 2023-01-22 21:50:32.781806: step: 456/527, loss: 0.003403643611818552 2023-01-22 21:50:33.835174: step: 460/527, loss: 0.003753100521862507 2023-01-22 21:50:34.875016: step: 464/527, loss: 0.010964242741465569 2023-01-22 21:50:35.934206: step: 468/527, loss: 0.011434354819357395 2023-01-22 21:50:36.988949: step: 472/527, loss: 9.458821295993403e-05 2023-01-22 21:50:38.051430: step: 476/527, loss: 0.002572170225903392 2023-01-22 21:50:39.105804: step: 480/527, loss: 0.00017540385306347162 2023-01-22 21:50:40.176758: step: 484/527, loss: 0.005281957797706127 2023-01-22 21:50:41.221469: step: 488/527, loss: 0.004321059677749872 2023-01-22 21:50:42.280472: step: 492/527, loss: 0.002527425065636635 2023-01-22 21:50:43.321568: step: 496/527, loss: 0.010908177122473717 2023-01-22 21:50:44.380140: step: 500/527, loss: 0.0002724303340073675 2023-01-22 21:50:45.437221: step: 504/527, loss: 0.0001880708005046472 2023-01-22 21:50:46.489543: step: 508/527, loss: 0.0011116022942587733 2023-01-22 21:50:47.573288: step: 512/527, loss: 0.034606996923685074 2023-01-22 21:50:48.638863: step: 516/527, loss: 0.005533210933208466 2023-01-22 21:50:49.713713: step: 520/527, loss: 0.0018650231650099158 2023-01-22 21:50:50.772157: step: 524/527, loss: 0.003876227419823408 2023-01-22 21:50:51.835979: step: 528/527, loss: 0.002837592037394643 2023-01-22 21:50:52.880906: step: 532/527, loss: 0.008782900869846344 2023-01-22 21:50:53.944974: step: 536/527, loss: 0.004077088087797165 2023-01-22 21:50:55.021328: step: 540/527, loss: 0.002331398893147707 2023-01-22 21:50:56.072359: step: 544/527, loss: 0.014300593174993992 2023-01-22 21:50:57.128075: step: 548/527, loss: 0.0003353523788973689 2023-01-22 21:50:58.176339: step: 552/527, loss: 0.010083137080073357 2023-01-22 21:50:59.241644: step: 556/527, loss: 0.01390083134174347 2023-01-22 21:51:00.315252: step: 560/527, loss: 0.0004276926629245281 2023-01-22 21:51:01.370550: step: 564/527, loss: 0.007978932932019234 2023-01-22 21:51:02.431813: step: 568/527, loss: 0.061783596873283386 2023-01-22 21:51:03.492626: step: 572/527, loss: 0.0029477113857865334 2023-01-22 21:51:04.564386: step: 576/527, loss: 0.005164923146367073 2023-01-22 21:51:05.625660: step: 580/527, loss: 0.006764416582882404 2023-01-22 21:51:06.672021: step: 584/527, loss: 0.0011231843382120132 2023-01-22 21:51:07.719515: step: 588/527, loss: 0.0010017876047641039 2023-01-22 21:51:08.769887: step: 592/527, loss: 0.004956105258315802 2023-01-22 21:51:09.821216: step: 596/527, loss: 0.0014087819727137685 2023-01-22 21:51:10.887639: step: 600/527, loss: 0.010600045323371887 2023-01-22 21:51:11.940627: step: 604/527, loss: 0.00046931247925385833 2023-01-22 21:51:13.006145: step: 608/527, loss: 0.006964202504605055 2023-01-22 21:51:14.043819: step: 612/527, loss: 0.0026729705277830362 2023-01-22 21:51:15.089392: step: 616/527, loss: 0.0 2023-01-22 21:51:16.152475: step: 620/527, loss: 0.0014651769306510687 2023-01-22 21:51:17.216646: step: 624/527, loss: 0.00011985751916654408 2023-01-22 21:51:18.270904: step: 628/527, loss: 8.332579454872757e-05 2023-01-22 21:51:19.333650: step: 632/527, loss: 0.002799414563924074 2023-01-22 21:51:20.380892: step: 636/527, loss: 0.0007252858486026525 2023-01-22 21:51:21.420167: step: 640/527, loss: 0.0027340000960975885 2023-01-22 21:51:22.481211: step: 644/527, loss: 0.0022030228283256292 2023-01-22 21:51:23.567171: step: 648/527, loss: 0.002237871289253235 2023-01-22 21:51:24.632046: step: 652/527, loss: 0.0017605915199965239 2023-01-22 21:51:25.678963: step: 656/527, loss: 0.0027358876541256905 2023-01-22 21:51:26.736824: step: 660/527, loss: 0.004454437177628279 2023-01-22 21:51:27.789822: step: 664/527, loss: 0.002043404383584857 2023-01-22 21:51:28.843502: step: 668/527, loss: 0.00566945131868124 2023-01-22 21:51:29.887380: step: 672/527, loss: 0.009340194053947926 2023-01-22 21:51:30.949000: step: 676/527, loss: 0.008716157637536526 2023-01-22 21:51:32.003539: step: 680/527, loss: 0.021269943565130234 2023-01-22 21:51:33.067023: step: 684/527, loss: 0.0001597239461261779 2023-01-22 21:51:34.141485: step: 688/527, loss: 0.008474440313875675 2023-01-22 21:51:35.191998: step: 692/527, loss: 0.005262911319732666 2023-01-22 21:51:36.247483: step: 696/527, loss: 0.003303262172266841 2023-01-22 21:51:37.305877: step: 700/527, loss: 0.011588364839553833 2023-01-22 21:51:38.363019: step: 704/527, loss: 0.001971333986148238 2023-01-22 21:51:39.425292: step: 708/527, loss: 0.000131136694108136 2023-01-22 21:51:40.469097: step: 712/527, loss: 0.013067991472780704 2023-01-22 21:51:41.526129: step: 716/527, loss: 0.010408147238194942 2023-01-22 21:51:42.598905: step: 720/527, loss: 0.005840797442942858 2023-01-22 21:51:43.657420: step: 724/527, loss: 0.0010175163624808192 2023-01-22 21:51:44.706197: step: 728/527, loss: 0.003741353750228882 2023-01-22 21:51:45.757452: step: 732/527, loss: 0.0028500924818217754 2023-01-22 21:51:46.824435: step: 736/527, loss: 0.013644230552017689 2023-01-22 21:51:47.886907: step: 740/527, loss: 0.004073255229741335 2023-01-22 21:51:48.945301: step: 744/527, loss: 0.002723877551034093 2023-01-22 21:51:50.005684: step: 748/527, loss: 0.005280986428260803 2023-01-22 21:51:51.072236: step: 752/527, loss: 0.008070048876106739 2023-01-22 21:51:52.128488: step: 756/527, loss: 0.0037095113657414913 2023-01-22 21:51:53.176271: step: 760/527, loss: 0.0026983425486832857 2023-01-22 21:51:54.212248: step: 764/527, loss: 0.0002139671560144052 2023-01-22 21:51:55.254122: step: 768/527, loss: 6.828071946074488e-06 2023-01-22 21:51:56.324373: step: 772/527, loss: 0.013572442345321178 2023-01-22 21:51:57.380454: step: 776/527, loss: 0.003095896914601326 2023-01-22 21:51:58.437049: step: 780/527, loss: 0.010774532333016396 2023-01-22 21:51:59.513079: step: 784/527, loss: 0.008064789697527885 2023-01-22 21:52:00.562357: step: 788/527, loss: 0.006426256150007248 2023-01-22 21:52:01.601984: step: 792/527, loss: 5.289889770665468e-08 2023-01-22 21:52:02.665775: step: 796/527, loss: 0.0007420787587761879 2023-01-22 21:52:03.727010: step: 800/527, loss: 0.006745407823473215 2023-01-22 21:52:04.787915: step: 804/527, loss: 0.001343736657872796 2023-01-22 21:52:05.836776: step: 808/527, loss: 0.00011648951476672664 2023-01-22 21:52:06.911917: step: 812/527, loss: 8.425141277257353e-05 2023-01-22 21:52:07.970223: step: 816/527, loss: 0.011028146371245384 2023-01-22 21:52:09.024327: step: 820/527, loss: 0.005218501202762127 2023-01-22 21:52:10.073283: step: 824/527, loss: 0.014840721152722836 2023-01-22 21:52:11.134507: step: 828/527, loss: 7.502801963710226e-06 2023-01-22 21:52:12.207216: step: 832/527, loss: 0.0010546231642365456 2023-01-22 21:52:13.261436: step: 836/527, loss: 0.000891447183676064 2023-01-22 21:52:14.332016: step: 840/527, loss: 0.01074120495468378 2023-01-22 21:52:15.373138: step: 844/527, loss: 0.004925885703414679 2023-01-22 21:52:16.419159: step: 848/527, loss: 0.024480372667312622 2023-01-22 21:52:17.462175: step: 852/527, loss: 0.010799067094922066 2023-01-22 21:52:18.515296: step: 856/527, loss: 0.0032312916591763496 2023-01-22 21:52:19.559186: step: 860/527, loss: 0.0007931143627502024 2023-01-22 21:52:20.592770: step: 864/527, loss: 0.011013202369213104 2023-01-22 21:52:21.633707: step: 868/527, loss: 0.005142250098288059 2023-01-22 21:52:22.707648: step: 872/527, loss: 0.005670530721545219 2023-01-22 21:52:23.758292: step: 876/527, loss: 0.004671367816627026 2023-01-22 21:52:24.822353: step: 880/527, loss: 0.000959213706664741 2023-01-22 21:52:25.891273: step: 884/527, loss: 0.006375072058290243 2023-01-22 21:52:26.947925: step: 888/527, loss: 0.03924502432346344 2023-01-22 21:52:28.028221: step: 892/527, loss: 0.005271376576274633 2023-01-22 21:52:29.073911: step: 896/527, loss: 0.0026893827598541975 2023-01-22 21:52:30.124319: step: 900/527, loss: 0.026787694543600082 2023-01-22 21:52:31.157868: step: 904/527, loss: 0.001120428554713726 2023-01-22 21:52:32.214076: step: 908/527, loss: 0.0062246788293123245 2023-01-22 21:52:33.260934: step: 912/527, loss: 0.005336942616850138 2023-01-22 21:52:34.316898: step: 916/527, loss: 0.0066856094636023045 2023-01-22 21:52:35.366268: step: 920/527, loss: 0.008613454177975655 2023-01-22 21:52:36.414553: step: 924/527, loss: 0.007455340586602688 2023-01-22 21:52:37.484825: step: 928/527, loss: 0.009863438084721565 2023-01-22 21:52:38.530149: step: 932/527, loss: 0.00613413518294692 2023-01-22 21:52:39.583642: step: 936/527, loss: 0.001392399426549673 2023-01-22 21:52:40.632098: step: 940/527, loss: 0.026021713390946388 2023-01-22 21:52:41.693120: step: 944/527, loss: 0.01770760491490364 2023-01-22 21:52:42.748559: step: 948/527, loss: 0.001510394155047834 2023-01-22 21:52:43.796818: step: 952/527, loss: 0.0011424239492043853 2023-01-22 21:52:44.844489: step: 956/527, loss: 0.0014410935109481215 2023-01-22 21:52:45.883066: step: 960/527, loss: 0.008240103721618652 2023-01-22 21:52:46.940481: step: 964/527, loss: 0.00755185866728425 2023-01-22 21:52:47.991361: step: 968/527, loss: 0.017570147290825844 2023-01-22 21:52:49.055290: step: 972/527, loss: 0.0007693552761338651 2023-01-22 21:52:50.109832: step: 976/527, loss: 0.01756575144827366 2023-01-22 21:52:51.153335: step: 980/527, loss: 0.0020110842306166887 2023-01-22 21:52:52.217355: step: 984/527, loss: 0.03568197041749954 2023-01-22 21:52:53.270789: step: 988/527, loss: 0.003300323849543929 2023-01-22 21:52:54.319707: step: 992/527, loss: 0.0 2023-01-22 21:52:55.384230: step: 996/527, loss: 0.00432737497612834 2023-01-22 21:52:56.432228: step: 1000/527, loss: 0.008678669109940529 2023-01-22 21:52:57.507305: step: 1004/527, loss: 0.0007442276692017913 2023-01-22 21:52:58.544868: step: 1008/527, loss: 0.003179546445608139 2023-01-22 21:52:59.602244: step: 1012/527, loss: 0.013082008808851242 2023-01-22 21:53:00.648078: step: 1016/527, loss: 0.0035641989670693874 2023-01-22 21:53:01.702552: step: 1020/527, loss: 0.003280684817582369 2023-01-22 21:53:02.752226: step: 1024/527, loss: 0.005581994540989399 2023-01-22 21:53:03.814114: step: 1028/527, loss: 0.0002273622085340321 2023-01-22 21:53:04.871380: step: 1032/527, loss: 0.005484454333782196 2023-01-22 21:53:05.913522: step: 1036/527, loss: 0.0009807702153921127 2023-01-22 21:53:06.974701: step: 1040/527, loss: 0.0002478167007211596 2023-01-22 21:53:08.018373: step: 1044/527, loss: 0.00014708031085319817 2023-01-22 21:53:09.069551: step: 1048/527, loss: 0.003240023972466588 2023-01-22 21:53:10.127447: step: 1052/527, loss: 0.0016205157153308392 2023-01-22 21:53:11.183664: step: 1056/527, loss: 0.011959646828472614 2023-01-22 21:53:12.229039: step: 1060/527, loss: 0.0013968355488032103 2023-01-22 21:53:13.275109: step: 1064/527, loss: 0.004498990252614021 2023-01-22 21:53:14.318527: step: 1068/527, loss: 0.0038817673921585083 2023-01-22 21:53:15.386106: step: 1072/527, loss: 0.002550655510276556 2023-01-22 21:53:16.436466: step: 1076/527, loss: 0.007548233028501272 2023-01-22 21:53:17.482227: step: 1080/527, loss: 0.007035511080175638 2023-01-22 21:53:18.532784: step: 1084/527, loss: 0.0053145186975598335 2023-01-22 21:53:19.582029: step: 1088/527, loss: 0.010060529224574566 2023-01-22 21:53:20.630297: step: 1092/527, loss: 0.02936890535056591 2023-01-22 21:53:21.665227: step: 1096/527, loss: 0.008506453596055508 2023-01-22 21:53:22.718174: step: 1100/527, loss: 4.393312337924726e-05 2023-01-22 21:53:23.778058: step: 1104/527, loss: 0.0011915852082893252 2023-01-22 21:53:24.834617: step: 1108/527, loss: 0.004857954103499651 2023-01-22 21:53:25.886140: step: 1112/527, loss: 0.021052666008472443 2023-01-22 21:53:26.927354: step: 1116/527, loss: 0.001208293717354536 2023-01-22 21:53:27.984236: step: 1120/527, loss: 0.005073441658169031 2023-01-22 21:53:29.029207: step: 1124/527, loss: 0.0018664358649402857 2023-01-22 21:53:30.088222: step: 1128/527, loss: 0.008032937534153461 2023-01-22 21:53:31.141803: step: 1132/527, loss: 0.0012455241521820426 2023-01-22 21:53:32.192012: step: 1136/527, loss: 0.009927546605467796 2023-01-22 21:53:33.236666: step: 1140/527, loss: 0.005889351014047861 2023-01-22 21:53:34.285815: step: 1144/527, loss: 0.0036864392459392548 2023-01-22 21:53:35.332162: step: 1148/527, loss: 0.00022810317750554532 2023-01-22 21:53:36.389104: step: 1152/527, loss: 0.004580994602292776 2023-01-22 21:53:37.443899: step: 1156/527, loss: 0.00515025295317173 2023-01-22 21:53:38.491216: step: 1160/527, loss: 0.0056029437109827995 2023-01-22 21:53:39.554064: step: 1164/527, loss: 0.001773229567334056 2023-01-22 21:53:40.621896: step: 1168/527, loss: 0.005558577831834555 2023-01-22 21:53:41.685172: step: 1172/527, loss: 0.001237261458300054 2023-01-22 21:53:42.728925: step: 1176/527, loss: 0.010743043385446072 2023-01-22 21:53:43.782423: step: 1180/527, loss: 0.010249967686831951 2023-01-22 21:53:44.829834: step: 1184/527, loss: 0.010876132175326347 2023-01-22 21:53:45.871334: step: 1188/527, loss: 0.0008464209968224168 2023-01-22 21:53:46.932750: step: 1192/527, loss: 0.002634631237015128 2023-01-22 21:53:47.977308: step: 1196/527, loss: 0.013846343383193016 2023-01-22 21:53:49.033092: step: 1200/527, loss: 0.002021439140662551 2023-01-22 21:53:50.103733: step: 1204/527, loss: 0.009759816341102123 2023-01-22 21:53:51.145832: step: 1208/527, loss: 0.0010770554654300213 2023-01-22 21:53:52.206094: step: 1212/527, loss: 0.00042460692930035293 2023-01-22 21:53:53.269396: step: 1216/527, loss: 0.0009167081443592906 2023-01-22 21:53:54.319667: step: 1220/527, loss: 0.0001087384152924642 2023-01-22 21:53:55.370738: step: 1224/527, loss: 0.004979348741471767 2023-01-22 21:53:56.398349: step: 1228/527, loss: 3.911551971214067e-08 2023-01-22 21:53:57.452224: step: 1232/527, loss: 0.0026819936465471983 2023-01-22 21:53:58.515170: step: 1236/527, loss: 0.03157117962837219 2023-01-22 21:53:59.564567: step: 1240/527, loss: 0.0099434033036232 2023-01-22 21:54:00.600600: step: 1244/527, loss: 0.010909296572208405 2023-01-22 21:54:01.673314: step: 1248/527, loss: 0.0003375186352059245 2023-01-22 21:54:02.737898: step: 1252/527, loss: 0.009656017646193504 2023-01-22 21:54:03.779972: step: 1256/527, loss: 0.007953505031764507 2023-01-22 21:54:04.833308: step: 1260/527, loss: 0.000923018204048276 2023-01-22 21:54:05.876407: step: 1264/527, loss: 0.0007859493489377201 2023-01-22 21:54:06.941652: step: 1268/527, loss: 0.00659319618716836 2023-01-22 21:54:08.000337: step: 1272/527, loss: 0.005343630909919739 2023-01-22 21:54:09.058624: step: 1276/527, loss: 0.0026983299758285284 2023-01-22 21:54:10.115248: step: 1280/527, loss: 0.004379732999950647 2023-01-22 21:54:11.171532: step: 1284/527, loss: 0.005964973941445351 2023-01-22 21:54:12.230574: step: 1288/527, loss: 0.016330217942595482 2023-01-22 21:54:13.281367: step: 1292/527, loss: 0.014876273460686207 2023-01-22 21:54:14.334859: step: 1296/527, loss: 0.0007380720926448703 2023-01-22 21:54:15.397223: step: 1300/527, loss: 0.001835113624110818 2023-01-22 21:54:16.468669: step: 1304/527, loss: 0.006024368107318878 2023-01-22 21:54:17.518889: step: 1308/527, loss: 0.006383189000189304 2023-01-22 21:54:18.562882: step: 1312/527, loss: 0.0031941768247634172 2023-01-22 21:54:19.608627: step: 1316/527, loss: 0.0003681556845549494 2023-01-22 21:54:20.653666: step: 1320/527, loss: 0.005898060742765665 2023-01-22 21:54:21.703613: step: 1324/527, loss: 0.0002862276742234826 2023-01-22 21:54:22.765990: step: 1328/527, loss: 0.0005839465302415192 2023-01-22 21:54:23.811595: step: 1332/527, loss: 0.005359964445233345 2023-01-22 21:54:24.857686: step: 1336/527, loss: 0.0004397186858113855 2023-01-22 21:54:25.897867: step: 1340/527, loss: 0.02376655861735344 2023-01-22 21:54:26.941648: step: 1344/527, loss: 0.004204253200441599 2023-01-22 21:54:27.989679: step: 1348/527, loss: 0.003517820965498686 2023-01-22 21:54:29.037917: step: 1352/527, loss: 0.003728880314156413 2023-01-22 21:54:30.080386: step: 1356/527, loss: 0.002548638265579939 2023-01-22 21:54:31.117176: step: 1360/527, loss: 0.0005844628321938217 2023-01-22 21:54:32.171743: step: 1364/527, loss: 0.0010275651002302766 2023-01-22 21:54:33.231620: step: 1368/527, loss: 0.004601018503308296 2023-01-22 21:54:34.290174: step: 1372/527, loss: 0.01771286316215992 2023-01-22 21:54:35.321933: step: 1376/527, loss: 6.794201681259437e-07 2023-01-22 21:54:36.375898: step: 1380/527, loss: 0.004671482834964991 2023-01-22 21:54:37.430116: step: 1384/527, loss: 0.00662675965577364 2023-01-22 21:54:38.500619: step: 1388/527, loss: 0.006327507551759481 2023-01-22 21:54:39.543487: step: 1392/527, loss: 0.006102381274104118 2023-01-22 21:54:40.597161: step: 1396/527, loss: 0.006286056712269783 2023-01-22 21:54:41.661034: step: 1400/527, loss: 0.0021835649386048317 2023-01-22 21:54:42.705130: step: 1404/527, loss: 0.011712766252458096 2023-01-22 21:54:43.748398: step: 1408/527, loss: 0.0 2023-01-22 21:54:44.815542: step: 1412/527, loss: 0.006227034144103527 2023-01-22 21:54:45.858966: step: 1416/527, loss: 0.004639971069991589 2023-01-22 21:54:46.915125: step: 1420/527, loss: 0.0032423525117337704 2023-01-22 21:54:47.977527: step: 1424/527, loss: 6.131846475909697e-06 2023-01-22 21:54:49.004377: step: 1428/527, loss: 0.003998455125838518 2023-01-22 21:54:50.057363: step: 1432/527, loss: 0.009648646228015423 2023-01-22 21:54:51.120176: step: 1436/527, loss: 0.004501263611018658 2023-01-22 21:54:52.165942: step: 1440/527, loss: 0.00038856532773934305 2023-01-22 21:54:53.217979: step: 1444/527, loss: 0.0039667836390435696 2023-01-22 21:54:54.260644: step: 1448/527, loss: 0.0003916067071259022 2023-01-22 21:54:55.307416: step: 1452/527, loss: 0.0006529375095851719 2023-01-22 21:54:56.359463: step: 1456/527, loss: 0.0014505956787616014 2023-01-22 21:54:57.404121: step: 1460/527, loss: 0.0055475723929703236 2023-01-22 21:54:58.464707: step: 1464/527, loss: 0.018042156472802162 2023-01-22 21:54:59.520842: step: 1468/527, loss: 0.004778545815497637 2023-01-22 21:55:00.568811: step: 1472/527, loss: 0.0015699844807386398 2023-01-22 21:55:01.624567: step: 1476/527, loss: 0.005329108331352472 2023-01-22 21:55:02.680747: step: 1480/527, loss: 0.003789538284763694 2023-01-22 21:55:03.731028: step: 1484/527, loss: 0.00029249233193695545 2023-01-22 21:55:04.795754: step: 1488/527, loss: 0.013323414139449596 2023-01-22 21:55:05.830747: step: 1492/527, loss: 0.00016153963224496692 2023-01-22 21:55:06.887793: step: 1496/527, loss: 0.005620854906737804 2023-01-22 21:55:07.941428: step: 1500/527, loss: 0.004473687149584293 2023-01-22 21:55:08.985309: step: 1504/527, loss: 0.012032121419906616 2023-01-22 21:55:10.038259: step: 1508/527, loss: 0.00602019764482975 2023-01-22 21:55:11.091338: step: 1512/527, loss: 0.000493289902806282 2023-01-22 21:55:12.144807: step: 1516/527, loss: 0.025164194405078888 2023-01-22 21:55:13.181002: step: 1520/527, loss: 0.0006302927504293621 2023-01-22 21:55:14.231540: step: 1524/527, loss: 0.0003442306478973478 2023-01-22 21:55:15.281066: step: 1528/527, loss: 0.000919578829780221 2023-01-22 21:55:16.334742: step: 1532/527, loss: 0.002080704551190138 2023-01-22 21:55:17.380586: step: 1536/527, loss: 0.00034298747777938843 2023-01-22 21:55:18.434761: step: 1540/527, loss: 0.0028959952760487795 2023-01-22 21:55:19.487473: step: 1544/527, loss: 0.0007091228035278618 2023-01-22 21:55:20.537992: step: 1548/527, loss: 0.003251128364354372 2023-01-22 21:55:21.575577: step: 1552/527, loss: 0.0013161341194063425 2023-01-22 21:55:22.634742: step: 1556/527, loss: 0.0005920772673562169 2023-01-22 21:55:23.677394: step: 1560/527, loss: 0.00036366027779877186 2023-01-22 21:55:24.720013: step: 1564/527, loss: 0.0036223935894668102 2023-01-22 21:55:25.764416: step: 1568/527, loss: 0.006960890721529722 2023-01-22 21:55:26.836155: step: 1572/527, loss: 0.01042530033737421 2023-01-22 21:55:27.883567: step: 1576/527, loss: 0.0 2023-01-22 21:55:28.929301: step: 1580/527, loss: 0.0010661283740773797 2023-01-22 21:55:29.972739: step: 1584/527, loss: 0.03677280247211456 2023-01-22 21:55:31.028513: step: 1588/527, loss: 0.001318818423897028 2023-01-22 21:55:32.071473: step: 1592/527, loss: 0.00014960371481720358 2023-01-22 21:55:33.140852: step: 1596/527, loss: 0.009377187117934227 2023-01-22 21:55:34.199263: step: 1600/527, loss: 0.004860251676291227 2023-01-22 21:55:35.238121: step: 1604/527, loss: 8.306769450427964e-05 2023-01-22 21:55:36.280547: step: 1608/527, loss: 0.002958378754556179 2023-01-22 21:55:37.331989: step: 1612/527, loss: 0.010559827089309692 2023-01-22 21:55:38.372257: step: 1616/527, loss: 0.0012714501935988665 2023-01-22 21:55:39.411914: step: 1620/527, loss: 0.0003085932694375515 2023-01-22 21:55:40.468557: step: 1624/527, loss: 0.0036669885739684105 2023-01-22 21:55:41.526446: step: 1628/527, loss: 0.0044968402944505215 2023-01-22 21:55:42.568377: step: 1632/527, loss: 0.004391274880617857 2023-01-22 21:55:43.605863: step: 1636/527, loss: 0.0003657602646853775 2023-01-22 21:55:44.658506: step: 1640/527, loss: 0.0005517423851415515 2023-01-22 21:55:45.697998: step: 1644/527, loss: 0.008376365527510643 2023-01-22 21:55:46.743137: step: 1648/527, loss: 0.000498554261866957 2023-01-22 21:55:47.789516: step: 1652/527, loss: 0.0016641139518469572 2023-01-22 21:55:48.848021: step: 1656/527, loss: 0.0016360287554562092 2023-01-22 21:55:49.948418: step: 1660/527, loss: 2.3104164938558824e-05 2023-01-22 21:55:50.998815: step: 1664/527, loss: 3.766153326978383e-07 2023-01-22 21:55:52.045776: step: 1668/527, loss: 0.000569180294405669 2023-01-22 21:55:53.103034: step: 1672/527, loss: 0.006972864270210266 2023-01-22 21:55:54.170521: step: 1676/527, loss: 0.0032602990977466106 2023-01-22 21:55:55.210010: step: 1680/527, loss: 0.015513749793171883 2023-01-22 21:55:56.270970: step: 1684/527, loss: 0.00356738967821002 2023-01-22 21:55:57.315556: step: 1688/527, loss: 0.004443651530891657 2023-01-22 21:55:58.370067: step: 1692/527, loss: 0.0014096458908170462 2023-01-22 21:55:59.433330: step: 1696/527, loss: 0.01340903714299202 2023-01-22 21:56:00.503310: step: 1700/527, loss: 0.005242456216365099 2023-01-22 21:56:01.550922: step: 1704/527, loss: 0.0007230520131997764 2023-01-22 21:56:02.597432: step: 1708/527, loss: 0.024347959086298943 2023-01-22 21:56:03.651406: step: 1712/527, loss: 0.002077270532026887 2023-01-22 21:56:04.702670: step: 1716/527, loss: 0.0036248930264264345 2023-01-22 21:56:05.752058: step: 1720/527, loss: 0.004900793079286814 2023-01-22 21:56:06.800112: step: 1724/527, loss: 0.0042923809960484505 2023-01-22 21:56:07.840414: step: 1728/527, loss: 0.0019345434848219156 2023-01-22 21:56:08.892028: step: 1732/527, loss: 0.0027310873847454786 2023-01-22 21:56:09.942525: step: 1736/527, loss: 0.005460775922983885 2023-01-22 21:56:11.000299: step: 1740/527, loss: 0.01553793903440237 2023-01-22 21:56:12.049944: step: 1744/527, loss: 0.000788163160905242 2023-01-22 21:56:13.089488: step: 1748/527, loss: 0.0005869740853086114 2023-01-22 21:56:14.140940: step: 1752/527, loss: 0.0021611275151371956 2023-01-22 21:56:15.200832: step: 1756/527, loss: 0.02036934532225132 2023-01-22 21:56:16.242459: step: 1760/527, loss: 0.0011394410394132137 2023-01-22 21:56:17.302345: step: 1764/527, loss: 0.0004443641228135675 2023-01-22 21:56:18.381659: step: 1768/527, loss: 0.0020099368412047625 2023-01-22 21:56:19.462359: step: 1772/527, loss: 0.006575481500476599 2023-01-22 21:56:20.526532: step: 1776/527, loss: 0.001949939876794815 2023-01-22 21:56:21.563994: step: 1780/527, loss: 0.0010158447548747063 2023-01-22 21:56:22.606315: step: 1784/527, loss: 0.001385521492920816 2023-01-22 21:56:23.656120: step: 1788/527, loss: 0.0015559963649138808 2023-01-22 21:56:24.701897: step: 1792/527, loss: 0.00010657988605089486 2023-01-22 21:56:25.754156: step: 1796/527, loss: 0.017425308004021645 2023-01-22 21:56:26.804979: step: 1800/527, loss: 0.012123212218284607 2023-01-22 21:56:27.861157: step: 1804/527, loss: 0.00530658895149827 2023-01-22 21:56:28.919075: step: 1808/527, loss: 0.001094582723453641 2023-01-22 21:56:29.956999: step: 1812/527, loss: 0.00027986016357317567 2023-01-22 21:56:31.006154: step: 1816/527, loss: 0.0004496570909395814 2023-01-22 21:56:32.047867: step: 1820/527, loss: 0.0002132516383426264 2023-01-22 21:56:33.109384: step: 1824/527, loss: 0.0047562518157064915 2023-01-22 21:56:34.159476: step: 1828/527, loss: 0.00100448087323457 2023-01-22 21:56:35.229422: step: 1832/527, loss: 0.005904084537178278 2023-01-22 21:56:36.283615: step: 1836/527, loss: 0.008406324312090874 2023-01-22 21:56:37.328962: step: 1840/527, loss: 0.00170095672365278 2023-01-22 21:56:38.382514: step: 1844/527, loss: 0.018389154225587845 2023-01-22 21:56:39.442229: step: 1848/527, loss: 0.007688975892961025 2023-01-22 21:56:40.508459: step: 1852/527, loss: 0.0021261251531541348 2023-01-22 21:56:41.559201: step: 1856/527, loss: 0.0011606162879616022 2023-01-22 21:56:42.600957: step: 1860/527, loss: 0.0007913933368399739 2023-01-22 21:56:43.662998: step: 1864/527, loss: 0.005645437631756067 2023-01-22 21:56:44.724856: step: 1868/527, loss: 0.004314650781452656 2023-01-22 21:56:45.780849: step: 1872/527, loss: 0.0006230800063349307 2023-01-22 21:56:46.837369: step: 1876/527, loss: 0.003027730155736208 2023-01-22 21:56:47.886512: step: 1880/527, loss: 0.0027935155667364597 2023-01-22 21:56:48.946279: step: 1884/527, loss: 0.0033874711953103542 2023-01-22 21:56:49.992030: step: 1888/527, loss: 0.0004291182558517903 2023-01-22 21:56:51.030244: step: 1892/527, loss: 0.005484543740749359 2023-01-22 21:56:52.063951: step: 1896/527, loss: 0.0005467801238410175 2023-01-22 21:56:53.145936: step: 1900/527, loss: 0.0017408350249752402 2023-01-22 21:56:54.195349: step: 1904/527, loss: 0.00661687646061182 2023-01-22 21:56:55.269000: step: 1908/527, loss: 0.004403543658554554 2023-01-22 21:56:56.322881: step: 1912/527, loss: 0.004321379121392965 2023-01-22 21:56:57.366819: step: 1916/527, loss: 0.005980567075312138 2023-01-22 21:56:58.401325: step: 1920/527, loss: 0.00015836532111279666 2023-01-22 21:56:59.451656: step: 1924/527, loss: 0.0017614540411159396 2023-01-22 21:57:00.521025: step: 1928/527, loss: 0.005670016165822744 2023-01-22 21:57:01.568773: step: 1932/527, loss: 0.012916527688503265 2023-01-22 21:57:02.615317: step: 1936/527, loss: 0.002517060609534383 2023-01-22 21:57:03.664205: step: 1940/527, loss: 0.002991416957229376 2023-01-22 21:57:04.723749: step: 1944/527, loss: 0.001949058030731976 2023-01-22 21:57:05.784203: step: 1948/527, loss: 0.007145324721932411 2023-01-22 21:57:06.834968: step: 1952/527, loss: 0.025982271879911423 2023-01-22 21:57:07.880598: step: 1956/527, loss: 0.004791861865669489 2023-01-22 21:57:08.942417: step: 1960/527, loss: 0.004748423118144274 2023-01-22 21:57:09.984830: step: 1964/527, loss: 0.0008997737313620746 2023-01-22 21:57:11.037065: step: 1968/527, loss: 0.0038349067326635122 2023-01-22 21:57:12.097900: step: 1972/527, loss: 0.0122023681178689 2023-01-22 21:57:13.149300: step: 1976/527, loss: 0.004679364152252674 2023-01-22 21:57:14.192027: step: 1980/527, loss: 0.003038185415789485 2023-01-22 21:57:15.251886: step: 1984/527, loss: 0.016883159056305885 2023-01-22 21:57:16.306323: step: 1988/527, loss: 0.03122745454311371 2023-01-22 21:57:17.360961: step: 1992/527, loss: 0.0015301862731575966 2023-01-22 21:57:18.416060: step: 1996/527, loss: 0.00023105574655346572 2023-01-22 21:57:19.474625: step: 2000/527, loss: 0.016663668677210808 2023-01-22 21:57:20.532588: step: 2004/527, loss: 0.031381573528051376 2023-01-22 21:57:21.566383: step: 2008/527, loss: 0.001840645563788712 2023-01-22 21:57:22.621780: step: 2012/527, loss: 0.02474190667271614 2023-01-22 21:57:23.672341: step: 2016/527, loss: 0.005741769913583994 2023-01-22 21:57:24.719181: step: 2020/527, loss: 0.00014354994345922023 2023-01-22 21:57:25.757009: step: 2024/527, loss: 0.013231417164206505 2023-01-22 21:57:26.801374: step: 2028/527, loss: 0.002164565958082676 2023-01-22 21:57:27.854566: step: 2032/527, loss: 0.001787404646165669 2023-01-22 21:57:28.913638: step: 2036/527, loss: 0.00047837159945629537 2023-01-22 21:57:29.961017: step: 2040/527, loss: 0.0003927461802959442 2023-01-22 21:57:31.005231: step: 2044/527, loss: 0.002115048933774233 2023-01-22 21:57:32.046773: step: 2048/527, loss: 0.001096814638003707 2023-01-22 21:57:33.121199: step: 2052/527, loss: 0.006315071601420641 2023-01-22 21:57:34.176897: step: 2056/527, loss: 0.0017492563929408789 2023-01-22 21:57:35.225541: step: 2060/527, loss: 0.009951732121407986 2023-01-22 21:57:36.278540: step: 2064/527, loss: 0.005994449369609356 2023-01-22 21:57:37.327627: step: 2068/527, loss: 0.0024943272583186626 2023-01-22 21:57:38.382971: step: 2072/527, loss: 0.0025179623626172543 2023-01-22 21:57:39.433390: step: 2076/527, loss: 0.04178306460380554 2023-01-22 21:57:40.468752: step: 2080/527, loss: 0.0008779895724728703 2023-01-22 21:57:41.514233: step: 2084/527, loss: 0.0015356200747191906 2023-01-22 21:57:42.565413: step: 2088/527, loss: 0.012282346375286579 2023-01-22 21:57:43.628226: step: 2092/527, loss: 0.010887503623962402 2023-01-22 21:57:44.668416: step: 2096/527, loss: 0.002622022060677409 2023-01-22 21:57:45.745267: step: 2100/527, loss: 0.0020813436713069677 2023-01-22 21:57:46.787078: step: 2104/527, loss: 0.004574185702949762 2023-01-22 21:57:47.844938: step: 2108/527, loss: 0.012098906561732292 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3242597197106691, 'r': 0.3402573529411765, 'f1': 0.3320659722222222}, 'combined': 0.24468019005847952, 'stategy': 1, 'epoch': 12} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3369774967557251, 'r': 0.31032564019413594, 'f1': 0.32310289087889216}, 'combined': 0.20678585016249096, 'stategy': 1, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3174133176559913, 'r': 0.35475606090963735, 'f1': 0.3350473908591019}, 'combined': 0.24687702484354876, 'stategy': 1, 'epoch': 12} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3581183706652577, 'r': 0.3170975391163282, 'f1': 0.33636190263062776}, 'combined': 0.21527161768360173, 'stategy': 1, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3283796975483849, 'r': 0.33149525445112105, 'f1': 0.32993012104955766}, 'combined': 0.24310640498388458, 'stategy': 1, 'epoch': 12} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36488091818121077, 'r': 0.29881057903829816, 'f1': 0.32855710491554746}, 'combined': 0.23556924503378876, 'stategy': 1, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.261437908496732, 'r': 0.38095238095238093, 'f1': 0.31007751937984496}, 'combined': 0.20671834625322996, 'stategy': 1, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'stategy': 1, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3319773567844166, 'r': 0.33386717095965995, 'f1': 0.3329195820165389}, 'combined': 0.24530916569639705, 'stategy': 1, 'epoch': 11} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36875593136201523, 'r': 0.3013128538335666, 'f1': 0.3316402867932796}, 'combined': 0.23777982826687974, 'stategy': 1, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 11} ****************************** Epoch: 13 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 22:00:16.963099: step: 4/527, loss: 0.00047966433339752257 2023-01-22 22:00:18.011588: step: 8/527, loss: 0.12679523229599 2023-01-22 22:00:19.052819: step: 12/527, loss: 0.005931123625487089 2023-01-22 22:00:20.099828: step: 16/527, loss: 0.006797137204557657 2023-01-22 22:00:21.155694: step: 20/527, loss: 0.0046298932284116745 2023-01-22 22:00:22.201302: step: 24/527, loss: 0.00021893317170906812 2023-01-22 22:00:23.223547: step: 28/527, loss: 0.0015516445273533463 2023-01-22 22:00:24.265417: step: 32/527, loss: 0.005914092529565096 2023-01-22 22:00:25.307203: step: 36/527, loss: 0.0003633471787907183 2023-01-22 22:00:26.376421: step: 40/527, loss: 0.0033023653086274862 2023-01-22 22:00:27.425332: step: 44/527, loss: 0.0027280061040073633 2023-01-22 22:00:28.472405: step: 48/527, loss: 0.003142612287774682 2023-01-22 22:00:29.510131: step: 52/527, loss: 0.0048146843910217285 2023-01-22 22:00:30.565324: step: 56/527, loss: 0.0027329442091286182 2023-01-22 22:00:31.605225: step: 60/527, loss: 0.00626673549413681 2023-01-22 22:00:32.653624: step: 64/527, loss: 0.011355612426996231 2023-01-22 22:00:33.694255: step: 68/527, loss: 0.006362869404256344 2023-01-22 22:00:34.735474: step: 72/527, loss: 0.0031870645470917225 2023-01-22 22:00:35.769062: step: 76/527, loss: 0.0051883733831346035 2023-01-22 22:00:36.803850: step: 80/527, loss: 0.0031207758001983166 2023-01-22 22:00:37.850097: step: 84/527, loss: 5.671257895301096e-05 2023-01-22 22:00:38.888212: step: 88/527, loss: 0.0016243119025602937 2023-01-22 22:00:39.935019: step: 92/527, loss: 0.0014474587514996529 2023-01-22 22:00:40.970385: step: 96/527, loss: 0.0004649146576412022 2023-01-22 22:00:42.018945: step: 100/527, loss: 0.0012905211187899113 2023-01-22 22:00:43.074265: step: 104/527, loss: 0.004177778493613005 2023-01-22 22:00:44.116140: step: 108/527, loss: 0.003395121544599533 2023-01-22 22:00:45.159697: step: 112/527, loss: 0.009098974987864494 2023-01-22 22:00:46.204026: step: 116/527, loss: 0.006990011315792799 2023-01-22 22:00:47.275883: step: 120/527, loss: 0.017712267115712166 2023-01-22 22:00:48.324686: step: 124/527, loss: 0.002568305702880025 2023-01-22 22:00:49.365636: step: 128/527, loss: 0.02214146964251995 2023-01-22 22:00:50.418689: step: 132/527, loss: 0.0020087631419301033 2023-01-22 22:00:51.473679: step: 136/527, loss: 0.0016258673276752234 2023-01-22 22:00:52.504979: step: 140/527, loss: 0.006468077190220356 2023-01-22 22:00:53.557807: step: 144/527, loss: 0.004438657313585281 2023-01-22 22:00:54.592615: step: 148/527, loss: 0.019502557814121246 2023-01-22 22:00:55.637843: step: 152/527, loss: 0.0023263217881321907 2023-01-22 22:00:56.707531: step: 156/527, loss: 0.010414495132863522 2023-01-22 22:00:57.749748: step: 160/527, loss: 0.010374199599027634 2023-01-22 22:00:58.787760: step: 164/527, loss: 0.00040748337050899863 2023-01-22 22:00:59.828971: step: 168/527, loss: 0.012541718780994415 2023-01-22 22:01:00.881737: step: 172/527, loss: 0.002299034036695957 2023-01-22 22:01:01.918510: step: 176/527, loss: 0.0919279158115387 2023-01-22 22:01:02.973597: step: 180/527, loss: 2.3659185899305157e-06 2023-01-22 22:01:04.006094: step: 184/527, loss: 0.001996793784201145 2023-01-22 22:01:05.054484: step: 188/527, loss: 0.008163457736372948 2023-01-22 22:01:06.109630: step: 192/527, loss: 0.002465104917064309 2023-01-22 22:01:07.167056: step: 196/527, loss: 0.005428884644061327 2023-01-22 22:01:08.217333: step: 200/527, loss: 0.003700327593833208 2023-01-22 22:01:09.266267: step: 204/527, loss: 0.0035851493012160063 2023-01-22 22:01:10.304315: step: 208/527, loss: 0.0030758383218199015 2023-01-22 22:01:11.344535: step: 212/527, loss: 0.00011749286204576492 2023-01-22 22:01:12.392525: step: 216/527, loss: 0.0030023108702152967 2023-01-22 22:01:13.434556: step: 220/527, loss: 0.00010859598114620894 2023-01-22 22:01:14.485895: step: 224/527, loss: 0.002482845913618803 2023-01-22 22:01:15.538551: step: 228/527, loss: 0.0008771735010668635 2023-01-22 22:01:16.580407: step: 232/527, loss: 0.0006597733008675277 2023-01-22 22:01:17.658951: step: 236/527, loss: 0.0026779011823236942 2023-01-22 22:01:18.698714: step: 240/527, loss: 0.004704420454800129 2023-01-22 22:01:19.753603: step: 244/527, loss: 0.0011582281440496445 2023-01-22 22:01:20.800166: step: 248/527, loss: 0.004860404413193464 2023-01-22 22:01:21.846348: step: 252/527, loss: 0.0033757255878299475 2023-01-22 22:01:22.893669: step: 256/527, loss: 0.005320724565535784 2023-01-22 22:01:23.947991: step: 260/527, loss: 0.002112730871886015 2023-01-22 22:01:24.978886: step: 264/527, loss: 0.00735299801453948 2023-01-22 22:01:26.024930: step: 268/527, loss: 0.00027587651857174933 2023-01-22 22:01:27.064460: step: 272/527, loss: 0.0026921499520540237 2023-01-22 22:01:28.107803: step: 276/527, loss: 0.007181172259151936 2023-01-22 22:01:29.139542: step: 280/527, loss: 0.0008219439769163728 2023-01-22 22:01:30.191638: step: 284/527, loss: 0.0031848468352109194 2023-01-22 22:01:31.252088: step: 288/527, loss: 0.012136109173297882 2023-01-22 22:01:32.305070: step: 292/527, loss: 0.0013794435653835535 2023-01-22 22:01:33.353862: step: 296/527, loss: 0.0041949208825826645 2023-01-22 22:01:34.396602: step: 300/527, loss: 0.005405834876000881 2023-01-22 22:01:35.462741: step: 304/527, loss: 0.014825263060629368 2023-01-22 22:01:36.501357: step: 308/527, loss: 0.002000151900574565 2023-01-22 22:01:37.581925: step: 312/527, loss: 0.001867753453552723 2023-01-22 22:01:38.644298: step: 316/527, loss: 0.006596806459128857 2023-01-22 22:01:39.687173: step: 320/527, loss: 0.000626065768301487 2023-01-22 22:01:40.761064: step: 324/527, loss: 0.0012385525042191148 2023-01-22 22:01:41.807750: step: 328/527, loss: 0.009825015440583229 2023-01-22 22:01:42.856820: step: 332/527, loss: 0.00412285840138793 2023-01-22 22:01:43.911362: step: 336/527, loss: 3.9639278838876635e-05 2023-01-22 22:01:44.972966: step: 340/527, loss: 0.0099570881575346 2023-01-22 22:01:46.002237: step: 344/527, loss: 0.0038917972706258297 2023-01-22 22:01:47.059321: step: 348/527, loss: 0.005468165036290884 2023-01-22 22:01:48.117239: step: 352/527, loss: 0.016557637602090836 2023-01-22 22:01:49.167480: step: 356/527, loss: 0.02072323113679886 2023-01-22 22:01:50.221325: step: 360/527, loss: 0.00039233139250427485 2023-01-22 22:01:51.273898: step: 364/527, loss: 0.00015779142268002033 2023-01-22 22:01:52.321262: step: 368/527, loss: 0.004482398275285959 2023-01-22 22:01:53.360329: step: 372/527, loss: 0.00025002469192259014 2023-01-22 22:01:54.432567: step: 376/527, loss: 0.011141078546643257 2023-01-22 22:01:55.502132: step: 380/527, loss: 0.0016537305200472474 2023-01-22 22:01:56.571484: step: 384/527, loss: 0.00703157065436244 2023-01-22 22:01:57.633493: step: 388/527, loss: 0.000436004571383819 2023-01-22 22:01:58.710648: step: 392/527, loss: 0.004214850720018148 2023-01-22 22:01:59.772044: step: 396/527, loss: 0.0041285110637545586 2023-01-22 22:02:00.831160: step: 400/527, loss: 0.0056310053914785385 2023-01-22 22:02:01.886683: step: 404/527, loss: 0.003791823284700513 2023-01-22 22:02:02.938656: step: 408/527, loss: 0.0010202594567090273 2023-01-22 22:02:03.981461: step: 412/527, loss: 0.000974743627011776 2023-01-22 22:02:05.042069: step: 416/527, loss: 0.049156296998262405 2023-01-22 22:02:06.092866: step: 420/527, loss: 0.004324521869421005 2023-01-22 22:02:07.146200: step: 424/527, loss: 0.0003029952058568597 2023-01-22 22:02:08.220146: step: 428/527, loss: 0.001004286459647119 2023-01-22 22:02:09.265220: step: 432/527, loss: 0.007104435004293919 2023-01-22 22:02:10.313065: step: 436/527, loss: 0.0003585568629205227 2023-01-22 22:02:11.376899: step: 440/527, loss: 0.0039218151941895485 2023-01-22 22:02:12.414724: step: 444/527, loss: 0.01423337496817112 2023-01-22 22:02:13.459479: step: 448/527, loss: 0.0015733834588900208 2023-01-22 22:02:14.514203: step: 452/527, loss: 0.004013628698885441 2023-01-22 22:02:15.584434: step: 456/527, loss: 0.003590354695916176 2023-01-22 22:02:16.623765: step: 460/527, loss: 0.008774403482675552 2023-01-22 22:02:17.682821: step: 464/527, loss: 0.000348780769854784 2023-01-22 22:02:18.718284: step: 468/527, loss: 8.557185356039554e-05 2023-01-22 22:02:19.794949: step: 472/527, loss: 0.0004322062595747411 2023-01-22 22:02:20.847476: step: 476/527, loss: 0.0011674977140501142 2023-01-22 22:02:21.897219: step: 480/527, loss: 0.0034263022243976593 2023-01-22 22:02:22.953191: step: 484/527, loss: 0.0002531966019887477 2023-01-22 22:02:24.018401: step: 488/527, loss: 0.0034450851380825043 2023-01-22 22:02:25.082416: step: 492/527, loss: 0.002794889733195305 2023-01-22 22:02:26.135649: step: 496/527, loss: 0.006746026687324047 2023-01-22 22:02:27.192659: step: 500/527, loss: 0.005788684822618961 2023-01-22 22:02:28.239663: step: 504/527, loss: 0.0032613726798444986 2023-01-22 22:02:29.281208: step: 508/527, loss: 0.014688258990645409 2023-01-22 22:02:30.339567: step: 512/527, loss: 0.00046264310367405415 2023-01-22 22:02:31.388519: step: 516/527, loss: 0.0029617997352033854 2023-01-22 22:02:32.444617: step: 520/527, loss: 0.0032303743064403534 2023-01-22 22:02:33.477777: step: 524/527, loss: 6.718641088809818e-05 2023-01-22 22:02:34.518763: step: 528/527, loss: 0.002313081407919526 2023-01-22 22:02:35.565800: step: 532/527, loss: 6.916584970895201e-05 2023-01-22 22:02:36.607648: step: 536/527, loss: 0.00029460343648679554 2023-01-22 22:02:37.668506: step: 540/527, loss: 0.011031667701900005 2023-01-22 22:02:38.742571: step: 544/527, loss: 0.004498671740293503 2023-01-22 22:02:39.800593: step: 548/527, loss: 0.001042934600263834 2023-01-22 22:02:40.861558: step: 552/527, loss: 0.0025397450663149357 2023-01-22 22:02:41.919742: step: 556/527, loss: 0.00864452589303255 2023-01-22 22:02:42.974894: step: 560/527, loss: 0.0022043746430426836 2023-01-22 22:02:44.037198: step: 564/527, loss: 0.00486398721113801 2023-01-22 22:02:45.108663: step: 568/527, loss: 0.0031027125660330057 2023-01-22 22:02:46.140651: step: 572/527, loss: 0.00041250884532928467 2023-01-22 22:02:47.184920: step: 576/527, loss: 0.0039843907579779625 2023-01-22 22:02:48.248440: step: 580/527, loss: 3.87683576263953e-05 2023-01-22 22:02:49.316275: step: 584/527, loss: 0.0005696567241102457 2023-01-22 22:02:50.386476: step: 588/527, loss: 0.009698409587144852 2023-01-22 22:02:51.427529: step: 592/527, loss: 0.004264072980731726 2023-01-22 22:02:52.491195: step: 596/527, loss: 0.0003863092861138284 2023-01-22 22:02:53.541107: step: 600/527, loss: 0.005400056950747967 2023-01-22 22:02:54.584234: step: 604/527, loss: 5.98683618591167e-05 2023-01-22 22:02:55.641069: step: 608/527, loss: 0.0002464533317834139 2023-01-22 22:02:56.700036: step: 612/527, loss: 0.002514599123969674 2023-01-22 22:02:57.762205: step: 616/527, loss: 0.002618327271193266 2023-01-22 22:02:58.816268: step: 620/527, loss: 0.003062386065721512 2023-01-22 22:02:59.879780: step: 624/527, loss: 0.0033916831016540527 2023-01-22 22:03:00.928962: step: 628/527, loss: 0.01238814927637577 2023-01-22 22:03:01.983128: step: 632/527, loss: 0.005368268582969904 2023-01-22 22:03:03.029155: step: 636/527, loss: 0.001293156761676073 2023-01-22 22:03:04.087890: step: 640/527, loss: 0.005739733576774597 2023-01-22 22:03:05.132777: step: 644/527, loss: 0.00011344454105710611 2023-01-22 22:03:06.169947: step: 648/527, loss: 0.00027888751355931163 2023-01-22 22:03:07.242325: step: 652/527, loss: 0.001399198197759688 2023-01-22 22:03:08.290877: step: 656/527, loss: 0.0036111478693783283 2023-01-22 22:03:09.339398: step: 660/527, loss: 6.151192064862698e-05 2023-01-22 22:03:10.389003: step: 664/527, loss: 0.004169187508523464 2023-01-22 22:03:11.453639: step: 668/527, loss: 0.03293965011835098 2023-01-22 22:03:12.521986: step: 672/527, loss: 0.026547370478510857 2023-01-22 22:03:13.580151: step: 676/527, loss: 0.0018172883428633213 2023-01-22 22:03:14.630912: step: 680/527, loss: 0.004993759095668793 2023-01-22 22:03:15.677109: step: 684/527, loss: 0.0005401379894465208 2023-01-22 22:03:16.714696: step: 688/527, loss: 0.003349890233948827 2023-01-22 22:03:17.766963: step: 692/527, loss: 0.004341718275099993 2023-01-22 22:03:18.823644: step: 696/527, loss: 0.02069842629134655 2023-01-22 22:03:19.878340: step: 700/527, loss: 0.006068716291338205 2023-01-22 22:03:20.931610: step: 704/527, loss: 0.04438595473766327 2023-01-22 22:03:21.980559: step: 708/527, loss: 0.03625955432653427 2023-01-22 22:03:23.014660: step: 712/527, loss: 0.009529301896691322 2023-01-22 22:03:24.068624: step: 716/527, loss: 0.00020313216373324394 2023-01-22 22:03:25.119385: step: 720/527, loss: 0.001596994698047638 2023-01-22 22:03:26.186708: step: 724/527, loss: 0.0011845908593386412 2023-01-22 22:03:27.247663: step: 728/527, loss: 0.0021616287995129824 2023-01-22 22:03:28.319328: step: 732/527, loss: 0.0030026764143258333 2023-01-22 22:03:29.364772: step: 736/527, loss: 0.01383097842335701 2023-01-22 22:03:30.414858: step: 740/527, loss: 0.006477289833128452 2023-01-22 22:03:31.461967: step: 744/527, loss: 0.035857681185007095 2023-01-22 22:03:32.514999: step: 748/527, loss: 0.005030815023928881 2023-01-22 22:03:33.579204: step: 752/527, loss: 0.004621890839189291 2023-01-22 22:03:34.643454: step: 756/527, loss: 0.01211671531200409 2023-01-22 22:03:35.702722: step: 760/527, loss: 0.018070314079523087 2023-01-22 22:03:36.771734: step: 764/527, loss: 0.007683371193706989 2023-01-22 22:03:37.823841: step: 768/527, loss: 0.023442300036549568 2023-01-22 22:03:38.903511: step: 772/527, loss: 0.0005778574850410223 2023-01-22 22:03:39.990434: step: 776/527, loss: 0.01411820761859417 2023-01-22 22:03:41.054183: step: 780/527, loss: 0.0008859537192620337 2023-01-22 22:03:42.094976: step: 784/527, loss: 0.0003062605392187834 2023-01-22 22:03:43.159319: step: 788/527, loss: 0.00015818291285540909 2023-01-22 22:03:44.202057: step: 792/527, loss: 4.957219061907381e-06 2023-01-22 22:03:45.263800: step: 796/527, loss: 0.030652323737740517 2023-01-22 22:03:46.302664: step: 800/527, loss: 0.023049531504511833 2023-01-22 22:03:47.355975: step: 804/527, loss: 4.4695683754980564e-05 2023-01-22 22:03:48.409185: step: 808/527, loss: 0.0038705794140696526 2023-01-22 22:03:49.472890: step: 812/527, loss: 0.009528076276183128 2023-01-22 22:03:50.533546: step: 816/527, loss: 0.0004005462396889925 2023-01-22 22:03:51.586379: step: 820/527, loss: 0.0013192713959142566 2023-01-22 22:03:52.637020: step: 824/527, loss: 0.00320342555642128 2023-01-22 22:03:53.720546: step: 828/527, loss: 0.0022633594926446676 2023-01-22 22:03:54.777074: step: 832/527, loss: 0.002621921943500638 2023-01-22 22:03:55.846915: step: 836/527, loss: 0.014133133925497532 2023-01-22 22:03:56.903352: step: 840/527, loss: 0.00785229355096817 2023-01-22 22:03:57.960756: step: 844/527, loss: 0.00226704403758049 2023-01-22 22:03:59.028762: step: 848/527, loss: 0.021022062748670578 2023-01-22 22:04:00.088854: step: 852/527, loss: 0.00106845295522362 2023-01-22 22:04:01.159935: step: 856/527, loss: 0.006896406412124634 2023-01-22 22:04:02.219439: step: 860/527, loss: 0.0008146132458932698 2023-01-22 22:04:03.282293: step: 864/527, loss: 0.0014657324645668268 2023-01-22 22:04:04.324241: step: 868/527, loss: 0.0035612343344837427 2023-01-22 22:04:05.373639: step: 872/527, loss: 0.006471499800682068 2023-01-22 22:04:06.426711: step: 876/527, loss: 0.000636056181974709 2023-01-22 22:04:07.478308: step: 880/527, loss: 0.000731413543689996 2023-01-22 22:04:08.530559: step: 884/527, loss: 0.005597101524472237 2023-01-22 22:04:09.568897: step: 888/527, loss: 0.0018012278014793992 2023-01-22 22:04:10.628108: step: 892/527, loss: 0.0015583968488499522 2023-01-22 22:04:11.687948: step: 896/527, loss: 0.00021445844322443008 2023-01-22 22:04:12.732467: step: 900/527, loss: 0.004209148231893778 2023-01-22 22:04:13.795078: step: 904/527, loss: 0.0020082229748368263 2023-01-22 22:04:14.849669: step: 908/527, loss: 0.029260389506816864 2023-01-22 22:04:15.908832: step: 912/527, loss: 0.026747386902570724 2023-01-22 22:04:16.959471: step: 916/527, loss: 0.0013539609499275684 2023-01-22 22:04:18.008429: step: 920/527, loss: 0.002039816463366151 2023-01-22 22:04:19.050529: step: 924/527, loss: 0.006380030419677496 2023-01-22 22:04:20.104968: step: 928/527, loss: 0.002951527712866664 2023-01-22 22:04:21.192761: step: 932/527, loss: 0.0035444144159555435 2023-01-22 22:04:22.259557: step: 936/527, loss: 0.007331428583711386 2023-01-22 22:04:23.303757: step: 940/527, loss: 0.001937823137268424 2023-01-22 22:04:24.357369: step: 944/527, loss: 0.004884100519120693 2023-01-22 22:04:25.392832: step: 948/527, loss: 0.003096437081694603 2023-01-22 22:04:26.437330: step: 952/527, loss: 0.00049392826622352 2023-01-22 22:04:27.500009: step: 956/527, loss: 0.001529772998765111 2023-01-22 22:04:28.537134: step: 960/527, loss: 0.0034300293773412704 2023-01-22 22:04:29.599239: step: 964/527, loss: 0.006670809350907803 2023-01-22 22:04:30.645064: step: 968/527, loss: 0.0070527647621929646 2023-01-22 22:04:31.716501: step: 972/527, loss: 0.004473906476050615 2023-01-22 22:04:32.762643: step: 976/527, loss: 0.009639867581427097 2023-01-22 22:04:33.820814: step: 980/527, loss: 9.810461779125035e-05 2023-01-22 22:04:34.879626: step: 984/527, loss: 0.053507447242736816 2023-01-22 22:04:35.943325: step: 988/527, loss: 0.006516705732792616 2023-01-22 22:04:36.987426: step: 992/527, loss: 0.00079623784404248 2023-01-22 22:04:38.056024: step: 996/527, loss: 0.001361334347166121 2023-01-22 22:04:39.098604: step: 1000/527, loss: 0.002607665490359068 2023-01-22 22:04:40.151657: step: 1004/527, loss: 0.005556870251893997 2023-01-22 22:04:41.213878: step: 1008/527, loss: 0.0016048513352870941 2023-01-22 22:04:42.270767: step: 1012/527, loss: 0.0065682721324265 2023-01-22 22:04:43.324327: step: 1016/527, loss: 0.002197975292801857 2023-01-22 22:04:44.366986: step: 1020/527, loss: 0.019346218556165695 2023-01-22 22:04:45.424372: step: 1024/527, loss: 0.00048493093345314264 2023-01-22 22:04:46.473234: step: 1028/527, loss: 0.037749119102954865 2023-01-22 22:04:47.533193: step: 1032/527, loss: 0.003849076572805643 2023-01-22 22:04:48.578245: step: 1036/527, loss: 0.0023239452857524157 2023-01-22 22:04:49.638466: step: 1040/527, loss: 0.004615205805748701 2023-01-22 22:04:50.683618: step: 1044/527, loss: 0.00832344125956297 2023-01-22 22:04:51.736169: step: 1048/527, loss: 0.003352670231834054 2023-01-22 22:04:52.783039: step: 1052/527, loss: 0.0014569121412932873 2023-01-22 22:04:53.820683: step: 1056/527, loss: 0.0007146981661207974 2023-01-22 22:04:54.864214: step: 1060/527, loss: 0.0008186764316633344 2023-01-22 22:04:55.918260: step: 1064/527, loss: 0.004233731888234615 2023-01-22 22:04:56.972771: step: 1068/527, loss: 0.0007094976026564837 2023-01-22 22:04:58.024017: step: 1072/527, loss: 0.0015996926231309772 2023-01-22 22:04:59.091777: step: 1076/527, loss: 0.0025237714871764183 2023-01-22 22:05:00.173253: step: 1080/527, loss: 0.009845957159996033 2023-01-22 22:05:01.258745: step: 1084/527, loss: 0.0008140782592818141 2023-01-22 22:05:02.311270: step: 1088/527, loss: 0.0001628329191589728 2023-01-22 22:05:03.378804: step: 1092/527, loss: 0.000677892763633281 2023-01-22 22:05:04.438309: step: 1096/527, loss: 0.017683546990156174 2023-01-22 22:05:05.497154: step: 1100/527, loss: 0.004512401297688484 2023-01-22 22:05:06.562313: step: 1104/527, loss: 0.012557548470795155 2023-01-22 22:05:07.613023: step: 1108/527, loss: 0.0024149129167199135 2023-01-22 22:05:08.663863: step: 1112/527, loss: 0.0062912022694945335 2023-01-22 22:05:09.722612: step: 1116/527, loss: 0.0020975188817828894 2023-01-22 22:05:10.786914: step: 1120/527, loss: 0.006300954148173332 2023-01-22 22:05:11.838242: step: 1124/527, loss: 0.00213739275932312 2023-01-22 22:05:12.889633: step: 1128/527, loss: 0.002461416181176901 2023-01-22 22:05:13.937031: step: 1132/527, loss: 0.003327315906062722 2023-01-22 22:05:14.993590: step: 1136/527, loss: 0.005420952569693327 2023-01-22 22:05:16.048351: step: 1140/527, loss: 0.0037704408168792725 2023-01-22 22:05:17.114392: step: 1144/527, loss: 0.08022844046354294 2023-01-22 22:05:18.171079: step: 1148/527, loss: 0.00016010383842512965 2023-01-22 22:05:19.257334: step: 1152/527, loss: 0.02749839425086975 2023-01-22 22:05:20.296979: step: 1156/527, loss: 9.632661385694519e-05 2023-01-22 22:05:21.374265: step: 1160/527, loss: 0.0013591231545433402 2023-01-22 22:05:22.424345: step: 1164/527, loss: 0.0029064109548926353 2023-01-22 22:05:23.475286: step: 1168/527, loss: 0.0025018402375280857 2023-01-22 22:05:24.522657: step: 1172/527, loss: 0.0017730684485286474 2023-01-22 22:05:25.565667: step: 1176/527, loss: 3.7252898543727042e-09 2023-01-22 22:05:26.615616: step: 1180/527, loss: 0.0036277053877711296 2023-01-22 22:05:27.674648: step: 1184/527, loss: 0.0032393040601164103 2023-01-22 22:05:28.721316: step: 1188/527, loss: 0.00472364854067564 2023-01-22 22:05:29.763004: step: 1192/527, loss: 0.008162073791027069 2023-01-22 22:05:30.815588: step: 1196/527, loss: 0.018864743411540985 2023-01-22 22:05:31.874752: step: 1200/527, loss: 0.0028245439752936363 2023-01-22 22:05:32.938428: step: 1204/527, loss: 0.005585981998592615 2023-01-22 22:05:33.998197: step: 1208/527, loss: 0.000987582840025425 2023-01-22 22:05:35.036471: step: 1212/527, loss: 0.007033372763544321 2023-01-22 22:05:36.096912: step: 1216/527, loss: 0.018273040652275085 2023-01-22 22:05:37.151184: step: 1220/527, loss: 0.0035938937216997147 2023-01-22 22:05:38.197590: step: 1224/527, loss: 0.016538042575120926 2023-01-22 22:05:39.239828: step: 1228/527, loss: 0.0042722695507109165 2023-01-22 22:05:40.288687: step: 1232/527, loss: 0.0024743378162384033 2023-01-22 22:05:41.356063: step: 1236/527, loss: 0.0037654361221939325 2023-01-22 22:05:42.426918: step: 1240/527, loss: 0.00030759748187847435 2023-01-22 22:05:43.479756: step: 1244/527, loss: 0.0016991720767691731 2023-01-22 22:05:44.525268: step: 1248/527, loss: 0.005908094346523285 2023-01-22 22:05:45.560693: step: 1252/527, loss: 0.003163701854646206 2023-01-22 22:05:46.616642: step: 1256/527, loss: 0.009016034193336964 2023-01-22 22:05:47.670528: step: 1260/527, loss: 0.0025383138563483953 2023-01-22 22:05:48.717751: step: 1264/527, loss: 0.000447107624495402 2023-01-22 22:05:49.785817: step: 1268/527, loss: 6.376469536917284e-05 2023-01-22 22:05:50.825631: step: 1272/527, loss: 0.027500247582793236 2023-01-22 22:05:51.865749: step: 1276/527, loss: 4.464551238925196e-05 2023-01-22 22:05:52.912807: step: 1280/527, loss: 0.0036577354185283184 2023-01-22 22:05:53.958897: step: 1284/527, loss: 0.0011909445747733116 2023-01-22 22:05:55.026558: step: 1288/527, loss: 0.0076804026030004025 2023-01-22 22:05:56.077912: step: 1292/527, loss: 0.011143893003463745 2023-01-22 22:05:57.138143: step: 1296/527, loss: 0.019095083698630333 2023-01-22 22:05:58.186742: step: 1300/527, loss: 0.001601521740667522 2023-01-22 22:05:59.244579: step: 1304/527, loss: 0.0014019741211086512 2023-01-22 22:06:00.299674: step: 1308/527, loss: 0.005991935729980469 2023-01-22 22:06:01.345818: step: 1312/527, loss: 0.0016050265403464437 2023-01-22 22:06:02.395110: step: 1316/527, loss: 0.000516237283591181 2023-01-22 22:06:03.443979: step: 1320/527, loss: 0.0013739397982135415 2023-01-22 22:06:04.492101: step: 1324/527, loss: 9.797236089070793e-07 2023-01-22 22:06:05.538183: step: 1328/527, loss: 0.0001518372300779447 2023-01-22 22:06:06.578576: step: 1332/527, loss: 2.3228280099374388e-07 2023-01-22 22:06:07.633206: step: 1336/527, loss: 0.012522008270025253 2023-01-22 22:06:08.683495: step: 1340/527, loss: 3.1365692620966e-07 2023-01-22 22:06:09.740108: step: 1344/527, loss: 0.007189786992967129 2023-01-22 22:06:10.801766: step: 1348/527, loss: 0.0028326334431767464 2023-01-22 22:06:11.854004: step: 1352/527, loss: 0.003496192628517747 2023-01-22 22:06:12.908703: step: 1356/527, loss: 0.000181456096470356 2023-01-22 22:06:13.959813: step: 1360/527, loss: 0.0009919545846059918 2023-01-22 22:06:15.009241: step: 1364/527, loss: 0.002171309432014823 2023-01-22 22:06:16.067791: step: 1368/527, loss: 0.0022930451668798923 2023-01-22 22:06:17.149163: step: 1372/527, loss: 0.01480209082365036 2023-01-22 22:06:18.209074: step: 1376/527, loss: 0.0024100271984934807 2023-01-22 22:06:19.266095: step: 1380/527, loss: 9.323685662820935e-05 2023-01-22 22:06:20.307061: step: 1384/527, loss: 0.009930646046996117 2023-01-22 22:06:21.354119: step: 1388/527, loss: 0.0022097628097981215 2023-01-22 22:06:22.397456: step: 1392/527, loss: 0.00019306884496472776 2023-01-22 22:06:23.458951: step: 1396/527, loss: 0.006641766522079706 2023-01-22 22:06:24.502361: step: 1400/527, loss: 4.484779856284149e-05 2023-01-22 22:06:25.563282: step: 1404/527, loss: 0.0023347761016339064 2023-01-22 22:06:26.601611: step: 1408/527, loss: 0.0018095102859660983 2023-01-22 22:06:27.650556: step: 1412/527, loss: 0.0008096770034171641 2023-01-22 22:06:28.706486: step: 1416/527, loss: 0.0023772353306412697 2023-01-22 22:06:29.746325: step: 1420/527, loss: 0.002142679877579212 2023-01-22 22:06:30.796753: step: 1424/527, loss: 0.012929125688970089 2023-01-22 22:06:31.841625: step: 1428/527, loss: 0.0007798751466907561 2023-01-22 22:06:32.892285: step: 1432/527, loss: 0.0028679780662059784 2023-01-22 22:06:33.939191: step: 1436/527, loss: 0.0013708813348785043 2023-01-22 22:06:34.985590: step: 1440/527, loss: 0.0011231850367039442 2023-01-22 22:06:36.028570: step: 1444/527, loss: 0.0053117242641747 2023-01-22 22:06:37.066454: step: 1448/527, loss: 1.5033860108815134e-05 2023-01-22 22:06:38.119717: step: 1452/527, loss: 0.0025865137577056885 2023-01-22 22:06:39.177527: step: 1456/527, loss: 0.0006940962630324066 2023-01-22 22:06:40.230876: step: 1460/527, loss: 0.008366623893380165 2023-01-22 22:06:41.283437: step: 1464/527, loss: 0.005322176031768322 2023-01-22 22:06:42.330954: step: 1468/527, loss: 0.005060556810349226 2023-01-22 22:06:43.395226: step: 1472/527, loss: 0.004776624031364918 2023-01-22 22:06:44.439937: step: 1476/527, loss: 0.0036280089989304543 2023-01-22 22:06:45.492180: step: 1480/527, loss: 0.0009317616350017488 2023-01-22 22:06:46.537716: step: 1484/527, loss: 0.0012273200554773211 2023-01-22 22:06:47.589940: step: 1488/527, loss: 0.0038714574184268713 2023-01-22 22:06:48.642921: step: 1492/527, loss: 0.0029744389466941357 2023-01-22 22:06:49.715940: step: 1496/527, loss: 0.0015907816123217344 2023-01-22 22:06:50.769023: step: 1500/527, loss: 0.0036565284244716167 2023-01-22 22:06:51.804800: step: 1504/527, loss: 0.0011274943826720119 2023-01-22 22:06:52.839566: step: 1508/527, loss: 0.002733144210651517 2023-01-22 22:06:53.889417: step: 1512/527, loss: 6.684953405056149e-05 2023-01-22 22:06:54.951340: step: 1516/527, loss: 0.004166835453361273 2023-01-22 22:06:55.994585: step: 1520/527, loss: 0.010164335370063782 2023-01-22 22:06:57.025382: step: 1524/527, loss: 0.0057418206706643105 2023-01-22 22:06:58.078049: step: 1528/527, loss: 0.00045034760842099786 2023-01-22 22:06:59.132684: step: 1532/527, loss: 0.0056198593229055405 2023-01-22 22:07:00.160142: step: 1536/527, loss: 0.00041610983316786587 2023-01-22 22:07:01.208896: step: 1540/527, loss: 0.0018941520247608423 2023-01-22 22:07:02.274921: step: 1544/527, loss: 0.011884275823831558 2023-01-22 22:07:03.323363: step: 1548/527, loss: 2.437141301925294e-05 2023-01-22 22:07:04.381780: step: 1552/527, loss: 0.003413716796785593 2023-01-22 22:07:05.439980: step: 1556/527, loss: 0.001881950069218874 2023-01-22 22:07:06.494898: step: 1560/527, loss: 0.003129188669845462 2023-01-22 22:07:07.555723: step: 1564/527, loss: 0.014781621284782887 2023-01-22 22:07:08.595616: step: 1568/527, loss: 0.0030736452899873257 2023-01-22 22:07:09.635473: step: 1572/527, loss: 0.00018176525190938264 2023-01-22 22:07:10.681626: step: 1576/527, loss: 0.00010842136543942615 2023-01-22 22:07:11.744696: step: 1580/527, loss: 0.004571137484163046 2023-01-22 22:07:12.803648: step: 1584/527, loss: 0.0005612316308543086 2023-01-22 22:07:13.856755: step: 1588/527, loss: 0.0025629668962210417 2023-01-22 22:07:14.907127: step: 1592/527, loss: 0.0004188601451460272 2023-01-22 22:07:15.945599: step: 1596/527, loss: 0.000501289265230298 2023-01-22 22:07:16.993816: step: 1600/527, loss: 0.02022780291736126 2023-01-22 22:07:18.037969: step: 1604/527, loss: 0.0006575792795047164 2023-01-22 22:07:19.083048: step: 1608/527, loss: 0.006200199481099844 2023-01-22 22:07:20.136161: step: 1612/527, loss: 0.001498966827057302 2023-01-22 22:07:21.184208: step: 1616/527, loss: 0.010210379958152771 2023-01-22 22:07:22.227869: step: 1620/527, loss: 0.0006134926225058734 2023-01-22 22:07:23.270234: step: 1624/527, loss: 0.006125845946371555 2023-01-22 22:07:24.325231: step: 1628/527, loss: 0.0005284885410219431 2023-01-22 22:07:25.389267: step: 1632/527, loss: 0.0023907991126179695 2023-01-22 22:07:26.430863: step: 1636/527, loss: 0.0030004247091710567 2023-01-22 22:07:27.473583: step: 1640/527, loss: 0.024753417819738388 2023-01-22 22:07:28.522065: step: 1644/527, loss: 0.00994983222335577 2023-01-22 22:07:29.580649: step: 1648/527, loss: 0.0014266620855778456 2023-01-22 22:07:30.625535: step: 1652/527, loss: 0.002152681350708008 2023-01-22 22:07:31.673817: step: 1656/527, loss: 0.0014415646437555552 2023-01-22 22:07:32.726784: step: 1660/527, loss: 0.015801381319761276 2023-01-22 22:07:33.784593: step: 1664/527, loss: 0.0010134992189705372 2023-01-22 22:07:34.838178: step: 1668/527, loss: 0.0028500196058303118 2023-01-22 22:07:35.878550: step: 1672/527, loss: 0.008396395482122898 2023-01-22 22:07:36.926870: step: 1676/527, loss: 0.0019125572871416807 2023-01-22 22:07:37.984851: step: 1680/527, loss: 0.00570964440703392 2023-01-22 22:07:39.021683: step: 1684/527, loss: 0.001295329537242651 2023-01-22 22:07:40.080875: step: 1688/527, loss: 0.006018343847244978 2023-01-22 22:07:41.139792: step: 1692/527, loss: 0.0012415233068168163 2023-01-22 22:07:42.201918: step: 1696/527, loss: 0.0037210476584732533 2023-01-22 22:07:43.248088: step: 1700/527, loss: 0.010629287920892239 2023-01-22 22:07:44.299424: step: 1704/527, loss: 0.0008832578314468265 2023-01-22 22:07:45.357068: step: 1708/527, loss: 0.0023887231945991516 2023-01-22 22:07:46.412458: step: 1712/527, loss: 0.008467582985758781 2023-01-22 22:07:47.440349: step: 1716/527, loss: 0.0008527770405635238 2023-01-22 22:07:48.488715: step: 1720/527, loss: 0.0028490840923041105 2023-01-22 22:07:49.557060: step: 1724/527, loss: 0.0015733069740235806 2023-01-22 22:07:50.597470: step: 1728/527, loss: 0.002432249952107668 2023-01-22 22:07:51.663815: step: 1732/527, loss: 0.0011606188490986824 2023-01-22 22:07:52.710974: step: 1736/527, loss: 0.00139420700725168 2023-01-22 22:07:53.748293: step: 1740/527, loss: 0.006411610636860132 2023-01-22 22:07:54.798039: step: 1744/527, loss: 0.0023196344263851643 2023-01-22 22:07:55.854734: step: 1748/527, loss: 0.004349843133240938 2023-01-22 22:07:56.894478: step: 1752/527, loss: 0.0021164407953619957 2023-01-22 22:07:57.926163: step: 1756/527, loss: 0.0013786342460662127 2023-01-22 22:07:58.975412: step: 1760/527, loss: 0.0019900633487850428 2023-01-22 22:08:00.043469: step: 1764/527, loss: 0.005306210368871689 2023-01-22 22:08:01.094873: step: 1768/527, loss: 0.0003959077876061201 2023-01-22 22:08:02.135451: step: 1772/527, loss: 0.004017258062958717 2023-01-22 22:08:03.182072: step: 1776/527, loss: 0.0037442147731781006 2023-01-22 22:08:04.222819: step: 1780/527, loss: 0.002976828021928668 2023-01-22 22:08:05.270956: step: 1784/527, loss: 0.001570437685586512 2023-01-22 22:08:06.313953: step: 1788/527, loss: 0.0003771288029383868 2023-01-22 22:08:07.371269: step: 1792/527, loss: 0.005642472300678492 2023-01-22 22:08:08.415313: step: 1796/527, loss: 0.0005993598024360836 2023-01-22 22:08:09.458958: step: 1800/527, loss: 0.001115448772907257 2023-01-22 22:08:10.521162: step: 1804/527, loss: 0.007068478502333164 2023-01-22 22:08:11.580690: step: 1808/527, loss: 0.00014667969662696123 2023-01-22 22:08:12.631296: step: 1812/527, loss: 0.0027813890483230352 2023-01-22 22:08:13.686045: step: 1816/527, loss: 0.001548566622659564 2023-01-22 22:08:14.724469: step: 1820/527, loss: 0.0014729787362739444 2023-01-22 22:08:15.779134: step: 1824/527, loss: 0.00013281393330544233 2023-01-22 22:08:16.830832: step: 1828/527, loss: 0.006314554251730442 2023-01-22 22:08:17.877708: step: 1832/527, loss: 0.010976334102451801 2023-01-22 22:08:18.931718: step: 1836/527, loss: 0.002326847752556205 2023-01-22 22:08:19.981704: step: 1840/527, loss: 0.004039890132844448 2023-01-22 22:08:21.037705: step: 1844/527, loss: 0.002230801386758685 2023-01-22 22:08:22.091185: step: 1848/527, loss: 6.312626646831632e-05 2023-01-22 22:08:23.159284: step: 1852/527, loss: 0.007334005553275347 2023-01-22 22:08:24.202003: step: 1856/527, loss: 0.00029763008933514357 2023-01-22 22:08:25.254084: step: 1860/527, loss: 0.0032405529636889696 2023-01-22 22:08:26.291082: step: 1864/527, loss: 0.0008879650849848986 2023-01-22 22:08:27.334922: step: 1868/527, loss: 0.0005904252175241709 2023-01-22 22:08:28.388854: step: 1872/527, loss: 0.0007783591863699257 2023-01-22 22:08:29.445656: step: 1876/527, loss: 0.014928046613931656 2023-01-22 22:08:30.520294: step: 1880/527, loss: 0.000410371896577999 2023-01-22 22:08:31.570031: step: 1884/527, loss: 0.0034347092732787132 2023-01-22 22:08:32.604879: step: 1888/527, loss: 0.0009367514867335558 2023-01-22 22:08:33.662996: step: 1892/527, loss: 0.0008500413969159126 2023-01-22 22:08:34.711052: step: 1896/527, loss: 0.004298130515962839 2023-01-22 22:08:35.761046: step: 1900/527, loss: 0.0014246077043935657 2023-01-22 22:08:36.804346: step: 1904/527, loss: 0.005420645698904991 2023-01-22 22:08:37.855156: step: 1908/527, loss: 0.001716962899081409 2023-01-22 22:08:38.903057: step: 1912/527, loss: 0.003293792949989438 2023-01-22 22:08:39.951145: step: 1916/527, loss: 0.006339459680020809 2023-01-22 22:08:41.020935: step: 1920/527, loss: 8.171771332854405e-05 2023-01-22 22:08:42.072917: step: 1924/527, loss: 0.0005772780859842896 2023-01-22 22:08:43.126933: step: 1928/527, loss: 0.0004736421979032457 2023-01-22 22:08:44.158629: step: 1932/527, loss: 0.0027515755500644445 2023-01-22 22:08:45.191998: step: 1936/527, loss: 0.003235356416553259 2023-01-22 22:08:46.238423: step: 1940/527, loss: 0.0015790046891197562 2023-01-22 22:08:47.295507: step: 1944/527, loss: 0.003959468100219965 2023-01-22 22:08:48.366394: step: 1948/527, loss: 0.009757568128407001 2023-01-22 22:08:49.410544: step: 1952/527, loss: 0.0024745729751884937 2023-01-22 22:08:50.455977: step: 1956/527, loss: 0.0031782081350684166 2023-01-22 22:08:51.515352: step: 1960/527, loss: 0.0016110064461827278 2023-01-22 22:08:52.568918: step: 1964/527, loss: 0.004924602806568146 2023-01-22 22:08:53.613155: step: 1968/527, loss: 0.009240970946848392 2023-01-22 22:08:54.636577: step: 1972/527, loss: 0.0002643048937898129 2023-01-22 22:08:55.682893: step: 1976/527, loss: 0.00017130047490354627 2023-01-22 22:08:56.729846: step: 1980/527, loss: 0.007659686263650656 2023-01-22 22:08:57.778939: step: 1984/527, loss: 0.0033778122160583735 2023-01-22 22:08:58.837674: step: 1988/527, loss: 0.0025927498936653137 2023-01-22 22:08:59.898875: step: 1992/527, loss: 0.003156978404149413 2023-01-22 22:09:00.942462: step: 1996/527, loss: 0.009291576221585274 2023-01-22 22:09:01.991312: step: 2000/527, loss: 0.0014566507888957858 2023-01-22 22:09:03.042181: step: 2004/527, loss: 0.0353391095995903 2023-01-22 22:09:04.112715: step: 2008/527, loss: 0.002842916175723076 2023-01-22 22:09:05.186684: step: 2012/527, loss: 0.00791760440915823 2023-01-22 22:09:06.236537: step: 2016/527, loss: 0.00882710050791502 2023-01-22 22:09:07.280863: step: 2020/527, loss: 0.0035401280038058758 2023-01-22 22:09:08.326633: step: 2024/527, loss: 0.0005623472388833761 2023-01-22 22:09:09.380520: step: 2028/527, loss: 0.0023368163965642452 2023-01-22 22:09:10.434553: step: 2032/527, loss: 0.00010127259156433865 2023-01-22 22:09:11.482466: step: 2036/527, loss: 0.004729969892650843 2023-01-22 22:09:12.547547: step: 2040/527, loss: 0.0022714941296726465 2023-01-22 22:09:13.596005: step: 2044/527, loss: 0.0016250937478616834 2023-01-22 22:09:14.652204: step: 2048/527, loss: 0.0029007631819695234 2023-01-22 22:09:15.707608: step: 2052/527, loss: 0.0026187445037066936 2023-01-22 22:09:16.761747: step: 2056/527, loss: 0.014578443951904774 2023-01-22 22:09:17.822320: step: 2060/527, loss: 0.00037389600765891373 2023-01-22 22:09:18.864719: step: 2064/527, loss: 1.5922822058200836e-05 2023-01-22 22:09:19.951273: step: 2068/527, loss: 0.0005649236845783889 2023-01-22 22:09:21.002780: step: 2072/527, loss: 0.0036380284000188112 2023-01-22 22:09:22.059403: step: 2076/527, loss: 0.007161274552345276 2023-01-22 22:09:23.096630: step: 2080/527, loss: 4.421605990501121e-05 2023-01-22 22:09:24.140393: step: 2084/527, loss: 0.0026107062585651875 2023-01-22 22:09:25.181927: step: 2088/527, loss: 0.0009592160931788385 2023-01-22 22:09:26.227427: step: 2092/527, loss: 0.00039406042196787894 2023-01-22 22:09:27.265103: step: 2096/527, loss: 0.005230220500379801 2023-01-22 22:09:28.333107: step: 2100/527, loss: 0.0021274436730891466 2023-01-22 22:09:29.381133: step: 2104/527, loss: 0.002505266573280096 2023-01-22 22:09:30.445828: step: 2108/527, loss: 0.006490703206509352 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32669481981981985, 'r': 0.3440524193548387, 'f1': 0.3351490295748614}, 'combined': 0.24695191652884524, 'stategy': 1, 'epoch': 13} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33828391470856145, 'r': 0.30876095487945066, 'f1': 0.3228489071933419}, 'combined': 0.2066233006037388, 'stategy': 1, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32196007474470995, 'r': 0.3604486605301307, 'f1': 0.34011896884400866}, 'combined': 0.25061397704295374, 'stategy': 1, 'epoch': 13} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.35941808999789143, 'r': 0.31596117547996455, 'f1': 0.33629152687756264}, 'combined': 0.21522657720164007, 'stategy': 1, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3329207530108317, 'r': 0.3348159375630755, 'f1': 0.33386565581029476}, 'combined': 0.24600627270232245, 'stategy': 1, 'epoch': 13} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3656423286707144, 'r': 0.2961070723538997, 'f1': 0.32722139016283136}, 'combined': 0.23461156275825648, 'stategy': 1, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'stategy': 1, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3319773567844166, 'r': 0.33386717095965995, 'f1': 0.3329195820165389}, 'combined': 0.24530916569639705, 'stategy': 1, 'epoch': 11} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36875593136201523, 'r': 0.3013128538335666, 'f1': 0.3316402867932796}, 'combined': 0.23777982826687974, 'stategy': 1, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 11} ****************************** Epoch: 14 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 22:11:57.545148: step: 4/527, loss: 0.0034385870676487684 2023-01-22 22:11:58.584299: step: 8/527, loss: 0.00029713299591094255 2023-01-22 22:11:59.627062: step: 12/527, loss: 0.001968927448615432 2023-01-22 22:12:00.659200: step: 16/527, loss: 0.003574621630832553 2023-01-22 22:12:01.700846: step: 20/527, loss: 0.0016944973031058908 2023-01-22 22:12:02.765638: step: 24/527, loss: 0.0005614277324639261 2023-01-22 22:12:03.816778: step: 28/527, loss: 0.00016602064715698361 2023-01-22 22:12:04.847481: step: 32/527, loss: 0.0004911221330985427 2023-01-22 22:12:05.883057: step: 36/527, loss: 0.0036268248222768307 2023-01-22 22:12:06.912842: step: 40/527, loss: 0.00653064763173461 2023-01-22 22:12:07.971361: step: 44/527, loss: 0.004089450463652611 2023-01-22 22:12:09.023241: step: 48/527, loss: 0.005078588612377644 2023-01-22 22:12:10.062864: step: 52/527, loss: 0.0001291111548198387 2023-01-22 22:12:11.104682: step: 56/527, loss: 0.0007841204642318189 2023-01-22 22:12:12.156506: step: 60/527, loss: 0.013271546922624111 2023-01-22 22:12:13.193295: step: 64/527, loss: 0.009134400635957718 2023-01-22 22:12:14.233999: step: 68/527, loss: 0.0009942364413291216 2023-01-22 22:12:15.285098: step: 72/527, loss: 0.007302484009414911 2023-01-22 22:12:16.337724: step: 76/527, loss: 0.0002903318381868303 2023-01-22 22:12:17.377443: step: 80/527, loss: 0.0009116778965108097 2023-01-22 22:12:18.445724: step: 84/527, loss: 0.0031148213893175125 2023-01-22 22:12:19.490902: step: 88/527, loss: 0.008291029371321201 2023-01-22 22:12:20.523804: step: 92/527, loss: 4.698391057900153e-05 2023-01-22 22:12:21.552172: step: 96/527, loss: 0.0007949414430186152 2023-01-22 22:12:22.589112: step: 100/527, loss: 0.004479155410081148 2023-01-22 22:12:23.657021: step: 104/527, loss: 0.003892976325005293 2023-01-22 22:12:24.732809: step: 108/527, loss: 2.17921297007706e-06 2023-01-22 22:12:25.769862: step: 112/527, loss: 0.0015464965254068375 2023-01-22 22:12:26.799962: step: 116/527, loss: 0.00213803444057703 2023-01-22 22:12:27.845562: step: 120/527, loss: 0.0022411756217479706 2023-01-22 22:12:28.894222: step: 124/527, loss: 0.006723572965711355 2023-01-22 22:12:29.944036: step: 128/527, loss: 0.006680222228169441 2023-01-22 22:12:30.978540: step: 132/527, loss: 0.0024505641777068377 2023-01-22 22:12:32.034641: step: 136/527, loss: 0.003457698505371809 2023-01-22 22:12:33.078528: step: 140/527, loss: 0.011755745857954025 2023-01-22 22:12:34.123783: step: 144/527, loss: 0.002054257085546851 2023-01-22 22:12:35.183839: step: 148/527, loss: 0.003046763828024268 2023-01-22 22:12:36.226741: step: 152/527, loss: 0.001982923364266753 2023-01-22 22:12:37.257443: step: 156/527, loss: 0.00019219951354898512 2023-01-22 22:12:38.303714: step: 160/527, loss: 0.002451239386573434 2023-01-22 22:12:39.338524: step: 164/527, loss: 0.00036641009501181543 2023-01-22 22:12:40.399753: step: 168/527, loss: 0.0019364228937774897 2023-01-22 22:12:41.437980: step: 172/527, loss: 0.0002594398392830044 2023-01-22 22:12:42.467858: step: 176/527, loss: 3.2482847700521233e-07 2023-01-22 22:12:43.534517: step: 180/527, loss: 0.005977904889732599 2023-01-22 22:12:44.573414: step: 184/527, loss: 0.0014887494035065174 2023-01-22 22:12:45.621319: step: 188/527, loss: 0.00010262204887112603 2023-01-22 22:12:46.679318: step: 192/527, loss: 0.0014058399247005582 2023-01-22 22:12:47.713617: step: 196/527, loss: 0.012003547511994839 2023-01-22 22:12:48.759726: step: 200/527, loss: 0.00013779512664768845 2023-01-22 22:12:49.800753: step: 204/527, loss: 0.0018862533615902066 2023-01-22 22:12:50.855114: step: 208/527, loss: 0.006187452934682369 2023-01-22 22:12:51.901610: step: 212/527, loss: 0.001590660191141069 2023-01-22 22:12:52.952101: step: 216/527, loss: 0.003272171365097165 2023-01-22 22:12:53.990603: step: 220/527, loss: 0.00725686177611351 2023-01-22 22:12:55.048182: step: 224/527, loss: 0.00021836531232111156 2023-01-22 22:12:56.110178: step: 228/527, loss: 0.007154582068324089 2023-01-22 22:12:57.163303: step: 232/527, loss: 0.00036996594280935824 2023-01-22 22:12:58.210017: step: 236/527, loss: 0.0007595289498567581 2023-01-22 22:12:59.248378: step: 240/527, loss: 0.00024765549460425973 2023-01-22 22:13:00.288293: step: 244/527, loss: 0.005371781066060066 2023-01-22 22:13:01.339682: step: 248/527, loss: 0.003217507852241397 2023-01-22 22:13:02.397006: step: 252/527, loss: 0.0028972842264920473 2023-01-22 22:13:03.449793: step: 256/527, loss: 0.00028066636878065765 2023-01-22 22:13:04.501336: step: 260/527, loss: 0.0032610679045319557 2023-01-22 22:13:05.545743: step: 264/527, loss: 0.003325643250718713 2023-01-22 22:13:06.581460: step: 268/527, loss: 0.0 2023-01-22 22:13:07.638980: step: 272/527, loss: 0.014231090433895588 2023-01-22 22:13:08.682254: step: 276/527, loss: 0.005514030810445547 2023-01-22 22:13:09.726730: step: 280/527, loss: 0.0004978215438313782 2023-01-22 22:13:10.784988: step: 284/527, loss: 0.004339766688644886 2023-01-22 22:13:11.839289: step: 288/527, loss: 0.0029256227426230907 2023-01-22 22:13:12.904520: step: 292/527, loss: 0.0005764540401287377 2023-01-22 22:13:13.953271: step: 296/527, loss: 0.0019375586416572332 2023-01-22 22:13:14.994771: step: 300/527, loss: 0.004870260134339333 2023-01-22 22:13:16.051520: step: 304/527, loss: 0.004647491499781609 2023-01-22 22:13:17.101063: step: 308/527, loss: 0.0006293188198469579 2023-01-22 22:13:18.143355: step: 312/527, loss: 0.00012084191257599741 2023-01-22 22:13:19.198234: step: 316/527, loss: 0.007480214815586805 2023-01-22 22:13:20.254426: step: 320/527, loss: 1.3044937077211216e-05 2023-01-22 22:13:21.303135: step: 324/527, loss: 0.0016710786148905754 2023-01-22 22:13:22.347733: step: 328/527, loss: 7.654158980585635e-05 2023-01-22 22:13:23.424330: step: 332/527, loss: 3.666698103188537e-05 2023-01-22 22:13:24.463134: step: 336/527, loss: 0.007356188725680113 2023-01-22 22:13:25.515430: step: 340/527, loss: 0.0016076359897851944 2023-01-22 22:13:26.557021: step: 344/527, loss: 0.0037754394579678774 2023-01-22 22:13:27.619871: step: 348/527, loss: 0.006621331907808781 2023-01-22 22:13:28.669563: step: 352/527, loss: 6.784845754737034e-05 2023-01-22 22:13:29.717513: step: 356/527, loss: 0.009478322230279446 2023-01-22 22:13:30.757852: step: 360/527, loss: 6.330499218165642e-06 2023-01-22 22:13:31.796197: step: 364/527, loss: 0.013975173234939575 2023-01-22 22:13:32.852572: step: 368/527, loss: 0.0041032824665308 2023-01-22 22:13:33.906052: step: 372/527, loss: 0.012271531857550144 2023-01-22 22:13:34.974320: step: 376/527, loss: 0.0003107993397861719 2023-01-22 22:13:36.032320: step: 380/527, loss: 0.0013480093330144882 2023-01-22 22:13:37.073869: step: 384/527, loss: 0.0015534843550994992 2023-01-22 22:13:38.113395: step: 388/527, loss: 0.0010218905517831445 2023-01-22 22:13:39.179865: step: 392/527, loss: 0.0012761206598952413 2023-01-22 22:13:40.232483: step: 396/527, loss: 0.005255566444247961 2023-01-22 22:13:41.283140: step: 400/527, loss: 0.0009776563383638859 2023-01-22 22:13:42.349990: step: 404/527, loss: 0.0006973003037273884 2023-01-22 22:13:43.401384: step: 408/527, loss: 0.0029390431009233 2023-01-22 22:13:44.445464: step: 412/527, loss: 0.0001560229720780626 2023-01-22 22:13:45.499593: step: 416/527, loss: 0.0025680658873170614 2023-01-22 22:13:46.551764: step: 420/527, loss: 0.010235791094601154 2023-01-22 22:13:47.623044: step: 424/527, loss: 0.0027746185660362244 2023-01-22 22:13:48.678533: step: 428/527, loss: 0.01127717923372984 2023-01-22 22:13:49.737933: step: 432/527, loss: 0.009032091125845909 2023-01-22 22:13:50.776643: step: 436/527, loss: 0.0017111834604293108 2023-01-22 22:13:51.827682: step: 440/527, loss: 0.00027508108178153634 2023-01-22 22:13:52.887350: step: 444/527, loss: 0.004775597248226404 2023-01-22 22:13:53.938005: step: 448/527, loss: 0.014230608940124512 2023-01-22 22:13:55.006599: step: 452/527, loss: 0.0013496255269274116 2023-01-22 22:13:56.054229: step: 456/527, loss: 0.012655236758291721 2023-01-22 22:13:57.104711: step: 460/527, loss: 0.0008285652147606015 2023-01-22 22:13:58.167015: step: 464/527, loss: 0.0006113981362432241 2023-01-22 22:13:59.208028: step: 468/527, loss: 3.422592271817848e-05 2023-01-22 22:14:00.282779: step: 472/527, loss: 0.012131537310779095 2023-01-22 22:14:01.340738: step: 476/527, loss: 0.002359824487939477 2023-01-22 22:14:02.398719: step: 480/527, loss: 0.002012055367231369 2023-01-22 22:14:03.429157: step: 484/527, loss: 0.0 2023-01-22 22:14:04.501594: step: 488/527, loss: 0.00022573393653146923 2023-01-22 22:14:05.560258: step: 492/527, loss: 0.0004634474462363869 2023-01-22 22:14:06.629424: step: 496/527, loss: 0.0024044073652476072 2023-01-22 22:14:07.669178: step: 500/527, loss: 0.002115800743922591 2023-01-22 22:14:08.721889: step: 504/527, loss: 0.0043561081402003765 2023-01-22 22:14:09.785612: step: 508/527, loss: 0.0012777147348970175 2023-01-22 22:14:10.837666: step: 512/527, loss: 0.0017793085426092148 2023-01-22 22:14:11.886018: step: 516/527, loss: 0.028184030205011368 2023-01-22 22:14:12.940319: step: 520/527, loss: 0.007247556932270527 2023-01-22 22:14:13.990528: step: 524/527, loss: 8.568251359974965e-05 2023-01-22 22:14:15.046988: step: 528/527, loss: 0.003138698171824217 2023-01-22 22:14:16.106332: step: 532/527, loss: 0.0005081011913716793 2023-01-22 22:14:17.155336: step: 536/527, loss: 0.0 2023-01-22 22:14:18.213504: step: 540/527, loss: 0.0001450574054615572 2023-01-22 22:14:19.302426: step: 544/527, loss: 0.0062448144890367985 2023-01-22 22:14:20.357338: step: 548/527, loss: 0.007234330754727125 2023-01-22 22:14:21.409524: step: 552/527, loss: 0.001055950648151338 2023-01-22 22:14:22.469989: step: 556/527, loss: 0.00021663459483534098 2023-01-22 22:14:23.526041: step: 560/527, loss: 0.0018378297099843621 2023-01-22 22:14:24.585848: step: 564/527, loss: 0.0019507826073095202 2023-01-22 22:14:25.656140: step: 568/527, loss: 0.0020837734919041395 2023-01-22 22:14:26.706405: step: 572/527, loss: 0.003399675013497472 2023-01-22 22:14:27.753719: step: 576/527, loss: 0.0036772252060472965 2023-01-22 22:14:28.815409: step: 580/527, loss: 0.00118359609041363 2023-01-22 22:14:29.867797: step: 584/527, loss: 0.00029811804415658116 2023-01-22 22:14:30.919794: step: 588/527, loss: 0.002658726880326867 2023-01-22 22:14:31.976924: step: 592/527, loss: 0.01457090862095356 2023-01-22 22:14:33.047881: step: 596/527, loss: 0.004640995059162378 2023-01-22 22:14:34.101909: step: 600/527, loss: 0.0015015759272500873 2023-01-22 22:14:35.157971: step: 604/527, loss: 0.0032119008246809244 2023-01-22 22:14:36.203692: step: 608/527, loss: 0.0036756584886461496 2023-01-22 22:14:37.251502: step: 612/527, loss: 0.000626731722149998 2023-01-22 22:14:38.313981: step: 616/527, loss: 0.00037523164064623415 2023-01-22 22:14:39.366481: step: 620/527, loss: 0.0007973259780555964 2023-01-22 22:14:40.434755: step: 624/527, loss: 0.0056456513702869415 2023-01-22 22:14:41.490955: step: 628/527, loss: 0.026532448828220367 2023-01-22 22:14:42.540421: step: 632/527, loss: 0.0005014989874325693 2023-01-22 22:14:43.619476: step: 636/527, loss: 0.0015794503269717097 2023-01-22 22:14:44.682855: step: 640/527, loss: 0.0008615500410087407 2023-01-22 22:14:45.735606: step: 644/527, loss: 0.006188173778355122 2023-01-22 22:14:46.786071: step: 648/527, loss: 0.0024688735138624907 2023-01-22 22:14:47.842936: step: 652/527, loss: 0.00021416146773844957 2023-01-22 22:14:48.898309: step: 656/527, loss: 0.005300307646393776 2023-01-22 22:14:49.955607: step: 660/527, loss: 0.012360951863229275 2023-01-22 22:14:50.997495: step: 664/527, loss: 0.00018883055599872023 2023-01-22 22:14:52.029678: step: 668/527, loss: 0.001723723253235221 2023-01-22 22:14:53.084509: step: 672/527, loss: 0.0009996925946325064 2023-01-22 22:14:54.136323: step: 676/527, loss: 0.0186931025236845 2023-01-22 22:14:55.207869: step: 680/527, loss: 0.006850207690149546 2023-01-22 22:14:56.259188: step: 684/527, loss: 0.0041246963664889336 2023-01-22 22:14:57.304776: step: 688/527, loss: 0.00033193855779245496 2023-01-22 22:14:58.344327: step: 692/527, loss: 0.001357655506581068 2023-01-22 22:14:59.392772: step: 696/527, loss: 0.00399990938603878 2023-01-22 22:15:00.451878: step: 700/527, loss: 0.00022381976305041462 2023-01-22 22:15:01.496319: step: 704/527, loss: 0.00813576765358448 2023-01-22 22:15:02.551372: step: 708/527, loss: 6.659329665126279e-05 2023-01-22 22:15:03.614220: step: 712/527, loss: 0.004132091533392668 2023-01-22 22:15:04.666733: step: 716/527, loss: 0.004225397016853094 2023-01-22 22:15:05.714017: step: 720/527, loss: 0.0027986590284854174 2023-01-22 22:15:06.752478: step: 724/527, loss: 0.005259071476757526 2023-01-22 22:15:07.782835: step: 728/527, loss: 0.004428376909345388 2023-01-22 22:15:08.848684: step: 732/527, loss: 0.0061376020312309265 2023-01-22 22:15:09.885821: step: 736/527, loss: 0.00029465576517395675 2023-01-22 22:15:10.940039: step: 740/527, loss: 0.003344725351780653 2023-01-22 22:15:11.979297: step: 744/527, loss: 0.007926919497549534 2023-01-22 22:15:13.041931: step: 748/527, loss: 0.00021887487673666328 2023-01-22 22:15:14.096725: step: 752/527, loss: 6.21903091087006e-05 2023-01-22 22:15:15.149775: step: 756/527, loss: 0.0005317451432347298 2023-01-22 22:15:16.202636: step: 760/527, loss: 0.002144468016922474 2023-01-22 22:15:17.263453: step: 764/527, loss: 0.0006259052315726876 2023-01-22 22:15:18.324846: step: 768/527, loss: 0.02363005466759205 2023-01-22 22:15:19.380125: step: 772/527, loss: 0.0034788320772349834 2023-01-22 22:15:20.450068: step: 776/527, loss: 0.0005750986165367067 2023-01-22 22:15:21.498263: step: 780/527, loss: 0.010245602577924728 2023-01-22 22:15:22.554421: step: 784/527, loss: 0.001477353274822235 2023-01-22 22:15:23.617797: step: 788/527, loss: 0.004400107078254223 2023-01-22 22:15:24.658988: step: 792/527, loss: 0.03304194286465645 2023-01-22 22:15:25.711742: step: 796/527, loss: 0.001029366278089583 2023-01-22 22:15:26.769231: step: 800/527, loss: 0.0012383551802486181 2023-01-22 22:15:27.822922: step: 804/527, loss: 0.008803913369774818 2023-01-22 22:15:28.868880: step: 808/527, loss: 0.00018131342949345708 2023-01-22 22:15:29.920898: step: 812/527, loss: 0.005063209682703018 2023-01-22 22:15:30.969045: step: 816/527, loss: 0.0001871378335636109 2023-01-22 22:15:32.014034: step: 820/527, loss: 0.0017088191816583276 2023-01-22 22:15:33.046772: step: 824/527, loss: 0.000313976634060964 2023-01-22 22:15:34.126339: step: 828/527, loss: 8.94362383405678e-05 2023-01-22 22:15:35.196272: step: 832/527, loss: 0.0009407568722963333 2023-01-22 22:15:36.248700: step: 836/527, loss: 0.004834346938878298 2023-01-22 22:15:37.299597: step: 840/527, loss: 2.4350756575586274e-05 2023-01-22 22:15:38.343714: step: 844/527, loss: 0.006917926017194986 2023-01-22 22:15:39.400564: step: 848/527, loss: 0.0005837543285451829 2023-01-22 22:15:40.459723: step: 852/527, loss: 0.0008525578887201846 2023-01-22 22:15:41.520161: step: 856/527, loss: 0.027686726301908493 2023-01-22 22:15:42.580569: step: 860/527, loss: 0.0026730350218713284 2023-01-22 22:15:43.643667: step: 864/527, loss: 0.004414840135723352 2023-01-22 22:15:44.699651: step: 868/527, loss: 0.00095394003437832 2023-01-22 22:15:45.749278: step: 872/527, loss: 0.001670580473728478 2023-01-22 22:15:46.813282: step: 876/527, loss: 0.00013169716112315655 2023-01-22 22:15:47.874363: step: 880/527, loss: 0.0005274789873510599 2023-01-22 22:15:48.922688: step: 884/527, loss: 0.00881276000291109 2023-01-22 22:15:49.980637: step: 888/527, loss: 2.3202443117043003e-05 2023-01-22 22:15:51.019933: step: 892/527, loss: 0.0013199002714827657 2023-01-22 22:15:52.077476: step: 896/527, loss: 0.0034865480847656727 2023-01-22 22:15:53.130115: step: 900/527, loss: 2.6759513275464997e-05 2023-01-22 22:15:54.169307: step: 904/527, loss: 0.00742027023807168 2023-01-22 22:15:55.212539: step: 908/527, loss: 0.0054139369167387486 2023-01-22 22:15:56.263540: step: 912/527, loss: 0.0008490922045893967 2023-01-22 22:15:57.319577: step: 916/527, loss: 0.0002286379021825269 2023-01-22 22:15:58.379088: step: 920/527, loss: 0.005094374530017376 2023-01-22 22:15:59.435340: step: 924/527, loss: 0.0010047341929748654 2023-01-22 22:16:00.489610: step: 928/527, loss: 0.0021650586277246475 2023-01-22 22:16:01.537377: step: 932/527, loss: 0.0035606506280601025 2023-01-22 22:16:02.590025: step: 936/527, loss: 0.0003091662365477532 2023-01-22 22:16:03.644621: step: 940/527, loss: 0.003999793436378241 2023-01-22 22:16:04.688249: step: 944/527, loss: 0.027652982622385025 2023-01-22 22:16:05.745653: step: 948/527, loss: 0.0031255122739821672 2023-01-22 22:16:06.782376: step: 952/527, loss: 0.0034520861227065325 2023-01-22 22:16:07.840775: step: 956/527, loss: 0.0015175495063886046 2023-01-22 22:16:08.887394: step: 960/527, loss: 0.003083609975874424 2023-01-22 22:16:09.948941: step: 964/527, loss: 0.0034182965755462646 2023-01-22 22:16:11.008317: step: 968/527, loss: 0.0019579913932830095 2023-01-22 22:16:12.055494: step: 972/527, loss: 0.003388741984963417 2023-01-22 22:16:13.116839: step: 976/527, loss: 0.0004289007920306176 2023-01-22 22:16:14.159928: step: 980/527, loss: 0.008413110859692097 2023-01-22 22:16:15.198774: step: 984/527, loss: 0.033625949174165726 2023-01-22 22:16:16.240907: step: 988/527, loss: 0.0008374184253625572 2023-01-22 22:16:17.278921: step: 992/527, loss: 7.134010957088321e-05 2023-01-22 22:16:18.325989: step: 996/527, loss: 0.0016638896195217967 2023-01-22 22:16:19.379419: step: 1000/527, loss: 0.01010317075997591 2023-01-22 22:16:20.438893: step: 1004/527, loss: 0.0029414936434477568 2023-01-22 22:16:21.485651: step: 1008/527, loss: 0.004308727104216814 2023-01-22 22:16:22.539430: step: 1012/527, loss: 0.0016412204131484032 2023-01-22 22:16:23.606435: step: 1016/527, loss: 0.007245340384542942 2023-01-22 22:16:24.677074: step: 1020/527, loss: 0.00031418513390235603 2023-01-22 22:16:25.717885: step: 1024/527, loss: 0.00016337065608240664 2023-01-22 22:16:26.751142: step: 1028/527, loss: 0.002298986306414008 2023-01-22 22:16:27.812188: step: 1032/527, loss: 0.0015857777325436473 2023-01-22 22:16:28.874057: step: 1036/527, loss: 0.029947228729724884 2023-01-22 22:16:29.946950: step: 1040/527, loss: 0.004765619989484549 2023-01-22 22:16:31.000299: step: 1044/527, loss: 0.009243758395314217 2023-01-22 22:16:32.046499: step: 1048/527, loss: 0.001570431748405099 2023-01-22 22:16:33.093231: step: 1052/527, loss: 0.0003034933761227876 2023-01-22 22:16:34.126962: step: 1056/527, loss: 0.0003016614937223494 2023-01-22 22:16:35.191165: step: 1060/527, loss: 0.00451872032135725 2023-01-22 22:16:36.245979: step: 1064/527, loss: 0.004765608813613653 2023-01-22 22:16:37.307757: step: 1068/527, loss: 0.009984379634261131 2023-01-22 22:16:38.352517: step: 1072/527, loss: 0.011205597780644894 2023-01-22 22:16:39.416618: step: 1076/527, loss: 0.0015372848138213158 2023-01-22 22:16:40.471418: step: 1080/527, loss: 0.0028527549002319574 2023-01-22 22:16:41.542201: step: 1084/527, loss: 0.009391309693455696 2023-01-22 22:16:42.590228: step: 1088/527, loss: 0.0016450384864583611 2023-01-22 22:16:43.633723: step: 1092/527, loss: 0.0005399346118792892 2023-01-22 22:16:44.696922: step: 1096/527, loss: 0.0003490214003250003 2023-01-22 22:16:45.738896: step: 1100/527, loss: 0.010700157843530178 2023-01-22 22:16:46.810319: step: 1104/527, loss: 0.004406277555972338 2023-01-22 22:16:47.867337: step: 1108/527, loss: 0.00541723845526576 2023-01-22 22:16:48.923582: step: 1112/527, loss: 1.115364739234792e-05 2023-01-22 22:16:49.973729: step: 1116/527, loss: 0.0011980623239651322 2023-01-22 22:16:51.020537: step: 1120/527, loss: 0.009564312174916267 2023-01-22 22:16:52.085365: step: 1124/527, loss: 0.006887249648571014 2023-01-22 22:16:53.154057: step: 1128/527, loss: 0.0001486936234869063 2023-01-22 22:16:54.222577: step: 1132/527, loss: 0.002974584000185132 2023-01-22 22:16:55.279409: step: 1136/527, loss: 0.0036571312230080366 2023-01-22 22:16:56.332614: step: 1140/527, loss: 0.002105077961459756 2023-01-22 22:16:57.403910: step: 1144/527, loss: 0.00019063902436755598 2023-01-22 22:16:58.463667: step: 1148/527, loss: 0.006378160789608955 2023-01-22 22:16:59.512722: step: 1152/527, loss: 0.0002467613376211375 2023-01-22 22:17:00.551310: step: 1156/527, loss: 0.0039273472502827644 2023-01-22 22:17:01.601439: step: 1160/527, loss: 0.003809248795732856 2023-01-22 22:17:02.668729: step: 1164/527, loss: 0.011625951156020164 2023-01-22 22:17:03.710526: step: 1168/527, loss: 0.003630966879427433 2023-01-22 22:17:04.774425: step: 1172/527, loss: 0.008686323650181293 2023-01-22 22:17:05.830766: step: 1176/527, loss: 0.0007059941999614239 2023-01-22 22:17:06.883494: step: 1180/527, loss: 2.5776464099180885e-05 2023-01-22 22:17:07.943009: step: 1184/527, loss: 0.0026349988766014576 2023-01-22 22:17:08.982313: step: 1188/527, loss: 0.001742327818647027 2023-01-22 22:17:10.022625: step: 1192/527, loss: 2.044104576270911e-06 2023-01-22 22:17:11.085078: step: 1196/527, loss: 0.0019598728977143764 2023-01-22 22:17:12.128809: step: 1200/527, loss: 0.0006487110513262451 2023-01-22 22:17:13.172338: step: 1204/527, loss: 0.000515613064635545 2023-01-22 22:17:14.236041: step: 1208/527, loss: 0.003136194311082363 2023-01-22 22:17:15.283111: step: 1212/527, loss: 0.000336171971866861 2023-01-22 22:17:16.350904: step: 1216/527, loss: 0.028238510712981224 2023-01-22 22:17:17.389775: step: 1220/527, loss: 0.0007461044006049633 2023-01-22 22:17:18.446982: step: 1224/527, loss: 0.00046366697642952204 2023-01-22 22:17:19.488031: step: 1228/527, loss: 0.0033446340821683407 2023-01-22 22:17:20.565487: step: 1232/527, loss: 0.0006776587688364089 2023-01-22 22:17:21.609044: step: 1236/527, loss: 0.008448583073914051 2023-01-22 22:17:22.657416: step: 1240/527, loss: 0.004112632479518652 2023-01-22 22:17:23.700327: step: 1244/527, loss: 0.0035975042264908552 2023-01-22 22:17:24.742682: step: 1248/527, loss: 0.007639944553375244 2023-01-22 22:17:25.799170: step: 1252/527, loss: 0.00027725001564249396 2023-01-22 22:17:26.847543: step: 1256/527, loss: 0.0019933138974010944 2023-01-22 22:17:27.910123: step: 1260/527, loss: 0.005885405000299215 2023-01-22 22:17:28.961535: step: 1264/527, loss: 0.00020798925834242254 2023-01-22 22:17:30.025871: step: 1268/527, loss: 0.0014467276632785797 2023-01-22 22:17:31.075393: step: 1272/527, loss: 0.0012812019558623433 2023-01-22 22:17:32.123524: step: 1276/527, loss: 0.0016509346896782517 2023-01-22 22:17:33.170433: step: 1280/527, loss: 0.029428355395793915 2023-01-22 22:17:34.224585: step: 1284/527, loss: 0.0003505543863866478 2023-01-22 22:17:35.289760: step: 1288/527, loss: 0.003971409518271685 2023-01-22 22:17:36.336859: step: 1292/527, loss: 0.0041485680267214775 2023-01-22 22:17:37.385878: step: 1296/527, loss: 6.258472421905026e-05 2023-01-22 22:17:38.453351: step: 1300/527, loss: 0.0056670717895030975 2023-01-22 22:17:39.505338: step: 1304/527, loss: 0.0008777391631156206 2023-01-22 22:17:40.559297: step: 1308/527, loss: 0.006410645321011543 2023-01-22 22:17:41.606873: step: 1312/527, loss: 0.001781641156412661 2023-01-22 22:17:42.664729: step: 1316/527, loss: 0.002438068389892578 2023-01-22 22:17:43.716519: step: 1320/527, loss: 0.002560111228376627 2023-01-22 22:17:44.762962: step: 1324/527, loss: 0.0008724459912627935 2023-01-22 22:17:45.817639: step: 1328/527, loss: 0.0013217119267210364 2023-01-22 22:17:46.874905: step: 1332/527, loss: 0.00022965823882259429 2023-01-22 22:17:47.923366: step: 1336/527, loss: 0.00047675950918346643 2023-01-22 22:17:48.965232: step: 1340/527, loss: 0.0002364874235354364 2023-01-22 22:17:50.030201: step: 1344/527, loss: 0.012648622505366802 2023-01-22 22:17:51.086217: step: 1348/527, loss: 0.009509393014013767 2023-01-22 22:17:52.143469: step: 1352/527, loss: 0.005710206925868988 2023-01-22 22:17:53.191656: step: 1356/527, loss: 0.005535817239433527 2023-01-22 22:17:54.263923: step: 1360/527, loss: 0.004796273075044155 2023-01-22 22:17:55.310620: step: 1364/527, loss: 0.02941475808620453 2023-01-22 22:17:56.341117: step: 1368/527, loss: 0.0031732227653265 2023-01-22 22:17:57.409009: step: 1372/527, loss: 0.009113037027418613 2023-01-22 22:17:58.461847: step: 1376/527, loss: 0.00972342025488615 2023-01-22 22:17:59.520807: step: 1380/527, loss: 0.004497257526963949 2023-01-22 22:18:00.584827: step: 1384/527, loss: 0.010309236124157906 2023-01-22 22:18:01.635012: step: 1388/527, loss: 0.0024991300888359547 2023-01-22 22:18:02.697898: step: 1392/527, loss: 0.000653493101708591 2023-01-22 22:18:03.751584: step: 1396/527, loss: 0.003460506908595562 2023-01-22 22:18:04.800930: step: 1400/527, loss: 0.001133756130002439 2023-01-22 22:18:05.840249: step: 1404/527, loss: 0.0015325575368478894 2023-01-22 22:18:06.895016: step: 1408/527, loss: 0.00836519617587328 2023-01-22 22:18:07.951288: step: 1412/527, loss: 0.002205162076279521 2023-01-22 22:18:09.032706: step: 1416/527, loss: 0.08566229045391083 2023-01-22 22:18:10.091684: step: 1420/527, loss: 0.007607208099216223 2023-01-22 22:18:11.140877: step: 1424/527, loss: 0.0006425183382816613 2023-01-22 22:18:12.201595: step: 1428/527, loss: 0.001953200437128544 2023-01-22 22:18:13.258232: step: 1432/527, loss: 0.0045538912527263165 2023-01-22 22:18:14.301895: step: 1436/527, loss: 0.00012479489669203758 2023-01-22 22:18:15.353086: step: 1440/527, loss: 0.005342176649719477 2023-01-22 22:18:16.414945: step: 1444/527, loss: 0.009013012051582336 2023-01-22 22:18:17.455363: step: 1448/527, loss: 0.004276537336409092 2023-01-22 22:18:18.525376: step: 1452/527, loss: 0.005824533756822348 2023-01-22 22:18:19.569042: step: 1456/527, loss: 0.0006591712008230388 2023-01-22 22:18:20.629905: step: 1460/527, loss: 0.005048574414104223 2023-01-22 22:18:21.682888: step: 1464/527, loss: 0.004408989567309618 2023-01-22 22:18:22.713967: step: 1468/527, loss: 0.0023100601974874735 2023-01-22 22:18:23.756246: step: 1472/527, loss: 0.018858449533581734 2023-01-22 22:18:24.803026: step: 1476/527, loss: 0.0025698766112327576 2023-01-22 22:18:25.856526: step: 1480/527, loss: 0.004582853987812996 2023-01-22 22:18:26.897403: step: 1484/527, loss: 0.0024459497071802616 2023-01-22 22:18:27.944053: step: 1488/527, loss: 0.001870372798293829 2023-01-22 22:18:29.023585: step: 1492/527, loss: 0.0007779135485179722 2023-01-22 22:18:30.099836: step: 1496/527, loss: 0.0033443497959524393 2023-01-22 22:18:31.137265: step: 1500/527, loss: 0.0025116491597145796 2023-01-22 22:18:32.178168: step: 1504/527, loss: 0.006434546783566475 2023-01-22 22:18:33.222821: step: 1508/527, loss: 0.002134869573637843 2023-01-22 22:18:34.272544: step: 1512/527, loss: 0.005965932738035917 2023-01-22 22:18:35.317099: step: 1516/527, loss: 0.0032697084825485945 2023-01-22 22:18:36.353657: step: 1520/527, loss: 0.011707708239555359 2023-01-22 22:18:37.402214: step: 1524/527, loss: 0.003924786113202572 2023-01-22 22:18:38.442458: step: 1528/527, loss: 0.005687150172889233 2023-01-22 22:18:39.485290: step: 1532/527, loss: 0.0015920015284791589 2023-01-22 22:18:40.547109: step: 1536/527, loss: 0.015378007665276527 2023-01-22 22:18:41.609848: step: 1540/527, loss: 0.004583601839840412 2023-01-22 22:18:42.657522: step: 1544/527, loss: 0.0028360383585095406 2023-01-22 22:18:43.705286: step: 1548/527, loss: 0.0021851013880223036 2023-01-22 22:18:44.745783: step: 1552/527, loss: 0.003037082962691784 2023-01-22 22:18:45.812235: step: 1556/527, loss: 0.012514110654592514 2023-01-22 22:18:46.871154: step: 1560/527, loss: 0.0019110996508970857 2023-01-22 22:18:47.922447: step: 1564/527, loss: 0.0006105828797444701 2023-01-22 22:18:48.968714: step: 1568/527, loss: 0.0004732580855488777 2023-01-22 22:18:50.022087: step: 1572/527, loss: 0.003760196967050433 2023-01-22 22:18:51.075338: step: 1576/527, loss: 0.003763664746657014 2023-01-22 22:18:52.121127: step: 1580/527, loss: 0.0012006523320451379 2023-01-22 22:18:53.171685: step: 1584/527, loss: 0.00037941866321489215 2023-01-22 22:18:54.231510: step: 1588/527, loss: 0.006562489550560713 2023-01-22 22:18:55.265252: step: 1592/527, loss: 0.03608640655875206 2023-01-22 22:18:56.308355: step: 1596/527, loss: 0.0073776342906057835 2023-01-22 22:18:57.364485: step: 1600/527, loss: 0.005907274316996336 2023-01-22 22:18:58.399627: step: 1604/527, loss: 0.0006821455899626017 2023-01-22 22:18:59.467964: step: 1608/527, loss: 0.01251176092773676 2023-01-22 22:19:00.512755: step: 1612/527, loss: 0.0009520926978439093 2023-01-22 22:19:01.562492: step: 1616/527, loss: 0.0037912847474217415 2023-01-22 22:19:02.610761: step: 1620/527, loss: 8.045620779739693e-05 2023-01-22 22:19:03.667024: step: 1624/527, loss: 0.006598693784326315 2023-01-22 22:19:04.707072: step: 1628/527, loss: 0.0012025644537061453 2023-01-22 22:19:05.743075: step: 1632/527, loss: 4.749703293782659e-05 2023-01-22 22:19:06.788108: step: 1636/527, loss: 0.0009005185565911233 2023-01-22 22:19:07.834237: step: 1640/527, loss: 0.0019350543152540922 2023-01-22 22:19:08.881569: step: 1644/527, loss: 0.00863682385534048 2023-01-22 22:19:09.946742: step: 1648/527, loss: 0.0019152669701725245 2023-01-22 22:19:10.994695: step: 1652/527, loss: 0.0007904738886281848 2023-01-22 22:19:12.058029: step: 1656/527, loss: 0.004350421484559774 2023-01-22 22:19:13.119491: step: 1660/527, loss: 0.006533185951411724 2023-01-22 22:19:14.162232: step: 1664/527, loss: 0.00529212411493063 2023-01-22 22:19:15.208824: step: 1668/527, loss: 0.007193727884441614 2023-01-22 22:19:16.254230: step: 1672/527, loss: 0.0003473594842944294 2023-01-22 22:19:17.300597: step: 1676/527, loss: 0.014090189710259438 2023-01-22 22:19:18.350741: step: 1680/527, loss: 0.0037139994092285633 2023-01-22 22:19:19.404990: step: 1684/527, loss: 0.0020392511505633593 2023-01-22 22:19:20.453860: step: 1688/527, loss: 0.003827287582680583 2023-01-22 22:19:21.508350: step: 1692/527, loss: 0.003518494078889489 2023-01-22 22:19:22.551511: step: 1696/527, loss: 0.0015186556847766042 2023-01-22 22:19:23.614538: step: 1700/527, loss: 0.003926885314285755 2023-01-22 22:19:24.657101: step: 1704/527, loss: 0.002102620666846633 2023-01-22 22:19:25.694736: step: 1708/527, loss: 0.004677518270909786 2023-01-22 22:19:26.740125: step: 1712/527, loss: 0.017792485654354095 2023-01-22 22:19:27.808383: step: 1716/527, loss: 0.0030138115398585796 2023-01-22 22:19:28.854100: step: 1720/527, loss: 0.0008383300737477839 2023-01-22 22:19:29.899471: step: 1724/527, loss: 0.005080461036413908 2023-01-22 22:19:30.941199: step: 1728/527, loss: 0.0003374902589712292 2023-01-22 22:19:32.002731: step: 1732/527, loss: 0.006319050677120686 2023-01-22 22:19:33.052810: step: 1736/527, loss: 0.00012671062722802162 2023-01-22 22:19:34.089481: step: 1740/527, loss: 0.0003456561535131186 2023-01-22 22:19:35.141633: step: 1744/527, loss: 0.042993925511837006 2023-01-22 22:19:36.176913: step: 1748/527, loss: 0.008083181455731392 2023-01-22 22:19:37.231482: step: 1752/527, loss: 0.003613003296777606 2023-01-22 22:19:38.264465: step: 1756/527, loss: 0.006234684959053993 2023-01-22 22:19:39.321768: step: 1760/527, loss: 0.006225149147212505 2023-01-22 22:19:40.355419: step: 1764/527, loss: 0.005681393668055534 2023-01-22 22:19:41.408623: step: 1768/527, loss: 0.0005418686778284609 2023-01-22 22:19:42.469274: step: 1772/527, loss: 0.0031710597686469555 2023-01-22 22:19:43.515615: step: 1776/527, loss: 0.0025270304176956415 2023-01-22 22:19:44.570012: step: 1780/527, loss: 0.0029379629995673895 2023-01-22 22:19:45.623104: step: 1784/527, loss: 0.004091819282621145 2023-01-22 22:19:46.668466: step: 1788/527, loss: 0.004152192268520594 2023-01-22 22:19:47.702983: step: 1792/527, loss: 0.0006632282165810466 2023-01-22 22:19:48.756160: step: 1796/527, loss: 0.006866875104606152 2023-01-22 22:19:49.800414: step: 1800/527, loss: 0.005559556186199188 2023-01-22 22:19:50.849553: step: 1804/527, loss: 0.000458006834378466 2023-01-22 22:19:51.886961: step: 1808/527, loss: 0.0033429362811148167 2023-01-22 22:19:52.930993: step: 1812/527, loss: 0.009467796422541142 2023-01-22 22:19:53.968160: step: 1816/527, loss: 0.004855016712099314 2023-01-22 22:19:55.018834: step: 1820/527, loss: 0.007009300868958235 2023-01-22 22:19:56.080669: step: 1824/527, loss: 0.002213296014815569 2023-01-22 22:19:57.144391: step: 1828/527, loss: 0.036922577768564224 2023-01-22 22:19:58.196184: step: 1832/527, loss: 0.0022862209007143974 2023-01-22 22:19:59.242788: step: 1836/527, loss: 0.0054118698462843895 2023-01-22 22:20:00.297443: step: 1840/527, loss: 0.004481484182178974 2023-01-22 22:20:01.349514: step: 1844/527, loss: 0.002876381389796734 2023-01-22 22:20:02.400933: step: 1848/527, loss: 0.004492246545851231 2023-01-22 22:20:03.448190: step: 1852/527, loss: 5.41856097697746e-05 2023-01-22 22:20:04.489987: step: 1856/527, loss: 0.004717900417745113 2023-01-22 22:20:05.536949: step: 1860/527, loss: 0.0037742636632174253 2023-01-22 22:20:06.572893: step: 1864/527, loss: 0.0008915414218790829 2023-01-22 22:20:07.620152: step: 1868/527, loss: 0.0014861089875921607 2023-01-22 22:20:08.663926: step: 1872/527, loss: 0.0006288950680755079 2023-01-22 22:20:09.697427: step: 1876/527, loss: 0.005388900637626648 2023-01-22 22:20:10.751270: step: 1880/527, loss: 0.003864086465910077 2023-01-22 22:20:11.793924: step: 1884/527, loss: 0.006012776400893927 2023-01-22 22:20:12.867068: step: 1888/527, loss: 0.019474459812045097 2023-01-22 22:20:13.914708: step: 1892/527, loss: 0.004927025642246008 2023-01-22 22:20:14.954468: step: 1896/527, loss: 0.0017890379531309009 2023-01-22 22:20:16.010524: step: 1900/527, loss: 0.0036250848788768053 2023-01-22 22:20:17.056084: step: 1904/527, loss: 0.0021174089051783085 2023-01-22 22:20:18.105627: step: 1908/527, loss: 0.004929032642394304 2023-01-22 22:20:19.143554: step: 1912/527, loss: 0.013725148513913155 2023-01-22 22:20:20.191224: step: 1916/527, loss: 0.011531383730471134 2023-01-22 22:20:21.240693: step: 1920/527, loss: 0.003347012447193265 2023-01-22 22:20:22.279708: step: 1924/527, loss: 0.009026437066495419 2023-01-22 22:20:23.334943: step: 1928/527, loss: 0.0001847794046625495 2023-01-22 22:20:24.384592: step: 1932/527, loss: 0.015622053295373917 2023-01-22 22:20:25.434789: step: 1936/527, loss: 0.0037039590533822775 2023-01-22 22:20:26.480929: step: 1940/527, loss: 0.005754625424742699 2023-01-22 22:20:27.518662: step: 1944/527, loss: 0.004106899257749319 2023-01-22 22:20:28.555120: step: 1948/527, loss: 0.0010922342771664262 2023-01-22 22:20:29.613325: step: 1952/527, loss: 0.011303143575787544 2023-01-22 22:20:30.658918: step: 1956/527, loss: 0.0007308688363991678 2023-01-22 22:20:31.719641: step: 1960/527, loss: 0.0007425081566907465 2023-01-22 22:20:32.775958: step: 1964/527, loss: 0.0027463282458484173 2023-01-22 22:20:33.815953: step: 1968/527, loss: 0.0021160945761948824 2023-01-22 22:20:34.861142: step: 1972/527, loss: 1.5059528095662245e-06 2023-01-22 22:20:35.915725: step: 1976/527, loss: 0.0024889616761356592 2023-01-22 22:20:36.975450: step: 1980/527, loss: 0.062308456748723984 2023-01-22 22:20:38.015343: step: 1984/527, loss: 0.0007120163645595312 2023-01-22 22:20:39.059912: step: 1988/527, loss: 0.000993556110188365 2023-01-22 22:20:40.106649: step: 1992/527, loss: 0.0003347903548274189 2023-01-22 22:20:41.143828: step: 1996/527, loss: 0.0061900257132947445 2023-01-22 22:20:42.198869: step: 2000/527, loss: 0.012815129943192005 2023-01-22 22:20:43.234470: step: 2004/527, loss: 2.3776488887961023e-05 2023-01-22 22:20:44.300631: step: 2008/527, loss: 0.013798339292407036 2023-01-22 22:20:45.367055: step: 2012/527, loss: 0.001623197109438479 2023-01-22 22:20:46.412605: step: 2016/527, loss: 0.000656258431263268 2023-01-22 22:20:47.495347: step: 2020/527, loss: 1.8884347809944302e-05 2023-01-22 22:20:48.545606: step: 2024/527, loss: 0.0037832241505384445 2023-01-22 22:20:49.578065: step: 2028/527, loss: 0.00029103129054419696 2023-01-22 22:20:50.647365: step: 2032/527, loss: 0.001067739212885499 2023-01-22 22:20:51.698084: step: 2036/527, loss: 0.01484745554625988 2023-01-22 22:20:52.728260: step: 2040/527, loss: 0.005156759172677994 2023-01-22 22:20:53.795931: step: 2044/527, loss: 0.004316109698265791 2023-01-22 22:20:54.844110: step: 2048/527, loss: 0.0006669743452221155 2023-01-22 22:20:55.889099: step: 2052/527, loss: 0.007700266782194376 2023-01-22 22:20:56.968948: step: 2056/527, loss: 0.003209709655493498 2023-01-22 22:20:58.016029: step: 2060/527, loss: 0.002674531890079379 2023-01-22 22:20:59.055197: step: 2064/527, loss: 0.002433930989354849 2023-01-22 22:21:00.115624: step: 2068/527, loss: 0.0015503098256886005 2023-01-22 22:21:01.156041: step: 2072/527, loss: 0.004364384338259697 2023-01-22 22:21:02.226478: step: 2076/527, loss: 0.006797038484364748 2023-01-22 22:21:03.265017: step: 2080/527, loss: 7.489960262319073e-05 2023-01-22 22:21:04.320758: step: 2084/527, loss: 0.010618247091770172 2023-01-22 22:21:05.366938: step: 2088/527, loss: 0.00022566843836102635 2023-01-22 22:21:06.429176: step: 2092/527, loss: 0.0012280181981623173 2023-01-22 22:21:07.474803: step: 2096/527, loss: 0.0008603253518231213 2023-01-22 22:21:08.524430: step: 2100/527, loss: 0.005899379495531321 2023-01-22 22:21:09.578838: step: 2104/527, loss: 0.005639920011162758 2023-01-22 22:21:10.636646: step: 2108/527, loss: 0.0020226570777595043 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.321676856884058, 'r': 0.33693666982922205, 'f1': 0.3291299814643189}, 'combined': 0.24251682844739283, 'stategy': 1, 'epoch': 14} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33765000036850124, 'r': 0.3060336821521779, 'f1': 0.3210653794634199}, 'combined': 0.20548184285658871, 'stategy': 1, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3217239517290292, 'r': 0.3528585277028062, 'f1': 0.3365727495011382}, 'combined': 0.24800097331662813, 'stategy': 1, 'epoch': 14} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.35969509690412604, 'r': 0.31391572093451003, 'f1': 0.33524979905627283}, 'combined': 0.21455987139601457, 'stategy': 1, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3294469679843575, 'r': 0.33007210454599767, 'f1': 0.3297592399919256}, 'combined': 0.24298049262562937, 'stategy': 1, 'epoch': 14} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3656000818475574, 'r': 0.2947421951928442, 'f1': 0.3263694433420008}, 'combined': 0.23400073296218926, 'stategy': 1, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3319773567844166, 'r': 0.33386717095965995, 'f1': 0.3329195820165389}, 'combined': 0.24530916569639705, 'stategy': 1, 'epoch': 11} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.36875593136201523, 'r': 0.3013128538335666, 'f1': 0.3316402867932796}, 'combined': 0.23777982826687974, 'stategy': 1, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 11} ****************************** Epoch: 15 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 22:23:37.072899: step: 4/527, loss: 0.0021173562854528427 2023-01-22 22:23:38.117672: step: 8/527, loss: 0.0038845541421324015 2023-01-22 22:23:39.165177: step: 12/527, loss: 0.0003737553779501468 2023-01-22 22:23:40.202178: step: 16/527, loss: 0.0037679087836295366 2023-01-22 22:23:41.230205: step: 20/527, loss: 0.00017279670282732695 2023-01-22 22:23:42.265879: step: 24/527, loss: 0.0021214622538536787 2023-01-22 22:23:43.310272: step: 28/527, loss: 0.002036649500951171 2023-01-22 22:23:44.349184: step: 32/527, loss: 0.004870116710662842 2023-01-22 22:23:45.383845: step: 36/527, loss: 0.0015003933804109693 2023-01-22 22:23:46.422149: step: 40/527, loss: 0.0008034825441427529 2023-01-22 22:23:47.465370: step: 44/527, loss: 7.095889304764569e-05 2023-01-22 22:23:48.520464: step: 48/527, loss: 0.005036995280534029 2023-01-22 22:23:49.556153: step: 52/527, loss: 0.008235444314777851 2023-01-22 22:23:50.612435: step: 56/527, loss: 0.003923624753952026 2023-01-22 22:23:51.654157: step: 60/527, loss: 0.003509890753775835 2023-01-22 22:23:52.705630: step: 64/527, loss: 0.00109007116407156 2023-01-22 22:23:53.766092: step: 68/527, loss: 0.009756636805832386 2023-01-22 22:23:54.803629: step: 72/527, loss: 0.0007047755643725395 2023-01-22 22:23:55.852376: step: 76/527, loss: 0.00788839440792799 2023-01-22 22:23:56.915366: step: 80/527, loss: 0.004746593534946442 2023-01-22 22:23:57.973071: step: 84/527, loss: 0.004120721016079187 2023-01-22 22:23:59.030999: step: 88/527, loss: 0.001955896383151412 2023-01-22 22:24:00.053770: step: 92/527, loss: 9.483610483584926e-06 2023-01-22 22:24:01.091159: step: 96/527, loss: 0.005657796747982502 2023-01-22 22:24:02.140854: step: 100/527, loss: 0.0008424916304647923 2023-01-22 22:24:03.181871: step: 104/527, loss: 0.004253414925187826 2023-01-22 22:24:04.241724: step: 108/527, loss: 0.03129945695400238 2023-01-22 22:24:05.273694: step: 112/527, loss: 0.10999766737222672 2023-01-22 22:24:06.327556: step: 116/527, loss: 0.003163097659125924 2023-01-22 22:24:07.379379: step: 120/527, loss: 0.003660310059785843 2023-01-22 22:24:08.439505: step: 124/527, loss: 0.0016737651312723756 2023-01-22 22:24:09.495746: step: 128/527, loss: 0.023412086069583893 2023-01-22 22:24:10.542728: step: 132/527, loss: 0.0010352160315960646 2023-01-22 22:24:11.578972: step: 136/527, loss: 0.0007334492402151227 2023-01-22 22:24:12.622706: step: 140/527, loss: 0.0011000334052368999 2023-01-22 22:24:13.668054: step: 144/527, loss: 3.7636273191310465e-05 2023-01-22 22:24:14.737978: step: 148/527, loss: 0.0012713746400550008 2023-01-22 22:24:15.792768: step: 152/527, loss: 0.001818284043110907 2023-01-22 22:24:16.842946: step: 156/527, loss: 0.001599849434569478 2023-01-22 22:24:17.873448: step: 160/527, loss: 0.0005812745075672865 2023-01-22 22:24:18.925031: step: 164/527, loss: 0.010420789010822773 2023-01-22 22:24:19.972371: step: 168/527, loss: 0.0021030320785939693 2023-01-22 22:24:21.020923: step: 172/527, loss: 0.0006419811979867518 2023-01-22 22:24:22.044237: step: 176/527, loss: 0.00936819612979889 2023-01-22 22:24:23.079245: step: 180/527, loss: 0.006638983730226755 2023-01-22 22:24:24.120587: step: 184/527, loss: 0.0007522262167185545 2023-01-22 22:24:25.156895: step: 188/527, loss: 0.001532741473056376 2023-01-22 22:24:26.193607: step: 192/527, loss: 0.002265756484121084 2023-01-22 22:24:27.221125: step: 196/527, loss: 0.0009916160488501191 2023-01-22 22:24:28.269968: step: 200/527, loss: 0.0014992932556197047 2023-01-22 22:24:29.311589: step: 204/527, loss: 0.00023226144548971206 2023-01-22 22:24:30.366257: step: 208/527, loss: 0.002908761613070965 2023-01-22 22:24:31.411765: step: 212/527, loss: 0.00032628432381898165 2023-01-22 22:24:32.471401: step: 216/527, loss: 0.004422122612595558 2023-01-22 22:24:33.531847: step: 220/527, loss: 0.0019195893546566367 2023-01-22 22:24:34.562130: step: 224/527, loss: 3.886927879648283e-05 2023-01-22 22:24:35.608926: step: 228/527, loss: 0.0025922805070877075 2023-01-22 22:24:36.633992: step: 232/527, loss: 0.0059373872354626656 2023-01-22 22:24:37.682810: step: 236/527, loss: 0.0002580052532721311 2023-01-22 22:24:38.749930: step: 240/527, loss: 0.0038165636360645294 2023-01-22 22:24:39.802953: step: 244/527, loss: 0.0033947532065212727 2023-01-22 22:24:40.845148: step: 248/527, loss: 0.0008480420801788568 2023-01-22 22:24:41.898142: step: 252/527, loss: 0.011438749730587006 2023-01-22 22:24:42.942033: step: 256/527, loss: 0.002159717259928584 2023-01-22 22:24:43.992362: step: 260/527, loss: 0.005551984068006277 2023-01-22 22:24:45.041718: step: 264/527, loss: 0.0006147713284008205 2023-01-22 22:24:46.088174: step: 268/527, loss: 0.0032075196504592896 2023-01-22 22:24:47.155014: step: 272/527, loss: 0.006498089991509914 2023-01-22 22:24:48.191241: step: 276/527, loss: 0.0014903396368026733 2023-01-22 22:24:49.267550: step: 280/527, loss: 0.0015536813298240304 2023-01-22 22:24:50.302657: step: 284/527, loss: 0.00019825789786409587 2023-01-22 22:24:51.369925: step: 288/527, loss: 0.01223039161413908 2023-01-22 22:24:52.404238: step: 292/527, loss: 0.003496290883049369 2023-01-22 22:24:53.461969: step: 296/527, loss: 0.0032782205380499363 2023-01-22 22:24:54.517656: step: 300/527, loss: 0.0006147643434815109 2023-01-22 22:24:55.560850: step: 304/527, loss: 0.002437482587993145 2023-01-22 22:24:56.628253: step: 308/527, loss: 3.440062573645264e-05 2023-01-22 22:24:57.679489: step: 312/527, loss: 0.0014014571206644177 2023-01-22 22:24:58.722570: step: 316/527, loss: 0.0056559550575912 2023-01-22 22:24:59.761196: step: 320/527, loss: 0.0009812063071876764 2023-01-22 22:25:00.823984: step: 324/527, loss: 8.941252599470317e-05 2023-01-22 22:25:01.874120: step: 328/527, loss: 0.0015465703327208757 2023-01-22 22:25:02.940912: step: 332/527, loss: 0.008679852820932865 2023-01-22 22:25:04.040940: step: 336/527, loss: 0.0019940112251788378 2023-01-22 22:25:05.102451: step: 340/527, loss: 0.004313911776989698 2023-01-22 22:25:06.150302: step: 344/527, loss: 0.010324474424123764 2023-01-22 22:25:07.209312: step: 348/527, loss: 3.7121339119039476e-05 2023-01-22 22:25:08.255534: step: 352/527, loss: 0.0029345473740249872 2023-01-22 22:25:09.313338: step: 356/527, loss: 0.0031653819605708122 2023-01-22 22:25:10.361311: step: 360/527, loss: 0.005418699234724045 2023-01-22 22:25:11.401557: step: 364/527, loss: 0.007249964401125908 2023-01-22 22:25:12.454987: step: 368/527, loss: 0.0008257487206719816 2023-01-22 22:25:13.517070: step: 372/527, loss: 0.0064759342931210995 2023-01-22 22:25:14.577639: step: 376/527, loss: 0.010194957256317139 2023-01-22 22:25:15.618065: step: 380/527, loss: 0.004651610739529133 2023-01-22 22:25:16.671995: step: 384/527, loss: 0.004355348646640778 2023-01-22 22:25:17.742833: step: 388/527, loss: 0.004689054097980261 2023-01-22 22:25:18.802620: step: 392/527, loss: 0.004069014452397823 2023-01-22 22:25:19.850800: step: 396/527, loss: 0.0035764339845627546 2023-01-22 22:25:20.899520: step: 400/527, loss: 0.0006211921572685242 2023-01-22 22:25:21.963887: step: 404/527, loss: 0.0020920531824231148 2023-01-22 22:25:23.010338: step: 408/527, loss: 0.0018426136812195182 2023-01-22 22:25:24.064639: step: 412/527, loss: 0.007245170418173075 2023-01-22 22:25:25.102816: step: 416/527, loss: 0.000667760381475091 2023-01-22 22:25:26.150040: step: 420/527, loss: 0.0025722994469106197 2023-01-22 22:25:27.198250: step: 424/527, loss: 0.0002297269820701331 2023-01-22 22:25:28.256847: step: 428/527, loss: 0.008032325655221939 2023-01-22 22:25:29.318667: step: 432/527, loss: 0.0023289441596716642 2023-01-22 22:25:30.370565: step: 436/527, loss: 0.015587732195854187 2023-01-22 22:25:31.407400: step: 440/527, loss: 0.0001631863706279546 2023-01-22 22:25:32.453261: step: 444/527, loss: 0.008715753443539143 2023-01-22 22:25:33.522210: step: 448/527, loss: 0.001279518473893404 2023-01-22 22:25:34.576719: step: 452/527, loss: 0.004508857149630785 2023-01-22 22:25:35.632607: step: 456/527, loss: 0.003289333777502179 2023-01-22 22:25:36.662033: step: 460/527, loss: 0.0036076297983527184 2023-01-22 22:25:37.701222: step: 464/527, loss: 0.0005100186681374907 2023-01-22 22:25:38.751350: step: 468/527, loss: 0.0017917719669640064 2023-01-22 22:25:39.813895: step: 472/527, loss: 0.0031433936674147844 2023-01-22 22:25:40.863111: step: 476/527, loss: 0.001392496284097433 2023-01-22 22:25:41.923932: step: 480/527, loss: 0.016258245334029198 2023-01-22 22:25:42.967664: step: 484/527, loss: 0.0006095237913541496 2023-01-22 22:25:44.013506: step: 488/527, loss: 0.0002444768906570971 2023-01-22 22:25:45.075827: step: 492/527, loss: 0.005289440508931875 2023-01-22 22:25:46.126553: step: 496/527, loss: 0.0002598441205918789 2023-01-22 22:25:47.174513: step: 500/527, loss: 0.0010381884640082717 2023-01-22 22:25:48.210703: step: 504/527, loss: 0.0034570619463920593 2023-01-22 22:25:49.276464: step: 508/527, loss: 0.009726104326546192 2023-01-22 22:25:50.334899: step: 512/527, loss: 0.004402461927384138 2023-01-22 22:25:51.383078: step: 516/527, loss: 0.0024160800967365503 2023-01-22 22:25:52.418060: step: 520/527, loss: 0.002171527361497283 2023-01-22 22:25:53.460534: step: 524/527, loss: 0.001062641036696732 2023-01-22 22:25:54.525643: step: 528/527, loss: 0.003841681405901909 2023-01-22 22:25:55.571463: step: 532/527, loss: 0.0008453569607809186 2023-01-22 22:25:56.618074: step: 536/527, loss: 0.0037636910565197468 2023-01-22 22:25:57.683249: step: 540/527, loss: 0.0033669352997094393 2023-01-22 22:25:58.748161: step: 544/527, loss: 0.002339624334126711 2023-01-22 22:25:59.781393: step: 548/527, loss: 0.0001866282691480592 2023-01-22 22:26:00.824419: step: 552/527, loss: 0.0022400973830372095 2023-01-22 22:26:01.871289: step: 556/527, loss: 0.007030345033854246 2023-01-22 22:26:02.905395: step: 560/527, loss: 0.006746623665094376 2023-01-22 22:26:03.955694: step: 564/527, loss: 0.0021641217172145844 2023-01-22 22:26:04.992469: step: 568/527, loss: 0.0008133139344863594 2023-01-22 22:26:06.033261: step: 572/527, loss: 0.001131757628172636 2023-01-22 22:26:07.085351: step: 576/527, loss: 0.00015711480227764696 2023-01-22 22:26:08.139668: step: 580/527, loss: 0.0031663328409194946 2023-01-22 22:26:09.200619: step: 584/527, loss: 0.002116158837452531 2023-01-22 22:26:10.260958: step: 588/527, loss: 0.015910262241959572 2023-01-22 22:26:11.316231: step: 592/527, loss: 0.00416777515783906 2023-01-22 22:26:12.408265: step: 596/527, loss: 6.097747245803475e-06 2023-01-22 22:26:13.478810: step: 600/527, loss: 0.0007248894544318318 2023-01-22 22:26:14.532604: step: 604/527, loss: 0.0027450760826468468 2023-01-22 22:26:15.574308: step: 608/527, loss: 0.0002297362661920488 2023-01-22 22:26:16.628485: step: 612/527, loss: 1.8520691810408607e-05 2023-01-22 22:26:17.682992: step: 616/527, loss: 0.003238700795918703 2023-01-22 22:26:18.726953: step: 620/527, loss: 0.0013879031175747514 2023-01-22 22:26:19.801271: step: 624/527, loss: 0.000962061167228967 2023-01-22 22:26:20.845657: step: 628/527, loss: 0.00020007911371067166 2023-01-22 22:26:21.885807: step: 632/527, loss: 3.51524940924719e-05 2023-01-22 22:26:22.940436: step: 636/527, loss: 3.9784629279893124e-07 2023-01-22 22:26:24.014895: step: 640/527, loss: 0.00030874053481966257 2023-01-22 22:26:25.065688: step: 644/527, loss: 0.0042197974398732185 2023-01-22 22:26:26.115375: step: 648/527, loss: 0.0009705985430628061 2023-01-22 22:26:27.163536: step: 652/527, loss: 0.00011235095007577911 2023-01-22 22:26:28.216574: step: 656/527, loss: 3.7596757465507835e-05 2023-01-22 22:26:29.264339: step: 660/527, loss: 0.0017915163189172745 2023-01-22 22:26:30.318893: step: 664/527, loss: 0.004214459098875523 2023-01-22 22:26:31.380055: step: 668/527, loss: 0.0016442033229395747 2023-01-22 22:26:32.435010: step: 672/527, loss: 0.0005432349862530828 2023-01-22 22:26:33.490581: step: 676/527, loss: 0.002872044686228037 2023-01-22 22:26:34.539431: step: 680/527, loss: 0.0011923682177439332 2023-01-22 22:26:35.606957: step: 684/527, loss: 0.002298399806022644 2023-01-22 22:26:36.671606: step: 688/527, loss: 0.002548930235207081 2023-01-22 22:26:37.750808: step: 692/527, loss: 0.005912081338465214 2023-01-22 22:26:38.803585: step: 696/527, loss: 0.004460288677364588 2023-01-22 22:26:39.870481: step: 700/527, loss: 0.009003069251775742 2023-01-22 22:26:40.919155: step: 704/527, loss: 0.0005650835810229182 2023-01-22 22:26:41.965026: step: 708/527, loss: 0.013968323357403278 2023-01-22 22:26:43.017468: step: 712/527, loss: 0.0002604085602797568 2023-01-22 22:26:44.067672: step: 716/527, loss: 3.4992392556887353e-06 2023-01-22 22:26:45.115033: step: 720/527, loss: 7.81009566708235e-06 2023-01-22 22:26:46.164582: step: 724/527, loss: 0.0014599731657654047 2023-01-22 22:26:47.222744: step: 728/527, loss: 0.0009206855320371687 2023-01-22 22:26:48.275888: step: 732/527, loss: 0.009889381006360054 2023-01-22 22:26:49.349292: step: 736/527, loss: 0.0003968965320382267 2023-01-22 22:26:50.397565: step: 740/527, loss: 0.00223132548853755 2023-01-22 22:26:51.449286: step: 744/527, loss: 0.0017901671817526221 2023-01-22 22:26:52.495334: step: 748/527, loss: 0.00043798726983368397 2023-01-22 22:26:53.559661: step: 752/527, loss: 0.00037257245276123285 2023-01-22 22:26:54.613286: step: 756/527, loss: 0.0026608763728290796 2023-01-22 22:26:55.672352: step: 760/527, loss: 0.004231521859765053 2023-01-22 22:26:56.736079: step: 764/527, loss: 0.009577380493283272 2023-01-22 22:26:57.805549: step: 768/527, loss: 0.0030072766821831465 2023-01-22 22:26:58.838955: step: 772/527, loss: 0.0008830654551275074 2023-01-22 22:26:59.895321: step: 776/527, loss: 0.02823844738304615 2023-01-22 22:27:00.947646: step: 780/527, loss: 0.004664203617721796 2023-01-22 22:27:01.998787: step: 784/527, loss: 0.00011153092054883018 2023-01-22 22:27:03.049809: step: 788/527, loss: 0.0011218494037166238 2023-01-22 22:27:04.119448: step: 792/527, loss: 0.023258119821548462 2023-01-22 22:27:05.196783: step: 796/527, loss: 0.00047314699622802436 2023-01-22 22:27:06.263835: step: 800/527, loss: 0.0074291592463850975 2023-01-22 22:27:07.341996: step: 804/527, loss: 0.01370174903422594 2023-01-22 22:27:08.379148: step: 808/527, loss: 0.0008737801108509302 2023-01-22 22:27:09.430189: step: 812/527, loss: 0.0002674296556506306 2023-01-22 22:27:10.485515: step: 816/527, loss: 0.001801965176127851 2023-01-22 22:27:11.545406: step: 820/527, loss: 0.015570897608995438 2023-01-22 22:27:12.622609: step: 824/527, loss: 0.004089339170604944 2023-01-22 22:27:13.667996: step: 828/527, loss: 0.019110364839434624 2023-01-22 22:27:14.709104: step: 832/527, loss: 0.00032402921351604164 2023-01-22 22:27:15.756311: step: 836/527, loss: 0.004586229100823402 2023-01-22 22:27:16.819985: step: 840/527, loss: 0.00522241648286581 2023-01-22 22:27:17.869566: step: 844/527, loss: 0.007909129373729229 2023-01-22 22:27:18.909220: step: 848/527, loss: 5.823296669404954e-05 2023-01-22 22:27:19.967343: step: 852/527, loss: 0.0026080021634697914 2023-01-22 22:27:21.012775: step: 856/527, loss: 0.0010545202530920506 2023-01-22 22:27:22.066010: step: 860/527, loss: 0.013986063189804554 2023-01-22 22:27:23.123485: step: 864/527, loss: 0.011676576919853687 2023-01-22 22:27:24.174136: step: 868/527, loss: 0.0025391425006091595 2023-01-22 22:27:25.240984: step: 872/527, loss: 0.01674940623342991 2023-01-22 22:27:26.293543: step: 876/527, loss: 0.0009539996972307563 2023-01-22 22:27:27.361636: step: 880/527, loss: 0.0036104826722294092 2023-01-22 22:27:28.433448: step: 884/527, loss: 0.008421748876571655 2023-01-22 22:27:29.484711: step: 888/527, loss: 0.017133517190814018 2023-01-22 22:27:30.558786: step: 892/527, loss: 0.004269172437489033 2023-01-22 22:27:31.593404: step: 896/527, loss: 0.006812944542616606 2023-01-22 22:27:32.634384: step: 900/527, loss: 0.0011456196662038565 2023-01-22 22:27:33.687883: step: 904/527, loss: 0.00023756176233291626 2023-01-22 22:27:34.747696: step: 908/527, loss: 0.00012250669533386827 2023-01-22 22:27:35.801719: step: 912/527, loss: 0.0055575892329216 2023-01-22 22:27:36.843144: step: 916/527, loss: 0.004770447500050068 2023-01-22 22:27:37.882409: step: 920/527, loss: 0.006210679188370705 2023-01-22 22:27:38.957348: step: 924/527, loss: 0.002885744906961918 2023-01-22 22:27:40.018740: step: 928/527, loss: 0.000533028447534889 2023-01-22 22:27:41.077198: step: 932/527, loss: 0.00919250026345253 2023-01-22 22:27:42.114106: step: 936/527, loss: 0.0027735168114304543 2023-01-22 22:27:43.154932: step: 940/527, loss: 0.004798294045031071 2023-01-22 22:27:44.194432: step: 944/527, loss: 0.001187857473269105 2023-01-22 22:27:45.262031: step: 948/527, loss: 0.0014265469508245587 2023-01-22 22:27:46.320020: step: 952/527, loss: 0.0018891862127929926 2023-01-22 22:27:47.366776: step: 956/527, loss: 0.00039537055999971926 2023-01-22 22:27:48.418875: step: 960/527, loss: 0.0005955604137852788 2023-01-22 22:27:49.474085: step: 964/527, loss: 0.003503769636154175 2023-01-22 22:27:50.525890: step: 968/527, loss: 0.0010721046710386872 2023-01-22 22:27:51.608770: step: 972/527, loss: 0.005274656228721142 2023-01-22 22:27:52.663276: step: 976/527, loss: 0.00016638064698781818 2023-01-22 22:27:53.721764: step: 980/527, loss: 0.0013780973386019468 2023-01-22 22:27:54.788280: step: 984/527, loss: 0.009462487883865833 2023-01-22 22:27:55.845727: step: 988/527, loss: 0.0006571573321707547 2023-01-22 22:27:56.900453: step: 992/527, loss: 0.000539962318725884 2023-01-22 22:27:57.939903: step: 996/527, loss: 0.012506379745900631 2023-01-22 22:27:58.995349: step: 1000/527, loss: 0.0006122874328866601 2023-01-22 22:28:00.051083: step: 1004/527, loss: 0.0007009954424574971 2023-01-22 22:28:01.099145: step: 1008/527, loss: 0.00013026964734308422 2023-01-22 22:28:02.155664: step: 1012/527, loss: 0.0017491503385826945 2023-01-22 22:28:03.237200: step: 1016/527, loss: 0.0003133631544187665 2023-01-22 22:28:04.292197: step: 1020/527, loss: 0.0021452170331031084 2023-01-22 22:28:05.343752: step: 1024/527, loss: 0.009732971899211407 2023-01-22 22:28:06.398030: step: 1028/527, loss: 0.0016120981890708208 2023-01-22 22:28:07.436120: step: 1032/527, loss: 0.00114544911775738 2023-01-22 22:28:08.487670: step: 1036/527, loss: 8.31125071272254e-05 2023-01-22 22:28:09.548789: step: 1040/527, loss: 0.00758141977712512 2023-01-22 22:28:10.611617: step: 1044/527, loss: 0.0008135505486279726 2023-01-22 22:28:11.661126: step: 1048/527, loss: 0.0019955255556851625 2023-01-22 22:28:12.708813: step: 1052/527, loss: 0.006410179194062948 2023-01-22 22:28:13.762719: step: 1056/527, loss: 0.0026326850056648254 2023-01-22 22:28:14.809133: step: 1060/527, loss: 0.029356464743614197 2023-01-22 22:28:15.867879: step: 1064/527, loss: 0.0008149564964696765 2023-01-22 22:28:16.911007: step: 1068/527, loss: 0.0008090647752396762 2023-01-22 22:28:17.957456: step: 1072/527, loss: 0.002092592651024461 2023-01-22 22:28:19.026163: step: 1076/527, loss: 0.00047904730308800936 2023-01-22 22:28:20.090688: step: 1080/527, loss: 0.0025993366725742817 2023-01-22 22:28:21.153513: step: 1084/527, loss: 0.01467125490307808 2023-01-22 22:28:22.205262: step: 1088/527, loss: 0.0022456017322838306 2023-01-22 22:28:23.273417: step: 1092/527, loss: 6.169895641505718e-05 2023-01-22 22:28:24.333180: step: 1096/527, loss: 0.01206066831946373 2023-01-22 22:28:25.374910: step: 1100/527, loss: 0.0032102654222398996 2023-01-22 22:28:26.429218: step: 1104/527, loss: 0.0017116570379585028 2023-01-22 22:28:27.483765: step: 1108/527, loss: 0.006274400744587183 2023-01-22 22:28:28.542570: step: 1112/527, loss: 0.0005220805178396404 2023-01-22 22:28:29.588958: step: 1116/527, loss: 4.4285923650022596e-05 2023-01-22 22:28:30.638623: step: 1120/527, loss: 0.002910742536187172 2023-01-22 22:28:31.715682: step: 1124/527, loss: 0.008637990802526474 2023-01-22 22:28:32.778281: step: 1128/527, loss: 3.064343400183134e-05 2023-01-22 22:28:33.836421: step: 1132/527, loss: 0.002794864820316434 2023-01-22 22:28:34.900212: step: 1136/527, loss: 0.0001063033560058102 2023-01-22 22:28:35.945013: step: 1140/527, loss: 0.003263834398239851 2023-01-22 22:28:36.993147: step: 1144/527, loss: 0.023214466869831085 2023-01-22 22:28:38.046770: step: 1148/527, loss: 0.006799470167607069 2023-01-22 22:28:39.105801: step: 1152/527, loss: 0.007932424545288086 2023-01-22 22:28:40.149139: step: 1156/527, loss: 0.004795227665454149 2023-01-22 22:28:41.208925: step: 1160/527, loss: 0.0021390633191913366 2023-01-22 22:28:42.253799: step: 1164/527, loss: 0.00022273171634878963 2023-01-22 22:28:43.308454: step: 1168/527, loss: 0.0035889961291104555 2023-01-22 22:28:44.362586: step: 1172/527, loss: 0.008758141659200191 2023-01-22 22:28:45.417721: step: 1176/527, loss: 0.0005906281294301152 2023-01-22 22:28:46.472251: step: 1180/527, loss: 0.0007719207787886262 2023-01-22 22:28:47.515897: step: 1184/527, loss: 0.0002030876639764756 2023-01-22 22:28:48.563818: step: 1188/527, loss: 0.002123411512002349 2023-01-22 22:28:49.616863: step: 1192/527, loss: 0.001337147201411426 2023-01-22 22:28:50.660497: step: 1196/527, loss: 0.00352330319583416 2023-01-22 22:28:51.734921: step: 1200/527, loss: 0.0020758798345923424 2023-01-22 22:28:52.807870: step: 1204/527, loss: 0.0015972491819411516 2023-01-22 22:28:53.848041: step: 1208/527, loss: 0.003596015740185976 2023-01-22 22:28:54.913829: step: 1212/527, loss: 0.003209357848390937 2023-01-22 22:28:55.964138: step: 1216/527, loss: 0.011255135759711266 2023-01-22 22:28:57.001708: step: 1220/527, loss: 9.44964776863344e-05 2023-01-22 22:28:58.067751: step: 1224/527, loss: 0.0042476835660636425 2023-01-22 22:28:59.128241: step: 1228/527, loss: 0.003568338230252266 2023-01-22 22:29:00.166860: step: 1232/527, loss: 1.5678507452321355e-06 2023-01-22 22:29:01.212327: step: 1236/527, loss: 0.013276586309075356 2023-01-22 22:29:02.277782: step: 1240/527, loss: 0.0020281258039176464 2023-01-22 22:29:03.322837: step: 1244/527, loss: 0.00019491919374559075 2023-01-22 22:29:04.384576: step: 1248/527, loss: 0.0005673468112945557 2023-01-22 22:29:05.455426: step: 1252/527, loss: 0.003406813135370612 2023-01-22 22:29:06.511828: step: 1256/527, loss: 0.002462834119796753 2023-01-22 22:29:07.570167: step: 1260/527, loss: 0.004317640792578459 2023-01-22 22:29:08.656357: step: 1264/527, loss: 0.0018103390466421843 2023-01-22 22:29:09.704378: step: 1268/527, loss: 0.00046188171836547554 2023-01-22 22:29:10.758269: step: 1272/527, loss: 0.004303985740989447 2023-01-22 22:29:11.804548: step: 1276/527, loss: 0.00015098131552804261 2023-01-22 22:29:12.861469: step: 1280/527, loss: 7.784442277625203e-05 2023-01-22 22:29:13.914094: step: 1284/527, loss: 0.0012205367675051093 2023-01-22 22:29:14.976296: step: 1288/527, loss: 2.8824190394516336e-06 2023-01-22 22:29:16.047534: step: 1292/527, loss: 0.009323973208665848 2023-01-22 22:29:17.101695: step: 1296/527, loss: 0.0011361275101080537 2023-01-22 22:29:18.150886: step: 1300/527, loss: 0.008245636709034443 2023-01-22 22:29:19.199992: step: 1304/527, loss: 0.0014456275384873152 2023-01-22 22:29:20.251183: step: 1308/527, loss: 0.00014965976879466325 2023-01-22 22:29:21.312003: step: 1312/527, loss: 0.0015570215182378888 2023-01-22 22:29:22.360833: step: 1316/527, loss: 0.003109920769929886 2023-01-22 22:29:23.417763: step: 1320/527, loss: 0.006661924067884684 2023-01-22 22:29:24.461394: step: 1324/527, loss: 0.0045791869051754475 2023-01-22 22:29:25.537047: step: 1328/527, loss: 0.009706941433250904 2023-01-22 22:29:26.585924: step: 1332/527, loss: 0.0008802172960713506 2023-01-22 22:29:27.641495: step: 1336/527, loss: 0.005875221453607082 2023-01-22 22:29:28.690154: step: 1340/527, loss: 0.015969112515449524 2023-01-22 22:29:29.745856: step: 1344/527, loss: 0.004202917218208313 2023-01-22 22:29:30.795482: step: 1348/527, loss: 0.0037117262836545706 2023-01-22 22:29:31.845771: step: 1352/527, loss: 5.7789413403952494e-05 2023-01-22 22:29:32.884728: step: 1356/527, loss: 0.00034378370037302375 2023-01-22 22:29:33.938030: step: 1360/527, loss: 0.0006365476292558014 2023-01-22 22:29:34.994391: step: 1364/527, loss: 0.006595726124942303 2023-01-22 22:29:36.031039: step: 1368/527, loss: 0.0015415733214467764 2023-01-22 22:29:37.089563: step: 1372/527, loss: 0.0033396631479263306 2023-01-22 22:29:38.143796: step: 1376/527, loss: 0.0010558458743616939 2023-01-22 22:29:39.210123: step: 1380/527, loss: 0.0034323404543101788 2023-01-22 22:29:40.284954: step: 1384/527, loss: 0.004915494471788406 2023-01-22 22:29:41.333373: step: 1388/527, loss: 0.0061089713126420975 2023-01-22 22:29:42.408975: step: 1392/527, loss: 0.004875743295997381 2023-01-22 22:29:43.469579: step: 1396/527, loss: 0.0006867261254228652 2023-01-22 22:29:44.533136: step: 1400/527, loss: 0.005687988828867674 2023-01-22 22:29:45.568218: step: 1404/527, loss: 0.00036953826202079654 2023-01-22 22:29:46.615601: step: 1408/527, loss: 0.000939302786719054 2023-01-22 22:29:47.671136: step: 1412/527, loss: 0.0013568548019975424 2023-01-22 22:29:48.727163: step: 1416/527, loss: 0.004481145180761814 2023-01-22 22:29:49.784843: step: 1420/527, loss: 0.0006092854891903698 2023-01-22 22:29:50.820907: step: 1424/527, loss: 0.0004368473601061851 2023-01-22 22:29:51.873258: step: 1428/527, loss: 0.004806303884834051 2023-01-22 22:29:52.913749: step: 1432/527, loss: 0.00019386038184165955 2023-01-22 22:29:53.966532: step: 1436/527, loss: 0.02582375518977642 2023-01-22 22:29:55.006748: step: 1440/527, loss: 0.0035245653707534075 2023-01-22 22:29:56.071416: step: 1444/527, loss: 0.012158765457570553 2023-01-22 22:29:57.122195: step: 1448/527, loss: 0.004373760428279638 2023-01-22 22:29:58.176581: step: 1452/527, loss: 0.004735062830150127 2023-01-22 22:29:59.197835: step: 1456/527, loss: 3.1292182711695204e-07 2023-01-22 22:30:00.242956: step: 1460/527, loss: 0.0021914043463766575 2023-01-22 22:30:01.299417: step: 1464/527, loss: 0.006319602485746145 2023-01-22 22:30:02.354997: step: 1468/527, loss: 0.004506241995841265 2023-01-22 22:30:03.403743: step: 1472/527, loss: 0.0023842931259423494 2023-01-22 22:30:04.439304: step: 1476/527, loss: 5.1615957090689335e-06 2023-01-22 22:30:05.488785: step: 1480/527, loss: 0.0025207032449543476 2023-01-22 22:30:06.538282: step: 1484/527, loss: 0.013974564149975777 2023-01-22 22:30:07.586656: step: 1488/527, loss: 0.004074737895280123 2023-01-22 22:30:08.631098: step: 1492/527, loss: 0.011669040657579899 2023-01-22 22:30:09.668966: step: 1496/527, loss: 0.0002378278149990365 2023-01-22 22:30:10.720416: step: 1500/527, loss: 0.0023171287029981613 2023-01-22 22:30:11.784450: step: 1504/527, loss: 0.02757694199681282 2023-01-22 22:30:12.819400: step: 1508/527, loss: 0.005107685457915068 2023-01-22 22:30:13.865135: step: 1512/527, loss: 0.0011398512870073318 2023-01-22 22:30:14.915088: step: 1516/527, loss: 0.00486304797232151 2023-01-22 22:30:15.950745: step: 1520/527, loss: 3.814215597230941e-05 2023-01-22 22:30:17.002789: step: 1524/527, loss: 0.001960632624104619 2023-01-22 22:30:18.054938: step: 1528/527, loss: 0.00026713087572716177 2023-01-22 22:30:19.099101: step: 1532/527, loss: 0.0035333430860191584 2023-01-22 22:30:20.151942: step: 1536/527, loss: 0.0002909427566919476 2023-01-22 22:30:21.190523: step: 1540/527, loss: 0.0012683109380304813 2023-01-22 22:30:22.233929: step: 1544/527, loss: 0.0005423807888291776 2023-01-22 22:30:23.274563: step: 1548/527, loss: 0.0007960903458297253 2023-01-22 22:30:24.312296: step: 1552/527, loss: 0.0023042315151542425 2023-01-22 22:30:25.363985: step: 1556/527, loss: 0.005188549403101206 2023-01-22 22:30:26.418384: step: 1560/527, loss: 0.00650202389806509 2023-01-22 22:30:27.458430: step: 1564/527, loss: 0.0007333987159654498 2023-01-22 22:30:28.507594: step: 1568/527, loss: 0.0034712692722678185 2023-01-22 22:30:29.557471: step: 1572/527, loss: 0.0027577606961131096 2023-01-22 22:30:30.588449: step: 1576/527, loss: 0.006949407979846001 2023-01-22 22:30:31.641865: step: 1580/527, loss: 0.028929319232702255 2023-01-22 22:30:32.681854: step: 1584/527, loss: 0.0057602147571742535 2023-01-22 22:30:33.734941: step: 1588/527, loss: 0.0018962372560054064 2023-01-22 22:30:34.785789: step: 1592/527, loss: 0.002414478687569499 2023-01-22 22:30:35.845323: step: 1596/527, loss: 0.0008136029355227947 2023-01-22 22:30:36.905362: step: 1600/527, loss: 0.0027432541828602552 2023-01-22 22:30:37.954125: step: 1604/527, loss: 0.005107726436108351 2023-01-22 22:30:38.995329: step: 1608/527, loss: 0.000485397526063025 2023-01-22 22:30:40.038086: step: 1612/527, loss: 0.0010145938722416759 2023-01-22 22:30:41.088672: step: 1616/527, loss: 0.0005520945996977389 2023-01-22 22:30:42.133677: step: 1620/527, loss: 0.0016862843185663223 2023-01-22 22:30:43.172097: step: 1624/527, loss: 0.0020822121296077967 2023-01-22 22:30:44.215632: step: 1628/527, loss: 2.6072260880027898e-05 2023-01-22 22:30:45.266155: step: 1632/527, loss: 0.005904187448322773 2023-01-22 22:30:46.294943: step: 1636/527, loss: 0.0014026776188984513 2023-01-22 22:30:47.334779: step: 1640/527, loss: 4.109400470042601e-05 2023-01-22 22:30:48.391442: step: 1644/527, loss: 0.0068221245892345905 2023-01-22 22:30:49.464186: step: 1648/527, loss: 0.009566273540258408 2023-01-22 22:30:50.501832: step: 1652/527, loss: 2.149449755961541e-07 2023-01-22 22:30:51.540667: step: 1656/527, loss: 7.826236833352596e-05 2023-01-22 22:30:52.598460: step: 1660/527, loss: 0.0042329756543040276 2023-01-22 22:30:53.651806: step: 1664/527, loss: 2.2546907985088183e-07 2023-01-22 22:30:54.698381: step: 1668/527, loss: 0.00874702725559473 2023-01-22 22:30:55.742025: step: 1672/527, loss: 0.0005634386907331645 2023-01-22 22:30:56.780203: step: 1676/527, loss: 0.0007338403956964612 2023-01-22 22:30:57.818258: step: 1680/527, loss: 0.0010113079333677888 2023-01-22 22:30:58.858431: step: 1684/527, loss: 0.004578796215355396 2023-01-22 22:30:59.902747: step: 1688/527, loss: 0.005295844282954931 2023-01-22 22:31:00.942777: step: 1692/527, loss: 0.004416812676936388 2023-01-22 22:31:01.981212: step: 1696/527, loss: 1.257691292266827e-05 2023-01-22 22:31:03.036321: step: 1700/527, loss: 0.0014993668301030993 2023-01-22 22:31:04.086208: step: 1704/527, loss: 8.849309233482927e-05 2023-01-22 22:31:05.124373: step: 1708/527, loss: 0.0017898413352668285 2023-01-22 22:31:06.169601: step: 1712/527, loss: 0.001130671356804669 2023-01-22 22:31:07.233018: step: 1716/527, loss: 0.009923440404236317 2023-01-22 22:31:08.279109: step: 1720/527, loss: 0.00018083699978888035 2023-01-22 22:31:09.315839: step: 1724/527, loss: 3.8295198464766145e-05 2023-01-22 22:31:10.365803: step: 1728/527, loss: 0.00010048371768789366 2023-01-22 22:31:11.416455: step: 1732/527, loss: 0.003804548876360059 2023-01-22 22:31:12.471933: step: 1736/527, loss: 0.0009665421093814075 2023-01-22 22:31:13.528479: step: 1740/527, loss: 0.00425344705581665 2023-01-22 22:31:14.587427: step: 1744/527, loss: 0.019550809636712074 2023-01-22 22:31:15.638655: step: 1748/527, loss: 0.0010442856000736356 2023-01-22 22:31:16.697292: step: 1752/527, loss: 0.03415495529770851 2023-01-22 22:31:17.742909: step: 1756/527, loss: 0.008795936591923237 2023-01-22 22:31:18.786804: step: 1760/527, loss: 0.002059955382719636 2023-01-22 22:31:19.842298: step: 1764/527, loss: 0.004015587270259857 2023-01-22 22:31:20.885200: step: 1768/527, loss: 0.0015639765188097954 2023-01-22 22:31:21.937598: step: 1772/527, loss: 0.0009359652176499367 2023-01-22 22:31:22.989179: step: 1776/527, loss: 1.968520564332721e-06 2023-01-22 22:31:24.034086: step: 1780/527, loss: 0.0054413070902228355 2023-01-22 22:31:25.084289: step: 1784/527, loss: 0.004592582117766142 2023-01-22 22:31:26.122694: step: 1788/527, loss: 0.004925496876239777 2023-01-22 22:31:27.170828: step: 1792/527, loss: 0.0009588312823325396 2023-01-22 22:31:28.221034: step: 1796/527, loss: 0.00028223529807291925 2023-01-22 22:31:29.301146: step: 1800/527, loss: 0.002428020816296339 2023-01-22 22:31:30.333160: step: 1804/527, loss: 0.00016602493997197598 2023-01-22 22:31:31.395149: step: 1808/527, loss: 0.001802899525500834 2023-01-22 22:31:32.444345: step: 1812/527, loss: 0.00035033110179938376 2023-01-22 22:31:33.511890: step: 1816/527, loss: 0.0053513795137405396 2023-01-22 22:31:34.556724: step: 1820/527, loss: 0.003625335171818733 2023-01-22 22:31:35.602926: step: 1824/527, loss: 0.009387193247675896 2023-01-22 22:31:36.667001: step: 1828/527, loss: 0.002989468863233924 2023-01-22 22:31:37.698924: step: 1832/527, loss: 0.001374842133373022 2023-01-22 22:31:38.741195: step: 1836/527, loss: 0.0033459945116192102 2023-01-22 22:31:39.801598: step: 1840/527, loss: 0.006262794137001038 2023-01-22 22:31:40.846518: step: 1844/527, loss: 0.0005649627419188619 2023-01-22 22:31:41.896244: step: 1848/527, loss: 0.0008230971870943904 2023-01-22 22:31:42.930969: step: 1852/527, loss: 0.0003535388095770031 2023-01-22 22:31:43.980340: step: 1856/527, loss: 0.0004460073832888156 2023-01-22 22:31:45.034815: step: 1860/527, loss: 4.401613477966748e-05 2023-01-22 22:31:46.071268: step: 1864/527, loss: 0.008959745988249779 2023-01-22 22:31:47.130225: step: 1868/527, loss: 0.001183580025099218 2023-01-22 22:31:48.189151: step: 1872/527, loss: 0.0006281028036028147 2023-01-22 22:31:49.255666: step: 1876/527, loss: 0.003985380753874779 2023-01-22 22:31:50.293077: step: 1880/527, loss: 0.004234537947922945 2023-01-22 22:31:51.353787: step: 1884/527, loss: 0.0033803682308644056 2023-01-22 22:31:52.387316: step: 1888/527, loss: 4.754704423248768e-05 2023-01-22 22:31:53.423820: step: 1892/527, loss: 0.0007918642950244248 2023-01-22 22:31:54.471650: step: 1896/527, loss: 0.004291980993002653 2023-01-22 22:31:55.503672: step: 1900/527, loss: 0.00036677898606285453 2023-01-22 22:31:56.545474: step: 1904/527, loss: 0.003098508110269904 2023-01-22 22:31:57.595512: step: 1908/527, loss: 0.005191630683839321 2023-01-22 22:31:58.634329: step: 1912/527, loss: 0.003917275927960873 2023-01-22 22:31:59.673635: step: 1916/527, loss: 0.003572305664420128 2023-01-22 22:32:00.731419: step: 1920/527, loss: 0.010544898919761181 2023-01-22 22:32:01.809592: step: 1924/527, loss: 3.4832566598197445e-05 2023-01-22 22:32:02.856519: step: 1928/527, loss: 0.009234145283699036 2023-01-22 22:32:03.908585: step: 1932/527, loss: 7.381310297205346e-06 2023-01-22 22:32:04.948220: step: 1936/527, loss: 5.0361388275632635e-05 2023-01-22 22:32:06.003059: step: 1940/527, loss: 0.00017151280189864337 2023-01-22 22:32:07.042003: step: 1944/527, loss: 4.3363597796997055e-05 2023-01-22 22:32:08.096330: step: 1948/527, loss: 0.0003087032528128475 2023-01-22 22:32:09.145728: step: 1952/527, loss: 0.004083825740963221 2023-01-22 22:32:10.198413: step: 1956/527, loss: 0.014585692435503006 2023-01-22 22:32:11.236168: step: 1960/527, loss: 0.0009473394602537155 2023-01-22 22:32:12.287940: step: 1964/527, loss: 0.00018962068133987486 2023-01-22 22:32:13.344221: step: 1968/527, loss: 0.0055191777646541595 2023-01-22 22:32:14.403973: step: 1972/527, loss: 0.000298607861623168 2023-01-22 22:32:15.446667: step: 1976/527, loss: 0.0019782360177487135 2023-01-22 22:32:16.491440: step: 1980/527, loss: 0.004785608034580946 2023-01-22 22:32:17.541003: step: 1984/527, loss: 0.007510562427341938 2023-01-22 22:32:18.602774: step: 1988/527, loss: 0.0012440590653568506 2023-01-22 22:32:19.674065: step: 1992/527, loss: 0.0020376089960336685 2023-01-22 22:32:20.715867: step: 1996/527, loss: 0.0005846908898092806 2023-01-22 22:32:21.757777: step: 2000/527, loss: 0.001342321396805346 2023-01-22 22:32:22.797449: step: 2004/527, loss: 0.00015501210873480886 2023-01-22 22:32:23.854352: step: 2008/527, loss: 0.0002958698896691203 2023-01-22 22:32:24.907865: step: 2012/527, loss: 0.0005573519156314433 2023-01-22 22:32:25.948136: step: 2016/527, loss: 0.004240536130964756 2023-01-22 22:32:27.016636: step: 2020/527, loss: 0.001750120078213513 2023-01-22 22:32:28.073976: step: 2024/527, loss: 0.01017381064593792 2023-01-22 22:32:29.132834: step: 2028/527, loss: 0.0007594551425427198 2023-01-22 22:32:30.189557: step: 2032/527, loss: 0.0028292520437389612 2023-01-22 22:32:31.227815: step: 2036/527, loss: 0.005003438331186771 2023-01-22 22:32:32.276695: step: 2040/527, loss: 0.001561109209433198 2023-01-22 22:32:33.329338: step: 2044/527, loss: 0.005150608718395233 2023-01-22 22:32:34.385022: step: 2048/527, loss: 0.0023214875254780054 2023-01-22 22:32:35.435390: step: 2052/527, loss: 0.003653744701296091 2023-01-22 22:32:36.480252: step: 2056/527, loss: 0.006593538913875818 2023-01-22 22:32:37.538086: step: 2060/527, loss: 0.0036782920360565186 2023-01-22 22:32:38.583555: step: 2064/527, loss: 0.005909407511353493 2023-01-22 22:32:39.637146: step: 2068/527, loss: 0.00043682128307409585 2023-01-22 22:32:40.699447: step: 2072/527, loss: 0.001568496460095048 2023-01-22 22:32:41.744344: step: 2076/527, loss: 0.0002973460068460554 2023-01-22 22:32:42.787144: step: 2080/527, loss: 0.0007467272807843983 2023-01-22 22:32:43.837152: step: 2084/527, loss: 0.000901336723472923 2023-01-22 22:32:44.877267: step: 2088/527, loss: 0.000655244046356529 2023-01-22 22:32:45.932355: step: 2092/527, loss: 0.0003568140382412821 2023-01-22 22:32:46.971517: step: 2096/527, loss: 0.0023579972330480814 2023-01-22 22:32:48.011929: step: 2100/527, loss: 0.0003425839531701058 2023-01-22 22:32:49.062770: step: 2104/527, loss: 0.005569449160248041 2023-01-22 22:32:50.124315: step: 2108/527, loss: 0.004380214028060436 ================================================== Loss: 0.004 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3232776032315979, 'r': 0.3416805028462998, 'f1': 0.3322244003690037}, 'combined': 0.24479692658768692, 'stategy': 1, 'epoch': 15} Test Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3386490121166166, 'r': 0.31032564019413594, 'f1': 0.32386926395972443}, 'combined': 0.2072763289342236, 'stategy': 1, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3153825069246265, 'r': 0.35428167760792956, 'f1': 0.33370231295688807}, 'combined': 0.24588591481033856, 'stategy': 1, 'epoch': 15} Test Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3595633192437614, 'r': 0.31641572093451, 'f1': 0.33661246907926595}, 'combined': 0.21543198021073018, 'stategy': 1, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33213909604462555, 'r': 0.3352903208647833, 'f1': 0.3337072693026266}, 'combined': 0.24588956685456698, 'stategy': 1, 'epoch': 15} Test Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3677573478917203, 'r': 0.2994930176188259, 'f1': 0.33013322604121337}, 'combined': 0.23669929414275678, 'stategy': 1, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3275862068965517, 'r': 0.41304347826086957, 'f1': 0.3653846153846154}, 'combined': 0.1826923076923077, 'stategy': 1, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 15} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387919611307425, 'r': 0.347847485768501, 'f1': 0.3354357273559012}, 'combined': 0.24716316752540088, 'stategy': 1, 'epoch': 0} Test for Chinese: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.3270271194361053, 'r': 0.3091892765577723, 'f1': 0.31785813477901825}, 'combined': 0.20342920625857164, 'stategy': 1, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'stategy': 1, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3167881131218918, 'r': 0.361871810435254, 'f1': 0.33783249619021943}, 'combined': 0.24892920771910904, 'stategy': 1, 'epoch': 0} Test for Korean: {'template': {'p': 0.927536231884058, 'r': 0.48854961832061067, 'f1': 0.6399999999999999}, 'slot': {'p': 0.33919419720971006, 'r': 0.3114419447107338, 'f1': 0.3247261982765945}, 'combined': 0.20782476689702045, 'stategy': 1, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33064516129032256, 'r': 0.44565217391304346, 'f1': 0.37962962962962965}, 'combined': 0.18981481481481483, 'stategy': 1, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33213909604462555, 'r': 0.3352903208647833, 'f1': 0.3337072693026266}, 'combined': 0.24588956685456698, 'stategy': 1, 'epoch': 15} Test for Russian: {'template': {'p': 0.9382716049382716, 'r': 0.5801526717557252, 'f1': 0.7169811320754718}, 'slot': {'p': 0.3677573478917203, 'r': 0.2994930176188259, 'f1': 0.33013322604121337}, 'combined': 0.23669929414275678, 'stategy': 1, 'epoch': 15} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'stategy': 1, 'epoch': 15} ****************************** Epoch: 16 command: python train.py --model_name coref --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --accumulate_step 4 --max_epoch 20 --event_hidden_num 500 --p1_data_weight 0.2 --learning_rate 9e-4 2023-01-22 22:35:26.213600: step: 4/527, loss: 0.00029362630448304117 2023-01-22 22:35:27.251667: step: 8/527, loss: 0.0023124818690121174 2023-01-22 22:35:28.286239: step: 12/527, loss: 0.0027675018645823 2023-01-22 22:35:29.341290: step: 16/527, loss: 0.001018871902488172 2023-01-22 22:35:30.370526: step: 20/527, loss: 0.0007350272499024868 2023-01-22 22:35:31.411087: step: 24/527, loss: 0.00016680097905918956 2023-01-22 22:35:32.454795: step: 28/527, loss: 1.1699240531015676e-05 2023-01-22 22:35:33.483321: step: 32/527, loss: 1.8017859474639408e-05 2023-01-22 22:35:34.522727: step: 36/527, loss: 8.809982682578266e-05 2023-01-22 22:35:35.573520: step: 40/527, loss: 0.000953048758674413 2023-01-22 22:35:36.631231: step: 44/527, loss: 0.0010760682635009289 2023-01-22 22:35:37.689822: step: 48/527, loss: 0.005208818707615137 2023-01-22 22:35:38.737380: step: 52/527, loss: 0.0007013630820438266 2023-01-22 22:35:39.776692: step: 56/527, loss: 0.0017439085058867931 2023-01-22 22:35:40.820542: step: 60/527, loss: 0.0005885157152079046 2023-01-22 22:35:41.858706: step: 64/527, loss: 0.0013615426141768694 2023-01-22 22:35:42.898006: step: 68/527, loss: 0.0001540377998026088 2023-01-22 22:35:43.952671: step: 72/527, loss: 0.0004917402984574437 2023-01-22 22:35:44.998920: step: 76/527, loss: 0.010568968020379543 2023-01-22 22:35:46.029029: step: 80/527, loss: 2.7583815608522855e-05 2023-01-22 22:35:47.069736: step: 84/527, loss: 0.0027480798307806253 2023-01-22 22:35:48.122506: step: 88/527, loss: 0.0013642244739457965 2023-01-22 22:35:49.174508: step: 92/527, loss: 4.709641871158965e-05 2023-01-22 22:35:50.227690: step: 96/527, loss: 6.708733053528704e-06 2023-01-22 22:35:51.265419: step: 100/527, loss: 0.005249525420367718 2023-01-22 22:35:52.295967: step: 104/527, loss: 3.4182542094640667e-06 2023-01-22 22:35:53.333896: step: 108/527, loss: 0.0024652304127812386 2023-01-22 22:35:54.380354: step: 112/527, loss: 0.0014432478928938508 2023-01-22 22:35:55.413919: step: 116/527, loss: 0.008281020447611809 2023-01-22 22:35:56.479517: step: 120/527, loss: 0.0021952250972390175 2023-01-22 22:35:57.523064: step: 124/527, loss: 0.002521294867619872 2023-01-22 22:35:58.563726: step: 128/527, loss: 0.004664104897528887 2023-01-22 22:35:59.620000: step: 132/527, loss: 0.0003241867816541344 2023-01-22 22:36:00.659371: step: 136/527, loss: 0.0032002637162804604 2023-01-22 22:36:01.721851: step: 140/527, loss: 0.0007822461775504053 2023-01-22 22:36:02.773261: step: 144/527, loss: 0.0006887163617648184 2023-01-22 22:36:03.820571: step: 148/527, loss: 0.00031078883330337703 2023-01-22 22:36:04.858579: step: 152/527, loss: 0.00018609754624776542 2023-01-22 22:36:05.897551: step: 156/527, loss: 0.0006201851647347212 2023-01-22 22:36:06.948031: step: 160/527, loss: 8.416463242610916e-05 2023-01-22 22:36:07.995667: step: 164/527, loss: 1.5695754882472102e-06 2023-01-22 22:36:09.040741: step: 168/527, loss: 0.0015502817695960402 2023-01-22 22:36:10.086856: step: 172/527, loss: 1.1131889550597407e-05 2023-01-22 22:36:11.138825: step: 176/527, loss: 0.0031404956243932247 2023-01-22 22:36:12.177929: step: 180/527, loss: 0.0008724552462808788 2023-01-22 22:36:13.233069: step: 184/527, loss: 0.009682309813797474 2023-01-22 22:36:14.272575: step: 188/527, loss: 0.00014177162665873766 2023-01-22 22:36:15.320956: step: 192/527, loss: 0.00023851713922340423 2023-01-22 22:36:16.366749: step: 196/527, loss: 0.0005552918883040547 2023-01-22 22:36:17.411096: step: 200/527, loss: 0.0029399204067885876 2023-01-22 22:36:18.470574: step: 204/527, loss: 0.00013852686970494688 2023-01-22 22:36:19.540034: step: 208/527, loss: 0.0038106085266917944 2023-01-22 22:36:20.571908: step: 212/527, loss: 4.388735305838054e-06 2023-01-22 22:36:21.629126: step: 216/527, loss: 0.006033284589648247 2023-01-22 22:36:22.688681: step: 220/527, loss: 0.0008205191697925329 2023-01-22 22:36:23.753667: step: 224/527, loss: 8.755196176934987e-05 2023-01-22 22:36:24.798537: step: 228/527, loss: 0.0008250846876762807 2023-01-22 22:36:25.852773: step: 232/527, loss: 0.00026991195045411587 2023-01-22 22:36:26.906744: step: 236/527, loss: 0.0038986585568636656 2023-01-22 22:36:27.965299: step: 240/527, loss: 0.010525080375373363 2023-01-22 22:36:29.014384: step: 244/527, loss: 2.4819946702336892e-05 2023-01-22 22:36:30.063455: step: 248/527, loss: 0.0006389497430063784 2023-01-22 22:36:31.110521: step: 252/527, loss: 0.0002910917974077165 2023-01-22 22:36:32.146178: step: 256/527, loss: 0.002244743285700679 2023-01-22 22:36:33.219277: step: 260/527, loss: 0.011275527998805046 2023-01-22 22:36:34.271943: step: 264/527, loss: 0.0006012019002810121 2023-01-22 22:36:35.321338: step: 268/527, loss: 0.000831753306556493 2023-01-22 22:36:36.367198: step: 272/527, loss: 0.00048162139137275517 2023-01-22 22:36:37.410817: step: 276/527, loss: 0.00014099193504080176 2023-01-22 22:36:38.451641: step: 280/527, loss: 0.008294143714010715 2023-01-22 22:36:39.495989: step: 284/527, loss: 0.0004632599593605846 2023-01-22 22:36:40.541614: step: 288/527, loss: 0.00790327787399292 2023-01-22 22:36:41.605913: step: 292/527, loss: 0.0026350372936576605 2023-01-22 22:36:42.653079: step: 296/527, loss: 0.001936080981977284 2023-01-22 22:36:43.721510: step: 300/527, loss: 0.00014472243492491543 2023-01-22 22:36:44.756130: step: 304/527, loss: 0.002501201583072543 2023-01-22 22:36:45.812671: step: 308/527, loss: 0.0032947431318461895 2023-01-22 22:36:46.861928: step: 312/527, loss: 0.00435211043804884 2023-01-22 22:36:47.907614: step: 316/527, loss: 0.0003964546776842326 2023-01-22 22:36:48.953610: step: 320/527, loss: 0.00017366201791446656 2023-01-22 22:36:50.019803: step: 324/527, loss: 0.0015467011835426092 2023-01-22 22:36:51.064664: step: 328/527, loss: 0.011085574515163898 2023-01-22 22:36:52.111024: step: 332/527, loss: 0.00216551311314106 2023-01-22 22:36:53.140542: step: 336/527, loss: 0.0003555732255335897 2023-01-22 22:36:54.206347: step: 340/527, loss: 0.0002360201469855383 2023-01-22 22:36:55.247434: step: 344/527, loss: 0.0004038470215164125 2023-01-22 22:36:56.287807: step: 348/527, loss: 0.001179668353870511 2023-01-22 22:36:57.335715: step: 352/527, loss: 0.00355441402643919 2023-01-22 22:36:58.377676: step: 356/527, loss: 0.002574915299192071 2023-01-22 22:36:59.440513: step: 360/527, loss: 4.936311233905144e-05 2023-01-22 22:37:00.493362: step: 364/527, loss: 1.4885632481309585e-05 2023-01-22 22:37:01.557180: step: 368/527, loss: 0.0017851099837571383 2023-01-22 22:37:02.618820: step: 372/527, loss: 0.0005334490560926497 2023-01-22 22:37:03.665132: step: 376/527, loss: 0.0009558703750371933 2023-01-22 22:37:04.721867: step: 380/527, loss: 0.006907467730343342 2023-01-22 22:37:05.758891: step: 384/527, loss: 0.0038654108066111803 2023-01-22 22:37:06.820253: step: 388/527, loss: 0.0008677334990352392 2023-01-22 22:37:07.906363: step: 392/527, loss: 2.391696398262866e-06 2023-01-22 22:37:08.958459: step: 396/527, loss: 0.005411786027252674 2023-01-22 22:37:09.998767: step: 400/527, loss: 3.6597964935936034e-05 2023-01-22 22:37:11.047716: step: 404/527, loss: 0.0027184037026017904 2023-01-22 22:37:12.110263: step: 408/527, loss: 0.005789011716842651 2023-01-22 22:37:13.164909: step: 412/527, loss: 0.00034568869159556925 2023-01-22 22:37:14.213101: step: 416/527, loss: 0.002663221675902605 2023-01-22 22:37:15.271693: step: 420/527, loss: 4.014531077700667e-05 2023-01-22 22:37:16.332434: step: 424/527, loss: 0.0023471759632229805 2023-01-22 22:37:17.371491: step: 428/527, loss: 0.0010860760230571032 2023-01-22 22:37:18.437675: step: 432/527, loss: 0.001893392764031887 2023-01-22 22:37:19.492699: step: 436/527, loss: 0.01324647106230259 2023-01-22 22:37:20.541679: step: 440/527, loss: 0.021253157407045364 2023-01-22 22:37:21.591786: step: 444/527, loss: 0.0019748907070606947 2023-01-22 22:37:22.632634: step: 448/527, loss: 0.00029857363551855087 2023-01-22 22:37:23.673578: step: 452/527, loss: 9.424844904515339e-08 2023-01-22 22:37:24.731311: step: 456/527, loss: 0.003209330141544342 2023-01-22 22:37:25.783072: step: 460/527, loss: 0.0001931208826135844 2023-01-22 22:37:26.837059: step: 464/527, loss: 0.0003605674428399652 2023-01-22 22:37:27.894985: step: 468/527, loss: 0.002870088443160057 2023-01-22 22:37:28.938867: step: 472/527, loss: 0.001728259609080851 2023-01-22 22:37:29.995327: step: 476/527, loss: 0.0019172728061676025 2023-01-22 22:37:31.049923: step: 480/527, loss: 0.003202020190656185 2023-01-22 22:37:32.093204: step: 484/527, loss: 0.0026045667473226786 2023-01-22 22:37:33.152898: step: 488/527, loss: 0.004719897639006376 2023-01-22 22:37:34.197149: step: 492/527, loss: 0.0013467278331518173 2023-01-22 22:37:35.272425: step: 496/527, loss: 0.0006678794743493199 2023-01-22 22:37:36.320970: step: 500/527, loss: 0.009689641185104847 2023-01-22 22:37:37.376358: step: 504/527, loss: 0.00015618542965967208 2023-01-22 22:37:38.443097: step: 508/527, loss: 0.006487994454801083 2023-01-22 22:37:39.500913: step: 512/527, loss: 0.0011897038202732801 2023-01-22 22:37:40.556217: step: 516/527, loss: 0.003776845522224903 2023-01-22 22:37:41.632729: step: 520/527, loss: 0.005213642027229071 2023-01-22 22:37:42.691768: step: 524/527, loss: 0.0017803956288844347 2023-01-22 22:37:43.753474: step: 528/527, loss: 0.00786628108471632 2023-01-22 22:37:44.827039: step: 532/527, loss: 0.0012076753191649914 2023-01-22 22:37:45.876172: step: 536/527, loss: 0.007799945771694183 2023-01-22 22:37:46.929516: step: 540/527, loss: 0.0019610016606748104 2023-01-22 22:37:47.996703: step: 544/527, loss: 0.006803759839385748 2023-01-22 22:37:49.046987: step: 548/527, loss: 0.002024806337431073 2023-01-22 22:37:50.120360: step: 552/527, loss: 0.0012042023008689284 2023-01-22 22:37:51.173274: step: 556/527, loss: 0.0009109866223298013 2023-01-22 22:37:52.234527: step: 560/527, loss: 0.001502372557297349 2023-01-22 22:37:53.307450: step: 564/527, loss: 0.01679421029984951 2023-01-22 22:37:54.358281: step: 568/527, loss: 0.00010982996900565922 2023-01-22 22:37:55.422742: step: 572/527, loss: 0.001854610163718462 2023-01-22 22:37:56.476230: step: 576/527, loss: 0.0024343417026102543 2023-01-22 22:37:57.532897: step: 580/527, loss: 0.007088626269251108 2023-01-22 22:37:58.590372: step: 584/527, loss: 0.005327480845153332 2023-01-22 22:37:59.649141: step: 588/527, loss: 0.0033573138061910868 2023-01-22 22:38:00.689366: step: 592/527, loss: 0.0027347393333911896 2023-01-22 22:38:01.761249: step: 596/527, loss: 0.01130823977291584 2023-01-22 22:38:02.815530: step: 600/527, loss: 0.00041065970435738564 2023-01-22 22:38:03.863152: step: 604/527, loss: 0.0006110537797212601 2023-01-22 22:38:04.902753: step: 608/527, loss: 2.780222985165892e-06 2023-01-22 22:38:05.977411: step: 612/527, loss: 0.002098517958074808 2023-01-22 22:38:07.036794: step: 616/527, loss: 0.0049666548147797585 2023-01-22 22:38:08.097066: step: 620/527, loss: 0.0011582697043195367 2023-01-22 22:38:09.132721: step: 624/527, loss: 0.001068922458216548 2023-01-22 22:38:10.183550: step: 628/527, loss: 0.0009036035626195371 2023-01-22 22:38:11.245541: step: 632/527, loss: 0.0040367040783166885 2023-01-22 22:38:12.308938: step: 636/527, loss: 0.03547500818967819 2023-01-22 22:38:13.344096: step: 640/527, loss: 0.00039093123632483184 2023-01-22 22:38:14.391992: step: 644/527, loss: 0.009030251763761044 2023-01-22 22:38:15.490225: step: 648/527, loss: 0.00578329199925065 2023-01-22 22:38:16.558536: step: 652/527, loss: 0.0007094664615578949 2023-01-22 22:38:17.609508: step: 656/527, loss: 9.087818762054667e-05 2023-01-22 22:38:18.651147: step: 660/527, loss: 0.0010494156740605831 2023-01-22 22:38:19.706060: step: 664/527, loss: 1.5628816981916316e-05 2023-01-22 22:38:20.753337: step: 668/527, loss: 0.00026448926655575633 2023-01-22 22:38:21.796357: step: 672/527, loss: 0.002680274425074458 2023-01-22 22:38:22.845969: step: 676/527, loss: 0.004019627813249826 2023-01-22 22:38:23.886015: step: 680/527, loss: 0.003811656264588237 2023-01-22 22:38:24.926625: step: 684/527, loss: 1.1421186172810849e-05 2023-01-22 22:38:25.985392: step: 688/527, loss: 0.0013235409278422594 2023-01-22 22:38:27.036142: step: 692/527, loss: 0.0001925052492879331 2023-01-22 22:38:28.083337: step: 696/527, loss: 0.0002155368565581739 2023-01-22 22:38:29.130816: step: 700/527, loss: 0.0025221812538802624 2023-01-22 22:38:30.182368: step: 704/527, loss: 0.004898042418062687 2023-01-22 22:38:31.241538: step: 708/527, loss: 0.0017057096119970083 2023-01-22 22:38:32.306857: step: 712/527, loss: 0.0035163576249033213 2023-01-22 22:38:33.382562: step: 716/527, loss: 0.010717121884226799 2023-01-22 22:38:34.467981: step: 720/527, loss: 0.002483958378434181 2023-01-22 22:38:35.542508: step: 724/527, loss: 0.0004091110604349524 2023-01-22 22:38:36.618113: step: 728/527, loss: 0.00719792116433382 2023-01-22 22:38:37.664918: step: 732/527, loss: 0.0010332348756492138 2023-01-22 22:38:38.703077: step: 736/527, loss: 0.002130708657205105 2023-01-22 22:38:39.755926: step: 740/527, loss: 0.00012908552889712155 2023-01-22 22:38:40.805554: step: 744/527, loss: 0.00020655262051150203 2023-01-22 22:38:41.851559: step: 748/527, loss: 0.0029447698034346104 2023-01-22 22:38:42.896724: step: 752/527, loss: 0.0021500212606042624 2023-01-22 22:38:43.963258: step: 756/527, loss: 0.0029845640528947115 2023-01-22 22:38:45.021706: step: 760/527, loss: 0.0031347903423011303 2023-01-22 22:38:46.055436: step: 764/527, loss: 9.04018379515037e-05 2023-01-22 22:38:47.111544: step: 768/527, loss: 0.0017972232308238745 2023-01-22 22:38:48.158274: step: 772/527, loss: 0.0014094715006649494 2023-01-22 22:38:49.240406: step: 776/527, loss: 0.007548716384917498 2023-01-22 22:38:50.277656: step: 780/527, loss: 4.1050832805922255e-05 2023-01-22 22:38:51.328017: step: 784/527, loss: 0.00357470172457397 2023-01-22 22:38:52.366944: step: 788/527, loss: 0.0010819025337696075 2023-01-22 22:38:53.432147: step: 792/527, loss: 0.018882377073168755 2023-01-22 22:38:54.496195: step: 796/527, loss: 0.0008510814514011145 2023-01-22 22:38:55.573520: step: 800/527, loss: 0.009258330799639225 2023-01-22 22:38:56.618075: step: 804/527, loss: 0.00013602181570604444 2023-01-22 22:38:57.681028: step: 808/527, loss: 0.0019705870654433966 2023-01-22 22:38:58.727355: step: 812/527, loss: 0.0008956373785622418 2023-01-22 22:38:59.780971: step: 816/527, loss: 0.003238560166209936 2023-01-22 22:39:00.844691: step: 820/527, loss: 0.004095160868018866 2023-01-22 22:39:01.885369: step: 824/527, loss: 0.00010957848280668259 2023-01-22 22:39:02.940108: step: 828/527, loss: 0.00027631016564555466 2023-01-22 22:39:03.992511: step: 832/527, loss: 0.007950243540108204 2023-01-22 22:39:05.068793: step: 836/527, loss: 0.0019664480350911617 2023-01-22 22:39:06.113831: step: 840/527, loss: 0.00014651849051006138 2023-01-22 22:39:07.166276: step: 844/527, loss: 0.0019699432887136936 2023-01-22 22:39:08.218235: step: 848/527, loss: 0.004299804102629423 2023-01-22 22:39:09.267071: step: 852/527, loss: 0.0016341921873390675 2023-01-22 22:39:10.323396: step: 856/527, loss: 0.0005780266947112978 2023-01-22 22:39:11.370397: step: 860/527, loss: 0.013584185391664505 2023-01-22 22:39:12.419343: step: 864/527, loss: 0.01736573502421379 2023-01-22 22:39:13.480211: step: 868/527, loss: 0.0060394564643502235 2023-01-22 22:39:14.542520: step: 872/527, loss: 0.0038760933093726635 2023-01-22 22:39:15.601983: step: 876/527, loss: 0.018327057361602783 2023-01-22 22:39:16.648287: step: 880/527, loss: 0.001958887092769146 2023-01-22 22:39:17.690516: step: 884/527, loss: 9.681533265393227e-05 2023-01-22 22:39:18.735805: step: 888/527, loss: 0.0038818400353193283 2023-01-22 22:39:19.792414: step: 892/527, loss: 0.0013985522091388702 2023-01-22 22:39:20.832659: step: 896/527, loss: 7.033974543446675e-05 2023-01-22 22:39:21.881358: step: 900/527, loss: 0.0055928910151124 2023-01-22 22:39:22.937367: step: 904/527, loss: 0.002146197482943535 2023-01-22 22:39:23.979337: step: 908/527, loss: 0.0023259862791746855 2023-01-22 22:39:25.032437: step: 912/527, loss: 0.006990176159888506 2023-01-22 22:39:26.068892: step: 916/527, loss: 0.0005071528721600771 2023-01-22 22:39:27.118904: step: 920/527, loss: 2.55736867984524e-05 2023-01-22 22:39:28.168752: step: 924/527, loss: 0.0015783029375597835 2023-01-22 22:39:29.216036: step: 928/527, loss: 0.0014483754057437181 2023-01-22 22:39:30.258928: step: 932/527, loss: 0.003942243754863739 2023-01-22 22:39:31.315466: step: 936/527, loss: 0.0016141952946782112 2023-01-22 22:39:32.366110: step: 940/527, loss: 0.000732183747459203 2023-01-22 22:39:33.412566: step: 944/527, loss: 0.0011083107674494386 2023-01-22 22:39:34.460878: step: 948/527, loss: 0.014129206538200378 2023-01-22 22:39:35.506546: step: 952/527, loss: 4.784448174177669e-05 2023-01-22 22:39:36.547688: step: 956/527, loss: 0.005685660056769848 2023-01-22 22:39:37.600439: step: 960/527, loss: 0.0021032216027379036 2023-01-22 22:39:38.643032: step: 964/527, loss: 0.0010349677177146077 2023-01-22 22:39:39.690508: step: 968/527, loss: 0.0005158516578376293 2023-01-22 22:39:40.753656: step: 972/527, loss: 0.005445067770779133 2023-01-22 22:39:41.817649: step: 976/527, loss: 0.00667544174939394 2023-01-22 22:39:42.879494: step: 980/527, loss: 0.0009074313566088676 2023-01-22 22:39:43.925532: step: 984/527, loss: 0.0011769848642870784 2023-01-22 22:39:44.980082: step: 988/527, loss: 0.001433023950085044 2023-01-22 22:39:46.044618: step: 992/527, loss: 0.0006545026553794742 2023-01-22 22:39:47.100955: step: 996/527, loss: 0.0005731495330110192 2023-01-22 22:39:48.156833: step: 1000/527, loss: 0.00036194580025039613 2023-01-22 22:39:49.200061: step: 1004/527, loss: 0.0013803731417283416 2023-01-22 22:39:50.271642: step: 1008/527, loss: 0.00251230550929904 2023-01-22 22:39:51.341837: step: 1012/527, loss: 5.7000743254320696e-05 2023-01-22 22:39:52.385059: step: 1016/527, loss: 0.000785976939368993 2023-01-22 22:39:53.432101: step: 1020/527, loss: 3.270840898039751e-05 2023-01-22 22:39:54.474762: step: 1024/527, loss: 0.00022950013226363808 2023-01-22 22:39:55.528031: step: 1028/527, loss: 2.7459434932097793e-05 2023-01-22 22:39:56.571602: step: 1032/527, loss: 0.001861344208009541 2023-01-22 22:39:57.622838: step: 1036/527, loss: 0.0004174953792244196 2023-01-22 22:39:58.673843: step: 1040/527, loss: 0.0025279626715928316 2023-01-22 22:39:59.722416: step: 1044/527, loss: 0.0008298912434838712 2023-01-22 22:40:00.773922: step: 1048/527, loss: 0.0006148685934022069 2023-01-22 22:40:01.828159: step: 1052/527, loss: 0.0020226824562996626 2023-01-22 22:40:02.875570: step: 1056/527, loss: 0.0016761821461841464 2023-01-22 22:40:03.910496: step: 1060/527, loss: 0.00027071317890658975 2023-01-22 22:40:04.966656: step: 1064/527, loss: 0.0019281964050605893 2023-01-22 22:40:06.020316: step: 1068/527, loss: 0.002859218046069145 2023-01-22 22:40:07.069324: step: 1072/527, loss: 0.0043061599135398865 2023-01-22 22:40:08.100326: step: 1076/527, loss: 0.0018398403190076351 2023-01-22 22:40:09.160656: step: 1080/527, loss: 0.00013586811837740242 2023-01-22 22:40:10.203914: step: 1084/527, loss: 0.0031078208703547716 2023-01-22 22:40:11.269843: step: 1088/527, loss: 0.0012761126272380352 2023-01-22 22:40:12.331728: step: 1092/527, loss: 0.004461658652871847 2023-01-22 22:40:13.373577: step: 1096/527, loss: 0.0008428000728599727 2023-01-22 22:40:14.423343: step: 1100/527, loss: 0.002554000820964575 2023-01-22 22:40:15.470455: step: 1104/527, loss: 0.00128261954523623 2023-01-22 22:40:16.528722: step: 1108/527, loss: 0.004220184404402971 2023-01-22 22:40:17.589728: step: 1112/527, loss: 0.006126233376562595 2023-01-22 22:40:18.643868: step: 1116/527, loss: 0.0021530345547944307 2023-01-22 22:40:19.732469: step: 1120/527, loss: 0.00011001452367054299 2023-01-22 22:40:20.783028: step: 1124/527, loss: 0.01634375937283039 2023-01-22 22:40:21.830639: step: 1128/527, loss: 0.0018732628086581826 2023-01-22 22:40:22.887581: step: 1132/527, loss: 0.00157389126252383 2023-01-22 22:40:23.928410: step: 1136/527, loss: 0.005545373074710369 2023-01-22 22:40:24.968845: step: 1140/527, loss: 0.0014116261154413223 2023-01-22 22:40:26.020288: step: 1144/527, loss: 0.0025831719394773245 2023-01-22 22:40:27.068754: step: 1148/527, loss: 0.007299414835870266 2023-01-22 22:40:28.117466: step: 1152/527, loss: 0.00012536680151242763 2023-01-22 22:40:29.180591: step: 1156/527, loss: 0.0029293829575181007 2023-01-22 22:40:30.237617: step: 1160/527, loss: 0.006933015305548906 2023-01-22 22:40:31.277290: step: 1164/527, loss: 0.0013301861472427845 2023-01-22 22:40:32.337526: step: 1168/527, loss: 0.002368565648794174 2023-01-22 22:40:33.388510: step: 1172/527, loss: 0.00046372567885555327 2023-01-22 22:40:34.452079: step: 1176/527, loss: 0.0009666255209594965 2023-01-22 22:40:35.495573: step: 1180/527, loss: 0.02743571810424328 2023-01-22 22:40:36.551992: step: 1184/527, loss: 0.0012554955901578069 2023-01-22 22:40:37.615537: step: 1188/527, loss: 0.0012396962847560644 2023-01-22 22:40:38.677994: step: 1192/527, loss: 0.0015350612811744213 2023-01-22 22:40:39.720117: step: 1196/527, loss: 0.0013064832892268896 2023-01-22 22:40:40.778820: step: 1200/527, loss: 0.003098790068179369 2023-01-22 22:40:41.837415: step: 1204/527, loss: 0.003295590402558446 2023-01-22 22:40:42.894401: step: 1208/527, loss: 0.0037896924186497927 2023-01-22 22:40:43.951112: step: 1212/527, loss: 0.0008170441724359989 2023-01-22 22:40:45.002719: step: 1216/527, loss: 0.013305151835083961 2023-01-22 22:40:46.047260: step: 1220/527, loss: 0.002453046850860119 2023-01-22 22:40:47.105476: step: 1224/527, loss: 0.0016749334754422307 2023-01-22 22:40:48.163239: step: 1228/527, loss: 0.004733951762318611 2023-01-22 22:40:49.255102: step: 1232/527, loss: 0.004380635917186737 2023-01-22 22:40:50.293663: step: 1236/527, loss: 0.001545277307741344 2023-01-22 22:40:51.344287: step: 1240/527, loss: 0.0009487916249781847 2023-01-22 22:40:52.394438: step: 1244/527, loss: 0.0031318538822233677 2023-01-22 22:40:53.460794: step: 1248/527, loss: 0.0031580699142068624 2023-01-22 22:40:54.501624: step: 1252/527, loss: 0.006127608008682728 2023-01-22 22:40:55.548916: step: 1256/527, loss: 0.004966470412909985 2023-01-22 22:40:56.591918: step: 1260/527, loss: 0.0005209269002079964 2023-01-22 22:40:57.641021: step: 1264/527, loss: 0.016081875190138817 2023-01-22 22:40:58.683085: step: 1268/527, loss: 7.874284165154677e-06 2023-01-22 22:40:59.750473: step: 1272/527, loss: 0.003223120467737317 2023-01-22 22:41:00.780004: step: 1276/527, loss: 2.6775867809192277e-05 2023-01-22 22:41:01.816283: step: 1280/527, loss: 3.3527435050473287e-08 2023-01-22 22:41:02.862308: step: 1284/527, loss: 0.0012703973334282637 2023-01-22 22:41:03.904570: step: 1288/527, loss: 0.0019685172010213137 2023-01-22 22:41:04.953386: step: 1292/527, loss: 0.0008686878136359155 2023-01-22 22:41:06.010663: step: 1296/527, loss: 0.0032676798291504383 2023-01-22 22:41:07.053391: step: 1300/527, loss: 0.0025982388760894537 2023-01-22 22:41:08.104700: step: 1304/527, loss: 0.0014656097628176212 2023-01-22 22:41:09.157819: step: 1308/527, loss: 2.2512227587867528e-05 2023-01-22 22:41:10.203704: step: 1312/527, loss: 1.3594108168035746e-05 2023-01-22 22:41:11.271736: step: 1316/527, loss: 0.010524352081120014 2023-01-22 22:41:12.309730: step: 1320/527, loss: 0.0001336935965809971 2023-01-22 22:41:13.366545: step: 1324/527, loss: 0.009151014499366283 2023-01-22 22:41:14.422805: step: 1328/527, loss: 0.005606172140687704 2023-01-22 22:41:15.478320: step: 1332/527, loss: 0.0004165508144069463 2023-01-22 22:41:16.536036: step: 1336/527, loss: 0.014254912734031677 2023-01-22 22:41:17.609994: step: 1340/527, loss: 0.06589920818805695 2023-01-22 22:41:18.642902: step: 1344/527, loss: 0.00764111615717411 2023-01-22 22:41:19.701834: step: 1348/527, loss: 5.8863155572908e-05 2023-01-22 22:41:20.744841: step: 1352/527, loss: 0.0009382255375385284 2023-01-22 22:41:21.802899: step: 1356/527, loss: 0.0018174120923504233 2023-01-22 22:41:22.857584: step: 1360/527, loss: 0.00689328508451581 2023-01-22 22:41:23.923510: step: 1364/527, loss: 0.003093535779044032 2023-01-22 22:41:24.965930: step: 1368/527, loss: 0.005622304044663906 2023-01-22 22:41:26.011726: step: 1372/527, loss: 0.004492186475545168 2023-01-22 22:41:27.072408: step: 1376/527, loss: 0.0008804639219306409 2023-01-22 22:41:28.134505: step: 1380/527, loss: 0.00939116720110178 2023-01-22 22:41:29.174247: step: 1384/527, loss: 0.00021086931519676 2023-01-22 22:41:30.228475: step: 1388/527, loss: 0.0021975624840706587 2023-01-22 22:41:31.291004: step: 1392/527, loss: 0.00034912340925075114 2023-01-22 22:41:32.344104: step: 1396/527, loss: 0.0005080234259366989 2023-01-22 22:41:33.400391: step: 1400/527, loss: 0.0031887295190244913 2023-01-22 22:41:34.463882: step: 1404/527, loss: 0.007273495197296143 2023-01-22 22:41:35.498293: step: 1408/527, loss: 0.0002891934709623456 2023-01-22 22:41:36.545278: step: 1412/527, loss: 0.014580144546926022 2023-01-22 22:41:37.593751: step: 1416/527, loss: 0.012143217958509922 2023-01-22 22:41:38.651975: step: 1420/527, loss: 0.004270481877028942 2023-01-22 22:41:39.706415: step: 1424/527, loss: 0.0005022920668125153 2023-01-22 22:41:40.752955: step: 1428/527, loss: 0.0002557813422754407 2023-01-22 22:41:41.801166: step: 1432/527, loss: 0.0016120565123856068 2023-01-22 22:41:42.845930: step: 1436/527, loss: 0.0076289186254143715 2023-01-22 22:41:43.900110: step: 1440/527, loss: 8.200816228054464e-05 2023-01-22 22:41:44.931500: step: 1444/527, loss: 0.004389632027596235 2023-01-22 22:41:45.973514: step: 1448/527, loss: 0.0009060506708920002 2023-01-22 22:41:47.002344: step: 1452/527, loss: 0.0001369757519569248 2023-01-22 22:41:48.059970: step: 1456/527, loss: 0.0022664363496005535